-
Notifications
You must be signed in to change notification settings - Fork 493
[#7993] Update error message for insert into select from source #34872
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#7993] Update error message for insert into select from source #34872
Conversation
ggevay
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote some comments.
src/adapter/src/error.rs
Outdated
| {object_type} '{}' may not be used in this operation; \ | ||
| the selection may refer to views and materialized views, but transitive \ | ||
| dependencies must not include sources or source-export tables", | ||
| object_name.quoted() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please split off the longer explanation into the AdapterError::detail? We have an error msg style guide that mentions this and various other minor things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, some possible further tweaks for the message content:
- I'm thinking that even the first part of the error should mention that we mean transitive dependencies, because just saying "refer" sounds like direct references, so some users might get confused.
- We should mention somewhere that this is a limitation of
INSERT INTO ... SELECT ...operations, and not otherSELECTs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(If the first part of the message changes, then please also update it in sqlsmith.py.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was able to produce the same problem with update/delete. So I don't think it makes sense to explicitly mention INSERT INTO ... SELECT ... operations here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "the selection" makes sense because of the context!
| # INSERT INTO ... SELECT ... from view that transitively refers to source -> error | ||
| statement error invalid selection | ||
| INSERT INTO salary_exceptions (employee, salary, type) | ||
| SELECT 'Alice', salary, type FROM emp_view LIMIT 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use the simple directive to test the full error msg. For an example, see
materialize/test/sqllogictest/arrays.slt
Lines 720 to 724 in ee5d332
| simple | |
| SELECT array_cat(ARRAY[[1,2]], ARRAY[[3,4,5]]); | |
| ---- | |
| db error: ERROR: cannot concatenate incompatible arrays | |
| DETAIL: Arrays with differing dimensions are not compatible for concatenation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we provoke this error in an UPDATE or a DELETE statement somehow? E.g. maybe with a subquery in the WHERE clause? At first glance it looks like we do allow that:
materialize/src/sql/src/plan/query.rs
Line 827 in 17d21f7
| allow_subqueries: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me try
| }; | ||
| if !valid { | ||
| return Err(AdapterError::InvalidTableMutationSelection); | ||
| let (object_name, object_type) = match catalog.try_get_entry(id) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be maybe slightly simpler if we put the error return also into the first match above, instead of introducing the valid boolean and doing another match? I'm not sure, so if you think it's simpler with the current approach, then it's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The valid boolean was already there. I kind of like keeping the error handling path separate. I had tried to write it originally with avoiding the catalog look up again. But in the end I found it a bit complicated.
I will leave it as is, unless others feel strongly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok!
81f53e9 to
047faa1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree to all of Gabor's feedback, otw code looks good to me~
One thing I noticed is if you have multiple transitive source dependencies, we only call out the first one when it'd be nice to call out all of them. I don't think this needs to be blocking, but if you're up for it, we can create a followup github issue/PR! Or even add onto this PR. Totally up to you though
src/adapter/src/error.rs
Outdated
| AdapterError::InvalidTableMutationSelection { .. } => { | ||
| write!( | ||
| f, | ||
| "invalid selection: operation may only refer to user-defined tables, even transitively." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor things:
CREATE TABLE FROM SOURCEis also kind of a user-defined table, but is not allowed here. How about instead of "user-defined tables", saying "non-source, non-system tables"?- I wouldn't say "even" here, because that sounds like as if all dependencies could only be tables, but actually views and MVs can also be among the (intermediate) dependencies.
- The error msg style guide says no punctuation at the end of the primary message.
So, how about something like:
invalid selection: operation may only (transitively) refer to non-source, non-system tables
I will create a follow up issue. In general, what is the philosophy on error messages? Do you think it is better to try and report as many errors as possible in one? Or just have a single error that a user can then iteratively address? |
Problem: It's not allowed for an `INSERT INTO ... SELECT ...` to (transitively) refer to any sources. The reason is that with Serializability, it is not possible to take a lock on sources. And it is possible that there are writes in between the read timestamp and the write timestamp which violates serializability. Solution: Change the error to take two parameters: object type and object name and output this state Testing: Introduce new SQL logic test: insert-into-select-source-error.slt which shows the error case for inserting from a source. As well as non error cases Additionally show the error arises for update and delete
047faa1 to
fe6fe4e
Compare
|
We often just stop at the first error, but I think Jun has a good point that in this case it would be useful to report as many as possible, so that the user immediately has an overview of all the issues. But I also wouldn't block this PR on this. |
ggevay
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, looks good to me!
https://github.com/MaterializeInc/database-issues/issues/7993
Problem:
It's not allowed for an
INSERT INTO ... SELECT ...to (transitively) refer to any sources.The reason is that with Serializability, it is not possible to take a lock on sources. And it is possible that there are writes in between the read timestamp and the write timestamp which violates serializability.
Solution:
Change the error to take two parameters: object type and object name and output this state
Testing:
Introduce new SQL logic test: insert-into-select-source-error.slt which shows the error case for inserting from a source. As well as non error cases
Motivation
Tips for reviewer
Please checkout the test for positive/negative cases
Checklist
$T ⇔ Proto$Tmapping (possibly in a backwards-incompatible way), then it is tagged with aT-protolabel.