-
Notifications
You must be signed in to change notification settings - Fork 41
Fix autonumbering locking with SQL named locks and Django row locking #7455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Hey @melton-jason, can you test this PR out before I open it up to other testers. |
|
I created some other solutions that try to avoid race conditions when doing autonumbering, but none are fully satisfactory:
|
Triggered by e604dec on branch refs/heads/issue-6490-1
|
TODO: add indexes to all the common used autonumbering fields. |
melton-jason
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for taking so long on this.
I haven't yet finished reviewing the autonumbering/locking changes, but here are my comments so far regarding the migrations!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had never heard of GET_LOCK before this PR!
Looks like it has potential and this implementation handles the implicit transaction commit problem with LOCK_TABLES mentioned in #6490 (comment) and #6490 (comment) 👏.
I don't think it's mentioned in this PR or on the Issue yet, but this PR would subsume/replace #5404.
I will provide a note here that this implementation utilizing the advisory GET_LOCKS and row level locks from FOR UPDATE is more vulnerable then the prior exclusive full table lock.
Notably, if there are any database triggers or people are interacting directly with the database, outside from the row level update locks, there are little safe guards in the way to prevent "messing up" or interfering with Specify's autoincrementing code (e.g., Specify is vulnerable to race conditions related to record inserts).
For example, it is now possible for some agent with a direct connection to the database to insert a record while Specify is autoincrementing (after it has read the largest record but before committing the new record), resulting in Specify and the agent inserting "duplicate" or "out-of-sync with the numbering scheme" records
…dd_indexes_to_help_autonumbering.py
Triggered by a425131 on branch refs/heads/issue-6490-1
Fixes #6490
Look at Django docs, it seems that explicitly locking mysql tables from within a Django atomic transaction can lead to issues of tables not getting unlocked. The
post_resourcefunction has thetransaction.atomicdecorator, which calls create_obj -> autonumber_and_save -> do_autonumbering -> lock_tables. This needs to be avoid in order to evade the issue.LOCK TABLES implicitly commits any active transaction, and while tables are locked the connection is restricted to those tables only. When Django later tries to manage the savepoint/transaction stack, the connection state can no longer matches expectations. This can leave the connection stuck.
This solution tries using
GET_LOCKandRELEASE_LOCK, acting like a mutex. Using these instead oflock tablesshould avoid problems of query transactions getting stuck. These sql named locks are scoped so that only one autonumbering operation can only be performed on a table at a time. This will avoid race-conditions between autonumbering operations.In addition to the sql named lock, we'll use Django's
select_for_updatefunction to perform row-level locking during the autonumbering operation. This implementation will replace the need to create explicit table locks. While the sql named lock will prevent race-conditions between autonumbering process, this row-level locking will ensure that other processes will not create/edit records that the autonumbering operation depends on while running. Indexes are added to the common autonumbering fields in order to get the row-level locking to work with Django.Checklist
self-explanatory (or properly documented)
Testing instructions