-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement task restart policies #280
base: main
Are you sure you want to change the base?
Conversation
* Test: test_add_task_restart_policy_patterns * Test: test_get_task_restart_policy_patterns * Test: test_remove_task_restart_policy_patterns * Test: test_clear_task_restart_policy_patterns * Test: test_task_resolve_restarts
* TaskRestartPattern * TaskRestartPolicy * TaskHistory
* Removed TaskRestartPolicy and TaskHistory * Added Traceback
* TaskReturnPattern: Confirm that the input pattern is a string type and that it is not empty. * Traceback: Confirm that the input is a list of strings and that none of them are empty.
7e82f54
to
6a167f1
Compare
Similar to `TaskHub`s, the `TaskRestartPattern` needs additonal hashed data to uniquely identify it as a Neo4j node (via the gufe key). The unit tests have been updated to reflect this change.
`statestore` methods have been added to modify the database state: * add_task_restart_patterns * remove_task_restart_patterns * get_task_restart_patterns Tests were added for each method in the integration tests for the statestore.
The `add_task_restart_patterns` method now establishes the APPLIES relationship between the each new pattern and all Tasks ACTIONED on the corresponding TaskHub. Added testing for creation of the APPLIES relationship, asserting the number of created connections over multiple TaskHubs and Tasks. Further subdivided the test classes. Additionally added a `set_task_restart_patterns_max_retries` method for updating the max_retries of a TaskRestartPattern.
"actioning" a Task on a TaskHub with preexisting TaskRestartPatterns created the APPLIES relationship between them with a num_retries value of 0. This behavior is tested in the test_action_task function in the statestore.
When an actioned Task is canceled and also has an APPLIES relationship with a TaskRestartPattern, APPLIES is removed between the two nodes. Removed org, project, and campaign fields since they are not necessary for the APPLIES relationship.
Setting an actioned Task status to the following statuses now removes the APPLIES relationship from attached TaskRestartPatterns: * complete * invalid * deleted NOTE: tests have not been added for this yet
Confirming that changing the status of an actioned Task to any of the following removes the APPLIES relationship: * complete * invalid * deleted
New statestore method placeholders: - add_task_traceback - resolve_task_restarts The compute api will add a Task Traceback and resolve restarts for returned failed Tasks. When a list of restart patterns are added, restarts are resolved.
* Renamed add_task_traceback to add_protocol_dag_result_ref_traceback * Added tests for add_protocol_dag_result_ref_traceback
Implemented half of the resolve_task_restarts test
With this decorator, if a transaction isn't passed as a keyword arg, one is automatically created (and closed). This allows a chaining behavior where many method calls share a single transaction object.
* Removed custom tokenization * Implemented _defaults to allow default tokenization to work
cancel_map has been changed from a defaultdict to a base dict and instead using the dict.get method to return None. Additionally added a set of all task/taskhub pairs that is later used to determine what should be canceled. I've also added grouping on taskhubs so the number of calls to cancel_tasks is minimized.
1071369
to
f03417c
Compare
Welcome to Codecov 🎉Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests. Thanks for integrating Codecov - We've got you covered ☂️ |
…t pattern endpoints
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this impressive feature @ianmkenney! I have a few notes, and I've made some modifications where it was obvious to me what to do.
Can you address the notes and fix any broken tests? After that, I think we should be good to merge!
alchemiscale/storage/statestore.py
Outdated
@@ -1411,30 +1461,51 @@ def action_tasks( | |||
# so we can properly return `None` if needed | |||
task_map = {str(task): None for task in tasks} | |||
|
|||
q = f""" | |||
query_safe_task_list = [str(task) for task in tasks if task] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the trailing if task
? Unclear under what conditions this would apply.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a None
were passed in this would filter it out. This was made more explicit in the current branch with a sync with main (d331cc4).
closes #277