-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[L0] Enable Copy engine support with in-order command lists #2636
base: main
Are you sure you want to change the base?
[L0] Enable Copy engine support with in-order command lists #2636
Conversation
Added a createSyncPointBetweenCopyAndCompute() function which appends a signal event to the previous command list to wait on. This can happen in a scenario when we use Compute -> Copy -> Compute, when the last Compute needs to wait for Copy to preserve in-order behaviour. For the immediateAppend path, this works only if the counter-based events are explicitly not used.
b47d4ef
to
85f8911
Compare
// Get sync point and register the event with it. | ||
ur_exp_command_buffer_sync_point_t SyncPoint = | ||
CommandBuffer->getNextSyncPoint(); | ||
CommandBuffer->registerSyncPoint(SyncPoint, SignalPrevCommandEvent); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is every call to this function inside a check for CommandBuffer->IsInOrderCmdList
? If so I don't think we need to care about sync points (and should also stick that in function comment) Also preference for renaming the function to remove "syncpoint" in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow. createSyncPointBetweenCopyAndCompute
is only ever called inside CommandBuffer->IsInOrderCmdList
check to synchronize between the potential in-orderCopy
vs in-orderCompute
race.
AFAIK Registering SignalPrevCommandEvent
adds it to the event list which is later being reset. If we managed to use counter-based events, lines 223-226
wouldn't be necessary.
I might be misunderstanding the purpose of getNextSyncPoint()
and registerSyncPoint()
, maybe there is a more straightforward way to cache this signal event and later reset it but I thought that's the current way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be misunderstanding the purpose of getNextSyncPoint() and registerSyncPoint(), maybe there is a more straightforward way to cache this signal event and later reset it but I thought that's the current way.
This is more what I was getting at. As was trying to ask if we need this sync point if the UR user doesn't care about sync points. However, you've answered my question by saying that we need to do this to cache the L0 event and reset it later. So I guess my feedback is to update the code comment to append ", so that the event is reset between executions"
ur_command_t CommandType, ur_exp_command_buffer_handle_t CommandBuffer, | ||
bool IsFirstNode, bool IsCopy, | ||
ze_event_handle_t &ZeSignalPrevCommandEvent) { | ||
if (IsFirstNode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be CommandBuffer->IsFirstNode()
now? and we drop the function parameter
// Get sync point and register the event with it. | ||
ur_exp_command_buffer_sync_point_t SyncPoint = | ||
CommandBuffer->getNextSyncPoint(); | ||
CommandBuffer->registerSyncPoint(SyncPoint, SignalPrevCommandEvent); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be misunderstanding the purpose of getNextSyncPoint() and registerSyncPoint(), maybe there is a more straightforward way to cache this signal event and later reset it but I thought that's the current way.
This is more what I was getting at. As was trying to ask if we need this sync point if the UR user doesn't care about sync points. However, you've answered my question by saying that we need to do this to cache the L0 event and reset it later. So I guess my feedback is to update the code comment to append ", so that the event is reset between executions"
* command list (copy or compute), to indicate when it's finished executing. | ||
* @param[in] CommandType The type of the command. | ||
* @param[in] CommandBuffer The CommandBuffer where the command is appended. | ||
* @param[in] IsFirstNode A boolean inidicating if the current node is at the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* @param[in] IsFirstNode A boolean inidicating if the current node is at the | |
* @param[in] IsFirstNode A boolean indicating if the current node is at the |
ze_command_list_handle_t ZeCommandList = | ||
CommandBuffer->chooseCommandList(PreferCopyEngine); | ||
|
||
// Create a sync point if both Copy and Compute in-order command lists are | ||
// used. | ||
if (CommandBuffer->IsInOrderCmdList && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bit of code is copy pasted a lot. It can probably be refactored into its own function.
bool IsFirstNode, bool IsCopy, | ||
ze_event_handle_t &ZeSignalPrevCommandEvent) { | ||
if (IsFirstNode) { | ||
CommandBuffer->MCopyCommandListEmpty = !IsCopy; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this bit of code needed? Isn't chooseCommandList()
already assigning a value to MCopyCommandListEmpty and MComputeCommandListEmtpy?
ur_command_t CommandType, ur_exp_command_buffer_handle_t CommandBuffer, | ||
bool IsFirstNode, bool IsCopy, | ||
ze_event_handle_t &ZeSignalPrevCommandEvent) { | ||
if (IsFirstNode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid using IsFirstNode
completely. It seems to only be used in this check and if I understand it correctly, initializing MWasPrevCopyCommandList
to true
in the CommandBuffer constructor would do the same. I could be missing something though
Currently, copy engine is only available to use for certain commands in the out-of-order command-buffer path. This PR adds required synchronization between the 2 command lists, when both are in-order. Hence, it enables using the copy engine with in-order path.
For the immediateAppend path, this only works for now if the environment variable
UR_L0_USE_DRIVER_COUNTER_BASED_EVENTS=0
is set.intel/llvm PR