-
Hi! Thanks for the excellent library! I'm testing the model with the pre-trained 'hey javis' wake word. It appears that the example streaming detection implementation feed each 80ms sample chunk to the model to detect the word. In this case, the 'hey jarvis' is a bit long (probably 300ms-ish) so it looks like the model will detect multiple activations of this wake word. I think a simple debounce logic would take care of this issue, but I just want to learn if there are standard techniques to handle it. Would it make sense to increase the chunk size, for example? Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Thank you, I'm glad you are finding it useful! Yes, the current way the model works (independent predictions on chunks of audio with a sliding window) does often produce multiple activations withing a short time, as the audio data associated with the word is still present in the chunk. To your point, openWakeWord does not currently implement any debounce logic. This was originally an intentional choice as different deployment environments and application scenarios might require different types of logic for what happens after an activation. But I agree that providing at least default approach for this situation would help. As for increasing chunksize, that actually would have a similar outcome to debounce logic. In the current version of openWakeWord, increasing the chunksize will still predict internally at the same rate (every 80 ms), but will then take the maximum prediction within the chunk and simply return that. So by increasing chunksize you decrease the latency of the response, but increase the chances that you'll just have 1 activation. |
Beta Was this translation helpful? Give feedback.
Thank you, I'm glad you are finding it useful!
Yes, the current way the model works (independent predictions on chunks of audio with a sliding window) does often produce multiple activations withing a short time, as the audio data associated with the word is still present in the chunk. To your point, openWakeWord does not currently implement any debounce logic. This was originally an intentional choice as different deployment environments and application scenarios might require different types of logic for what happens after an activation. But I agree that providing at least default approach for this situation would help.
As for increasing chunksize, that actually would have a similar out…