Refactor audio crate #1327

EzraEllette · 2025-02-11T05:46:33Z

description

Moved pcm_to_mel into screenpipe and switched to tokio runtime.
Reapplied LRU cache for speaker embeddings
lazy load onnx sessions
stt is now fully async

related issue: #1318

how to test

Test using instruments.

Update your profile (`~/.zshrc`)

alias instruments="open /Applications/Xcode.app/Contents/Applications/Instruments.app"
get-task-allow () {
	codesign -s - -v -f --entitlements ~/.get-task-allow.ent $1
}

Create `.get-task-allow.ent` in `~`

.get-task-allow.ent
<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
    <dict>
        <key>com.apple.security.get-task-allow</key>
        <true/>
    </dict>
</plist>

Update screenpipe/Cargo.toml `[release]` section (optional)

[release]
debug=true

Testing (TEST THOROUGHLY)

Build screenpipe with --release (optional)
run get-task-allow <path_to_screenpipe_build>
Open instruments with instruments
Select the Leaks & your screenpipe build as the target.
Edit the target to add arguments.
Press the record button. Fix permissions. It helps to remove screenpipe completely from permissions and add it back when prompted.
Use screenpipe normally

Screenshots

Before

After

Ran for 13:33 and had under 400MB usage

/claim #1318

- Improve audio segment handling with more efficient buffer management - Implement lazy-loading and caching for Pyannote and Whisper model sessions - Add LRU cache for speaker embeddings - Enhance async handling of STT and embedding extraction - Optimize memory usage in audio processing components

- Update log_mel_spectrogram_ and pcm_to_mel functions to be async - Replace thread::scope with tokio::task::spawn for parallel processing - Remove Arc and thread synchronization in favor of async task spawning - Add .await calls to async functions in processing pipeline

- Upgrade Candle libraries from 0.7.2 to 0.8.2 - Update tokenizers from 0.20.0 to 0.21.0 - Add num-traits dependency to screenpipe-audio

vercel · 2025-02-11T05:46:38Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
screenpipe	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Feb 11, 2025 5:47am

EzraEllette added 4 commits February 10, 2025 22:23

move pcm_to_mel into screenpipe

761583f

chore: Update Candle and tokenizers dependencies

4be0923

- Upgrade Candle libraries from 0.7.2 to 0.8.2 - Update tokenizers from 0.20.0 to 0.21.0 - Add num-traits dependency to screenpipe-audio

algora-pbc bot mentioned this pull request Feb 11, 2025

[bounty] fix memory leak #1318

Open

algora-pbc bot added the 🙋 Bounty claim label Feb 11, 2025

vercel bot deployed to Preview February 11, 2025 05:47 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor audio crate #1327

Refactor audio crate #1327

EzraEllette commented Feb 11, 2025 •

edited

Loading

vercel bot commented Feb 11, 2025 •

edited

Loading

Refactor audio crate #1327

Are you sure you want to change the base?

Refactor audio crate #1327

Conversation

EzraEllette commented Feb 11, 2025 • edited Loading

description

how to test

Update your profile (~/.zshrc)

Create .get-task-allow.ent in ~

Update screenpipe/Cargo.toml [release] section (optional)

Testing (TEST THOROUGHLY)

Screenshots

Before

After

vercel bot commented Feb 11, 2025 • edited Loading

EzraEllette commented Feb 11, 2025 •

edited

Loading

Update your profile (`~/.zshrc`)

Create `.get-task-allow.ent` in `~`

Update screenpipe/Cargo.toml `[release]` section (optional)

vercel bot commented Feb 11, 2025 •

edited

Loading