Skip to content

Optimize audio capture startup#517

Open
altic-dev wants to merge 5 commits into
mainfrom
B/audio-pipeline-optimization
Open

Optimize audio capture startup#517
altic-dev wants to merge 5 commits into
mainfrom
B/audio-pipeline-optimization

Conversation

@altic-dev

@altic-dev altic-dev commented Jul 4, 2026

Copy link
Copy Markdown
Owner

Description

Makes dictation start much faster so FluidVoice is less likely to miss the first word. Also closes the overlay faster after dictation, keeps overlay behavior safe when recordings happen back-to-back, removes an old Parakeet experiment, and makes the faster recording path the default.

Type of Change

  • 🐞 Bug fix
  • ✨ New feature
  • 💥 Breaking change
  • 🧹 Chore
  • 📝 Documentation update

Related Issue or Discussion

Tracking PR: #517

Testing

  • Tested on Intel Mac
  • Tested on Apple Silicon Mac
  • Tested on macOS version: macOS 26
  • Ran linter locally: swiftlint --strict --config .swiftlint.yml Sources
  • Ran formatter locally: swiftformat --config .swiftformat Sources
  • Ran tests locally: sh build_with_FI_incremental.sh

Screenshots / Video

Fluid Intelligence settings

  • No UI/visual changes; screenshots/video are not applicable.

Notes

Bluetooth microphone smoke test passed.

Add the low-latency Core Audio capture path, keep modifier-only dictation starts instant, guard stale overlay hides, and remove the experimental Parakeet fast finalization mode.
@github-actions

github-actions Bot commented Jul 4, 2026

Copy link
Copy Markdown

The PR Policy check is blocking this PR because required template information is missing.

Please update the PR description with:

  • Related Issue or Discussion
  • Screenshots / Video

Visual files detected:

  • .github/screenshots/audio-pipeline-optimization-settings.png
  • Sources/Fluid/ContentView.swift
  • Sources/Fluid/Persistence/SettingsStore.swift
  • Sources/Fluid/Services/MenuBarManager.swift
  • Sources/Fluid/Services/NotchOverlayManager.swift
  • Sources/Fluid/UI/SettingsView.swift
  • Sources/Fluid/Views/BottomOverlayView.swift

Screenshots or video are required for UI, UX, settings, onboarding, overlay, menu bar, or visual behavior changes. If this PR has no visual changes, check the no-visual-change box in the template.

If this remains incomplete for 48 hours after opening, the PR may be closed.

@github-actions github-actions Bot added needs PR template Pull request is missing required template content. needs screenshots Pull request needs screenshot or video evidence. labels Jul 4, 2026
@github-actions github-actions Bot removed needs PR template Pull request is missing required template content. needs screenshots Pull request needs screenshot or video evidence. labels Jul 4, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b75c23f8cf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread Sources/Fluid/ContentView.swift Outdated
Comment on lines +3614 to +3615
if SettingsStore.shared.enableTranscriptionSounds {
TranscriptionSoundPlayer.shared.playStartSound()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid recording the start cue

With transcription sounds enabled on speakers or an open mic, playing the start sound inside onCaptureStarted records FluidVoice's own cue at the beginning of the dictation, because ASRService.start invokes this callback only after the capture pipeline has been enabled and the hardware backend is running. Previously the cue was played before asr.start(), so it stayed outside the captured buffer; move the cue back before capture starts or otherwise exclude it from input.

Useful? React with 👍 / 👎.

if self.hotkeyMode == .hold {
self.markHoldModeStartTriggered(for: behavior.holdModeType)
}
behavior.onHoldStart()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Defer modifier-only starts until combos are ruled out

For modifier-only shortcuts in hold/automatic modes, calling onHoldStart() immediately means any normal chord that begins with the same modifier (for example Fn+C when Fn is the dictation shortcut) starts the mic, overlay, and sounds before otherKeyPressedDuringModifier can mark the press as interrupted. The later release may stop it, but the app still briefly records and steals the shortcut path; keep a short cancellable delay or otherwise avoid starting until the modifier-only press is known to be clean.

Useful? React with 👍 / 👎.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f9be08a0ff

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


let settings = SettingsStore.shared
let desiredVolume = overrideVolume ?? settings.transcriptionSoundVolume
self.playbackQueue.async { [weak self] in

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep start cue playback synchronous before capture

When transcription sounds are enabled and the start cue is played immediately before asr.start(), the new queue hop means playStartSound() returns before AVAudioPlayer.play() actually runs, so the direct/Core Audio capture path can start first and record FluidVoice's own cue on speakers or open mics. Fresh evidence beyond the earlier start-cue concern is that playback is now only enqueued on playbackQueue here; either keep the start cue synchronous until player.play() has been dispatched or provide a completion that callers await before enabling capture.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant