Optimize audio capture startup#517
Conversation
Add the low-latency Core Audio capture path, keep modifier-only dictation starts instant, guard stale overlay hides, and remove the experimental Parakeet fast finalization mode.
|
The PR Policy check is blocking this PR because required template information is missing. Please update the PR description with:
Visual files detected:
Screenshots or video are required for UI, UX, settings, onboarding, overlay, menu bar, or visual behavior changes. If this PR has no visual changes, check the no-visual-change box in the template. If this remains incomplete for 48 hours after opening, the PR may be closed. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b75c23f8cf
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if SettingsStore.shared.enableTranscriptionSounds { | ||
| TranscriptionSoundPlayer.shared.playStartSound() |
There was a problem hiding this comment.
With transcription sounds enabled on speakers or an open mic, playing the start sound inside onCaptureStarted records FluidVoice's own cue at the beginning of the dictation, because ASRService.start invokes this callback only after the capture pipeline has been enabled and the hardware backend is running. Previously the cue was played before asr.start(), so it stayed outside the captured buffer; move the cue back before capture starts or otherwise exclude it from input.
Useful? React with 👍 / 👎.
| if self.hotkeyMode == .hold { | ||
| self.markHoldModeStartTriggered(for: behavior.holdModeType) | ||
| } | ||
| behavior.onHoldStart() |
There was a problem hiding this comment.
Defer modifier-only starts until combos are ruled out
For modifier-only shortcuts in hold/automatic modes, calling onHoldStart() immediately means any normal chord that begins with the same modifier (for example Fn+C when Fn is the dictation shortcut) starts the mic, overlay, and sounds before otherKeyPressedDuringModifier can mark the press as interrupted. The later release may stop it, but the app still briefly records and steals the shortcut path; keep a short cancellable delay or otherwise avoid starting until the modifier-only press is known to be clean.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f9be08a0ff
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| let settings = SettingsStore.shared | ||
| let desiredVolume = overrideVolume ?? settings.transcriptionSoundVolume | ||
| self.playbackQueue.async { [weak self] in |
There was a problem hiding this comment.
Keep start cue playback synchronous before capture
When transcription sounds are enabled and the start cue is played immediately before asr.start(), the new queue hop means playStartSound() returns before AVAudioPlayer.play() actually runs, so the direct/Core Audio capture path can start first and record FluidVoice's own cue on speakers or open mics. Fresh evidence beyond the earlier start-cue concern is that playback is now only enqueued on playbackQueue here; either keep the start cue synchronous until player.play() has been dispatched or provide a completion that callers await before enabling capture.
Useful? React with 👍 / 👎.
Description
Makes dictation start much faster so FluidVoice is less likely to miss the first word. Also closes the overlay faster after dictation, keeps overlay behavior safe when recordings happen back-to-back, removes an old Parakeet experiment, and makes the faster recording path the default.
Type of Change
Related Issue or Discussion
Tracking PR: #517
Testing
swiftlint --strict --config .swiftlint.yml Sourcesswiftformat --config .swiftformat Sourcessh build_with_FI_incremental.shScreenshots / Video
Notes
Bluetooth microphone smoke test passed.