Sounds like there's plenty of interest in those kind of tools. I'm not a huge fun API transcriptions given great local models.
I build https://github.com/bwarzecha/Axii to keep EVERYTHING locally and be fully open source - can be easily used at any company. No data send anywhere.
arcologies1985 | 2026-02-16 21:34 UTC
Could you make it use Parakeet? That's an offline model that runs very quickly even without a GPU, so you could get much lower latency than using an API.
zachlatta | 2026-02-16 21:56 UTC
I love this idea, and originally planned to build it using local models, but to have post-processing (that's where you get correctly spelled names when replying to emails / etc), you need to have a local LLM too.
If you do that, the total pipeline takes too long for the UX to be good (5-10 seconds per transcription instead of <1s). I also had concerns around battery life.
I used to use VoiceInk, but I found Spokenly [0] to be easier to use for post-processing the output, and more stable overall (local version with Parakeet or whisper is free).
I just learned about Handy in this thread and it looks great!
I think the biggest difference between FreeFlow and Handy is that FreeFlow implements what Monologue calls "deep context", where it post-processes the raw transcription with context from your currently open window.
This fixes misspelled names if you're replying to an email / makes sure technical terms are spelled right / etc.
The original hope for FreeFlow was for it to use all local models like Handy does, but with the post-processing step the pipeline took 5-10 seconds instead of <1 second with Groq.
stavros | 2026-02-16 23:02 UTC
As a very happy Handy user, it doesn't do that indeed. It will be interesting to see if it works better, I'll give FreeFlow a shot, thanks!
lemming | 2026-02-17 00:38 UTC
Could you go into a little more detail about the deep context - what does it grab, and which model is used to process it? Are you also using a groq model for the transcription?
sipjca | 2026-02-17 02:53 UTC
There's an open PR in the repo which will be merged which adds this support. Post processing is an optional feature if you want to use it, and when using it, end to end latency can still be under 3 seconds easily
k9294 | 2026-02-17 10:34 UTC
You can try ottex for this use case - it has both context capture (app screenshots), native LLMs support, meaning it can send audio AND screenshot directly to gemini 3 flash to produce the bespoke result.
hendersoon | 2026-02-16 22:48 UTC
Yes, I also use Handy. It supports local transcription via Nvidia Parakeet TDT2, which is extremely fast and accurate. I also use gemini 2.5 flash lite for post-processing via the free AI studio API (post-processing is optional and can also use a locally-hosted LM).
stavros | 2026-02-16 23:02 UTC
I use handy as well, and love it.
vogtb | 2026-02-16 23:15 UTC
Handy rocks. I recently had minor surgery on my shoulder that required me to be in a sling for about a month, and I thought I'd give Handy a try for dictating notes and so on. It works phenomenally well for most text-to-speech use cases - homonyms included.
irrationalfab | 2026-02-17 00:39 UTC
Handy is genuinely great and it supports Parakeet V3. It’s starting to change how I "type" on my computer.
d4rkp4ttern | 2026-02-17 02:30 UTC
Big fan of handy and it’s cross platform as well. Parakeet V3 gives the best experience with very fast and accurate-enough transcriptions when talking to AIs that can read between the lines. It does have stuttering issues though. My primary use of these is when talking to coding agents.
But a few weeks ago someone on HN pointed me to Hex, which also supports Parakeet-V3 , and incredibly enough, is even faster than Handy because it’s a native MacOS-only app that leverages CoreML/Neural Engine for extremely quick transcriptions. Long ramblings transcribed in under a second!
I installed a few different STT apps at the same time that used Parakeet and I think they disagreed with each other. But Hex otherwise would’ve won for me I think. Wanna reformat the Mac & try again (been a while anyway).
Thanks for the recommendation! I picked the smallest model (Moonshine Base @ 58MB), and it works great for transcribing English.
Surprisingly, it produced a better output (at least I liked its version) than the recommended but heavy model (Parakeet V3 @ 478 MB).
sipjca | 2026-02-17 05:37 UTC
Great feedback :) also support for the v2 versions of the moonshine models should be out today!
smcleod | 2026-02-17 04:18 UTC
Handy is nothing short of fantastic, really brilliant when combined with Parakeet v2!
arach | 2026-02-17 05:49 UTC
Handy's great! I find the latency to be just a bit too much for my taste. Like half the people on this thread, built my own but with a bit more emphasis on speed
I didn't try Handy but been using Whisper-Key its super simple get out of your way all local single file executable (portable so zero install too) -- thats for Windows idk about the Mac version
the astroturfing here off topic of op post is unbearable
spelk | 2026-02-16 22:38 UTC
Does anyone know of an effective alternative for Android?
jskherman | 2026-02-16 22:57 UTC
Check out the FUTO keyboard or FUTO voice input apps. It only uses the whisper models though so far.
xnx | 2026-02-16 23:30 UTC
Does the Android keyboard transcription not work for your needs?
baseh | 2026-02-17 17:04 UTC
For Android I find Google GBoard transcription most accurate and pretty solid.
uncharted9 | 2026-02-17 00:01 UTC
I have been using VoiceFlow. It works incredibly well and uses Groq to transcribe using the Whisper V3 Turbo model. You can also use it in an offline scenario with an on-device model, but I am mostly connected to the internet whenever I am transcribing.
windthrown | 2026-02-17 05:44 UTC
I installed Whisper+ through FDroid and it works well for my basic needs. Only 30s at a time but you can append multiple recordings to the same transcript: https://github.com/woheller69/whisperIMEplus
lemming | 2026-02-16 22:56 UTC
Is it possible to customise the key binding? Most of these services let you customise the binding, and also support toggle for push-to-talk mode.
vesterde | 2026-02-16 23:00 UTC
Since many are asking about apps with simillar capabilities I’m very happy with MacWhisper. Has Parakeet, near instant transcription of my lengthy monologues. All local.
Edit: Ah but Parakeet I think isn’t available for free. But very worthwhile single purchase app nonetheless!
SOLAR_FIELDS | 2026-02-17 05:08 UTC
I actually got MacWhisper originally for speech to text so I could talk to my machine like a crazy person. I realized I didn't like doing that but the actual killer feature for buying it that I really enjoy is the fully local transcription of meetings, with a nice little button to start recording that pops up when you launch zoom, teams, etc. It means I can safely record meetings and encrypt them locally and keep internal notes without handing off all of that to some nebulous cloud platform.
I had previously used Hyprnote to record meetings in this way - and indeed I still use that as a backup, it's a great free option - but the meeting prompting to record and better transcription offered by Macwhisper is a much better experience.
arach | 2026-02-17 06:03 UTC
I initially built Talkie to talk to it like a crazy person when I was on long runs and ideas would pop into my head haha
Been a power user of SuperWhisper and Wispr Flow for a long time and eventually decided to unify those flows - memos & dictations, everything is a file and local first, BYOK
baxtr | 2026-02-16 23:00 UTC
Is there a tool that preserves the audio? I want both, the transcript and the audio.
heyalexej | 2026-02-16 23:23 UTC
Quick glance; FreeFlow already saves WAV recordings for every transcript to ~/Lib../App../FreeFlow/audio/ with UUIDs linking them to pipeline history entries in CoreData. Audio files are automatically deleted though, when their associated history entries are deleted. Shall be a quick fix. Recently did the same for hyprvoice, for debugging and auditing.
BizarroLand | 2026-02-17 20:17 UTC
Handy appears to keep the audio clips, It does have a section in the settings to limit how many of those it keeps and there does not appear to be an upper limit, but it does have to be manually set. (I set mine to 99,999).
It would be nice if below 0 it had a -1 option to keep all recordings.
Comments
I build https://github.com/bwarzecha/Axii to keep EVERYTHING locally and be fully open source - can be easily used at any company. No data send anywhere.
If you do that, the total pipeline takes too long for the UX to be good (5-10 seconds per transcription instead of <1s). I also had concerns around battery life.
Some day!
It’s free and offline
[0] https://github.com/EpicenterHQ/epicenter
https://news.ycombinator.com/item?id=36460246
https://blazingbanana.com/work/whistle
https://github.com/Beingpax/VoiceInk
[0]: https://spokenly.app/
just bought the one-time licence. this is the future of AI pricing - local models and one-time fee.
I think the biggest difference between FreeFlow and Handy is that FreeFlow implements what Monologue calls "deep context", where it post-processes the raw transcription with context from your currently open window.
This fixes misspelled names if you're replying to an email / makes sure technical terms are spelled right / etc.
The original hope for FreeFlow was for it to use all local models like Handy does, but with the post-processing step the pipeline took 5-10 seconds instead of <1 second with Groq.
But a few weeks ago someone on HN pointed me to Hex, which also supports Parakeet-V3 , and incredibly enough, is even faster than Handy because it’s a native MacOS-only app that leverages CoreML/Neural Engine for extremely quick transcriptions. Long ramblings transcribed in under a second!
It’s now my favorite fully local STT for MacOS:
https://github.com/kitlangton/Hex
My comment on this from a month back: https://news.ycombinator.com/item?id=46637040
Surprisingly, it produced a better output (at least I liked its version) than the recommended but heavy model (Parakeet V3 @ 478 MB).
https://usetalkie.com
[1] https://github.com/PinW/whisper-key-local
Edit: Ah but Parakeet I think isn’t available for free. But very worthwhile single purchase app nonetheless!
I had previously used Hyprnote to record meetings in this way - and indeed I still use that as a backup, it's a great free option - but the meeting prompting to record and better transcription offered by Macwhisper is a much better experience.
Been a power user of SuperWhisper and Wispr Flow for a long time and eventually decided to unify those flows - memos & dictations, everything is a file and local first, BYOK
It would be nice if below 0 it had a -1 option to keep all recordings.
https://handy.computer/