Open beta · Linux · v0.2.3

Private dictation
for Linux,
built to stay local.

VocaPulse is a GPU-accelerated speech-to-text application for Linux. Press a hotkey, speak, paste clean text into any app. An optional on-device language model turns spoken thoughts into structured writing — prose, bullet points, or Markdown. Nothing is ever uploaded.

Get early access See it work

CtrlShiftSpace— default hotkey, fully remappable

How it works

From hotkey to clean paste, in seconds.

The recording overlay appears wherever you are working. Speech recognition runs on your own GPU. An optional second hotkey reshapes the raw transcript into the format you need, then pastes it into the window that had focus.

meeting-notes.md

VocaPulse

Ctrl+Shift+Space

press to record

Ctrl+Shift+S

restructure

ready

0.0s / 15.4s

Capture

The overlay captures your microphone directly — robust handling for USB, Bluetooth, and built-in audio.

Transcribe

On-device speech recognition on your GPU. Custom dictionary rules apply before the text reaches your clipboard.

Restructure

An optional on-device language model reshapes speech into prose, bullet points, or Markdown — local, offline, on your hardware.

Features

Built for how you actually talk.

VocaPulse is not a browser extension, a wrapper around a cloud API, or a demo. It is a native Linux application designed to sit in your tray, wake on a hotkey, and get out of your way.

Thought-to-Structure

Speak freely. Publish cleanly.

An optional on-device language model rewrites raw dictation into one of three formats: continuous prose, bullet points, or Markdown. Choose which one runs by default, or switch per use with a dedicated hotkey. Nothing about the text — including the decision to restructure it — is visible outside your machine.

three output formats · dedicated hotkey · offline

Three customizable hotkeys

Record, restructure, paste the last transcript.

Each shortcut is remappable in Settings, including bare function keys. Recording works as a toggle or push-to-talk. The overlay never takes keyboard focus, so the paste always lands in the window and cursor position that were active when you started.

Ctrl+Shift+Space · Ctrl+Shift+S · F8

Per-user dictionary

Your vocabulary, recognised on the first try.

Define whole-word replacements for names, technical jargon, brand names, and project codenames. Rules apply before the transcript leaves the recognition engine, so the text arriving in your editor is already correct. Rules live in a plain JSON file you can sync between machines.

case-insensitive · whole-word · portable JSON

Encrypted transcript history

AES-256-GCM, bound to your machine.

An optional local history of every transcription. Each entry is encrypted at rest. The encryption key is derived from a hardware identifier unique to your computer, so copying the database file to another machine produces an unreadable blob. Searchable locally; never uploaded.

local SQLite · AES-256-GCM · HMAC(machine-id)

Runs on any modern GPU

One binary. AMD, NVIDIA, or Intel.

Speech recognition and language-model inference both run on the Vulkan graphics API, supported by every major GPU vendor without proprietary toolkits. On machines without a capable GPU, the application falls back automatically to CPU — still fast enough for conversational use.

Vulkan backend · CPU fallback · single binary

Works in every text field

Browser, editor, terminal, messenger.

A persistent virtual keyboard, initialised once at startup, emits keystrokes into whichever window currently has focus. No per-application integration, no browser extension, no accessibility permissions beyond the ones every Linux application already has. If you can paste, VocaPulse works.

Wayland clipboard · /dev/uinput keystrokes

Who it is for

Built for the people who actually write for a living.

VocaPulse is designed around six common workflows. Not marketing personas — real patterns of daily work we watched and measured.

Writers & managers

Long-form email without the RSI

Dictate a full reply, restructure it into short paragraphs, paste directly into your email client. Faster than typing, cleaner than speaking.

Software engineers

Describe the code, not the grammar

Dictate commit messages, pull-request descriptions, and code comments. Your custom dictionary fixes project-specific terms before the text reaches your editor.

Journalists & students

Interview-to-notes in one pass

Capture what you heard from an interview or lecture directly into your notes application. Convert the raw transcript into bullet points with a single shortcut.

Thinkers out loud

Stream-of-consciousness, tidied

Speak everything you are considering. The on-device language model extracts the points worth keeping. No editor required, no context lost.

Chat-first teams

Messengers at speaking pace

Dictate directly into any composer. Push-to-talk behaves like familiar voice modes — but your words are delivered as text, not audio.

Accessibility-first

Typing-free workflows

Keyboard-driven when you want it, voice-driven when you do not. Suitable for repetitive-strain injury recovery, mobility constraints, or simply preferring to read before writing.

Privacy by architecture

Privacy isn't a setting. It's the design.

Most dictation software records your voice, sends it to a remote server, and asks you to trust a privacy policy. VocaPulse takes a different approach. Every step of the pipeline — microphone capture, speech recognition, restructuring, clipboard delivery — runs locally on your machine. Your audio is not uploaded, not retained on our infrastructure, and not retrievable by us, our staff, or any third party. It cannot be. It never leaves your device.

The sections below describe, in plain language and with specific technical choices, what that commitment actually means.

Audio is never persisted.

Voice samples exist only in volatile memory during recognition and are discarded the moment the transcript is produced. The application has no filesystem code path for writing raw audio, and no configurable option that would create one.

No cloud dependencies during use.

Routine operation — microphone capture, speech recognition, restructuring, and paste — opens zero network connections. The application contacts our servers only to check for updates on a schedule you control, and to download language models when you explicitly initiate a download.

No telemetry. No behavioral analytics.

VocaPulse does not ship crash reporting, usage pings, A/B feature flags, or behavioral tracking of any kind. We learn that something is broken only because you tell us. No session replay, no funnel analytics, no identifiers.

Transcript history, when enabled, is encrypted at rest.

Opting into local history stores each entry in a SQLite database on your machine. Every row is encrypted with AES-256-GCM. The encryption key is derived from a hardware identifier unique to your computer, which means the database file is unreadable if copied elsewhere.

Usage quotas are enforced client-side.

The free tier's daily word limit is counted and enforced by the application itself. Our servers neither receive nor store transcription content or word counts. We cannot audit the limit, because we collect no data that would let us audit it. Honor system, by architecture.

Open source where it matters.

The storage schema, the cryptographic parameters, and the platform integrations are documented in the public product architecture. You do not need to trust us — the security posture is inspectable.

In plain English: we sell you software, not your voice.

There is no business model in the design that could be improved by looking at your data. The software you install today is complete. The privacy posture described here is a technical property of the application, not a promise that could be quietly reversed in a future release without rearchitecting everything.

System requirements

Runs on the hardware you already own.

Linux-first. Wayland-native. Operational on a laptop up to five years old, with graceful CPU fallback for machines without a dedicated GPU.

Operating system

Linux, Wayland

KDE Plasma 5.27+, GNOME 45+, any wlroots compositor

GPU

Any Vulkan-capable

AMD, NVIDIA, Intel — CPU fallback available

VRAM for structuring

4 GB small · 8 GB medium · 16 GB large

dictation alone runs on 2 GB or CPU; structure models scale with VRAM

RAM

4 GB min · 8 GB recommended

Disk

~500 MB

application binary plus the default recognition model

Microphone

Built-in, USB, or Bluetooth

Web Audio API handles A2DP cleanly — no 3-second cutoffs

Keystroke injection

/dev/uinput access

typically in the `input` group on modern distros

Native deb and rpm packages

Signed installers for Debian, Ubuntu, Fedora, and derivatives. First-class integration with your distribution's package manager.

Flatpak in development

Portal-based audio access introduces a set of open issues we intend to fully resolve before publishing a Flatpak build.

AppImage intentionally not supported

The AppImage runtime cannot reliably access the required audio subsystem. We decided against shipping a format that would silently fail for a meaningful subset of users.

Pricing

Free during the beta.

VocaPulse is in active development and every feature is unlocked for early users. A commercial model will be announced — in full, with specific numbers — before any payment is ever requested.

Beta

€0/ during beta

On-device speech recognition, GPU-accelerated
Thought-to-Structure restructuring
Custom dictionary
Three customizable hotkeys
Encrypted transcript history (optional)
Automatic pause for media players

Get early access

One email. No card. No spam.

Frequently asked

Answers, before you install.

No. Microphone capture, speech recognition, and optional restructuring all run locally on your computer. Voice samples exist only in memory during processing and are discarded immediately after. The only network traffic VocaPulse ever initiates is an on-demand model download and a manifest check for application updates.

Start dictating on your own terms.

Private. Fast. Fully offline. Installation takes a few minutes. You will be typing with your voice inside of five.

Get early access See it work again

Private dictationfor Linux,built to stay local.