đ± 10 Bits Per Second

I bought a DJI Mic Mini last week. Itâs a wireless lavalier microphoneâthe kind YouTubers clip to their shirts. It has a tiny transmitter that weighs 10 grams, a receiver with a USB-C adapter, and 400 meters of wireless range. It was designed for content creators filming vlogs.
Iâm using it to write code.
If youâre not already familiar with the shift: since late 2025, AI coding agents have fundamentally changed how software gets built. Tools like Anthropicâs Claude Code and OpenAIâs Codex live in your terminal, read your codebase, write code, run tests, and commit to gitâautonomously.
Andrej Karpathy coined the term âvibe codingâ in February 2025[4], describing a workflow where you speak your intent and the AI handles the implementation. By the end of the year, âvibe codingâ was Collins Dictionaryâs Word of the Year, Claude Code alone was responsible for 4% of all public GitHub commits and growing fast[5], and three companiesâGitHub Copilot, Claude Code, and Cursorâhad each crossed $1 billion in annual revenue.

I got the two-pack (2 transmitters + 1 receiver) for $59. It comes with windscreens, magnetic clips for attaching to your shirt, a charging dock, a USB-C splitter cable so you can charge both mics at the same time, and a carrying pouch. The receiver is tinyâjust a small USB-C dongle that plugs directly into my laptop. Having two transmitters means I can swap one in while the other charges, though with 11.5 hours of battery life per transmitter I havenât actually needed to swap yet. When Iâm not coding, the same USB-C receiver plugs into my phone for recording content.
The Setup

I pair it once, and then I just talk. It doesnât matter if I stand up to stretch, walk to the kitchen to get water, or pace around while thinking through an architecture problem. The mic picks up my voice clearly from anywhere because itâs physically attached to me.
I use VoiceInk, an open source speech-to-text app, for transcription. Here are my stats after a few months:

Thatâs 609,430 keystrokes I didnât have to type.
Before the DJI Mic Mini, I was using my MacBookâs built-in microphone. It worked fine if I was sitting right in front of my laptop, enunciating clearly, speaking in its direction. But as soon as I stood up and walked across the room, the transcription quality fell off a cliff.
Another issue was volume. If youâre talking to AI all day, you donât want to be projecting your voice toward a laptop across the room. Thatâs a recipe for vocal strain. Youâd think the solution is to just speak quietly, but thereâs a floor to how quiet you can go before the laptop mic canât pick you up. And going all the way down to a whisper is actually worse for youâotolaryngologists have found that whispering can cause more stress on your vocal cords than normal speech, because many people tighten the muscles around the voice box to compensate[3].
A lavalier mic solves both problems. The vocal strain one is obviousâitâs inches from your mouth, so I can talk at a quiet indoor voice without projecting. But the accuracy improvement is just as important. Transcription models like Whisper and Parakeet can make errors, but those models are only as good as their input audio. A laptop mic across the room picks up fan noise, room reverb, and a quieter voice signal. A mic clipped to your chest picks up a clean, close-range signal every time. Better source data means fewer transcription errors, which means less friction, which means you actually stay in the voice workflow instead of getting frustrated and reaching for the keyboard.
Iâm running this through a speech-to-text tool that feeds directly into Claude Code. I describe what I want, the agent builds it, I look at the result on my phone, I describe whatâs wrong, and the agent fixes it. The entire feedback loop is voice-driven. My hands never touch the keyboardânot even for approving tool calls. I handle those with ControllerKeys, an app I built that lets me control macOS entirely with an Xbox controller. I wrote about it in Every Shortcut Within Reach. Between the mic and the controller, my keyboard is essentially a decoration.
10 Bits Per Second
In 2024, Caltech researchers published a paper called âThe Unbearable Slowness of Beingâ[1] that quantified something neuroscientists had suspected for decades: conscious human thought runs at approximately 10 bits per second.
Ten. Bits. Per second.
Your eyes take in about a billion bits per second. Your ears, your skin, your proprioceptive systemâall of it operates at enormous bandwidth. But the conscious part of your brain, the part that makes decisions and forms intentions and chooses what to do next, operates at a rate that would embarrass a 1970s modem.
This isnât a metaphor. The researchers surveyed decades of studies across wildly different tasksâtyping, speaking, solving Rubikâs Cubes, playing video games, readingâand found the same speed limit everywhere. About 10 bits per second of behavioral output, regardless of the task. Our sensory systems take in billions of bits per second, but somehow, at the level of conscious thought, we bottleneck to a trickle.
The implication is startling: no input device can ever be faster than you can think.
The Bandwidth Hierarchy
Hereâs how our current input methods stack up:
| Method | Words Per Minute |
|---|---|
| Eye tracking | ~20 WPM |
| Neuralink BCI (current best) | ~40 WPM |
| Typing (average developer) | ~54 WPM |
| Typing (fast developer) | ~100 WPM |
| Natural speech | ~100â150 WPM |
Voice wins. Not by a littleâby a lot compared to average typing, and roughly tied with the fastest typists.
But hereâs the thing that surprised me: Neuralink, the brain-computer interface that is supposed to be the future of human-computer interaction, currently tops out at 40 words per minute[2]. Thatâs slower than the average developer types. Their ambitious goal with the VOICE clinical trial is 140 WPMâwhich is just⊠normal talking speed.
The sci-fi dream of thinking commands directly into a computer runs straight into the Caltech wall. Even if you could read neural signals perfectly, the conscious thoughts generating those signals only produce 10 bits per second. The bottleneck was never the interface. It was always the brain.
Why Voice Beats Typing for Programming
When I say âvoice is faster,â I donât just mean words per minute. I mean the full loop.
You can talk while looking at something else. Say youâre working on a web app. You navigate to the page in your browser, start talking: âthe sidebar is overlapping the main content, and the nav links are wrapping to two lines.â Then you pick up your phone, pull up the same page on mobile, and keep going: âon mobile the hamburger menu isnât opening, and the hero image is way too tall.â You put your phone down, switch back to your terminal, end the transcription, and itâs all there as one prompt. You stayed in the same flow the entire timeâlooking at the thing, talking about the thing, never breaking out of that observational mode to sit down and translate your thoughts into typed text.
You can talk while thinking. This sounds obvious but itâs genuinely different from typing. When I type, I think first, then type. Thereâs a serialization stepâI formulate the thought, then I encode it into keystrokes. When I speak, the thought and the expression happen almost simultaneously. The bandwidth of speech is close enough to the bandwidth of thought that thereâs barely any buffering.
Errors donât matter. I wrote about this in Misheard Lyrics for RobotsâI said ârun make install build from sourceâ and my transcription software heard âRyan Lacon stall book for source.â Claude Code ran make install BUILD_FROM_SOURCE=1 anyway. When your listener is an LLM, transcription errors are just noise that gets filtered out. The error tolerance of natural language is orders of magnitude higher than the error tolerance of a keyboard.
You can move. The DJI Mic Mini on my shirt handles voice input. The DualSense controller in my hand handles everything elseâapproving tool calls, switching windows, scrolling, navigatingâvia ControllerKeys. Together, they make me completely untethered. I live in a small unit and I havenât had a desk in about three years. My back was starting to hurt from hunching over a laptop on the couch, which is part of why I built ControllerKeys in the first placeâI needed a way to work that didnât chain me to one position. Now I can stand up, walk to the kitchen, pace around while thinking through a problem, and keep working the entire time. The only limit Iâve found is wireless rangeâthick panes of glass can cut the signal short, but barring that, I can work from anywhere in my apartment.
The Real Unlock: Compression of Intent
Hereâs the argument that goes beyond raw words-per-minute.
AI coding agents have compressed programming from writing code to describing intent. Instead of typing 47 lines of Swift to implement a camera animation, you say âmake the camera do a cinematic swoop into the photo when you tap itâ and the agent writes the Bezier math for you. You could type that sentence too â but once programming becomes a series of short, conversational exchanges, voice is the natural medium. Youâre not writing code anymore. Youâre just talking. And talking is what voice was literally designed for.
The advantages compound from there. You talk while staring at the bug on your phone. You talk while pacing through an architecture problem. You talk while the agent is still finishing its last task, queueing up your next thought. Thereâs no context-switch to the keyboard, no breaking out of the flow to sit down and type. The conversation just keeps going.
A Content Creator Accessory Is Now a Programming Tool
The DJI Mic Mini costs $59. It was built for people who make YouTube videos and TikToks. The product page shows influencers filming themselves cooking and traveling.
Iâm using it to debug RealityKit coordinate transforms.
Thereâs something funny about the fact that the highest-bandwidth programming peripheral you can buy in 2026 isnât a mechanical keyboard or an ergonomic split boardâitâs a lavalier microphone originally designed for vloggers. The tool categories are converging. Content creation and software engineering now share the same input device because they share the same upstream constraint: getting human intent into a computer as fast as possible.
I think five years from now â assuming on-device mics donât get significantly better â a wireless mic will be as standard in a developerâs kit as a second monitor. Not because everyone will be recording themselvesâbut because talking to your AI agent is faster than typing to it, and a good mic is the difference between âworks okayâ and âworks every time.â
The 10-Bit Ceiling
The Caltech paper ends with a question that nobody has answered: why is human conscious thought so slow? We have 86 billion neurons, each capable of transmitting hundreds of bits per second, yet we think one thought at a time at 10 bits per second. The researchers suggest weâre limited not by hardware but by some deep architectural constraintâperhaps the brain can only maintain one coherent âthreadâ of consciousness at a time.
If thatâs true, then the optimizations we should be chasing arenât about faster interfaces. Theyâre about richer compressionâmaking each of those 10 bits count for more. And thatâs exactly what AI does. It takes a low-bandwidth, noisy, sometimes garbled human signal and reconstructs the high-bandwidth intent behind it.
Voice input is already close to saturating our conscious output bandwidth. The next frontier isnât a faster pipe from brain to computer. Itâs a smarter decoder on the other end.
Citations
[1] The Unbearable Slowness of Being: Why do we live at 10 bits/s? â Zheng & Meister, Neuron (2024) â©
[2] From Paralysis to Neuroscience: How 21 People are Using Neuralink in 2026 â TeslaNorth â©
[3] Laryngeal hyperfunction during whispering: reality or myth? â Journal of Voice (2006) â©
[4] Andrej Karpathy on "vibe coding" â X/Twitter (2025) â©
[5] Claude Code is the Inflection Point â SemiAnalysis (2026) â©

I bought a [DJI Mic Mini](https://www.amazon.com/dp/B0FQJH54TR?tag=kevindotmd-20) last week. It's a wireless lavalier microphoneâthe kind YouTubers clip to their shirts. It has a tiny transmitter that weighs 10 grams, a receiver with a USB-C adapter, and 400 meters of wireless range. It was designed for content creators filming vlogs.
I'm using it to write code.
If you're not already familiar with the shift: since late 2025, AI coding agents have fundamentally changed how software gets built. Tools like Anthropic's [Claude Code](https://docs.anthropic.com/en/docs/claude-code) and OpenAI's [Codex](https://openai.com/index/introducing-codex/) live in your terminal, read your codebase, write code, run tests, and commit to gitâautonomously.
Andrej Karpathy coined the term "vibe coding" in February 2025<sup><a href="#cite-4" id="ref-4">[4]</a></sup>, describing a workflow where you speak your intent and the AI handles the implementation. By the end of the year, "vibe coding" was Collins Dictionary's Word of the Year, Claude Code alone was responsible for 4% of all public GitHub commits and growing fast<sup><a href="#cite-5" id="ref-5">[5]</a></sup>, and three companiesâGitHub Copilot, Claude Code, and Cursorâhad each crossed $1 billion in annual revenue.

I got the [two-pack](https://www.amazon.com/dp/B0FQJH54TR?tag=kevindotmd-20) (2 transmitters + 1 receiver) for $59. It comes with windscreens, magnetic clips for attaching to your shirt, a charging dock, a USB-C splitter cable so you can charge both mics at the same time, and a carrying pouch. The receiver is tinyâjust a small USB-C dongle that plugs directly into my laptop. Having two transmitters means I can swap one in while the other charges, though with 11.5 hours of battery life per transmitter I haven't actually needed to swap yet. When I'm not coding, the same USB-C receiver plugs into my phone for recording content.
## The Setup

I pair it once, and then I just talk. It doesn't matter if I stand up to stretch, walk to the kitchen to get water, or pace around while thinking through an architecture problem. The mic picks up my voice clearly from anywhere because it's physically attached to me.
I use [VoiceInk](https://tryvoiceink.com/), an open source speech-to-text app, for transcription. Here are my stats after a few months:

That's 609,430 keystrokes I didn't have to type.
Before the DJI Mic Mini, I was using my MacBook's built-in microphone. It worked fine if I was sitting right in front of my laptop, enunciating clearly, speaking in its direction. But as soon as I stood up and walked across the room, the transcription quality fell off a cliff.
Another issue was volume. If you're talking to AI all day, you don't want to be projecting your voice toward a laptop across the room. That's a recipe for vocal strain. You'd think the solution is to just speak quietly, but there's a floor to how quiet you can go before the laptop mic can't pick you up. And going all the way down to a whisper is actually worse for youâotolaryngologists have found that whispering can cause *more* stress on your vocal cords than normal speech, because many people tighten the muscles around the voice box to compensate<sup><a href="#cite-3" id="ref-3">[3]</a></sup>.
A lavalier mic solves both problems. The vocal strain one is obviousâit's inches from your mouth, so I can talk at a quiet indoor voice without projecting. But the accuracy improvement is just as important. Transcription models like Whisper and Parakeet can make errors, but those models are only as good as their input audio. A laptop mic across the room picks up fan noise, room reverb, and a quieter voice signal. A mic clipped to your chest picks up a clean, close-range signal every time. Better source data means fewer transcription errors, which means less friction, which means you actually stay in the voice workflow instead of getting frustrated and reaching for the keyboard.
I'm running this through a speech-to-text tool that feeds directly into Claude Code. I describe what I want, the agent builds it, I look at the result on my phone, I describe what's wrong, and the agent fixes it. The entire feedback loop is voice-driven. My hands never touch the keyboardânot even for approving tool calls. I handle those with [ControllerKeys](https://kevintang.xyz/apps/controller-keys/), an app I built that lets me control macOS entirely with an Xbox controller. I wrote about it in [Every Shortcut Within Reach](/every-shortcut-within-reach.md). Between the mic and the controller, my keyboard is essentially a decoration.
## 10 Bits Per Second
In 2024, Caltech researchers published a paper called "The Unbearable Slowness of Being"<sup><a href="#cite-1" id="ref-1">[1]</a></sup> that quantified something neuroscientists had suspected for decades: conscious human thought runs at approximately 10 bits per second.
Ten. Bits. Per second.
Your eyes take in about a billion bits per second. Your ears, your skin, your proprioceptive systemâall of it operates at enormous bandwidth. But the conscious part of your brain, the part that makes decisions and forms intentions and chooses what to do next, operates at a rate that would embarrass a 1970s modem.
This isn't a metaphor. The researchers surveyed decades of studies across wildly different tasksâtyping, speaking, solving Rubik's Cubes, playing video games, readingâand found the same speed limit everywhere. About 10 bits per second of behavioral output, regardless of the task. Our sensory systems take in billions of bits per second, but somehow, at the level of conscious thought, we bottleneck to a trickle.
The implication is startling: **no input device can ever be faster than you can think.**
## The Bandwidth Hierarchy
Here's how our current input methods stack up:
| Method | Words Per Minute |
|:-------------------------------|----------------:|
| Eye tracking | ~20 WPM |
| Neuralink BCI (current best) | ~40 WPM |
| Typing (average developer) | ~54 WPM |
| Typing (fast developer) | ~100 WPM |
| Natural speech | ~100â150 WPM |
Voice wins. Not by a littleâby a lot compared to average typing, and roughly tied with the fastest typists.
But here's the thing that surprised me: Neuralink, the brain-computer interface that is supposed to be the future of human-computer interaction, currently tops out at 40 words per minute<sup><a href="#cite-2" id="ref-2">[2]</a></sup>. That's slower than the average developer types. Their ambitious goal with the VOICE clinical trial is 140 WPMâwhich is just... normal talking speed.
The sci-fi dream of thinking commands directly into a computer runs straight into the Caltech wall. Even if you could read neural signals perfectly, the conscious thoughts generating those signals only produce 10 bits per second. The bottleneck was never the interface. It was always the brain.
## Why Voice Beats Typing for Programming
When I say "voice is faster," I don't just mean words per minute. I mean the full loop.
**You can talk while looking at something else.** Say you're working on a web app. You navigate to the page in your browser, start talking: "the sidebar is overlapping the main content, and the nav links are wrapping to two lines." Then you pick up your phone, pull up the same page on mobile, and keep going: "on mobile the hamburger menu isn't opening, and the hero image is way too tall." You put your phone down, switch back to your terminal, end the transcription, and it's all there as one prompt. You stayed in the same flow the entire timeâlooking at the thing, talking about the thing, never breaking out of that observational mode to sit down and translate your thoughts into typed text.
**You can talk while thinking.** This sounds obvious but it's genuinely different from typing. When I type, I think first, then type. There's a serialization stepâI formulate the thought, then I encode it into keystrokes. When I speak, the thought and the expression happen almost simultaneously. The bandwidth of speech is close enough to the bandwidth of thought that there's barely any buffering.
**Errors don't matter.** I wrote about this in [Misheard Lyrics for Robots](/misheard-lyrics-for-robots.md)âI said "run make install build from source" and my transcription software heard "Ryan Lacon stall book for source." Claude Code ran `make install BUILD_FROM_SOURCE=1` anyway. When your listener is an LLM, transcription errors are just noise that gets filtered out. The error tolerance of natural language is orders of magnitude higher than the error tolerance of a keyboard.
**You can move.** The DJI Mic Mini on my shirt handles voice input. The DualSense controller in my hand handles everything elseâapproving tool calls, switching windows, scrolling, navigatingâvia [ControllerKeys](https://kevintang.xyz/apps/controller-keys/). Together, they make me completely untethered. I live in a small unit and I haven't had a desk in about three years. My back was starting to hurt from hunching over a laptop on the couch, which is part of why I built ControllerKeys in the first placeâI needed a way to work that didn't chain me to one position. Now I can stand up, walk to the kitchen, pace around while thinking through a problem, and keep working the entire time. The only limit I've found is wireless rangeâthick panes of glass can cut the signal short, but barring that, I can work from anywhere in my apartment.
## The Real Unlock: Compression of Intent
Here's the argument that goes beyond raw words-per-minute.
AI coding agents have compressed programming from writing code to describing intent. Instead of typing 47 lines of Swift to implement a camera animation, you say "make the camera do a cinematic swoop into the photo when you tap it" and the agent writes the Bezier math for you. You could type that sentence too â but once programming becomes a series of short, conversational exchanges, voice is the natural medium. You're not writing code anymore. You're just talking. And talking is what voice was literally designed for.
The advantages compound from there. You talk while staring at the bug on your phone. You talk while pacing through an architecture problem. You talk while the agent is still finishing its last task, queueing up your next thought. There's no context-switch to the keyboard, no breaking out of the flow to sit down and type. The conversation just keeps going.
## A Content Creator Accessory Is Now a Programming Tool
The DJI Mic Mini costs $59. It was built for people who make YouTube videos and TikToks. The product page shows influencers filming themselves cooking and traveling.
I'm using it to debug RealityKit coordinate transforms.
There's something funny about the fact that the highest-bandwidth programming peripheral you can buy in 2026 isn't a mechanical keyboard or an ergonomic split boardâit's a lavalier microphone originally designed for vloggers. The tool categories are converging. Content creation and software engineering now share the same input device because they share the same upstream constraint: getting human intent into a computer as fast as possible.
I think five years from now â assuming on-device mics don't get significantly better â a wireless mic will be as standard in a developer's kit as a second monitor. Not because everyone will be recording themselvesâbut because talking to your AI agent is faster than typing to it, and a good mic is the difference between "works okay" and "works every time."
## The 10-Bit Ceiling
The Caltech paper ends with a question that nobody has answered: *why* is human conscious thought so slow? We have 86 billion neurons, each capable of transmitting hundreds of bits per second, yet we think one thought at a time at 10 bits per second. The researchers suggest we're limited not by hardware but by some deep architectural constraintâperhaps the brain can only maintain one coherent "thread" of consciousness at a time.
If that's true, then the optimizations we should be chasing aren't about faster interfaces. They're about richer compressionâmaking each of those 10 bits count for more. And that's exactly what AI does. It takes a low-bandwidth, noisy, sometimes garbled human signal and reconstructs the high-bandwidth intent behind it.
Voice input is already close to saturating our conscious output bandwidth. The next frontier isn't a faster pipe from brain to computer. It's a smarter decoder on the other end.
## Citations
<p id="cite-1">[1] <a href="https://arxiv.org/html/2408.10234v2" target="_blank" rel="noopener noreferrer">The Unbearable Slowness of Being: Why do we live at 10 bits/s?</a> â Zheng & Meister, Neuron (2024) <a href="#ref-1">â©</a></p>
<p id="cite-2">[2] <a href="https://teslanorth.com/2026/01/28/from-paralysis-to-neuroscience-how-21-people-are-using-neuralink-in-2026/" target="_blank" rel="noopener noreferrer">From Paralysis to Neuroscience: How 21 People are Using Neuralink in 2026</a> â TeslaNorth <a href="#ref-2">â©</a></p>
<p id="cite-3">[3] <a href="https://pubmed.ncbi.nlm.nih.gov/16503476/" target="_blank" rel="noopener noreferrer">Laryngeal hyperfunction during whispering: reality or myth?</a> â Journal of Voice (2006) <a href="#ref-3">â©</a></p>
<p id="cite-4">[4] <a href="https://x.com/karpathy/status/1886192184808149383" target="_blank" rel="noopener noreferrer">Andrej Karpathy on "vibe coding"</a> â X/Twitter (2025) <a href="#ref-4">â©</a></p>
<p id="cite-5">[5] <a href="https://newsletter.semianalysis.com/p/claude-code-is-the-inflection-point" target="_blank" rel="noopener noreferrer">Claude Code is the Inflection Point</a> â SemiAnalysis (2026) <a href="#ref-5">â©</a></p>