Designing a Terminal for an Audio-First Workflow
Designing a Terminal for an Audio-First Workflow
Most shell environments assume a visual interface. Prompts are colorful, information-dense, and optimized for quick scanning. My workflow is different. I'm blind, and I work with a screen reader, which means my terminal is fundamentally an audio interface. Instead of scanning the screen, I'm listening to the environment as I work. That changes what matters.
Most terminal workflows carry a lot of unnecessary noise such as long paths, redundant context, and visual-only cues. Designing for audio forced me to remove that noise entirely.
The result is a shell environment that’s faster to navigate, easier to parse, and often better even if you can see the screen. If you like what I describe here, take a look at my dotfiles repo for your own inspiration. If you are interested in what path substitution built directly into shell configuration might look like, and maintain shell code, I'd love to talk.
Where this started
While I was working at Google, I spent a lot of time inside the monorepo. (If you haven't encountered that model before: imagine nearly the entire company's code living inside one enormous repository). The ideas that eventually led to this system started there, but the implementation in this repository was written later from scratch and is entirely my own code.
The paths were extremely long. In some cases it could take several seconds for my screen reader, even reading at 700 WPM, just to tell me where I was during a large refactor. A typical path might look something like this (simplified):
/mounted/path/to/the/monorepo/perforce_client/root/javascript/namespace_part/namespace_subpart/namespace_subpartsubpart/app_root/subsystem/main/tests/audioInformation_test.ts
For a sighted developer, this is mostly a visual annoyance. You glance at the prompt, pick out the important segment, and move on. For me, the terminal had to speak the entire thing, or I had to interrupt my work and read it piece by piece. Every time the prompt appeared, my screen reader would start reading the path, and after a while I realized I often had no idea where I was unless I waited several seconds for the prompt to finish speaking. That was not going to work.
So I started building tools to compensate. First came "teleport" functions. These were small helpers that jumped directly to important parts of the repository:
function jsdf {
cd <path_to_google_drive_javascript_code>
}
function jdf {
cd <path_to_java_drive_code>
}
I can't show the real internal layout because I don't want to reveal how Google's internal monorepo is organized, but jsdf simply meant "JavaScript Drive frontend code." This saved a ton of time in typing along, a simple jsdf;cd infra would get me somewhere I needed to be to do something on a specific file, but it wasn't enough. These worked, but they were crude, and I was still constantly hearing huge paths in the prompt. Eventually I added a small rewriting pipeline that shortened common segments before displaying the path. The first version was extremely simple:
sed -e 's/path_1/short_segment/' \
-e 's/path_2/another_short/'
It was ugly, but it proved something important: shortening paths dramatically reduced cognitive load. To make that concrete, here is the kind of transformation I was trying to achieve.
Before:
/mnt/c/users/driem/programs/python/ai_image_describer/main/tests/audioInformation_test.ts
After:
py/ai_image_describer/main/tests/audioInformation_test.ts
Saving keystrokes was important, but not as important as saving attention. Hearing a short, predictable path segment is far easier than listening through a long directory hierarchy every time the prompt appears. At Google, the savings was well over two thirds the path length, and left me more mental capacity to actually think about the problem at hand. That idea eventually evolved into the alias and namespace system I use today.
Making terminal sessions resilient
My workflow makes this problem harder to ignore. At Google, and still today, I am often working from home, a café, my desk, or even while camping in the middle of the desert, so most of my development happens inside a persistent tmux session on a remote machine. I jump between windows, reconnect from different devices, and keep long-running work alive in the background. That means I'm constantly re-orienting myself: switching panes, reattaching sessions, and resuming work. Every time I do, the first thing my terminal speaks is the full working directory. If that path is long, I'm back to waiting several seconds just to answer a simple question: "Where am I?"
Path aliases
The first step was simple: shorten frequently used paths. For example, instead of typing cd /mnt/c/users/driem, I can define an alias:
driem /mnt/c/users/driem
and then navigate with:
p driem
The p command is a thin wrapper around cd, backed by the alias map. I also use pd for pushd when I want a quick stack to bounce between locations. Both commands support tab completion over the alias names, so navigation stays fast even as the alias set grows. This alone reduces both typing and how much the prompt has to speak.
Each alias also exports an environment variable. For example, P_driem - which makes it easy to reuse paths in scripts:
tail -f $P_driem/log.txt
This helped, but it didn't scale well once directory structures became deeper and shortcuts naturally nested. That led to the namespace system.
Namespaced paths
Earlier versions of my environment experimented with something I called namespaced paths. The idea was to treat filesystem paths more like hierarchical identifiers than raw strings. After the sed rewriting pipeline, I briefly invented a space separated list of simple substitutions, applied in order. This was much better, but I started desiring something that wasnn't just a prefix match, and didn't strictly do a longest match either. This lead me to a new, more elegant, solution.
The new system is fundamentally a key-value store based on aliases. Each alias has two roles: a real filesystem expansion used for navigation, and a display representation used when rendering the prompt. That distinction allows the prompt to either preserve hierarchy for orientation or collapse it to reduce noise, depending on how an alias is defined. The prompt uses a display path rather than a literal filesystem path, optimized for readability rather than exact reproduction. In the rare case I need the real path, I'll just run pwd.
Example aliases:
driem /mnt/c/users/driem gmscripts [driem]/mydrive/software/gm_scripts py driem/programs/python easy_ui py/easy_ui
Literal paths
If an alias target starts with /, it is treated as a normal filesystem path:
driem /mnt/c/users/driem
Visible composition with [alias]
If the target starts with [alias], the referenced alias is expanded for the real path while remaining visible in the prompt:
gmscripts [driem]/mydrive/software/gm_scripts
This keeps the parent namespace visible when it carries useful meaning. When working inside gm_scripts, the prompt can display:
/driem/gm_scripts$
Hidden-prefix composition with alias/...
If the target starts with alias/..., the alias is expanded for the real path but its ancestry can be collapsed in the prompt:
py driem/programs/python easy_ui py/easy_ui
When working inside easy_ui, the prompt can simply display:
easy_ui$
while the filesystem path remains long.
Mental model of the algorithm
Conceptually the system resolves aliases while tracking two outputs: the expanded filesystem path, and the compressed display path.
resolve(value): if value starts with /: return real_path=value, display=value if value starts with [alias]suffix: expand alias for real path keep alias in display path if value starts with alias/suffix: expand alias for real path allow display to collapse the prefix
When rendering the prompt, the system finds the best matching expansion and substitutes the corresponding display form.
Concrete example
Real filesystem path:
/mnt/c/users/driem/programs/python/easy_ui
Prompt display:
easy_ui
The filesystem stays long and stable, but the spoken prompt stays short.
Hacks to make reading code faster
Another trick that speeds up spoken output is shortening how punctuation is pronounced. Instead of hearing every punctuation symbol spoken in full, I use shorthand pronunciations:
| symbol | pronunciation | shorthand |
|---|---|---|
| ( | left paren | par |
| ) | right paren | ren |
| [ | left bracket | brà |
| ] | right bracket | ket |
| { | left brace | curl |
| } | right brace | lea |
| : | colon | coal |
| ; | semicolon | dah |
| ... | ... | ... |
For example, a typical C-style loop would normally be spoken like this:
for left paren i equals zero semi i less ten semi i plus plus right paren left brace
With shorthand it becomes:
for par i eq zero dah i less ten dah i plus plus ren curl
This dramatically reduces how long it takes to listen to code. The learning curve was a little weird at first, but I adapted to it within a week.
Small shell tweaks that reduce friction
Most of the rest of the repository consists of small adjustments to default shell behavior.
Nice to have: Immediate history syncing
Normally Bash writes command history when a shell exits, which means that if you have multiple terminals open, commands from one session may not appear in another until much later. This configuration appends history immediately after each command so history search remains consistent across terminals, and survives reboots.
Per-machine overrides
The repository includes a .bash_local file that allows machine-specific configuration. Local settings are stored separately so the main configuration remains portable across machines. Something as trivial as the hostname adds words to my prompt, so the host is one such local knob, and only remote systems have that knob turned on.
Bootstrapping and configuration drift
Installers sometimes modify shell startup files without asking. To reduce ambiguity, this repository bootstraps itself by symlinking dotfiles into $HOME, so there is always one canonical copy. If .bashrc changes unexpectedly, the difference becomes immediately visible and I know exactly who to blame for messing with my config without asking.
WSL as a practical compromise
I split my time between WSL and native Linux. Strong accessibility tooling on Windows provides a great accessibility environment for me, while Linux offers the developer tooling I prefer. WSL lets me keep a Linux shell while still accessing the Windows GUI when necessary, Linux GUI accessibility is ... complicated.
A tiny Vim tweak that matters
set noru
This disables Vim's ruler display. For a screen reader, the ruler constantly announces line and column numbers, and turning it off removes unnecessary speech. In general, the only time I want changing text on screen is if that text matters right now.
Designing terminals for listening instead of looking
Most terminal environments assume the user is visually scanning the screen. When the interface is audio, different tradeoffs emerge:
| Visual shell design | Audio-first shell design |
|---|---|
| show lots of context | minimize repeated information |
| color cues | text cues |
| long paths are fine | long paths create audio noise |
| visual scanning | stable spoken landmarks |
| highlight changes | highlight importance |
Individually, these changes are small. Together, they turn the terminal into something I can navigate at the speed I can think. Interestingly, many of these ideas are useful even for sighted users. Shorter prompts, clearer cues, and stable navigation primitives improve terminal workflows regardless of how you interact with them.
The full configuration is available here: https://codeberg.org/derekriemer/dotfiles-public
disclaimers:
- This repo is a cleaned-up, minimal version of my actual setup. I’ve removed machine-specific config (SSH, hostnames, private paths), but the structure and workflows are the same as what I use day to day.
- My crude bash implementations are far from elegant, and do not even attempt to maximize performance. If the pipeline ever becomes noticeably slow, I'll likely rewrite the core matcher as a simple rust program that uses the current directory to walk a tree of substitutions or something like that. However, I'm not optimizing something that works well enough with no noticeable overhead to me, the intended user.