Voice Control
Assistive technology that allows users to operate computers and mobile devices using spoken commands, enabling people with motor disabilities to navigate, type, and interact without a keyboard or mouse.
In simple terms: Voice control lets you use your computer or phone just by talking to it. You can say things like 'click search' or 'open email' and the device does what you say, so people who cannot use their hands can still do everything.
What Is Voice Control?
Voice control is an assistive technology category that allows users to operate computers, smartphones, and other digital devices entirely through spoken commands. Unlike voice assistants that respond to specific queries (such as asking for the weather), voice control provides comprehensive device interaction, enabling users to navigate interfaces, select elements, enter text, scroll pages, and perform any action that would normally require a keyboard or mouse. Voice control is primarily used by people with motor disabilities who have difficulty using traditional input devices. This includes people with repetitive strain injuries, quadriplegia, cerebral palsy, arthritis, limb differences, and other conditions that affect hand and arm function. It is also used by people with temporary injuries and by some users as a productivity tool. The major voice control systems include Dragon NaturallySpeaking (now Nuance Dragon), which has been the commercial standard for decades; Apple Voice Control, built into macOS and iOS since 2019; and Windows Voice Access, built into Windows 11. Each allows users to dictate text, issue navigation commands, and interact with interface elements by speaking their names or numbers. Voice control represents a fundamentally different interaction paradigm than keyboard or mouse input. Understanding how voice control users interact with web content reveals why certain accessibility practices, particularly around naming and labeling, are critical.
Why It Matters
Voice control users interact with websites in ways that directly depend on specific accessibility features. Understanding these interactions explains why certain WCAG success criteria exist. **Label in Name (WCAG 2.5.3).** This is the single most important accessibility consideration for voice control users. Voice control users interact with buttons, links, and form fields by speaking the visible label. If a button displays "Search" on screen, the user says "click Search." This works only if the element's accessible name (the name exposed to the accessibility API) includes the word "Search." When developers override visible text with a different `aria-label`, voice control users see one label but must guess a different name to activate the element. WCAG 2.5.3 requires that the accessible name include the visible label text. **Visible and descriptive labels.** Voice control users need to see a label in order to speak it. Icon-only buttons without visible text, unlabeled form fields, and generic link text all create barriers. Even when ARIA provides an accessible name for screen readers, voice control users cannot see ARIA labels on screen. Visible text labels benefit both screen reader users and voice control users. **Consistent identification.** When the same functionality appears across multiple pages with different labels, voice control users must learn different commands for the same action. Consistent labeling across a site reduces cognitive overhead for voice control users. **Unique accessible names.** When multiple elements on a page share the same visible label (such as several "Read more" links), voice control systems cannot determine which one the user intends. The system typically displays numbers next to each matching element, requiring the user to say the number. Unique, descriptive labels eliminate this ambiguity.
How It Works
Voice control systems offer several methods for interacting with interfaces: **Name-based commands.** The primary method is speaking an element's visible label preceded by an action word. "Click Submit," "tap Search," or "press Enter" all instruct the system to interact with the named element. The voice control software matches the spoken label against visible text and accessible names on screen to find the target element. **Number overlays.** When name-based commands are ambiguous or unavailable, the user can say "show numbers" (or a similar command depending on the software). The system displays a number next to every interactive element on the screen. The user then says the number of the element they want to activate. This works as a fallback but is slower and more cognitively demanding than name-based interaction. **Grid navigation.** For precise screen positioning, voice control systems can overlay a numbered grid on the screen. The user says a grid number to zoom into that area, and the grid subdivides for finer targeting. This allows clicking on elements that lack proper labels but is the slowest and most cumbersome interaction method. **Dictation.** When a text field is focused, voice control switches to dictation mode, converting speech to text. Users can also speak punctuation, formatting commands, and editing instructions ("select all," "delete that," "capitalize that"). Modern voice control dictation is highly accurate and supports continuous natural speech. **Navigation commands.** Voice control supports page-level navigation commands like "scroll down," "scroll up," "go back," "go to address bar," and "press Tab." These allow users to navigate without targeting specific elements. **Web accessibility considerations.** For websites to work well with voice control, visible labels must match accessible names (WCAG 2.5.3). All interactive elements need visible text labels, not just icon-only representations. Form fields must have associated visible labels, not just placeholder text. Link text should be descriptive and unique within the page. Focus management should work correctly because voice control systems often rely on focus state. Avoid using `aria-label` or `aria-labelledby` to override visible text with different content. **Testing for voice control.** Test by comparing every element's visible text with its accessible name (inspect in browser dev tools or use an accessibility testing tool). If the accessible name does not contain the visible text, voice control users will have trouble. Also check that all interactive elements have visible labels and that similar functions use consistent names across pages.
Frequently Asked Questions
- What is the difference between voice control and voice assistants like Siri or Alexa?
- Voice assistants respond to specific queries and commands. Voice control provides full device control, allowing users to navigate interfaces, click elements, type text, and perform any action a mouse or keyboard could. Dragon NaturallySpeaking, Apple Voice Control, and Windows Voice Access are voice control tools.
- Why do accessible names matter for voice control users?
- Voice control users interact with elements by speaking their visible labels. If a button shows 'Submit' visually but has an aria-label of 'Send Form,' the user says 'click Submit' and nothing happens. WCAG 2.5.3 (Label in Name) addresses this by requiring that accessible names include the visible text.
- What is the most common voice control software?
- Dragon NaturallySpeaking (now Nuance Dragon) has been the leading commercial voice control software for decades. Apple Voice Control (built into macOS and iOS) and Windows Voice Access (built into Windows 11) are free built-in alternatives.
Need help making your website ADA compliant?
Our team specializes in ADA-compliant web design and remediation. Get a free accessibility audit today.
Last updated: 2026-03-15