MacSpeech Dictate: Voice Recognition for OS X

I have always held a fascination with speech recognition technology. Ever since experimenting with it in early versions of Microsoft Office, I’m regularly enthusiastic about trying a new dictation application. Unfortunately, they rarely meet my expectations. Commonly, I will experiment with one for a few days before it quickly becomes redundant. Speech recognition is a complex technology, and one very difficult to perfect.

MacSpeech Dictate is undoubtedly the leading speech recognition application for the Mac, designed for the platform from the ground up. At $200 it certainly doesn’t come cheap, but offers an incredibly powerful feature set and arrives bundled with a high-quality, noise canceling microphone headset.

This review will assess the quality of speech recognition in MacSpeech Dictate, take a look at the features on offer, and outline how it is capable of controlling your Mac.

The Microphone

MacSpeech comes bundled with a high-quality noise canceling headset. Mine was a Plantronics set, complete with a USB adaptor. It felt sturdy, and produced an excellent quality of sound. Full instructions on how to set it up for the best dictation are provided:

Microphone Instructions

Microphone Instructions

Creating a Profile

Before you get started, the first step is to create a profile. This involves training MacSpeech Dictate to the microphone and background noise, along with reading a passage to get the software accustomed to your voice.

The whole process takes around five minutes, and should be repeated whenever you move to a different setting (i.e. create one profile for quiet, home use, and another for use in a noisy office environment).

Profile Creation

Profile Creation

The Four Modes

Status Window

Status Window

There are four primary modes when using MacSpeech dictate. These are dictation mode, spelling mode, command mode, and sleep mode. I shall approach each of these in turn to offer a general overview of what’s possible with the application.

It’s also worth noting that the interface to MacSpeech itself is fairly simple. You’re provided with a menu bar icon for turning the microphone on/off, and a hovering status window displaying what you’ve recently spoken and a few useful hints about the current state of capitalization etc.

Dictation

In dictation mode, you can both dictate text to be typed and issue commands to be obeyed. This is the most common mode, and the one most likely to be turned on regularly.

MacSpeech Dictate differentiates between commands and dictation based upon the pause before and after you speak a phrase. If you pause before and after speaking a word, the application will consider that phrase as a possible command.

When in dictation mode, you can say:

  • Words, sentences and phrases to be typed
  • Instructions for spacing and capitalization
  • Punctuation (e.g. to enter a full stop, you simply say “full stop”)
  • Names of letters
  • Commands for controlling applications

It is also possible to train MacSpeech on-the-fly with new words, and edit text. For instance, speaking Insert Before the Word “Someword” will do exactly what you’d expect it to.

Spelling

In spelling mode, the application will type individual letters, numbers, and punctuation. It will also attempt to listen for commands. No spaces are automatically added in this mode, making it particularly useful for typing URLs.

Command

In command mode, no dictation occurs. The application will interpret everything you say as a command and control your Mac appropriately. Because commands can also be given in the other modes, this one is slightly redundant – useful only if you want to be sure what you say will, in fact, be taken as a command rather than text to be typed.

Sleep

Sleep mode is a special state in which MacSpeech will respond only to a very limited set of commands. These two commands are “wake up”, or “turn the microphone on”. This essentially allows you to pause the application, temporarily ignoring microphone input.

Accuracy & Phrase Training

From the outset, I was remarkably impressed with the level of accuracy in MacSpeech Dictate. After listening to my voice for less than 5 minutes, the recognition engine rarely made a mistake. Providing you speak clearly (like a newsreader, the training manual suggests), the software will surpass your expectations.

Obviously, it isn’t possible for the software to understand every phrase you may like to attempt – for instance, speaking the words “text edit” will produce, as you’d expect, the words “text edit” rather than “TextEdit”.

Fortunately, training the application to understand a different spelling is very straight forward. You only need to do it once and each subsequent time you say “text edit”, MacSpeech should determine that you’re talking about the OS X application spelling.

The “Golden Rule”

MacSpeech has an ongoing recommendation called “The Golden Rule” – essentially a warning that you shouldn’t combine mouse/keyboard interaction with dictation when working in a document. As you dictate, MacSpeech creates a cache of what has been inputted, along with tracking the current position of the cursor.

This makes it possible for you to say Select the words “Dear James”, and for MacSpeech to quickly know exactly where that text is. It can then send a signal to the application in question to move the cursor appropriately and select that text. If you’ve been moving the cursor manually or editing text with the keyboard, this cache system breaks down.

Although the software can dictate text into any application, MacSpeech comes bundled with a simple notepad application. This is designed to work especially well with the speech recognition system and offer the best results. It also allows greater leniency with regard to the “golden rule”.

Notepad

Notepad

Interacting with OS X

Whilst primarily useful for dictation, the software is also handy for navigating around OS X and controlling settings. Commands exist both globally (for use anywhere in the OS), and specifically for applications such as Finder, Safari and Mail.

Here are a few global commands that I found particularly useful:

  • “Paste from clipboard” – A quick way to paste something
  • “Press OK” – Will press the OK button for any open dialog box
  • “Capture Screen” – Takes a quick screenshot
  • “Quit [Name of Application]” – Speedily quit any open application
  • “Turn Dock Hiding On/Off” – Far quicker than right clicking the Dock to do it manually
  • “Activate [Name of Application] – Every time you launch MacSpeech, it takes an inventory of your Applications folder and allows you to launch each one by speaking it’s name.
  • “Web 100″ – A list of 100 popular websites accessible by simply saying, for instance, “jump to eBay”

Each application also has a phenomenally long list of commands, allowing you to perform specific actions appropriate to only that piece of software. For instance, speaking “reply to message” in Mail will initiate a reply to the currently selected message.

At any given time, you can show a list of currently available commands by speaking “Show Available Commands Window”. This is very useful when first getting to grips with the application

In Conclusion – Who is MacSpeech for?

MacSpeech Dictate is priced at $200, and more information is available at their website (including a few useful video examples). Other versions are also available for those in the legal or medial profession with a range of advanced in-built terminology. As you’d expect, these are considerably more expensive.

Broadly speaking, there are two types of people who would use this application: those who want to, and those who need to. Those in the former category may just like the idea of speaking to their computer, or may find that it makes them slightly more productive. For those who simply want a piece of dictation software, $200 may seem slightly expensive.

However, there are many Mac users who either have a physical disability or simply struggle typing for long periods. For this type of potential user, MacSpeech Dictate is an absolute must-have application. It makes using a computer without a keyboard and mouse possible, and does so with the style expected from an OS X application.

For this type of user, I cannot recommend MacSpeech high enough. Accuracy is excellent, controls are intuitive, and – whilst the price may seem steep – you receive one box containing everything you need to get started.


  • Royal8

    The headset concept still changes my perception of any application. What kind of results did you see using the built in microphone.

    • http://www.2dforever.com Tom

      I’ve tried this with the inbuilt microphone on a oldish MacBook. While it said the microphone quality wasn’t great, the results weren’t bad. But using a headset resulted in a much higher accuracy overall which makes it worthwhile.

  • http://www.twitter.com/secondfret Joshua Johnson

    Sounds awesome! I could use that for writing more lengthy AppStorm articles. :)

  • Zaphod

    “designed for the platform from the ground up.”

    Not true. MacSpeech Dictate uses the same engine as Dragon Naturally Speaking for the PC (they licensed it from Nuance). One of the reasons for the “Golden Rule” on the Mac is that the engine can’t obtain the necessary cursor/edit information from OS X to allow the mixing of voice dictation w/keyboard edits. Dragon Naturally Speaking doesn’t have this problem in Windows.

    So while MacSpeech is a nice app, its major shortcoming is that it *wasn’t* designed for the Mac platform from the ground up.

    • http://www.macchuck.com Chuck Rogers

      Actually, the engine is platform agnostic, so in terms of the way the program interacts with the user, MacSpeech Dictate is 100% designed from the ground up. This is to say that while the company may have learned important design lessons from its former speech recognition product, iListen, MacSpeech Dictate does not use one line of code from it. Neither does it use any code from Dragon NaturallySpeaking, the PC program from Nuance based on the same engine.

  • http://tomschlick.com trs21219

    It would be interesting to see how fast you can write php code with this haha :)

  • http://twitter.com/billyjp BillyJP

    “designed for the platform from the ground up.”

    @Zaphod It depends on where you consider the “ground” to be. ;-)

    Like Mac OS X is different from Windows or Linux, even though they run the same Intel “platform”. My understanding is the speech recognition engine…recognizes speech. That’s it. It doesn’t interact with text or a user. This part is all in the Dictate.app which MacSpeech built from scratch. Not 100% originated on Mac, but definitely not Dragon ported from Windows.

    Why is this important? You mention one reason: Understanding how to use the product, especially not mixing voice dictation w/keyboard edits. I support several lawyers who switched from using Windows/Dragon to using Mac/Dictate (as I did too). We had the darnedest time getting past this quirky “golden rule”. Then I “untrained” everyone from Windows/Dragon and learned the correct way to use Mac/Dictate…from the ground up…on Mac. Works great for dictation, now.

    If you use Windows, Dragon is better. For Mac users, nothing is better than Dictate.

  • http://www.jaywingard.com Jay Wingard

    I own this software and use it to dictate my teaching notes for several Bible classes I teach. This software is amazing. It takes very little training and even correctly handles less common words and even some of the less known Biblical words. This is great software for the price.

  • http://twitter.com/JoelDrapper Joel Drapper

    Is there any kind of demo? I don’t want to spend that much on a piece of software that might not work well for me. Not only do I have no guarantee that it will work with my voice, but I have no idea if I’ll work talking rather then typing to my computer…

  • BillyJP

    @Joel Drapper: You have my recommendation. ;-)

    They include a microphone, which is an important piece. Plus there’s so much vocabulary data it comes on a DVD. That’s MacSpeech’s reasoning for no try before you buy. Perhaps they have a money back guarantee?

    Also, it’s Intel-Mac only and now Leopard or Snow Leo only.

  • http://speech.salmat.com.au John Livers

    MacSpeech Dictate is a very power voice recognition software, but the price might turn many away. Apple has already incorporated SIRI into its iOS devices, and that is also a great voice recognition service.

theatre-aglow
theatre-aglow
theatre-aglow
theatre-aglow