MacSpeech Dictate: First Impressions

Thursday, November 19, 2009

I suspect that the first thing any developer does when they get their hands on MacSpeech Dictate is to begin writing a review of MacSpeech Dictate in MacSpeech Dictate. I am no exception.

Over the last three months I’ve develop more pain than usual in my wrists and elbows. It’s not been painful enough to stop me from typing altogether, but certainly enough to make me stop and think about the amount of time I spend typing at the computer.

My dad has been playing with Dragon Naturally Speaking on and off for quite some time now, so speech recognition software has been on my radar for years. In fact, I suspect he personally financed the software’s research and development over the first few years of its life… Of course the software was initially more or less worthless, its output was simply miserable through the first few versions. But my dad continued to upgrade. Year after year he picked up a new version of Dragon, installed it, spent hours training at, and eventually began to actually get things done using the software. Now it’s my turn.

So, first impressions (beyond the fact that it’s simply strange to talk instead of type):

I’m a little surprised at the delay between my speech and its recognition. It shouldn’t be surprising when considering how much work is involved in this sort of thing, but it is. I speak, and about a second and a half later text appears on the screen. As long as I’m not actually paying attention to the screen, this doesn’t bother me at all. At the moment, however, I’m paying quite a bit of attention to the screen, so it’s a little distracting.

The text is generally correct, so that delay is easy enough to ignore. There is a slider in the preferences that offers a balance between speed and accuracy. I’ll play with that later, as I’m interested in the effect it has on the text produced.
I’m very surprised at the out-of-the-box accuracy. Dictate is doing a much better job than I expected it to, especially given my lack of experience with speech recognition software, and the seemingly insubstantial amount of text used to train the program. I read to it for about 10 minutes total and based on that small amount of training, I’m getting something like 90% accuracy. Reading back over the text that Dictate has produced, I certainly see some issues: it’s not perfect by any means. I’ll need to go back and edit a few portions, but overall I’m impressed.
I need to get used to the command structure. Specifically, I need to train myself to pause before and after commands. Dictate will delete the last phrase when I say “scratch that,” and delete the last word when I say “scratch word.” If I speak too fluidly, Dictate doesn’t recognize “scratch that” as a command. When I make a mistake I have to remember to pause a moment before correcting myself. That’ll take some time to get used to.
Dictate forces me to pay much more attention to the words I choose then I usually do. I normally write very fluidly, editing myself as I go. I’ll write half a sentence, jump back to the paragraph before to make a change, delete the sentence I’d started, rewrite the sentence, and start again. That process simply doesn’t work with speech recognition software. It’s much more difficult to jump around in the document, so it’s much more important for me to address a topic more linearly. I have to think before I speak, which is new to me.

Overall, I like this. I think it’s something I can get used to. It’s certainly better on my hands and wrists, and that’s something I very much appreciate at the moment. So far, it’s looking like a worthwhile purchase. Let’s see how I feel about a week or two.

A note on process: I dictated this document via Dictate in a bit more than half an hour, stopping relatively often to figure out exactly what I was doing. I then copied the text out of Dictate’s notepad, pasted into a real editor, and cleaned up the text. I’m not thrilled with workflow, but it’s not bad. Much less typing, which I’m certainly happy about.

— Mike West