Speech to Text Software

Many of us have had to transcribe a recording of some type at one point or another. Maybe you wanted a quote from an important interview. Perhaps you make personal voice memos and then write them all down later. Or you may just want to generate a searchable text from a long speech. In any case, the process is tedious and can take up a significant amount of your time. That's where transcription services can be of assistance. The process is simple; just upload a file, select your options, and add a payment method. Wait a bit, and the best of them generate very usable transcripts without the headache.

There are a couple of things to consider before choosing a transcription service, however. First, you need to determine the complexity of your file, since that will determine whether you can use an automatic or human-based service. Accuracy is the most important concern of all, and choosing the wrong type of service at the outset might leave you with a significant amount of editing to do. Cost is also an important factor. Although most transcriptions services charge on a per-minute basis, prices vary, and some services offer bulk plans at better values. Before you commit to any service, make sure to read the rest of our guide and click through the blurbs below to read the full reviews, too.

Types of Transcription

The cheapest (and probably most accurate) way to transcribe an audio or video file is to do it yourself. In other words, you listen to the audio file and type or dictate what you hear. You can do this with any number of programs, but it's often cumbersome to synchronize media playback to your typing speed. The biggest disadvantage of this method is the time and effort it requires. Between laboring over every word, setting up the correct formatting, and reviewing the finished product, it's enough to steer most people towards a dedicated service.

Automatic transcription services are the next step up from the manual approach. For this method, you upload files to a program that processes the audio quickly using automatic speech recognition (ASR) and spits back out a transcript. This voice recognition method sometimes includes extras that may not typically be free, such as time stamping and basic speaker identification. The downside of automatic services is that they are far less accurate than other methods. Otter, Trint, Temi, and Scribie all offer automated transcription services—Scribie also offers a human-based service.

With higher-tier transcription services, a trained transcriptionist (often more than one) completes the work on your file. These services are highly accurate, but they're also pricier and typically require a longer turnaround time. This also introduces some privacy concerns, but all the services we've reviewed operate under strict NDA policies and let you remove your files from their servers at any time.

Cost and Turnaround Time

Many transcription services charge on a per-minute basis. For example, a 30-minute transcription at $1 per minute would cost $30. Costs can quickly add up, and some services bill extra fees for a faster turnaround, for verbatim files (including all the "ums" and "ahs"), or if the audio is of poor quality. Trint is unique in that it offers both a per-hour rate and a subscription tier for individuals and teams who need it to process multiple hours of recordings each month. Otter and Scribie's automated transcription tiers are the only free options we've reviewed, though the former will eventually require a monthly subscription.

As you might guess, the amount of time it takes to turn around a file usually depends on its length. Automatic services can typically process a file in a matter of minutes. Human-based services take quite a bit longer and you may have to pay for faster delivery speeds. Rev is simple in that it promises to return your file (in most cases) in a 12-hour timeframe. Scribie's and GoTranscript's slowest options (five days) are also their cheapest, though you can fast-track that to 12 hours for an additional $1.60 per minute in both cases. There are also intermediary options for both services.

Audio Quality and Accuracy

One of the most important things you can do is to ensure an accurate transcription is to capture a high-quality recording of a conversation or interview in the first place. It is vital that your subjects are close by and speak in loud, clear voices. If there are multiple speakers present for a recording, participants should only speak one at a time to avoid interference. Most services also point out that speakers with heavy accents may also pose some issues, though there's not much you can do to avoid that. Audio editing software such as Audacity can clear up some issues, but it's not a miracle-worker. It's also worthwhile to use a dedicated digital voice recorder. In-person recordings also produce better results than recordings of phone calls.

In our testing, the overall accuracy of transcripts varied considerably. We evaluated with two different files: a recording of a conference call with multiple speakers and an in-person interview with just two participants. The human-based services did a (mostly) excellent job with the more difficult file, whereas the automatic ones produced nearly unusable results. The latter did considerably better on the second (easier) recording, but they still weren't perfect. Keep in mind that your experience may vary, as we cannot control every variable in tests of human-based transcription services.

Basically, the automatic services are only useful if your recording is on the simple side and you do not need the utmost accuracy. They are fine for personal voice memos and similar applications, but not for a professional setting.

Edit and Revise

Regardless of the service you choose, chances are that you will need to correct some parts of your transcript. As such, most services include a built-in editor for making these changes before you export the final document. Typically, these interfaces combine playback controls with a text editor. This is much more convenient setup, then say, switching between a document and audio player every couple of minutes. In the case of the human-based transcription services, these web editors are often just modified versions of what the freelance transcriptionists use themselves.

Some include extra tools for highlighting selected parts of a transcript or editing the start time of the recording. Playback speeds and quick rewind buttons (all controllable via keyboard shortcuts) are also fairly standard. GoTranscript is notably the only service that does not offer an online editor; your only option is to edit the exported transcript after it completes a job.

Mobile Apps and Manual Alternatives

GoTranscript, Otter, Rev, Trint, and Temi also offer mobile apps in addition to their web dashboard. All offer both Android apps and iPhone apps. For the most part, these apps function as digital voice recorders, but they do let you order transcripts of the recordings directly from your mobile device. The drawback is that you can't import audio files or links the way that you can via their respective web interfaces. Many let you view the completed transcript directly on your device. Otter goes one step further than the others with excellent organizational features and the ability to edit transcripts on the go.

If you want to avoid the transcription services entirely—for privacy reasons or to save on costs—there are alternatives. For doing your own manual transcriptions (you listen to the recording and type what you hear), Transcribe is a great option, at only $20 per year. It has great built-in keyboard shortcuts and useful playback modes that reduce the number of times you need to pause and rewind.

For those who don't want to spend any money, Google Docs may be the best solution. With Google Docs, you can use its voice typing feature to put words down on the page, which is certainly quicker than typing everything out. Another completely free option is oTranscribe, but it operates more similarly to Transcribe, with a similar layout and set of keyboard controls.

Convert Conversations

Any transcription method or service you choose is better than simply letting your recordings go to waste. Yes, transcribing can be a hassle and some services are costly, but the value of accurate and usable transcripts far outweighs these annoyances. At least one of the services in the chart should suit your needs; make sure to read our full reviews for help picking the right one. Do you use a service not mentioned here? Let us know in the comments and it may make the chart in our next update.

Bottom Line: GMR Transcription offers a rich set of services and produces reasonably accurate transcripts, but it costs more than competitors and its interface is dated and confusing.

Cons: Expensive. Mixed performance in testing. No mobile apps.

Bottom Line: Transcription service Sonix has an excellent web editor and innovative features (such as embeddable transcripts and multiple collaboration roles), but it's expensive and its performance was iffy in our testing.

About the Author

Ben Moore is a Junior Analyst for PCMag’s software team. He has previously written for Laptop Mag, Neowin.net, and Tom’s Guide on everything from hardware to business acquisitions across the tech industry. Ben holds a degree in New Media and Digital Design from Fordham University at Lincoln Center, where he served as the Editor-in-Chief of The Observer, the student-run newspaper. He spends his free time taking photos and reading books. You can follow him on Twitter at @benmoore214. See Full Bio