Comparing Machine Transcription Options from Rev and Sonix

As part of our continuing exploration of new options for transcription and captioning, two members of our media production team tested the automated services offered by both Rev and Sonix. We submitted the same audio and video files to each service and compared the results. Overall, both services were surprisingly accurate and easy to use. Sonix, in particular, offers some unique exporting options that could be especially useful to media producers. Below is an outline of our experience and some thoughts on potential uses.

Accuracy

The quality and accuracy of the transcription seemed comparable. Both produced transcripts with about the same number of errors. Though errors occurred at similar rates, they interestingly almost always occurred in different places. All of the transcripts would need cleaning up for official use but would work just fine for editing or review purposes. The slight edge might go to Rev here. It did a noticeably better job at distinguishing and identifying unique speakers, punctuating, and in general (but not always) recognizing names and acronyms.  

Interface

When it came time to share and edit the transcripts, both services offered similar web-based collaborative tools. The tools feature basic word processing functions and allow multiple users to highlight, strikethrough, and attach notes to sections of text. After it’s recent updates, the Rev interface is slightly cleaner and more streamlined. Again, the services are pretty much even in this category.

Export Options

This is where things get interesting. Both services allow users to export transcripts as documents (Microsoft Word, Text File, and, for Sonix, PDF) and captions (SubRip and WebVTT). However, Sonix offers some unique export options. When exporting captions, Rev automatically formats the length and line breaks of the subtitles and produces reliable results. Sonix, on the other hand, provides several options for formatting captions including character length, time duration, number of lines, and whether or not to include speaker names. The downside was that using the default settings for caption exporting in Sonix led to cluttered, clunky results, but the additional options would be useful for those looking for more control of how their captions are displayed.

Sonix also allows two completely different export options. First, users can export audio or video files that include only highlighted sections of the transcript or exclude strikethroughs. Basically, you can produce a very basic audio or video edit by editing the transcript text. It unfortunately does not allow users to move or rearrange sections of media and the edits are all hard cuts so it’s a rather blunt instrument, but it could be useful for rough cuts or those with minimal editing skills.

Sonix also provides the option of exporting XML files that are compatible with Adobe Audition, Adobe Premiere, and Final Cut Pro. When imported into the editing software these work like edit decision lists that automatically cut and label media in a timeline. We tried this with two different audio files intended for a podcast, and it worked great. This has the potential to be useful for more complicated and collaborative post-production workflows, an online equivalent of an old school “paper edit”. Again, the big drawback here is the inability to rearrange the text. It could save time when cutting down raw footage, but a true paper edit would still require editing the transcript with timecode in a word processing program.

And the winner is…

Everyone. Both Rev and Sonix offer viable and cost-effective alternatives to traditional human transcription. Though the obvious compromise in accuracy exists, it is much less severe than you might expect. Official transcripts or captions could be produced with some light editing, and, from a media production perspective, quick and cheap transcripts can be an extremely useful tool in the post-production process. Those looking to try a new service or stick with the one they’re familiar with can be confident that they’re getting the highest quality machine transcription available with either company. As more features get added and improved, like those offered by Sonix, this could become a helpful tool throughout the production process.


This entry was posted on Wednesday, June 26th, 2019 at 2:17 pm and is filed under Accessibility, Captioning, DDMC Info, Video Production. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Leave a Reply

Your email address will not be published. Required fields are marked *