The Marchex Blog

Nearly Human: How Marchex is Leading the Way in AI-Powered Voice Transcription

Did you say coach or couch? Was that leaf or leave? Did I hear hello or marshmallow? When it comes to recognizing speech, accuracy DOES matter.

Our team of engineers has been heads down with the goal of developing and training the most accurate speech recognition platform in the industry. And we did just that – almost reaching parity with human accuracy. In fact, did you know that the average human level of performance for speech recognition, or rather the word error rate (WER), is about 5 percent? With its recent enhancements, Marchex speech analytics technology achieved a new industry record for automatic transcription accuracy, with an 8.4 percent WER for consumer-to-business phone conversations. That’s a more than 20 percent improvement since 2017, and well below the industry average comparatively.

How Marchex Compares to IBM, Amazon, Google, and Microsoft

Marchex Speech Analytics was put to the test against several other commercially-available voice recognition solutions and was evaluated internally on a series of tests to directly compare capabilities. Using the same sample audio data from the business conversation domain, the test measured the word error rate (WER) which tracks the number of words that are inserted, deleted or substituted in order to discern overall accuracy of a transcription. Marchex’s solution scored favorably across the board, achieving an overall WER of 8.4 percent, which is more than 35 percent more accurate than either IBM Watson’s or Microsoft Azure’s respective solutions. Amazon’s tool reached an overall WER of 9.5 percent while Google Speech (phone) had a 9.7 percent overall WER.

Word Error Rate (WER) Performance Comparison Results:

Why does accuracy matter?

For industries with a significant reliance on phone calls from consumers to drive appointments and sales, such as automotive manufacturers, AI-powered conversation analytics technology offers a better understanding of the customer experience and a view into exactly what happens during every call. Accuracy plays a huge role in being able to automatically pinpoint the reason why a customer is calling and the outcome of the conversation and log these classifications in real time. If you are a business looking to implement a speech analytics solution, having the most reliably accurate transcription outcome might just be the competitive edge that puts you on top.

Listen, learn, repeat

Today, Marchex’s call and speech analytics business handles more than one million calls per business day, analyzing tens of millions of minutes of audio per week. These spontaneous consumer-to-business phone calls occur on a mixture of mobile phones and landlines, capturing everyday North American dialogue in many possible accent variants. The more our speech recognition systems train and learn from that data, the more accurate transcriptions become.

Over the course of the past few years, we have invested in technology and talent that would allow us to build an unparalleled conversational analytics solution. With the knowledge and experience of our data scientists who have deep knowledge in the consumer-to-business industry, we have built the best-in-class speech recognition system capable of a level of accuracy previously unavailable on the market.

The benefactors of these advancements are the more than 1,200-and-growing brands that trust Marchex to help them enhance marketing and sales to drive stronger business results.

For more information about Marchex Speech Analytics visit the product page.

Test sources: