Getting real with speech

As part of the continuous research into new ideas, I’ve been helping out research and evaluate RealSpeak Telecom from Nuance.

One thing that’s not instantly clear from the documentation is how to play back the ‘standard.pcm’ file that the standard demonstration application they provide generates. To be fair to them, they do tell you that the generated file is a PCM wave file which is 16-bit 8kHz. What they leave out of the documentation is any kind of suggestion of the software capable of playing this back easily.

I tried playing the file back in Quicktime, Audacity, WinAmp and tried converting it using dbPowerAmp, but to no avail. Thankfully the helpful people at Nuance replied to the incident I raised with them to say that I should try using GoldWave. Sure enough, trying to load the standard.pcm file resulted in GoldWave admitting it was having trouble loading the file, and offering a dialog box for me to set the details.


And there was the Attributes and Rate boxes that I’d been missing in other software. Setting these to the above settings results in the file playing back perfectly!

It was a hark back to the heady days of the 80s and 90s when I first became exposed to the emerging world of home computing, I was reminded of sample frequencies all over again which were referred to in tracker music. The Mod and Demo communities were rinsing everything they could get out of the Amiga and Acorn series with some very impressive demos that fitted on one floppy disk.

