Let’s learn about audi...

March 29, 2011

Creating pre-recorded audio files is a complicated and involved process that’s exacerbated by the fact most people don’t have a firm grasp of how an audio file format is specified in the first place.

When audio is recorded on a computer, it is encoded as a series of numbers that, when read and decoded by the IVR, can be converted back into sound. In order for this data to be encoded and then, in turn, successfully decoded and converted into sound, the encoder and decoder both need to agree on a set of descriptors for what the numbers represent.

The typical descriptors for an audio file recorded in a non-lossy format are as follows:

  • audio format: linear PCM, u-law, a-law are all examples of audio formats which each specify a different way to map from a numerical data point in a file to a real sound generated by a speaker.
  • bit depth: the number of bits used to specify each data point. Linear PCM, for instance, is usually 8 or 16 bits. u-law and a-law are always 8-bits.
  • number of channels: 1 for mono, 2 for stereo, etc.
  • frequency: the number of data points written to the audio file per channel per second. This is measured in hertz (Hz)

The Plum IVR can handle audio files that are 16-bit linear PCM, 8-bit u-law, or 8-bit a-law, single channel (mono) recordings sampled at 8000 Hz. These descriptors are important for IVR for a couple reasons. First, if you try to use an audio file that was not recorded with an acceptable encoding, the Plum IVR will not be able to play it. Second, when you initially record your file, it’s always preferable to record it in one of these formats so you won’t have to re-encode the file and possibly introduce noise artifacts into your audio file. Finally, third, these three formats were chosen because they could all be re-encoded with minimal or zero quality loss to 8000Hz mono 8-bit ulaw — the standard audio encoding format used by the U.S. public telephone system.

This leads to the final question: how does the encoder and decoder agree on the encoding format for the data? We shall discuss encapsulation next week…

Leave a Reply