Copyright Michael Karbo, Denmark, Europe.


  • Next chapter.
  • Previous chapter.


    Chapter 7. Sound compression

    Data media such as DVD and CD disks, etc. are expensive and have a limited capacity. So it is not unimportant how much space the sound data takes. Just like data ­trans­port, for example via the Internet, costs time and also kroner (or euro).

    The music business’ distribution of music on CD, SACD and DVD-A does not require that the music be compressed. But in almost all other cases, there are good reasons for it and it is easy to compress music data. We will be taking a look at it in this chapter, where I will describe the mechanics behind the sort of compression found in Sony’s MiniDisc, mp3 files and digital soundtracks in DVD films.

    Need for compression

    The problem with uncompressed sound is, that it takes up a lot of space. It’s easy to calculate how much. If we, for example, sample for three minutes with 44,1 kHz and with 16 bit’ resolution, this will result in, an amount of data of almost 16 megabytes. The calculation looks like this, you have to remember that 16 bits = 2 bytes:

    3 minutes = 180 seconds

    180 seconds x 44.100 samples/second x 2 channels
    = 15.876.000 samples

    15.876.000 samples x 2 bytes = 30,3 MB

    Figure 32. Three minutes of music fills more than 30 MB in an uncompressed CD format.

    If music is sampled with a higher frequency and resolution – then the amount of data will be correspondingly greater.

    Digital data is suitable for transmission and copying for example via computer networks. But if digital music is to be transmitted via a network, then too much data won’t do.
    And not at all, if it is to be transmitted via the Internet. An Internet connection can transfer from 300 KB and up to maybe 4 MB data per minute (with a powerful ADSL line). A quick calculation shows that it will take far too much time to transmit uncompressed sound. Even the fastest ADSL line would take almost 80 minutes to transmit 30 minutes of music:

     

    Internet

    Bandwidth

    Transmission time
    for 300 MB data

    K56-modem

    350 KB/minute

    877 minutes

    ISDN

    480 KB/minute

    640 minutes

    ADSL 256

    1,9 MB/minute

    158 minutes

    ADSL 512

    3,8 MB/minute

    79 minutes

    Figure 33. Approximate times for the transmission of 30 minutes of stereo music, which takes up 317 MB in an uncompressed CDquality.

    Because of the explosive development of the Internet in the 1990s, among other things, a need for compressed sound data arose, so that music, etc. could be distributed more easily.

    Compression without audible loss

    All types of data can be compressed, and there are fundamentally two forms of compression: with or without loss. When the data is sound (like music) then it is not possible to compress very much, unless you accept some loss. The aim is, therefore, to compress sound data with as little audible loss as possible.

    You can make a comparison with the world of graphics, where digital colour photographs are compressed with the loss giving JPEG-algo­rithms. But even though the images’ data is compressed very greatly, the photographs are still very good. It is the same with digital sound. With the use of different algorithms (methods) details can be removed from the sound data, which we can’t hear anyway.

    Figure 34. The principle behind compression. The trick is to remove the sound information, which can’t be heard anyway.

    Try and put yourself in an extreme situation. We are going to make a three-minute stereo recording with two microphones. The sound is sampled with the familiar 44,1 kHz and with a 16-bit resolution. But what we are going to record is silence! Three minutes’ stereo recording will take up 30 MB in a PCM format (as described in Figure 32).

    So three minutes of silence takes up more than 30 MB, which is an enormous waste because no sound information has been recorded. The principle is the same with normal music – there is a lot of superfluous data, which can easily be cut out without the quality of the music being reduced.

    Superfluous data

    With the help of special software, uncompressed PCM sound can be processed so that ”superfluous” data can be removed in a lot of ways. One example of this is that it takes a huge amount of data to keep noise down.

    Noise, however, is only a problem, if the music otherwise has a low sound level. Which is why sound is encoded so that more noise is accepted in the passages with a high level of sound­. The good signal/noise conditions are only kept in the soft passages of the music. This is just one of many mechanisms, which can compress the digital data of sound.

    These compression mechanisms are used as mentioned in a number of sound formats like for example:

  • Mp3 and ATRAC (MiniDisc).

  • Real, QuickTime and Windows Media
    (streaming media).

  • AC-3 (Dolby Digital), DTS and MPEG-2, all of them Home Cinema and DVD sound.

    Common, too, for all compressed sound formats are the fact that there has to be software, and thus hard­ware too, which can encode and decode the formats. These mechanisms (codecs) have to be built into the sound device, which manages the digital sound:

     

    Software

    Found in

    Mp3 decoder

    Computer software, mp3 players, many DVD and CD players

    Mp3 coder

    Computer software, mp3 recorders

    ATRAC decoder

    MiniDisc players

    ATRAC coder

    MiniDisc recorders

    AC-3 decoder

    DVD players

    AC-3 coder

    Professional recording equipment

    Figure 35. Software is required, if digital sound is to be encoded and decoded.

    A more detailed description of the mp3 format and its possibilities follows later in this booklet, but as mentioned, all methods are built on the same principle: to remove as much of the sound information that we cannot hear anyway as possible. Read more about codecs, too, later in the booklet (in chapter 24).

    If we try to illustrate the conversion from analog sound to digital sound and the reproduction of it, then it could look like this:

    Figure 36. The path from sound recording to reproduction via, for example, the mp3 format goes from analog signals to digital data with compression and back to analog signals.

    Reduced quality

    Sound compression isn’t done in one particular way. Just like digital photography can be compressed with different JPEG quality, digital sound files can be encoded in variable degrees.

    In practice many of compression’s different algorithms can be varied, so that they can work more or less powerfully.  So, sound can be compressed to a higher or lower degree. The more powerful the compression, the worse the quality of sound; that’s the way it is. Compression removes information, and all other things being equal, can only reduce the quality of sound. This reduction is experienced in several ways:

  • Fewer nuances in the music, as, for example, reduced feeling of depth

  • Digital noise (so-called artefacts), which is unwished for sound information.

    What is brilliant with a format like mp3, etc. is, that you yourself can decide how good the quality should be. If you choose a weak compression, you will get a compressed sound file, which is almost identical to the original recording. With a little more powerful compression, you will get a minimal reduction in quality.

    Variable bandwidth

    A variable bit rate is necessary for varying the compression. This means choosing in advance how much room the final sound file should fill in advance. You give the amount of data to be played per second. The amount of data is measured in bits or more correctly in kilobits.

    In Figure 37 you can see the bit rates, which Windows normally can work with (when a sound card is installed in the computer):

    Figure 37. Different bit rates for compression of sound in Windows.

    A bit rate is a measure for bandwidth. In the table in Figure 33 (on page 3) I have listed the bandwidth of different kinds of Internet connections measured in kilobytes per minute. Music, however, is compressed in the above-mentioned kilobits per second. It is written as Kbps, which is also seen in Figure 37.

    Figure 38. Mp3 compression with different bit rates.

    Please note that in the next to last line the bit rate of 6,5 Kbps and further up bit rates of 28Kbps. These are very powerful compressions, which give really small files. They are not suitable for music.

    Music is compressed to bandwidths between 64 and 320 Kbps. Most experts agree that a compression of 256 Kbps gives a sound reproduction as close to the original as possible. In practice, however, a bit rate of 128 Kbps is the most common. This gives, in my opinion, an acceptable sound quality:

     

    Bitrate

    Amount of data
     (circa) per minute

    Quality

    28 Kbps

    220 KB

    For speech

    64 Kbps

    500 KB

    Acceptable

    128 Kbps

    1 MB

    Very good

    192 Kbps

    1,5 MB

    Excellent

    256 Kbps

    2 MB

    Perfect

    Figure 39. Bit rates, used for compression of stereo sound.


  • Next chapter.
  • Previous chapter.


  • Book overview.