Copyright Michael Karbo and ELI Aps., Denmark, Europe.

  • Next chapter.
  • Previous chapter.

    29. Data and instructions

    Now it’s time to look more closely at the work of the CPU. After all, what does it actually do?

    Instructions and data

    Our CPU processes instructions and data. It receives orders from the software. The CPU is fed a gentle stream of binary data via the RAM.

    These instructions can also be called program code. They include the commands which you constantly – via user programs – send to your PC using your keyboard and mouse. Commands to print, save, open, etc.

    Data is typically user data. Think about that email you are writing. The actual contents (the text, the letters) is user data. But when you and your software say “send”, your are sending program code (instructions) to the processor:

    Figure 80. The instructions process the user data.

    Instructions and compatibility

    Instructions are binary code which the CPU can understand. Binary code (machine code) is the mechanism by which PC programs communicate with the processor.

    All processors, whether they are in PC’s or other types of computers, work with a particular instruction set. These instructions are the language that the CPU understands, and thus all programs have to communicate using these instructions. Here is a simplified example of some “machine code” – instructions written in the language the processor understands:

    proc near

    mov AX,01

    mov BX,01

    inc AX

    add BX,AX

    You can no doubt see that it wouldn’t be much fun to have to use these kinds of instructions in order to write a program. That is why people use programming tools. Programs are written in a programming language (like Visual Basic or C++). But these program lines have to be translated into machine code, they have to be compiled, before they can run on a PC. The compiled program file contains instructions which can be understood by the particular processor (or processor family) the program has been “coded” for:

    Figure 81. The program code produced has to match the CPU’s instruction set. Otherwise it cannot be run.

    The processors from AMD and Intel which we have been focusing on in this guide, are compatible, in that they understand the same instructions.

    There can be big differences in the way two processors, such as the Pentium and Pentium 4, process the instructions internally. But externally – from the programmer’s perspective – they all basically function the same way. All the processors in the PC family (regardless of manufacturer) can execute the same instructions and hence the same programs.

    And that’s precisely the advantage of the PC: Regardless of which PC you have, it can run the Windows programs you want to use.

    Figure 82. The x86 instruction set is common to all PC’s.

    As the years have passed, changes have been made in the instruction set along the way. A PC with a Pentium 4 processor from 2002 can handle very different applications to those which an IBM XT with an 8088 processor from 1985 can. But on the other hand, you can expect all the programs which could run on the 8088, to still run on a Pentium 4 and on a Athlon 64. The software is backwards compatible.

    The entire software industry built up around the PC is based on the common x86 instruction, which goes back to the earliest PC’s. Extensions have been made, but the original instruction set from 1979 is still being used.

    x86 and CISC

    People sometimes differentiate between RISC and CISC based CPU’s. The (x86) instruction set of the original Intel 8086 processor is of the CISC type, which stands for Complex Instruction Set Computer.

    That means that the instructions are quite diverse and complex. The individual instructions vary in length from 8 to 120 bits. It is designed for the 8086 processor, with just 29,000 transistors. The opposite of CISC, is RISC instructions.

    RISC stands for Reduced Instruction Set Computer, which is fundamentally a completely different type of instruction set to CISC. RISC instructions can all have the same length (e.g. 32 bits). They can therefore be executed much faster than CISC instructions. Modern CPU’s like the AthlonXP and Pentium 4 are based on a mixture of RISC and CISC.

    Figure 83. PC’s running Windows still work with the old fashioned CISC instructions.

    In order to maintain compatibility with the older DOS/Windows programs, the later CPU’s still understand CISC instructions. They are just converted to shorter, more RISC-like, sub-operations (called micro-ops), before being executed. Most CISC instructions can be converted into 2-3 micro-ops.

    Figure 84. The CISC instructions are decoded before being executed in a modern processor. This preserves compatibility with older software.

    Extensions to the instruction set

    For each new generation of CPU’s, the original instruction set has been extended. The 80386 processor added 26 new instructions, the 80486 added six, and the Pentium added eight new instructions.

    At the same time, execution of the instructions was made more efficient. For example, it took an 80386 processor six clock ticks to add one number to a running summation. This task could be done in the 80486 (see page 40), in just two clock ticks, due to more efficient decoding of the instructions.

    These changes have meant that certain programs require at least a 386 or a Pentium processor in order to run. This is true, for example, of all Windows programs. Since then, the MMX and SSE extensions have followed, which are completely new instruction sets which will be discussed later in the guide. They can make certain parts of program execution much more efficient.

    Another innovation is the 64-bit extension, which both AMD and Intel use in their top-processors. Normally the pc operates in 32-bit mode, but one way to improve the performance is using a 64-bit mode. This requires new software, which is not available yet.

  • Next chapter.
  • Previous chapter.