第二章 Instructions Language of the computer
I speak Spanish to God, Italian to women, French to men, and German to my horse.
Charies V, Holy Roman Emperor (1500-1558)
Instructions: Language of the computer
2.1 Instruction
2.2 Operations of the computer Hardware
2.3 Operands of the computer Hardware
2.4 Signed and unsigned Numbers
2.5 Representing Instructions in the computer
2.6 Logical Operations
2.7 Instructions for Making Decisions
2.8 Supporting Procedures in computer Hardware
2.9 MIPS addresses for 32-Bit Immediates and Addresses
2.10 Parallelism and Instructions: Synchronization
2.11 Translating and starting a Program
2.12 A C sort Example to put It All Together
2.13 Advanced Material: compiling C
2.14 Real stuff:ARMV7 (32-BIT)INSTRUCTION
2.15 Real atuff: x86 Instruction
2.16 Real stuff :ARMA8(64-bit) Instructions
2.17 Fallacies and pitfalls
2.18 Concluding Remarke
2.19 Historical perspective and Further Reading
2.20 Exercise
The Five classic Componenets of a computer.
2.1 Introduction
To command a computer's hardware, you must speak its language. The words of a computer's language are called instructions, and its vocabulary is called an instruction set. In this chapter, you will see the instruction set of a real computer, both in the form written by people and in the form read by the computer. We introduce instructions in a top-down fashion. Starting from a notation that looks like a restricted programming language, we refine it step-by-step until you see the real language of a real computer. Chapter 3 continues our downward descent, unveiling the hardware for arithmetic and the representation of floating-point numbers.
You might think that the languages of computers would be as diverse as those of people, but in reality computer languages are quite similar, more like regional dialects than like independent languages. Hence, once you learn one, it is easy to pick up others.
The chosen instruction set comes from MIPS Technologies, and it is an elegant example of instruction sets since the 1980s. To demonstrate how easy it is to pick up other instruction sets, we will take a quick look at three other popular instruction sets:
1.ARMV7 is similar to MIPS. More than 9 billion chips with ARM processors were manufactured in 2011, making it the most popular instruction set in the world.
2. The second example is the Intel x86, which powers both the PC and the cloud of the PostPC Era.
3. The third example is ARMv8, which extends the address size of the ARMv7 from 32 bits to 64 bits. Ironically, as we shall see, this 2013 instruction set is closer to MIPS than it is to ARMv7
This similarity of instruction sets occurs because all computers are constructed from hardware technologies based on similar underlying principles and because there are a few basic operations that all computers must provide. Moreover, computer designers have a common goal : to find a language that makes it easy to build the hardware and the compiler while maximizing performance and minimizing cost and energy. This goal is time honored; the following quote was wtitten before you could buy a computer, and it is as true today as it was in 1947:
It is easy to see by formal-logical methods, that there exist certain [instruction sets] that are in abstract adequate to control and cause the execution of any sequence of operations... The really decisive considerations from the present point of view, in selecting an [instruction set], are more of a practical nature: simplicity of equipment demanded by the [instruction set], and the clarity of its application to the actually important problems together with the speed of its handling of those problems.
Burks, Goldstine and Von Neumann, 1947
The "simplicity of the equipment" is as valuable a consideration for today's computers as it was for those of the 1950s. The goal of this chapter is to teach an instruction set that follows this advice, showing both how it is represented in hardware amd the relationship between high-level programming languages and this more primitive one. We examples are in the C programming language; Section 2.13 shows how these would change for an object-oriented language like Java.
By learning how to represent instructions, you will also discover the secret of computing: the stored-program concept. Moreover, you will exercise your “foreign language” skill by writing programs in the language of the computer and running them on the simulator that comes with this book. You will also see the impact of programming languages and compiler optimization on performance. We conclude with a look at the historical evolution of instruction sets and overview of other computer dialects
with reveal our first instruction set a piece at a time, giving the rationale along with the computer structions. This top-down, step-by-step tutorial weaves the components with their explanations, making the computer's language more palatable. Figure 2.1 gives a sneak preview of the instruction set covered in this chapter.
2.2 Operations of the computer Hardware
Every computer must be able to perform arithmetic. The MIPS assembly language notation:
add a, b, c
instructs a computer to add the two variables b and c and put their sum in a .
This notation is rigid in that each MIPS arithmetic instruction performs only one operation and must always have exactly three variables. For example, suppose we want to place the sum of four variables b,c,d and e into variable a. (In this section we are being deliberately vague about what a "variable"is; in the next section we'll explain in detail.)
The following sequence of instructions adds the four variables:
add a, b, c # The sum of b and c is placed in a
add a, a, d #The sum of b, c, and d is now in a
add a, a, e #The sum of b, c, d, and e is now in a
Thus, it takes three instructions to sum the four variables.
The words to the right of the sharp symbol(#) on each line above are comments for the human reader, so the computer ignores them. Note that unlike other programming languages, each line of this language can contain at most one instruction. Another difference from C is that comments always terminate at the end of a line.
The natural number of operands for an operation like addition is three: the two numbers being added together and a place to put the sum. Requiring every instruction to have exactly three operands. no more and no less, conforms to the philosophy of keeping the hardware simple: hardware for a variable number of operands is more complicated than hardware for a fixed number. This situation illustrates the first of three underlying principles of hardware design:
Design Principle: Simplicity favors regularity
We can now show, in the two examples that follow, the relationship of programs written in higher-level programming languages to programs in this more primitive notation.
Compiling Two C Assignment Statements into MIPS
This segment of a C program contains the five variables a, b, c, d, and e. Since Jave evolved from C, this example and the next few work for either high-level programmiong language:
a = b + c;
d = a - e;
The translation from C to MIPS assembly language instructions is performed by the compiler. Show the MIPS code produced by a compier.
A MIPS instruction operates on two source operands and places the result in one destination operand. Hence, the two simple statements above compile directly into these two MIPS assembly language instructions:
add a, b, c
sub d, a, e
Compiling a Complex C Assignment into MIPS
A somewhat complex statement contains the five variables f, g, h, i, and j:
f = ( g + h ) - ( i + j )
What might a compiler produce?
The compiler must break statement into several assembly instructions, since only one operation is performed per MIPS instruction. The first MIPS instruction calculates the sum of g and h. We must place the result somewhere, so the compiler creates a temporary variable, called t0:
add t0, g, h # temporary variable t0 contains g + h
Although the next operation is subtract, we need to caculate the sum of i and j before we can subtract. Thus, the second instruction places the sum of i and j in another temporary variable created by the compiler, called t1:
add t1, i, j # temporary variable t1 contains i + j
Finally, the subtract instruction substracts the second sum from the first and places the difference in the variable f, completing the compiled code:
sub f t0 - t1, which is (g+h)-(i+j)
Check Yourself For a given function, which programming likely takes the most lines of code? put the three representations in order.
1. Java
2.C
3.MIPS assembly language
Elaboration: To increase portability, Java was originally envisioned as replaying on a software interpreter. The instruction set of this interpreter is called Java bytecodes (See Section 2.13), which is quite different from the MIPS instruction set. To get performance close to the equivalent C program, Java systems today typically compile Java bytecodes into the native instruction sets like MIPS. Because this compilation is normally done much later than for C program, such Java compiler are often called Just In Time (JIT) compilers. Section 2.11 shows how JITs are used later than C compilers in the start-up process, and section 2.12 shows the performance consequences of compiling versus interpreting Java programs.
2.3 Oprands of the computer Hardware
Unlike programs in high-level languages, the operands of arithmetic instructions are restricted; they must be from a limited number of special locations built directly in hardware called registers. Registers are primitives used in hardware design that are also visible to the programmer when the compter is completed, so you can think of registers as the bricks of computer construction. The size of a register in the MIPS architecture is 32 bit; groups of 32 bits occur so frequently that they are given the name word in the MIPS architecture.
One major difference between the variables of a programming language and registers is the limited number of registers, typically 32 on current computers, like MIPS. (See section 2.19 for the history of the number of registers.) Thus, continuing in our top-down stepwise evolution of the symbolic representation of the MIPS language, in this section we have added the restriction that the three operands of MIPS arithmetic instructions must each be chosen from one of the 32 32-bit registers.
The reason for the limit of 32 registers may be found in the second of our three underlying design principles of hardware technology:
Design principle 2: Smaller is faster.
A very large number of registers may increase the clock cycle time simply because it takes electronic signals longer when they must travel farther.
Guidelines such as "smaller is faster" are not absolutes; 31 registers may not be faster than 32.Yet, the truth behind such observations causes computer designers to take them seriouly. In this case, the designer must balance the craving of programs for more registers with the designer's desire to keep the clock cycle fast. Anther reason for not using more than 32 is the number of bits it would take in the instruction format, as Section 2.5 demonstrates.
Chapter 4 shows the central role that registers play in hardware construction; as we shall see in this chapter, effective use of registers is critical to program proformance.
Although we could simply write instructions using numbers for registers, from 0 to 31, the MIPS convention is to use two-character names following a dollar sign to represent a register. Section 2.8 will explain the reasons behind these names. For now, we will use $ s0, $s1,.... for registers that correspond to variables in C and Java programs and $t0, $t1,.... for temporary register needed to compile the program into MIPS instruction.
Compiling a C Assignment using Registers
It is the compiler's job to associate program variables with registers. Take, for instance, the assignment statement from our earlier example:
f = (g + h) - (i + j);
The variables f, g, h, i, and j are assigned to the registers $s0, $s1, $s2, $s3, $s4, respectively. What is the compiled MIPS code?
The compiled program is very similar to the prior example, expect we replace the variables with the register names mentioned above plus two temporary registers, $t0 and $t1, which correspond to the temporary variables above:
add $t0, $s1, $s2 # register $t0 contains g+h
add $t1, $s3, $s4 # register $t1 contains i+j
sub $s0, $t0, $t1 # register $t0-$t1, which is (g+h)-(i+j)
Memory Operanda
Programming languges have simple variable that contain single data elements, as in these examples, but they also have some complex data structures--arrays and structures. These complex data structures can contain many more data elements than there are registers in a computer. How can a computer represent and access such large structures?
Recall the five components of a computer introduced in Chapter1 and repeated on page 61. The processor can keep only keep only a small amount of data in registers, but computer memory contains billions of data elements. Hence, data structures (arrays and structures) are kept in memory.
As explained above, arithmetic operations occur only on registers in MIPS instructions; thus, MIPS must include instructions that transfer data between memory and registers. Such instructions are called data transfer instructions. To access a word in memory, the instruction must supply the memoey address. Memory is just a large, single-dimensional array, with the address acting as the index to that array, start at 0. For example, in Figure 2.2, the address of the third data element is 2, and the value of Memory [2] is 10 .
the data transferinstruction that copies data from memory to a register is traditionally called load. The format of the load instruction is the name of the operation followed by the register to be loaded, then a constant and register used to access memory. The sum of the constant portion of instruction and the contents of the second register forms the memory address. The actual MIPS name for this instruction is lw, standing for load word.
compiling an Assignment when an Operand is in Memory
Let's assume that A is an array of 100 words and that the compiler has associated the variables g and h with the register $s1 and $s2 as before. Let's also assume that the starting address, or base address, of the array is in $s3. Compile this C assignment statement:
g = h + A[8];
Although there is a single operation in this assignment statement, one of the operands is in memory, so we must first transfer A[8] to a register. The address of this array element is the sum of the base of the array A, found in register $s3, plus the number to select element 8. The data should be placed in a temporary register for use in the next instruction. Based on Figure 2.2 , the first compiled instruction is
lw $to, 8($s3) # Temporary reg $t0 gets A[8]
(We'll be making a slight adjustment to this instruction, but we'll use this simplified version for mow.) The following instruction can operate on the value in $t0 (which equals A[8]) since it is in a register. The instruction must add h (contained in $2) to A[8] (contained in $t0) and put the sum in the register corresponding to a (associated with $s1):
add $s1, $s2, $t0 # g = h +A[8]
The constant in a data transfer instruction (8) is called the offset, and the register added to form the address ($s3) is called the based register.
In addition to associating variables with registers, the compiler allocates data struction like arrays and structures to locations in memory. The compiler can then place the propor starting address into the data transfer instructions.
Since 8-bit bytes are useful in many programs, virtually all architectures today address individual bytes. Therefore, the address of a word matches the address of one of the 4 bytes within the word, and addresses of sequential words differ by 4. For example, Figure 2.3 show the actual MIPS addresses for the words in Figure 2.2; the byte address of the third word is 8.
In MIPS, word must start at addresses that are multiples of 4. This requirement is called an aligment restriction, and many architectures have it. (Chapter 4 suggests why alignment leads to faster data transfers.)
Computers divide into those that use the address of the leftmost or "big end" byte as the word address versus those that use the rightmost or "little end byte". MIPS is in the big-endian camp. Since the order matters only if you access the identical data bothas a word and as four bytes, few need to be aware of the endianess. (Appendix A show the two options to number bytes in a word.)
Byte addressing also affects the array index. To get thr proper byte address in the code above, the offset to be added to the base register $s3 must be 4*8, or 32, so that the load address will select A[8] and not A[8/4]. (see the related pitfall on section 2.17)
The instruction complementary to load is traditionally called store; it copies data from a register to memory. The format of a store is similar to that of a load: the name of the operation, followed by the register to be stored, then offset to select the arrary element, and finally the base register. Once again, the MIPS address is specified in part by a constant and in part by the contents of sa register. the actual MIPS is SW,standing for store word.