【深入理解计算机系统】 六:AVR架构和运行环境
6.1. The Execution Environment
The architecture of a microprocessor consists of several modules that allow the execution of the instructions in its machine language. These circuits are too complex to be studied in detail. To give an idea about this complexity, the generic microprocessors used in personal computers in the year in 1974 had around 6000 transistors, and in 2004 they contained approximately 50 million transistors. These microprocessors are now designed over periods of several years and by large teams of designers. The following figure shows an image of an 8 bit microcontroller manufactured by Atmel Corporation similar to the one used in the Arduino platform.
By ZeptoBars CC-BY-3.0, via Wikimedia Commons
The chip of the previous figure has a size of 2855 x 2795µm, (one meter has 1,000,000 µm). That mean, in a square meter you can fit almost 125,000 of these circuits. The complexity of designing one of these circuits can be managed with the help of sophisticated technology that supports the whole cycle, from initial design to the required ultra-high level of component integration. The final product is a chip ready to be used in a circuit board as shown in the following figure.
The term architecture is used to denote a generic description of a set of microcontrollers that are capable of executing the same machine instructions but that may differ in some characteristics such as the size of the memory, speed, etc. The AVR architecture refers to a family of microcontrollers that process most of their data in sizes of 8 bits, and it has a reduced set of instructions with fixed format.
In this document we will describe the basic elements of what is known as the AVR architecture. An in-depth study of this architecture is required to design advanced applications that make use of this chip. In that case, the most important source of information is the known as the datasheet containing the description of all the aspects of the chip. You may check the Atmel 8-bit Microcontroller datasheet to see the type of information included in these documents. Section 7 of that document contains a block diagram of the CPU, the description of the register file, the stack, and how are the instructions executed.
In this section we will present a summary of this architecture to understand how the most common instructions are executed.
The execution environment of an AVR microcontroller refers to the elements used to execute instructions: type of data managed, data and status registers, available memory, etc. The units of information manipulated by a circuit are usually denoted by the following terms:
Name | Bits | Bytes |
---|---|---|
Byte | 8 bits | 1 byte |
Word | 16 bits | 2 bytes |
Doubleword | 32 bits | 4 bytes |
Quadword | 64 bits | 8 bytes |
Double Quadword | 128 bits | 16 bytes |
The following figure shows the relative sizes of these units of information and the nomenclature used to refer to their bytes. Bits are numbered starting at zero.
Information Sizes and Names
6.1.1. The Data-path
The main blocks of the AVR architecture are illustrated in the following figure.
Block Diagram of the AVR Architecture
You may download the Block Diagram of the AVR Architecture.
The blocks are all interconnected through an 8-bit data bus through which data can be exchanged between all the modules. The modules shown below the bus are in charge of executing the machine language, whereas those above the bus execute other tasks such as input/output, interruptions, timer events, etc.
6.1.2. Program and Data Memories
The AVR architecture contains two types of memory to store programs and data. Programs are stored in the Program Memory and the data that these programs manipulate is stored in the Data Memory. The machine instructions are obtained from the Program Memory and stored in the Instruction Register from where they are decoded and the proper control signals sent to the rest of the components to execute the appropriate operation.
The program memory is a flash memory which is 16 bit (or 2 byte) addressable. That is, an address in this memory uniquely identifies a group of 16 bits. The size of this memory depends on the microcontroller model, but we will assume that it has 32 Kbytes. A size of 32 Kbytes means bytes, but since the memory is 2 byte addressable, addresses for this memory require 14 bits ( locations by 2 bytes each, 32 Kbytes). Thus, the range of values for the addresses is from 0x0000 to 0x3FFF. The following figure illustrates this configuration.
The read and write operations in this memory are then carried out with 16 bits of data. The purpose of this memory is to store the code to execute, which is comprised of instructions of either 16 bits, occupying a single cell of this memory, or 32 bits, occupying two consecutive cells.
The data memory is a static RAM chip which is byte addressable. As in the previous case, the size of this memory varies depending on the model, but we will assume that its size is 2 Kbytes. As a consequence of this size, the range for address values should be from 0x000 to 0x7FF. However, the chip is designed so that the memory offers addresses from 0x000 to 0x8FF and the first 256 positions, that is from 0x000 to 0x100 are used to access 32 General Purpose Registers and 64 additional input/output registers. The following figure illustrates the structure of this memory and how the access to the 256 registers is mapped as the first positions.
With this implementation, accessing to any of the 256 registers, or accessing data in its corresponding memory address in the data memory are exactly the same operation. This technique is known as memory mapped input/output.
The architecture has a third memory called the EEPROM (Electronically Erasable Programmable Read Only Memory) with a size between 256 bytes and 1 Kbyte and is used for basic programming that does not need to be modified frequently.
6.1.3. Program Counter
The program counter is one of the most important registers in any processor. It contains the address in the program memory of the next instruction to be executed. The value of this register is modified by every instruction as it has to point to the next instruction to execute. Some special instructions modify this register in different ways, for example, the branching or jumping instructions. The following figure illustrates the way the program counter is used in the AVR architecture.
Access to Program Memory with the Program Counter
As will be explained in a later section, the first step in executing an instruction is to read it from memory and load it in the instruction register. That initial memory access is done with the program counter.
6.1.4. General Purpose Registers
The AVR architecture has 32 8-bit general purpose registers. This module is typically known as the register file. The register are named with the prefix R and a number, thus ranging from R0 to R31. The size of these registers is one byte because the data memory is 1 byte addressable (each cell has an address and contains one byte). However, in some situations, the information stored in the register file is used to access the program memory and therefore 16 bits are required for the address or the data. To address this issue the register file allows the six last registers, R26 to R31 to be treated as if they were three consecutive 16-bit registers by concatenating two of them consecutively. Their names are X, Y and Z respectively.
Additionally, and as was explained in the previous section, the registers can also be accessed as if they were located in the lowest positions of the data memory, that is, addresses 0x000 to 0x01F. The following figure shows the structure of the register file and how last six registers are treated as three 16 bit registers.
AVR Register File
6.1.5. Arithmetic/Logic Unit
The Arithmetic/Logic Unit (ALU) is a combinational circuit capable of performing operations of three types: arithmetic, logical, and bit level functions. The operands are obtained directly from the register file, or for some instructions, one from the register file and the second from the instruction register (a field inside the encoding of a machine instruction). In the most advanced models of the architecture, the ALU is also capable of multiplying two signed or unsigned integers.
Examples of arithmetic operations are addition, subtraction with or without carry, both for signed and unsigned numbers. Examples of logical operations are bit-wise conjunction, disjunction, negation, one’s complement, two’s complement and exclusive or. Examples of bit level operations are arithmetic and logical shifts, rotations, and byte swaps.
6.1.6. Status Register
During the execution of programs, there are several special situations that it is should be reflected the moment they occur. This is usually done using the status register. These values are stored so that other instructions at a later stage can consult and perhaps make decisions based on those values. For example, if the result of an addition has produced a carry, it is possible that this event might affect the execution of the program and so indicating this in the status register means that subsequent operations may use this information. The way to offer this functionality is to refresh the content of the status register every time an operation is executed. The number of bits and conditions stored in this type of register is highly dependent on the architecture. Different architectures may have radically different status register both in size and in the type of conditions being stored.
An analogy to understand the purpose of this register would be the notifications that appear in our mobile phone. Those icons inform us about some special condition that occurred internally in the device (a new message, battery fully charged, wireless coverage, etc.). In the context of a microcontroller, it is enough to store these values in a register and include machine instructions that allow us to check them and change the behaviour of a program.
The status register of the AVR architecture contains the following eight bits:
- Carry flag (C): indicates if a carry has occurred in the latest arithmetic or logic operation.
- Zero flag (Z): indicates if the result of the latest arithmetic or logic operation has been zero.
- Negative flag (N): indicates if the result of the latest arithmetic or logic operation has been negative (it is the most significant bit of the latest result).
- Two’s complement overflow flag: (V): if true, indicates that the latest arithmetic or logic operation, if considered over integers encoded in two’s complement, has produced an overflow.
- Sign flag (S): the sign flag indicates the sign of the result of the latest arithmetic or logic operation. It is always the exclusive or between the negative flag and the two’s complement overflow flag (S = N ⊕ V).
- Half carry flag (H): indicates if a carry has occurred at the 4th bit (half) the operator. Useful for Binary Coded Decimal (BCD)
- Bit copy storage (T): the source or destination for the bit that is the operand of bit copy operations BST and BLD.
- Global interrupt enable (I): if set, the interruptions in the microcontroller are processed. If zero, they are ignored.
The following figure illustrates the location of these bits in the status register.
Bits in the Status Register
6.2. The Execution Cycle
The execution cycle of a microprocessor in general refers to the internal steps to execute an instruction. The number of these steps and their duration vary significantly from processor to processor and are totally dependent on the architecture. Most of the techniques used to increase the performance of a circuit focus on how to optimise the execution cycle.
The execution of most of the instruction in the AVR architecture is divided into two steps:
- Instruction Fetch (IF). In this stage the instruction is obtained from the program memory and stored in the instruction register.
- Execution. The second step is where the instruction gets decoded and executed. For a typical instruction requiring an operation in the ALU, the second stage is further divided in the following sub-stages:
- Register Operand Fetch (ROF)
- ALU Execution (ALU)
- Register Write Back (RWB)
The following figure shows the sequence of stages and sub-stages executed in the AVR microcontrollers.
Sequence of Stages in Instruction Execution
6.2.1. Instruction Fetch
In this stage the processor obtains the next instruction to execute from the program memory. The value stored in the program counter is given as the address memory and a read operation is started. The result of the operation is stored in the instruction register. At the same time, the program counter is automatically updated to point to the next memory location. The following figure shows a simplified version of the architecture with the components that are used during this stage.
At this stage, the controller now finds out the type of instruction that is about to execute by processing the bits in the instruction register. The controller contains a sequential digital circuit (similar to a Finite State Machine) that receives the instruction as inputs and generates the appropriate control signals over the next clock cycles for the rest of the components in the datapath to perform the required operations. This sequential circuit is one of the most important components of the architecture as it captures in its structure both the structure of the datapath (because it provides control signals to all their components) and the set of instructions (because it must be capable of interpreting all of them).
Some special operations are 32 bits in size and therefore require an additional value from the program memory. In these cases a second fetch stage is executed, the value of the program counter updated, and the two bytes are stored in the instruction register. This extra stage takes the same time as the first one.
6.2.2. Register Operand Fetch
The following three stages are valid only for those instructions that require the use of the ALU and a result written back to the register file. Other instructions have different stages but with similar timing.
During the register operand fetch the operands that are required by the instruction are obtained from various sources. Most of the instructions use the values stored in the register file. This is the reason why this module has been designed with the following modes of operation with respect to inputs and outputs (to be processed by the ALU):
- One 8-bit output, one 8-bit result input
- Two 8-bit outputs, one 8 bit result input
- Two 8-bit outputs, one 16-bit result input
- One 16-bit output, one 16-bit result input
For example, if an operation requires two 8-bit values stored in two different registers, the register file can output both of them simultaneously into the ALU. For the same instruction, an 8-bit value can be written in another register.
Some instructions contain one operand that is not in a register but is part of the instruction itself. A pre-defined subset of the instruction bits is extracted and used as operand. The AVR Instruction Set Architecture contains the description of which combinations of operands are allowed for each instruction.
The instructions that read or write data from the Data Memory are special cases with respect to this stage. The operands required for their execution are used to calculate the address of the position in memory from where the data will be read, where the data will be written.
The following figure shows the simplified data-path with the components that play a role in this stage.
6.2.3. ALU Operation Execute
This stage is only present in those instructions that require an operation to be performed by the ALU. The control signals select the appropriate operation and the result is produce at the output (the ALU is a combinational circuit).
The status register is updated to reflect the conditions of the result just obtained. More precisely, the following bits in the register are changed:
- Carry Flag: if a carry has occurred in the latest operation
- Zero Flag: if the result of the latest operation is zero
- Negative Flag: if the result of the latest operation has the sign bit to one
- Two’s Complement Overflow Flag: if the overflow condition for two’s complement is satisfied, that is if the carry into the most significant bits is different to the carry derived from the most significant bit.
- Sign Bit: the exclusive or between the Negative Flag and the Two’s complement Flag.
- Half Carry Flag: if there is a carry out of the third bit of the result of the latest operation.
The following figure shows the subset of blocks in the architecture that are active in this stage.
6.2.4. Result Write Back
This stage is not present in all the instructions, only those that produced a result by the ALU in the previous stage that needs to be stored in the register file. The ALU writes the result on to the data bus. At the same time the register file selects the destination for the result, reads the value from the bus, and selects the write operation.
The following figure shows the subset of blocks in the architecture that are active in this stage.
6.2.5. Pipelined execution
To increase the overall performance of the microcontroller, the architecture has been designed following a technique called 2-stage pipelining. The idea is to execute the two main stages of two instructions in parallel to increase speed. In other words, the execution stage of an instruction (containing ROF, ALU and RWB) is done in parallel with the fetch of the next instruction in sequence. This technique is similar to the concept applied in a production line. For example, cars are typically produced in stages. A single car goes through all the stages in sequence until it is finished, however, an entire plant is working in as many cars as stages in the production line. This parallelism allows an increase in the production rate when compared with the scenario in which not until a car is finished, the next one is started.
The following figure illustrates this principle in the AVR architecture.
Pipelined Execution in the AVR Architecture
Pipelining is now a technique that is used by most of the commercial microcontrollers and processors. However, the execution of several instructions in parallel significantly increases the complexity of the design of the overall circuit as the control signals also need to be generated in parallel and taking into account instructions that may break the normal sequence of instructions (jumps).
6.3. The Stack
Aside from the architectural blocks described in the previous section, most of the microcontrollers offer the possibility to manipulate a basic data structure stored in memory called the stack. The stack is an area in the data memory in which data can be read and written with special instructions (aside from the normal ones). This area has a special location called the top of the stack. The processor includes two machine instructions to perform the operations of pushing data from a location to the stack and popping data from the stack to a location. In the AVR architecture, the instructions are restricted to use registers, and the data size is always 1 byte.
6.3.1. Stack Instructions
The instruction to put a byte of data on top of the stack has the format push rn (where rn is any of the general purpose registers). The effect is to store the content of the register at the top of the stack. Let us assume that the top of the stack is in address @top. The effect of the instruction push rn is:
- The value of @top is decremented by one (@top = @top - 1).
- The given register is written in the memory location now indicated by @top. The data previously stored in this memory location is lost.
The following figure shows the state of the stack before and after executing the instruction push rn.
Effect of the Instruction push rn
The push operation is similar to a write operation in the data memory but with a specific address.
The analogous instruction to remove data from the stack is pop rn (where rn is any of the general purpose registers). The effect is to store the content at the top of the stack in the register and adjust the address of the top. If we assume that the top of the stack is in address @top, the effect of pop rn is:
- The value pointed by @top is read from the data memory and written in the given registers.
- The value of @top is incremented by one (@top = @top + 1).
The following figure shows the state of the stack before and after executing the instruction pop rn.
Effect of the Instruction pop rn
The pop operation is similar to a read operation in the data memory but with a specific address. Note that the value stored in the register is still present in data memory, as it has only been read.
Some processors have machine languages that allow operands of the stack instructions other than the general purpose registers.
Aside from the stack instructions, the instructions to call and return from a subroutine also use the stack as described in the following table:
Instruction | Description |
---|---|
CALL, ICALL, RCALL | Instructions to call a subroutine. The return address is stored in two consecutive positions in the stack. The stack pointer is thus decremented by two. |
RET, RETI | Instructions to return from a subroutine. They read the return address from two positions in the stack. The stack pointer is thus incremented by two. |
6.3.2. The Stack Pointer
The stack instructions have what is known as an implicit operand. The instruction push does not specify the destination operand, and the instruction pop does not specify the source operand. In both cases, it is assumed to be the element at the top of the stack. This means that the processor must keep the address of this position in a register. Such register is known as the stack pointer. In the AVR architecture this register is part of the generic Input/Output registers and the value is stored in registers with addresses 0x3E 0x3D. The reason to use two registers is because memory addresses are represented by 16 bits and we need two registers of eight bits each. The 16 bit address is stored as shown in the following table:
Register name | Address | Content |
---|---|---|
SPH | 0x3E | Most significant 8 bits of the address of the top of the stack. |
SPL | 0x3D | Least significant 8 bits of the address of the top of the stack. |
The following figure shows the effect of two stack instructions in the stack pointer.
Effect of Stack Instructions in the Stack Pointer
6.3.3. Stack Initialisation
There is one more issue that has not been addressed so far. If a microcontroller offers the space in memory and a special register to implement a stack, how is this stack initialised? This initialisation requires two steps: reserve memory specially for the stack, and set the correct value in the stack pointer. These tasks are typically performed at the level of either the operating system or in code not explicitly written by the user.
In the AVR architecture the stack pointer is typically initialised at the last position in data memory. It is assumed that the stack is empty and therefore the pointer will decrease its value as the stack grows in content.
The stack is used to store temporary data during short time intervals. For this reason, the space that is typically reserved is rather small. If an unusually large amount of data is pushed to the stack, its space may run outside of its bounds producing what is known as a stack overflow. This type of event typically points to an execution anomaly in a program.