处理器不同编址方式、指令/数据处理方式区别
每个外设都是通过读写其寄存器来控制的。外设寄存器也称为I/O端口,通常包括:控制寄存器、状态寄存器和数据寄存器三大类。根据访问外设寄存器的不同方式,可以把CPU分成两大类。一类CPU(如ARM,MIPS,M68K,Power PC等)把这些寄存器看作内存的一部分,寄存器参与内存统一编址,访问寄存器就通过访问一般的内存指令进行,所以,这种CPU没有专门用于设备I/O的指令。这就是所谓的“I/O内存”方式。另一类CPU(典型的如X86),将外设的寄存器看成一个独立的地址空间,所以访问内存的指令不能用来访问这些寄存器,而要为对外设寄存器的读/写设置专用指令,如IN和OUT指令。这就是所谓的“ I/O端口”方式。
下面以X86的一种实现为例,该项目链接为http://zet.aluzina.org/index.php/Zet_processor。这个CPU的接口如下:
`include "defines.v" module zet_core ( input clk, input rst, // interrupts input intr, output inta, input nmi, output nmia, // interface output [19:0] cpu_adr_o, input [15:0] iid_dat_i, input [15:0] cpu_dat_i, output [15:0] cpu_dat_o, output cpu_byte_o, input cpu_block, output cpu_mem_op, output cpu_m_io, output cpu_we_o, output [19:0] pc // for debugging purposes );
从上面总线接口的定义中我们可以看到,该CPU具有独立的IO总线,总线信号如 data_in,data_out,adr_out,we等都是共用的,也就是MEM总线与IO总线复用这些信号,通过cpu_m_io来区分当前的总线操作是IO还是MEM,实际上是分时复用的。当然也可以完全复用,就是各自有独立的总线信号,这个项目实现了兼容传统8086的架构,所以是复用的。
module zet_exec ( input clk, input rst, input [`IR_SIZE-1:0] ir, input [15:0] off, input [15:0] imm, output [15:0] cs, output [15:0] ip, output of, output zf, output cx_zero, input [15:0] memout, output [15:0] wr_data, output [19:0] addr, output we, output m_io, output byteop, input block, output div_exc, input wrip0, output ifl, output tfl, output wr_ss ); // Net declarations wire [15:0] c; wire [15:0] omemalu; wire [ 3:0] addr_a; wire [ 3:0] addr_c; wire [ 3:0] addr_d; wire [ 8:0] flags; wire [15:0] a, b, s, alu_iflags, bus_b; wire [31:0] aluout; wire [3:0] addr_b; wire [2:0] t, func; wire [1:0] addr_s; wire wrfl, high, memalu, r_byte, c_byte; wire wr, wr_reg; wire wr_cnd; wire jmp; wire b_imm; wire [8:0] iflags, oflags; wire [4:0] logic_flags; wire alu_word; wire a_byte; wire b_byte; wire wr_high; wire dive; // Module instances zet_alu alu( {c, a }, bus_b, aluout, t, func, alu_iflags, oflags, alu_word, s, off, clk, dive); zet_regfile regfile ( a, b, c, cs, ip, {aluout[31:16], omemalu}, s, flags, wr_reg, wrfl, wr_high, clk, rst, addr_a, addr_b, addr_c, addr_d, addr_s, iflags, ~byteop, a_byte, b_byte, c_byte, cx_zero, wrip0); zet_jmp_cond jmp_cond (logic_flags, addr_b, addr_c[0], c, jmp); // Assignments assign addr_s = ir[1:0]; assign addr_a = ir[5:2]; assign addr_b = ir[9:6]; assign addr_c = ir[13:10]; assign addr_d = ir[17:14]; assign wrfl = ir[18]; assign we = ir[19]; assign wr = ir[20]; assign wr_cnd = ir[21]; assign high = ir[22]; assign t = ir[25:23]; assign func = ir[28:26]; assign byteop = ir[29]; assign memalu = ir[30]; assign m_io = ir[32]; assign b_imm = ir[33]; assign r_byte = ir[34]; assign c_byte = ir[35]; assign omemalu = memalu ? aluout[15:0] : memout; assign bus_b = b_imm ? imm : b; assign addr = aluout[19:0]; assign wr_data = c; assign wr_reg = (wr | (jmp & wr_cnd)) && !block && !div_exc; assign wr_high = high && !block && !div_exc; assign of = flags[8]; assign ifl = flags[6]; assign tfl = flags[5]; assign zf = flags[3]; assign iflags = oflags; assign alu_iflags = { 4'b1111, flags[8:3], 1'b0, flags[2], 1'b0, flags[1], 1'b1, flags[0] }; assign logic_flags = { flags[8], flags[4], flags[3], flags[1], flags[0] }; assign alu_word = (t==3'b011) ? ~r_byte : ~byteop; assign a_byte = (t==3'b011 && func[1]) ? 1'b0 : r_byte; assign b_byte = r_byte; assign div_exc = dive && wr; assign wr_ss = (addr_d == 4'b1010) && wr; endmodule
这部分代码可以看出,addr的计算来源只有一个,也就是他并不区分指令地址和数据地址,因此是冯诺依曼结构的处理器。
下面以OpenRISC无MMU和浮点运算单元的精简版本AltOR32处理器来说明统一编址的CPU总线接口形式。
(1)冯诺依曼结构,读取数据的接口和读取指令的接口在一起的,代码如下:
module altor32_lite ( // General input clk_i /*verilator public*/, input rst_i /*verilator public*/, // Maskable interrupt input intr_i /*verilator public*/, // Unmaskable interrupt input nmi_i /*verilator public*/, // Memory interface output reg [31:0] mem_addr_o /*verilator public*/, input [31:0] mem_dat_i /*verilator public*/, output reg [31:0] mem_dat_o /*verilator public*/, output reg mem_cyc_o /*verilator public*/, output reg mem_stb_o /*verilator public*/, output reg mem_we_o /*verilator public*/, output reg [3:0] mem_sel_o /*verilator public*/, input mem_ack_i/*verilator public*/ ); . . . . //----------------------------------------------------------------- // Next State Logic //----------------------------------------------------------------- reg [3:0] next_state_r; always @ * begin next_state_r = state_q; case (state_q) //----------------------------------------- // IDLE - //----------------------------------------- STATE_IDLE : begin if (enable_i) next_state_r = STATE_FETCH; end //----------------------------------------- // FETCH - Fetch line from memory //----------------------------------------- STATE_FETCH : begin next_state_r = STATE_FETCH_WAIT; end //----------------------------------------- // FETCH_WAIT - Wait for read responses //----------------------------------------- STATE_FETCH_WAIT: begin // Read from memory complete if (mem_ack_i) next_state_r = STATE_EXEC; end //----------------------------------------- // EXEC //----------------------------------------- STATE_EXEC : begin if (load_inst_r || store_inst_r) next_state_r = STATE_MEM; else next_state_r = STATE_WRITE_BACK; end //----------------------------------------- // MEM //----------------------------------------- STATE_MEM : begin // Read from memory complete if (mem_ack_i) next_state_r = STATE_FETCH; end //----------------------------------------- // WRITE_BACK //----------------------------------------- STATE_WRITE_BACK : begin if (enable_i) next_state_r = STATE_FETCH; end default: ; endcase end . . . . //----------------------------------------------------------------- // Memory Access / Instruction Fetch //----------------------------------------------------------------- always @ (posedge rst_i or posedge clk_i ) begin if (rst_i == 1'b1) begin mem_addr_o <= 32'h00000000; mem_dat_o <= 32'h00000000; mem_sel_o <= 4'b0; mem_we_o <= 1'b0; mem_stb_o <= 1'b0; mem_cyc_o <= 1'b0; opcode_q <= 32'h00000000; mem_offset_q <= 2'b0; end else begin case (state_q) //----------------------------------------- // FETCH - Issue instruction fetch //----------------------------------------- STATE_FETCH : begin // Start fetch from memory mem_addr_o <= pc_q; mem_stb_o <= 1'b1; mem_we_o <= 1'b0; mem_cyc_o <= 1'b1; end //----------------------------------------- // FETCH_WAIT - Wait for response //----------------------------------------- STATE_FETCH_WAIT : begin // Data ready from memory? if (mem_ack_i) begin opcode_q <= mem_dat_i; mem_cyc_o <= 1'b0; end end //----------------------------------------- // EXEC - Issue read / write //----------------------------------------- STATE_EXEC : begin `ifdef CONF_CORE_TRACE $display("%08x: Execute 0x%08x", pc_q, opcode_q); $display(" rA[%d] = 0x%08x", ra_w, reg_ra_r); $display(" rB[%d] = 0x%08x", rb_w, reg_rb_r); `endif case (1'b1) // l.lbs l.lhs l.lws l.lbz l.lhz l.lwz load_inst_r: begin mem_addr_o <= {mem_addr_r[31:2], 2'b0}; mem_offset_q <= mem_addr_r[1:0]; mem_dat_o <= 32'h00000000; mem_sel_o <= 4'b1111; mem_we_o <= 1'b0; mem_stb_o <= 1'b1; mem_cyc_o <= 1'b1; `ifdef CONF_CORE_DEBUG $display(" Load from 0x%08x to R%d", mem_addr_r, rd_w); `endif end inst_sb_w: // l.sb begin mem_addr_o <= {mem_addr_r[31:2], 2'b0}; mem_offset_q <= mem_addr_r[1:0]; case (mem_addr_r[1:0]) 2'b00 : begin mem_dat_o <= {reg_rb_r[7:0],24'h000000}; mem_sel_o <= 4'b1000; mem_we_o <= 1'b1; mem_stb_o <= 1'b1; mem_cyc_o <= 1'b1; end 2'b01 : begin mem_dat_o <= {{8'h00,reg_rb_r[7:0]},16'h0000}; mem_sel_o <= 4'b0100; mem_we_o <= 1'b1; mem_stb_o <= 1'b1; mem_cyc_o <= 1'b1; end 2'b10 : begin mem_dat_o <= {{16'h0000,reg_rb_r[7:0]},8'h00}; mem_sel_o <= 4'b0010; mem_we_o <= 1'b1; mem_stb_o <= 1'b1; mem_cyc_o <= 1'b1; end 2'b11 : begin mem_dat_o <= {24'h000000,reg_rb_r[7:0]}; mem_sel_o <= 4'b0001; mem_we_o <= 1'b1; mem_stb_o <= 1'b1; mem_cyc_o <= 1'b1; end default : ; endcase end inst_sh_w: // l.sh begin mem_addr_o <= {mem_addr_r[31:2], 2'b0}; mem_offset_q <= mem_addr_r[1:0]; case (mem_addr_r[1:0]) 2'b00 : begin mem_dat_o <= {reg_rb_r[15:0],16'h0000}; mem_sel_o <= 4'b1100; mem_we_o <= 1'b1; mem_stb_o <= 1'b1; mem_cyc_o <= 1'b1; end 2'b10 : begin mem_dat_o <= {16'h0000,reg_rb_r[15:0]}; mem_sel_o <= 4'b0011; mem_we_o <= 1'b1; mem_stb_o <= 1'b1; mem_cyc_o <= 1'b1; end default : ; endcase end inst_sw_w: // l.sw begin mem_addr_o <= {mem_addr_r[31:2], 2'b0}; mem_offset_q <= mem_addr_r[1:0]; mem_dat_o <= reg_rb_r; mem_sel_o <= 4'b1111; mem_we_o <= 1'b1; mem_stb_o <= 1'b1; mem_cyc_o <= 1'b1; `ifdef CONF_CORE_DEBUG $display(" Store R%d to 0x%08x = 0x%08x", rb_w, {mem_addr_r[31:2],2'b00}, reg_rb_r); `endif end default: ; endcase end //----------------------------------------- // MEM - Wait for response //----------------------------------------- STATE_MEM : begin // Data ready from memory? if (mem_ack_i) begin mem_cyc_o <= 1'b0; end end default: ; endcase end end
该实现是一个典型的5级流水线处理器,下方的部分是访存部分的状态机逻辑实现,这部分的代码描述了 Memory Access / Instruction Fetch的操作,是通过唯一的总线端口进行的。
(2)哈佛结构,读取数据的接口和读取指令的接口分开,代码如下:
//----------------------------------------------------------------- // Module - AltOR32 CPU (Pipelined Wishbone Interfaces) //----------------------------------------------------------------- module altor32 ( // General input clk_i, input rst_i, input intr_i, input nmi_i, output fault_o, output break_o, // Instruction memory output [31:0] imem_addr_o, input [31:0] imem_dat_i, output [2:0] imem_cti_o, output imem_cyc_o, output imem_stb_o, input imem_ack_i, // Data memory output [31:0] dmem_addr_o, output [31:0] dmem_dat_o, input [31:0] dmem_dat_i, output [3:0] dmem_sel_o, output [2:0] dmem_cti_o, output dmem_cyc_o, output dmem_we_o, output dmem_stb_o, input dmem_ack_i ); . . . //----------------------------------------------------------------- // Module - Wishbone fetch unit //----------------------------------------------------------------- always @ (posedge rst_i or posedge clk_i ) begin if (rst_i == 1'b1) begin wbm_addr_o <= 32'h00000000; wbm_cti_o <= 3'b0; wbm_stb_o <= 1'b0; wbm_cyc_o <= 1'b0; fetch_word_q <= {FETCH_WORDS_W{1'b0}}; resp_word_q <= {FETCH_WORDS_W{1'b0}}; end else begin // Idle if (!wbm_cyc_o) begin if (fetch_i) begin if (burst_i) begin wbm_addr_o <= {address_i[31:FETCH_BYTES_W], {FETCH_BYTES_W{1'b0}}}; fetch_word_q <= {FETCH_WORDS_W{1'b0}}; resp_word_q <= {FETCH_WORDS_W{1'b0}}; // Incrementing linear burst wbm_cti_o <= WB_CTI_BURST; end else begin wbm_addr_o <= address_i; resp_word_q <= address_i[FETCH_BYTES_W-1:2]; // Single fetch wbm_cti_o <= WB_CTI_FINAL; end // Start fetch from memory wbm_stb_o <= 1'b1; wbm_cyc_o <= 1'b1; end end // Access in-progress else begin // Command accepted if (~wbm_stall_i) begin // Fetch next word for line if (wbm_cti_o != WB_CTI_FINAL) begin wbm_addr_o <= {wbm_addr_o[31:FETCH_BYTES_W], next_word_w, 2'b0}; fetch_word_q <= next_word_w; // Final word to read? if (penultimate_word_w) wbm_cti_o <= WB_CTI_FINAL; end // Fetch complete else wbm_stb_o <= 1'b0; end // Response if (wbm_ack_i) resp_word_q <= resp_word_q + 1'b1; // Last response? if (final_o) wbm_cyc_o <= 1'b0; end end end . . . . //----------------------------------------------------------------- // Module - Data Cache Memory Interface //----------------------------------------------------------------- always @ * begin next_state_r = state; case (state) //----------------------------------------- // IDLE //----------------------------------------- STATE_IDLE : begin // Perform cache evict (write) if (evict_i) next_state_r = STATE_WRITE_SETUP; // Perform cache fill (read) else if (fill_i) next_state_r = STATE_FETCH; // Read/Write single else if (rd_single_i | (|wr_single_i)) next_state_r = STATE_MEM_SINGLE; end //----------------------------------------- // FETCH - Fetch line from memory //----------------------------------------- STATE_FETCH : begin // Line fetch complete? if (~mem_stall_i && request_idx == {CACHE_LINE_WORDS_IDX_MAX{1'b1}}) next_state_r = STATE_FETCH_WAIT; end //----------------------------------------- // FETCH_WAIT - Wait for read responses //----------------------------------------- STATE_FETCH_WAIT: begin // Read from memory complete if (mem_ack_i && response_idx == {CACHE_LINE_WORDS_IDX_MAX{1'b1}}) next_state_r = STATE_IDLE; end //----------------------------------------- // WRITE_SETUP - Wait for data from cache //----------------------------------------- STATE_WRITE_SETUP : next_state_r = STATE_WRITE; //----------------------------------------- // WRITE - Write word to memory //----------------------------------------- STATE_WRITE : begin // Line write complete? if (~mem_stall_i && request_idx == {CACHE_LINE_WORDS_IDX_MAX{1'b1}}) next_state_r = STATE_WRITE_WAIT; // Fetch next word for line else if (~mem_stall_i | ~mem_stb_o) next_state_r = STATE_WRITE_SETUP; end //----------------------------------------- // WRITE_WAIT - Wait for write to complete //----------------------------------------- STATE_WRITE_WAIT: begin // Write to memory complete if (mem_ack_i && response_idx == {CACHE_LINE_WORDS_IDX_MAX{1'b1}}) next_state_r = STATE_IDLE; end //----------------------------------------- // MEM_SINGLE - Single access to memory //----------------------------------------- STATE_MEM_SINGLE: begin // Data ready from memory? if (mem_ack_i) next_state_r = STATE_IDLE; end default: ; endcase end
上面摘录了不同阶段的代码,首先是CPU总线接口,有专门的指令总线和数据总线,下面是指令读取的状态机逻辑中的操作,最后部分是数据读写的逻辑操作。