多通道的AXI仲裁方法【第二版】:简易仲裁和DMA拓展优化
代码:
NoNounknow/DMA: DMA仓库,主要包含了各种操作场景下用的DMA,细节在博客园。 (github.com)
DMA_Custom:没有仲裁功能,具备定义多帧缓冲区和输出缓冲区区号功能的DMA
DMA_Complex:具备仲裁功能DMA的蓝本
DMA_Loop:可以通过输入信号配置缓冲区大小的DMA
DMA_Loop_PsRd:可以通过输入信号配置缓冲区大小,可以通过PS端提供帧同步的读DMA;(通过一次帧同步即读一帧)
DMA_Loop_AcpWr:ACP DMA,可以通过输入信号配置缓冲区大小,可以通过PS端提供帧同步的写DMA;(帧同步完成对写FIFO的复位,防止写入上一帧图像,复位完成后才开始写操作);
2024/02/24
描述:
DMA,直接存储器访问,是使用FPGA上DDR3.DDR4,SDRAM不可或缺的模块;
为了满足不同场景的需求,DMA模块需要根据具体情况进行修缮,但要求要求是:
【从指定区域读出指定长度的数据,并根据数据的刷新情况和需求定期更新内部的数据】
在满足基本的同时,我们还需要考虑:多个DMA如何进行工作?不同DMA之间怎么进行通信?等诸多问题。
从fpga的角度上考虑的话要么对DMA输入总线的信号进行仲裁,要么对DMA的输入请求进行仲裁,我认为后者是一个较为简单的方法。
仲裁:
0.仲裁的逻辑
module Aribe_LoopPrior_State_v1 #( parameter AXI_ADDR_WIDTH = 32 , parameter AXI_BUF_SIZE = 3 , )( input wire I_Main_CLK , input wire I_Rst_n , //I_CH0 input wire I_CH0_WR_req , input wire I_CH0_RD_req , input wire I_CH0_WR_END , output wire O_CH0_WR_req , //I_CH1 input wire I_CH1_WR_req , input wire I_CH1_RD_req , input wire I_CH1_WR_END , output wire O_CH1_WR_req , //I_CH2 input wire I_CH2_WR_req , input wire I_CH2_RD_req , input wire I_CH2_WR_END , output wire O_CH2_WR_req , //I_CH3 input wire I_CH3_WR_req , input wire I_CH3_RD_req , input wire I_CH3_WR_END , output wire O_CH3_WR_req , );
利用位屏蔽算法完成仲裁,通过设置状态机完成仲裁的开始和结束、信号隔离;
1.仲裁的请求
要进行仲裁,我们首先需要直到是什么让请求开始的?显而易见,让请求开始的是输入的需要写入的信号增加到了一定的数量,或者需要读出的输出信号减少到了一定的数量,这部分标志信号我们需要用FIFO来提供;
assign wr_brust_Req = (w_rd_data_count>=C_M_AXI_BURST_LEN); generate if (WR_CH_EN[0]==1) begin: WR_EN wdata_w64x1024_r64x1024 wdata_w32x4096_r64x2048 ( .rst ( (!M_AXI_ARESETN)|(Ext_Pose_pre_vs)), // input wire rst .wr_clk ( I_Pre_clk ), // input wire wr_clk .rd_clk ( M_AXI_ACLK ), // input wire rd_clk .din ( wr_fifo_wr_data ), // input wire [63 : 0] din .wr_en ( wr_fifo_wr_en ), // input wire wr_en .rd_en ( wr_fifo_rd_en ), // input wire rd_en .dout ( wr_fifo_rd_data ), // output wire [63 : 0] dout .full ( full_w ), // output wire full .empty ( empty_w ), // output wire empty .rd_data_count(w_rd_data_count ), // output wire [10 : 0] rd_data_count .wr_data_count(w_wr_data_count ), // output wire [10 : 0] wr_data_count .wr_rst_busy(), // output wire wr_rst_busy .rd_rst_busy() // output wire rd_rst_busy ); end endgenerate
2.操作的开始
那么得到了请求,我们要怎么进行仲裁呢?仲裁还需要哪些信号呢?
首先我们需要读写操作的开始标志:wr_brust_now和rd_brust_now;
rd_brust_now的代码,wr也是一样的。
always@(posedge M_AXI_ACLK) if(M_AXI_ARESETN == 1'b0) begin wr_brust_now <= 1'b0; end else if(wr_brust_end == 1'b1 && wr_brust_now == 1'b1) begin wr_brust_now <= 1'b0; end else if(wr_brust_start == 1'b1 && wr_brust_now == 1'b0) begin wr_brust_now <= 1'b1; end else begin wr_brust_now <= wr_brust_now; end
always@(posedge M_AXI_ACLK) if(M_AXI_ARESETN == 1'b0) begin rd_brust_now <= 1'b0; end else if(rd_brust_end == 1'b1 && rd_brust_now == 1'b1) begin rd_brust_now <= 1'b0; end else if(rd_brust_start == 1'b1 && rd_brust_now == 1'b0) begin rd_brust_now <= 1'b1; end else begin rd_brust_now <= rd_brust_now; end
利用这两个信号我们可以通知仲裁模块的状态机进入等待操作结束的状态;
3.操作的结束
仲裁需要知道这次仲裁需要维持的生命周期,简单来说我们可以把这一次总线的突发开始作为这次仲裁有效的标志,而把突发结束作为仲裁结束,可以开始下一轮仲裁的标志。
读写操作的结束标志:rd_brust_end和wr_brust_end
assign wr_brust_end = (axi_wvalid==1'b1&&M_AXI_WREADY==1'b1&&wr_burst_cnt==C_M_AXI_BURST_LEN-1);
always@(*) begin if((M_AXI_RVALID == 1'b1)&&(axi_rready == 1'b1)&&(M_AXI_RLAST == 1'b1)) begin rd_brust_end <= 1'b1; end else begin rd_brust_end <= 1'b0; end end
但无论怎么仲裁,总线的承载能力总是有限的,如何根据需求调整突发长度也是节约总线资源的要求之一。
完整代码V2:
module Aribe_LoopPrior_State_v1 ( input wire I_clk , input wire I_Rst_n , //Port //ch0 input wire I_ch0_req , input wire I_ch0_start , input wire I_ch0_end , output wire O_ch0_vaild , //ch1 input wire I_ch1_req , input wire I_ch1_start , input wire I_ch1_end , output wire O_ch1_vaild , //ch2 input wire I_ch2_req , input wire I_ch2_start , input wire I_ch2_end , output wire O_ch2_vaild , //ch3 input wire I_ch3_req , input wire I_ch3_start , input wire I_ch3_end , output wire O_ch3_vaild ); //-----------------------------------------------------------------// localparam state_idle = 10'b0000_0000_01; localparam state_aribe = 10'b0000_0000_10; localparam state_ch0_0 = 10'b0000_0001_00; localparam state_ch0_1 = 10'b0000_0010_00; localparam state_ch1_0 = 10'b0000_0100_00; localparam state_ch1_1 = 10'b0000_1000_00; localparam state_ch2_0 = 10'b0001_0000_00; localparam state_ch2_1 = 10'b0010_0000_00; localparam state_ch3_0 = 10'b0100_0000_00; localparam state_ch3_1 = 10'b1000_0000_00; //-----------------------------------------------------------------// //req //step.0 wire [3:0] single_req_Concat ; reg [7:0] double_req_Concat ; //step.1 reg [7:0] S1_req_Concat ; //step.2 reg [7:0] S2_req_Concat ; //step.3 wire [3:0] S3_req_Concat ; //aribe wire aribe_start ; wire aribe_step ; reg aribe_cycle ; reg [3:0] aribe_value ; //step reg [3:0] step ; //state reg [9:0] state ; wire aribe_ch0_end ; wire aribe_ch1_end ; wire aribe_ch2_end ; wire aribe_ch3_end ; //req vaild reg reg_ch0_vaild ; reg reg_ch1_vaild ; reg reg_ch2_vaild ; reg reg_ch3_vaild ; //start reg r1_ch0_start ; reg r2_ch0_start ; reg r1_ch1_start ; reg r2_ch1_start ; reg r1_ch2_start ; reg r2_ch2_start ; reg r1_ch3_start ; reg r2_ch3_start ; //-----------------------------------------------------------------// assign single_req_Concat = {I_ch3_req, I_ch2_req, I_ch1_req, I_ch0_req}; assign aribe_start = |single_req_Concat; assign aribe_step = (aribe_start == 1'b1 && aribe_cycle == 1'b0); assign aribe_ch0_end = (I_ch0_end == 1'b1)&&(state == state_ch0_1); assign aribe_ch1_end = (I_ch1_end == 1'b1)&&(state == state_ch1_1); assign aribe_ch2_end = (I_ch2_end == 1'b1)&&(state == state_ch2_1); assign aribe_ch3_end = (I_ch3_end == 1'b1)&&(state == state_ch3_1); assign O_ch0_vaild = reg_ch0_vaild; assign O_ch1_vaild = reg_ch1_vaild; assign O_ch2_vaild = reg_ch2_vaild; assign O_ch3_vaild = reg_ch3_vaild; always @(posedge I_clk) begin step[3:0] <= {step[2:0],aribe_step}; end // Pose always @(posedge I_clk) begin {r2_ch0_start,r1_ch0_start} <= {r1_ch0_start,I_ch0_start}; {r2_ch1_start,r1_ch1_start} <= {r1_ch1_start,I_ch1_start}; {r2_ch2_start,r1_ch2_start} <= {r1_ch2_start,I_ch2_start}; {r2_ch3_start,r1_ch3_start} <= {r1_ch3_start,I_ch3_start}; end // aribe_cycle always @(posedge I_clk) begin if(I_Rst_n == 1'b0) begin aribe_cycle <= 1'b0; end else if(aribe_ch0_end|aribe_ch1_end|aribe_ch2_end|aribe_ch3_end) begin aribe_cycle <= 1'b0; end else if(aribe_start == 1'b1 && aribe_cycle == 1'b0 && state == state_idle) begin aribe_cycle <= 1'b1; end else begin aribe_cycle <= aribe_cycle; end end // step.0 always @(posedge I_clk) begin if(I_Rst_n == 1'b0) begin double_req_Concat <= 'd0; end else if(aribe_step == 1'b1 && step[0] == 1'b0) begin double_req_Concat <= {2{I_ch3_req, I_ch2_req, I_ch1_req, I_ch0_req}}; end else begin double_req_Concat <= double_req_Concat; end end // step.1 always @(posedge I_clk) begin if(I_Rst_n == 1'b0) begin S1_req_Concat <= 'd0; end else if(step[0] == 1'b1 && step[1] == 1'b0) begin S1_req_Concat <= ~(double_req_Concat - {4'b0,aribe_value}); end else begin S1_req_Concat <= S1_req_Concat; end end // step.2 always @(posedge I_clk) begin if(I_Rst_n == 1'b0) begin S2_req_Concat <= 'd0; end else if(step[1] == 1'b1 && step[2] == 1'b0) begin S2_req_Concat <= (S1_req_Concat & double_req_Concat); end else begin S2_req_Concat <= S2_req_Concat; end end assign S3_req_Concat = ((S2_req_Concat[3:0])|(S2_req_Concat[7:4])); // aribe_value always @(posedge I_clk) begin if(I_Rst_n == 1'b0) begin aribe_value <= {3'b0,1'b1}; end else if(aribe_value[3] == 1'b1 && step[0] == 1'b1 && step[1] == 1'b0) begin aribe_value <= {3'b0,1'b1}; end else if(step[0] == 1'b1 && step[1] == 1'b0) begin aribe_value <= aribe_value << 1; end else begin aribe_value <= aribe_value; end end //req //ch0 always @(posedge I_clk) begin if(I_Rst_n == 1'b0) begin reg_ch0_vaild <= 1'b0; end else if(state == state_ch0_0 && (r1_ch0_start == 1'b1 && r2_ch0_start == 1'b0)) begin reg_ch0_vaild <= 1'b0; end else if(state == state_ch0_0 && reg_ch0_vaild == 1'b0) begin reg_ch0_vaild <= 1'b1; end end //ch1 always @(posedge I_clk) begin if(I_Rst_n == 1'b0) begin reg_ch1_vaild <= 1'b0; end else if(state == state_ch1_0 && (r1_ch1_start == 1'b1 && r2_ch1_start == 1'b0)) begin reg_ch1_vaild <= 1'b0; end else if(state == state_ch1_0 && reg_ch1_vaild == 1'b0) begin reg_ch1_vaild <= 1'b1; end end //ch2 always @(posedge I_clk) begin if(I_Rst_n == 1'b0) begin reg_ch2_vaild <= 1'b0; end else if(state == state_ch2_0 && (r1_ch2_start == 1'b1 && r2_ch2_start == 1'b0)) begin reg_ch2_vaild <= 1'b0; end else if(state == state_ch2_0 && reg_ch2_vaild == 1'b0) begin reg_ch2_vaild <= 1'b1; end end //ch3 always @(posedge I_clk) begin if(I_Rst_n == 1'b0) begin reg_ch3_vaild <= 1'b0; end else if(state == state_ch3_0 && (r1_ch3_start == 1'b1 && r2_ch3_start == 1'b0)) begin reg_ch3_vaild <= 1'b0; end else if(state == state_ch3_0 && reg_ch3_vaild == 1'b0) begin reg_ch3_vaild <= 1'b1; end end //state always @(posedge I_clk) begin if(I_Rst_n == 1'b0) begin state <= state_idle; end else begin case (state) state_idle: begin if(aribe_start == 1'b1 && aribe_cycle == 1'b0) begin state <= state_aribe; end else begin state <= state_idle; end end state_aribe:begin if(step[2] == 1'b1 && step[3] == 1'b0) begin case (S3_req_Concat) 4'b0001:begin state <= state_ch0_0; end 4'b0010:begin state <= state_ch1_0; end 4'b0100:begin state <= state_ch2_0; end 4'b1000:begin state <= state_ch3_0; end default: state <= state_aribe; endcase end else begin state <= state_aribe; end end // state.step.0 state_ch0_0:begin if((r1_ch0_start == 1'b1 && r2_ch0_start == 1'b0)) begin state <= state_ch0_1; end else begin state <= state_ch0_0; end end state_ch1_0:begin if((r1_ch1_start == 1'b1 && r2_ch1_start == 1'b0)) begin state <= state_ch1_1; end else begin state <= state_ch1_0; end end state_ch2_0:begin if((r1_ch2_start == 1'b1 && r2_ch2_start == 1'b0)) begin state <= state_ch2_1; end else begin state <= state_ch2_0; end end state_ch3_0:begin if((r1_ch3_start == 1'b1 && r2_ch3_start == 1'b0)) begin state <= state_ch3_1; end else begin state <= state_ch3_0; end end // state.step.1 state_ch0_1:begin if(I_ch0_end == 1'b1) begin state <= state_idle; end else begin state <= state_ch0_1; end end state_ch1_1:begin if(I_ch1_end == 1'b1) begin state <= state_idle; end else begin state <= state_ch1_1; end end state_ch2_1:begin if(I_ch2_end == 1'b1) begin state <= state_idle; end else begin state <= state_ch2_1; end end state_ch3_1:begin if(I_ch3_end == 1'b1) begin state <= state_idle; end else begin state <= state_ch3_1; end end default: begin state <= state_idle; end endcase end end endmodule
TB文件:
`timescale 1ns / 1ps module tb_Aribe_LoopPrior_State_v1; // Aribe_LoopPrior_State_v1 Parameters parameter PERIOD = 10; // Aribe_LoopPrior_State_v1 Inputs reg I_clk ; reg I_Rst_n ; reg I_ch0_req ; reg I_ch0_start ; reg I_ch0_end ; reg I_ch1_req ; reg I_ch1_start ; reg I_ch1_end ; reg I_ch2_req ; reg I_ch2_start ; reg I_ch2_end ; reg I_ch3_req ; reg I_ch3_start ; reg I_ch3_end ; // Aribe_LoopPrior_State_v1 Outputs wire O_ch0_vaild ; wire O_ch1_vaild ; wire O_ch2_vaild ; wire O_ch3_vaild ; initial begin I_clk = 0; end always #(PERIOD/2) I_clk = ~ I_clk; initial begin I_Rst_n <= 1'b0; repeat (100) @(posedge I_clk); I_Rst_n <= 1'b1; end initial begin repeat (10) @(posedge I_clk); I_ch0_start <= 1'b0; I_ch0_end <= 1'b0; I_ch1_start <= 1'b0; I_ch1_end <= 1'b0; I_ch2_start <= 1'b0; I_ch2_end <= 1'b0; I_ch3_start <= 1'b0; I_ch3_end <= 1'b0; {I_ch3_req, I_ch2_req, I_ch1_req, I_ch0_req} <= 4'b0000; @(posedge I_Rst_n); {I_ch3_req, I_ch2_req, I_ch1_req, I_ch0_req} <= 4'b1111; repeat (1) @(posedge I_clk); {I_ch3_req, I_ch2_req, I_ch1_req, I_ch0_req} <= 4'b0000; @(posedge O_ch0_vaild); repeat (2) @(posedge I_clk); I_ch0_start <= 1'b1; repeat (1) @(posedge I_clk); I_ch0_start <= 1'b0; repeat (64) @(posedge I_clk); I_ch0_end <= 1'b1; repeat (1) @(posedge I_clk); I_ch0_end <= 1'b0; repeat (64) @(posedge I_clk); {I_ch3_req, I_ch2_req, I_ch1_req, I_ch0_req} <= 4'b0001; repeat (1) @(posedge I_clk); {I_ch3_req, I_ch2_req, I_ch1_req, I_ch0_req} <= 4'b0000; @(posedge O_ch0_vaild); repeat (2) @(posedge I_clk); I_ch0_start <= 1'b1; repeat (1) @(posedge I_clk); I_ch0_start <= 1'b0; repeat (64) @(posedge I_clk); I_ch0_end <= 1'b1; repeat (1) @(posedge I_clk); I_ch0_end <= 1'b0; repeat (64) @(posedge I_clk); {I_ch3_req, I_ch2_req, I_ch1_req, I_ch0_req} <= 4'b0001; repeat (1) @(posedge I_clk); {I_ch3_req, I_ch2_req, I_ch1_req, I_ch0_req} <= 4'b0000; @(posedge O_ch0_vaild); repeat (2) @(posedge I_clk); I_ch0_start <= 1'b1; repeat (1) @(posedge I_clk); I_ch0_start <= 1'b0; repeat (64) @(posedge I_clk); I_ch0_end <= 1'b1; repeat (1) @(posedge I_clk); I_ch0_end <= 1'b0; repeat (64) @(posedge I_clk); {I_ch3_req, I_ch2_req, I_ch1_req, I_ch0_req} <= 4'b1001; repeat (1) @(posedge I_clk); {I_ch3_req, I_ch2_req, I_ch1_req, I_ch0_req} <= 4'b0000; @(posedge O_ch3_vaild); repeat (2) @(posedge I_clk); I_ch3_start <= 1'b1; repeat (1) @(posedge I_clk); I_ch3_start <= 1'b0; repeat (64) @(posedge I_clk); I_ch3_end <= 1'b1; repeat (1) @(posedge I_clk); I_ch3_end <= 1'b0; repeat (64) @(posedge I_clk); {I_ch3_req, I_ch2_req, I_ch1_req, I_ch0_req} <= 4'b1000; repeat (1) @(posedge I_clk); {I_ch3_req, I_ch2_req, I_ch1_req, I_ch0_req} <= 4'b0000; @(posedge O_ch3_vaild); repeat (2) @(posedge I_clk); I_ch3_start <= 1'b1; repeat (1) @(posedge I_clk); I_ch3_start <= 1'b0; repeat (64) @(posedge I_clk); I_ch3_end <= 1'b1; repeat (1) @(posedge I_clk); I_ch3_end <= 1'b0; repeat (64) @(posedge I_clk); end Aribe_LoopPrior_State_v1 u_Aribe_LoopPrior_State_v1 ( .I_clk ( I_clk ), .I_Rst_n ( I_Rst_n ), .I_ch0_req ( I_ch0_req ), .I_ch0_start ( I_ch0_start ), .I_ch0_end ( I_ch0_end ), .I_ch1_req ( I_ch1_req ), .I_ch1_start ( I_ch1_start ), .I_ch1_end ( I_ch1_end ), .I_ch2_req ( I_ch2_req ), .I_ch2_start ( I_ch2_start ), .I_ch2_end ( I_ch2_end ), .I_ch3_req ( I_ch3_req ), .I_ch3_start ( I_ch3_start ), .I_ch3_end ( I_ch3_end ), .O_ch0_vaild ( O_ch0_vaild ), .O_ch1_vaild ( O_ch1_vaild ), .O_ch2_vaild ( O_ch2_vaild ), .O_ch3_vaild ( O_ch3_vaild ) ); endmodule