Gated Clock Conversion in Vivado Synthesis/时钟门控自动转换

Overview

Traditionally, gated clocks in ASIC designs are a common way to reduce power consumption in systems.  By gating the clock, whole sets of registers can be kept from transitioning when not needed.

Fig. 1: Gating a clock with an AND gate

In Fig. 1, when the "gate" signal is set to a low condition, the registers are all turned off and are not drawing dynamic power.

This type of coding style does not always port well to FPGAs.  This is because FPGAs have advanced dedicated clock resources that are designed to set the timing of the clock structures to an optimal setting, avoiding clock skew.  Putting a gate in the middle of this structure can interfere with those resources.  In addition, those clock resources are not unlimited, so having many different gated clocks can cause problems in an FPGA design.

Fig. 2: LUT inside the clocking structure

One way to get around these problems is to rewrite the RTL code to remove the gates. However this involves a lot of work, and in many cases when the design is being prototyped in FPGAs, the RTL is not allowed to be changed.  Another way to fix this is to allow your synthesis tool to convert those gates so that the clock will drive the register clock pin directly and the gating logic will go to the clock enable pin.  Vivado Synthesis does support this behavior.

Fig. 3: Same circuit with the gated clocks converted.

It should be noted for designers that doing this conversion helps the tool to use the dedicated clocking resource, but it also now uses different clock enables.  This will mean more control sets in your design which can also have other effects. 

In addition, simulation can also be affected. Take for example Fig. 2 and Fig. 3.  In this example, the clock and gate are both low and then the gate pulses high and low while clk stays low.  In Fig. 3, this would count as a clock pulse, however in Fig. 4, this pulse would have been missed.  Care will need to be taken to avoid situations like this.

Controlling gated clocks

The controlling of gated clock conversion is accomplished with a combination of three items.  The clock constraints in XDC files, the GATED_CLOCK synthesis attribute, and the gated_clock_conversion synthesis setting.

Clock constraints from the XDC files tell the tool how fast the clocks in the design need to operate. 

They take the following form:

create_clock -period 5 [get_ports clk]

By using constraints, the tool will know which signals can be converted to direct clocks.

The GATED_CLOCK attribute allows the the user to directly tell the tool which clock in the gated logic should drive the clock input of the register.  It is put in the RTL file.

(* gated_clock = "yes" *) input clk;  

The gated_clock_conversion option controls how synthesis does gated clock conversions.  If it is set to "off", it will never convert gated clocks.  If it is set to "on", then it will perform gated clock conversions on signals that have the GATED_CLOCK attribute.

If it is set to "auto", then it will do conversions when it knows, via the XDC file, which signals are the real clocks in the design. In addition, if there is more than one possible clock that could be converted, the GATED_CLOCK attribute can be used to tell the tool which one should be used.

When the tool can detect a gated clock, and the conversion feature is turned on, it will attempt to separate the clock from the rest of the logic in the gate.  If it can do so, the clock will directly drive the C pin of the register, and the rest will get assigned to the clock enable logic of the register.

One important thing to consider when doing gated clock conversion is hierarchy.  When converting the gated clock, the tool splits up the clock from the rest of the logic and creates a new clock, and a clock enable.  If the gated clock and the registers that the new clock will drive are in different levels of hierarchy and there is something keeping the hierarchy static (DONT_TOUCH, KEEP_HIERARCHY...) the tool will not be able to convert the clock.

Fig. 4: Clock gating circuit in a different level of hierarchy

Basic gates

One of the more common forms of gated clocks are clocks gated by basic gates (for example, AND gates). 

Example RTL code:

assign my_clk = clk1 & gate1 & gate2;

This gates the clock with two different enables and the elaborated view looks like the following:

Fig. 5: Clock gating with AND gates

When synthesized with gated clock conversion on, and either gated_clock_conversion set to auto with a clock period on clk1, or the GATED_CLOCK attribute set on clk1, the tool will connect the clk1 signal to the C input of the register and the gate1 and gate2 signals into the CE input of the flop.

Fig. 6: Previous circuit, post Synthesis

Conversion of OR gates will also work.

Fig. 7 : Using OR gates in the clock circuit

The above circuit will get converted to the following:

Fig. 8: OR gate conversion

Vivado Synthesis can also convert more complex gates than ANDs and ORs.

 

Registered gates

Vivado can also convert gates that are registered. 

For example, the coding style below will create a register that gets used as a clock by another register:

always@(posedge clk)
     reg_clk <= clk_in;
always@(posedge reg_clk)
     out1 <= in1;

If there is an appropriate constraint on the first clk signal, the tool be able to convert this type of gate as well. 

For example :

create_clock -period 5 [get_ports clk]

Fig. 9: Elaborated view of a register used as a clock

This will get converted to the following:

Fig. 10: A register gate after being converted

Clock Dividers (1 bit)

Vivado can also handle more complex gates, for example, a clock divider. 

The code for this clock would look like the following:

module clock_gen (clk_in, clk_out, clk_out_div_2);
input clk_in;
output clk_out, clk_out_div_2;
reg div_2;

always@(posedge clk_in)
     div_2 <= ~div_2;

assign clk_out = clk_in;
assign clk_out_div_2 = div_2;

endmodule

What this module does is take an input clock called "clk_in" and makes a new clock that is half the speed called clk_out_div_2. 

Then both clocks are assigned outputs to the module. The rest of the design would use these two clocks for the sequential elements in the design.

 

Fig 11: Clock divider

Because this type of clock will not be at the same frequency, a generated clock constraint needs to be used.

create_clock -period 2.5 -name clk_in [get_ports clk_in]
create_generated_clock -source [get_ports clk_in] -divide_by 2 [get_pins clks/div_2_reg/Q]

What these constraints do is create one clock called clk_in that is based on the input clk_in and give it a period of 2.5 ns. 

Then it creates a new clock that is half the frequency of clk_in and the source of this clock is the Q pin of the div_2_reg flop.

When running synthesis without converting the clocks and looking at the timing info, it is seen that the paths for the registers clocked by the div_2_reg are all using a period of 5 ns. 

Example command:

report_timing -name timing -to out1_2_reg/D

The tool will return the following:

Fig 12: Timing report for a divide by 2 clock without clock gating

Running this same design with clock gating yields a problem.

The new netlist looks like the following:

 

Fig 13: Gated clock conversion on a clock divider

As you can see, all of the registers are now driven by the original clock, but if you do report_timing on those new registers, instead of getting a 5 ns period, you now get a 2.5 ns period. The reason for this is that the clock path from div_2_reg/Q no longer drives the clock pin of the registers, so the generated clock command is no longer affecting the timing analysis.

In order to fix this, Vivado Synthesis gives you an anchor point to create a new generated clock constraint.  In the above figure, notice that the clock line branches off and goes through a new level of hierarchy that only drives the registers that have had their clocks converted.  Also, the synthesis log file will tell you the name of this point and give a new command to put in your XDC file so that timing will come out correct.

 

Fig 14 : Log file for clock dividers

Clock Dividers (multiple bits using a counter)

In addition to clock dividers, counters can also be used as clocks.  For example, a 4 bit counter can produce 4 different clocks.  Divide by 2, divide by 4, divide by 8, and divide by 16.

module clk_gen (clk_in, clk_out);
input clk_in;
output reg [3:0] clk_out;

always@posedge clk_in)
clk_out <= clk_out + 1;

endmodule

They can be used to create a clock that is used as follows:

 

Fig 15 : Using a counter as a clock generator

Running this with gated clock conversion turned on will result in the following:

 

Fig 16 : Gated clock conversion with a counter

Notice that this has created the feedthrough anchor points for each new clock.

These are also reported in the log to be used on subsequent runs.

 

Fig 17: Log file for a counter acting as a gated clock

posted @   数字IC那些事  阅读(852)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· 地球OL攻略 —— 某应届生求职总结
· 周边上新:园子的第一款马克杯温暖上架
· Open-Sora 2.0 重磅开源!
· 提示词工程——AI应用必不可少的技术
· .NET周刊【3月第1期 2025-03-02】
点击右上角即可分享
微信分享提示