A Peer Revieved Open Access International Journal www.ijiemr.org ### **COPY RIGHT** 2018 IJIEMR. Personal use of this material is permitted. Permission from IJIEMR must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. No Reprint should be done to this paper, all copy right is authenticated to Paper Authors IJIEMR Transactions, online available on 19<sup>th</sup> April 2018. Link: http://www.ijiemr.org/downloads.php?vol=Volume-7&issue=ISSUE-4 Title: ENERGY EFFICIENT SYNCHRONOUS SEQUENTIAL CIRCUITS DESIGN USING CLOCK GATING. Volume 07, Issue 04, Page No: 63-75. **Paper Authors** \*SANAM JYOTHI, DONE SRIDHAR. \* Ramachandra College of Engineering, Vatluru. USE THIS BARCODE TO ACCESS YOUR ONLINE PAPER To Secure Your Paper As Per UGC Guidelines We Are Providing A Electronic Bar Code A Peer Revieved Open Access International Journal www.ijiemr.org # ENERGY EFFICIENT SYNCHRONOUS SEQUENTIAL CIRCUITS DESIGN USING CLOCK GATING ### \*SANAM JYOTHI, \*\*DONE SRIDHAR. \*M.Tech (Vlsi), Ramachandra College of Engineering, Vatluru, Andhra Pradesh 534007. \*\*Associate Professor, Dept of ECE, Ramachandra College of Engineering, Vatluru, Andhra Pradesh 534007. ABSTRACT: Pulsed latches are gaining increased visibility in low-power ASIC designs. They provide an alternative sequential element with high performance and low area and power consumption, taking advantage of both latch and flip-flop features. While the circuit reliability and robustness against different process, voltage, and temperature variations are considered as critical issues with current technologies, no significant reliability study was proposed for pulsed latch circuits. In this paper, we present a study on the effect of different PVT variations on the behavior of pulsed latches, considering the effect on both the pulsar and the latch. In addition, two novel design approaches are presented to enhance the reliability of pulsed latch circuits, while keeping their main advantages of high performance, low power, and small area. Experiments performed using Synopsys 28nm PDK demonstrate the ability of the proposed approaches to keep the same reliability level at different supply voltages and temperatures in the presence of process variations, with a very small area overhead of around 3%. The two proposed designs have negligible power overhead when running at nominal supply voltage, and they have higher yield per unit power when compared with the traditional design at different voltages andtemperatures. **Index Terms**—Pulsed latches, flip-flops, pulsed flip-flops, variability, process variation, voltage scaling, low power. #### I. INTRODUCTION FLIP-FLOPS are considered the most popular sequential elements used in conventional ASIC designs. This is mainly because of the simplicity of their timing model, which makes the design and timing verification processes much easier. Master-Slave Flip-Flops (MSFFs) are considered the most common and traditional implementations of flip-flops, due to its stable operation and its simple timing characteristics. However, the fact that the MSFF micro-architecture is usually built using two consecutive latches, it takes an appreciable portion of the clock period, power consumption, and area. A typical MSFF has a significant nominal timing overhead (sum of the clock-to-Q delay and the setup time) of 6 FO4 (fanout-of-4) and can reach 10 FO4 when A Peer Revieved Open Access International Journal www.ijiemr.org considering clock skew and jitter. In addition, the clock network, including the flops, often consumes one third to one half of the total dynamic power of the chip. In addition to the mentioned Overheads associated with MSFF, additional margins, which can reach up to 15% (depending on the sign off methodology), are usually added to the nominal timing margins to ensure correct operation under different process, voltage, and temperature (PVT) variations. This, in turn, increases the already existing high timing and power overheads. For the above reasons, MSFF can be considered as a good choice for low-to-medium performance designs as they provide a good balance between delay, power, and easy design and verification processes for chips working at a relatively lowfrequency. Fig. 1. Simple diagram of a traditional transmission gate pulsed latch. On the other hand, high performance custom designs tend to use latches due to their lower timing overhead that can reach 2 FO4 in some designs. Although latch based designs are typically robust to clock skew and jitter (due to the latch transparency period), latches have a complicated timing model, which, in turn, complicates the design and the verification processes and increases the risk of hold time violations, especially with PVT variations. To fill in the missing gap between MSFFs and latches, pulsed latches (sometimes called pulsed flip-flops) havebeen used in some highperformance designs. Pulsed latches (PLs) are latches driven by short pulses generated from the normal clock signal using a pulse generator circuit called a *pulsar* as shown in Fig. 1. The pulsar can be either embedded in the latch, or can be separated as a standalone circuit as shown in Fig. 1. If the latter approach is used, a single pulsar can be shared by more than one latch. Thus, it has the advantage of area and power consumption savings over the former approach, and it is the focus of our discussion in this paper. In addition, the pulsar usage can eliminate the need for some of the clock buffers used in the clock tree, thus providing an additional amount of power and area savings. Having only one latch between the input and the output, PLs have lower timing overhead A Peer Revieved Open Access International Journal www.ijiemr.org than MSFFs. At the same time, since the driving pulse is very short, the transparency period for the latch becomes very narrow, allowing the PLs to have a timing behavior close to that of MSFFs, to the extent that they are sometimes classified among flip-flop families. Also, due to the presence of the narrow transparent windowof the latch, pulsed latches have an inherent tolerance to clock skew and jitter. Since they have fewer transistors that are triggered by the clock signal, they have the advantage of reducing a significant amount of clocking power, and they consume much less leakage power compared to MSFFs due to the smaller area and fewer transistors. In this paper, we are presenting variability analysis of one of the popular topologies of pulsed latches, TransmissionGate Pulsed Latch (TGPL), studying the effects of process, voltage, and temperature variations, as well as proposing design modifications that can help in decreasing the probability of circuit failure (i.e. enhancing pulsed latch reliability) at different supply voltage values. With the proposed approaches, pulsed latches present a formidable alternative to MSFFs, providing higher performance, lower area and power consumption, and higher reliability robustness to different kinds of variations. The main contribution areas of this paper are: - A study of the effect of PVT variations on the operation of pulsed latches in advanced technology nodes, considering the effects on both the pulsar and the latches. - Two novel pulse generator designs for the pulsed latch that can be utilized to increase the reliability of pulsed latch circuits, while keeping its main advantages of high performance and low power consumption. - Comprehensive comparisons of reliability, power, and area between different data registers implemented usingthe traditional transmission gate pulsed latches and the proposed reconfigurable pulsed latches. The remainder of the paper is organized as follows. Section II discusses some previous work in this area. Section III discusses PVT variations and their effects on pulsed latches behavior. Section IV discusses the two proposed design approaches to improve the reliability of pulsed latch under supply voltage V discusses scaling. Section the resultsObtained for the enhancement of circuit reliability, power, and area on a case study of a typical register. Finally, some conclusions are drawn in section VI. A Peer Revieved Open Access International Journal www.ijiemr.org ### **II. PREVIOUS WORK** Pulsed latches have been always proposed to decrease power consumption and increase performance. In PLs with relatively wide pulse widths were used to allow cycle borrowing and tolerate any clock skew. In order to compensate for any hold time issues, dearer circuits were used to block any incoming fast data before the end of the pulse. In PLs were used as the main sequential elements to increase the performance of the Intel X Scale microprocessor without consuming high clock power. Although the minimum pulse widths to ensure reliable operation across different PVT corners were used, delay buffers were still be needed to decrease the risk of hold time violations. Baumann et al. proposed three options for selectively replacing some of the MSFFs by PLs to improve the performance of the ARM926 microprocessor. However, some area and power overhead were presented due to buffer insertion. Baumann et al. proposed replacing MSFFs by PLs in an ARM microprocessor. This was used to gain some performance improvement that was utilized as timing margins to compensate for the withindie variations. In a pulse generator with a pulse width control option was presented to enhance the reliability against Bias Temperature Stability (BTI). In a traditional pulsar and a latch formed by a tri-state inverter and a static keeper were proposed. The pulse width was chosen large enough to ensure correct operation under different PVT corners, assuming that delay cells can be used to fix any hold time issues. In the impact of process variations on several pulsed flip-flops were presented, in addition to two techniques to reduce that impact. However, the effect of voltage and temperature were not studied and the proposed techniques were not quantified under these two effects. Dhong et al.presented a novel pulser design whose output pulse width is determined by the voltage level at an input of a NAND gate instead of the delay chain of the traditional pulsar of the TGPL. The paper showed that the proposed pulser is less affected by the clock rise time when compared with the traditional pulser at different supply voltages. However, the paper defined the failure criteria by the ability of the pulser to output a valid pulse, without quantifying the satisfaction of the pulse width for the needed latch transparency window in order to achieve successful writing. In addition, the study Didn't quantify the effect of these variations on the design yield. Also, the studied voltage variations was limited to 10% only, while the temperature variations studied was only 20°C A Peer Revieved Open Access International Journal www.ijiemr.org around 85°C. As shown from the previous studies, TGPL, which will be our focus in this paper, is one of the most attractive architecture for PL circuits. However, there are still some challenges in the TGPL design (PL in general) to ensure reliable operation under PVT variations. In addition, a more comprehensive study of the probability of failure based on both the pulser and the latch is still missing. Although the study and proposed architectures presented in this paper will focus on TGPL, the same approaches can be applied to any other PL topology. # III. EFFECT OF PVT VARIATIONS ON PULSED LATCHES The operation of PLs is based on enabling the latch for a short time using a pulse generated by the pulser circuit. Hence, to study the effect of variation on PL operation, variation effects on both the latch write time and the pulser pulse width should be studied. The effect of process variations is carried out for each of the latch and the pulser independently and the same study is repeated for different voltage and temperature values of interest. ### A. Process Variations Due to the extreme miniaturization of device parameters in current and upcoming technology processes, even a small variation in the manufacturing process may cause parameter variations that can lead to a failed circuit operation. Thus, one of the significant challenges in the design phase is the ability to evaluate the effect of different sources of variations on the functionality of complex circuits and to provide circuit solutions to guarantee correct functionality under different sources variations. Process dependent sources variability such as effective length variation, thickness oxide variation. Line Edge Roughness (LER), and Random Dopant Fluctuation (RDF) (for planar MOSFETs) result in variations in the value of the threshold voltages of transistors, which in turn impact the timing and power of digital circuits. The threshold voltage variations due to RDF (which is usually the principal source of threshold voltage variations in planar MOSFETs) are considered as zero-mean Gaussian independent random variables with standard deviation denoted as $\sigma V t h$ which is given by: $$\sigma_{Vth} = \sigma_{Vtho} \sqrt{\frac{L_{min} W_{min}}{LW}} \tag{1}$$ Where $\sigma V tho$ is the $\sigma V th$ for minimum sized transistor and it is given by: $$\sigma_{Vtho} = \frac{q T_{ox}}{\epsilon_{ox}} \sqrt{\frac{N_a W_d}{3 L_{min} W_{min}}}$$ (2) A Peer Revieved Open Access International Journal www.ijiemr.org Where Na is the effective channel doping, Wdis the depletion region width, Toxis the oxide thickness, Lminand Wminare the minimum channel length and width. respectively. While the scaling down of CMOS technology reduces the nominal supply voltage, the threshold voltages are not scaled by the same factor, leading to a significant reduction of the transistor's available voltage headroom (the difference between the supply voltage and the threshold voltage). Hence, even any small variation in the transistor threshold voltage can lead to a significant degradation of the circuit behavior or can even cause complete circuit failure. Fig. 2. Sample PDFs of the latch write time and the pulser pulse width showing the region of write failure. Studying the effect of process variations on PLs includes studying the effect of variations on both the latch write time and the pulser pulse width. As shown in Fig. 2, this is represented by the probability distribution functions (PDFs) of both the write time (Latch WR Time) calculated as the CLK-to-Q delay and the pulse width (Pulser PW). To ensure correct write operation, the pulse width should be larger than the required transparent window for the latch (i.e. time needed to capture the input data and pass it through the internal nodes to the storing cross coupled inverters). The area under the intersection between the two PDFs represents the failure of write operation, since this is the region where there is a high probability that the pulse width will be smaller than the time needed by the latch. Alternatively, knowing the information about the distribution of the latch write time and for easiness of timing analysis, a maximum value for latch write time can be calculated for certain sigma value of the designer choice. In this case, the probability of write failure can be calculated as the probability of having the width of the pulser output smaller than this desired maximum value. In both cases, depending on the target yield, the designer can determine the minimum acceptable value for circuit failure, and hence, the transistors' dimension can be adjusted to reach the target yield. Fig. 3. The effect of voltage scaling on the distributions of the latch WR time and the pulser PW at $125^{\circ}$ C. A Peer Revieved Open Access International Journal www.ijiemr.org ### **B. Voltage Scaling** used for reducing the power consumption of circuits. It significantly decreases both dynamic power (with its two components of switching power and internal power) and leakage power. On the other hand, the ability to reduce the operating supply voltage is limited by a minimum value determined usually by some timing constrains (critical path delay as an example), in addition to some margins for the PVT variations, and usually adding a margin for aging effects. As the supply voltage is scaled down, the available voltage headroom decreases further and the transistors become more sensitive to any variations. Voltage scaling is a popular run-time technique The effect of voltage scaling is naturally associated with the increase of timing delays for different circuit components. While this can be handled at design time for several circuit components, the case may not be as easy for PLs. Since PL operation depends on two different components (pulser and latches) of different micro architectures, the timing of each of them is affected differently. As shown in Fig. 3, voltage scaling affects the probability distribution of the pulser and the latch differently. As shown in Fig. 4, the probability of write failure for a PL can increase by up to two order of magnitude whenthe supply voltage is scaled down by around 30%. Even if the PL circuit is designed to operate reliably at an intermediate supply voltage (0.9V as an example), the reliability will still significantly degrade at lower voltages, especially at low operating temperatures. One possible solution is to design the PL circuit to operate with the needed level of reliability at the lowest possible operating voltage. Since chips usually operate at different supply voltages with different operating modes, when pulsed latches are operating at a voltage higher than that minimum value, they will be operating with extra timing margin (the pulse width will be larger than the needed width to achieve the required level of reliability). Fig. 4. The probability of failure of a traditional pulsed latch designed a nominal supply voltage at different supply voltages and temperatures. ### C. Temperature Effect Studying the effect of temperature variation on the design is very important. Not only does the variation in temperature affect leakage power and performance, but it also affects the probability of having an error during circuit A Peer Revieved Open Access International Journal www.ijiemr.org operation, as well as impacting the life span of different chip parts. Factors such as the increase of leakage power with technology process scaling, the nonequivalent down scaling of the supply voltage when compared to geometry scaling, and the increase in the dynamic power associated with the increase in performance Required in current designs, all lead to the increase of the operating temperature. Careful study of the effect of temperature is required especially for time-sensitive sequential elements. The study of temperature effects on pulsed latches is much more critical, since each of the pulser and the latches can have a different response to temperature variation. The study done in this paper shows that both circuits become more sensitive to process variation with the decrease in temperature. However, the pulser is more significantly affected by temperature variations. In addition, the entire PL would have high failure rates when operating at a lower temperature. When running at nominal supply voltage, the transistors become faster as the temperature decreases. Since the pulser is more timing sensitive than the latch, the timing margin between the latch write time and the pulser output pulse width will decrease with the decrease in temperature hence, the probability of write failure is expected to increase with the decrease of temperature. The standard deviation for the latch distribution is doubled with the temperature decrease from 125°C to −40°C, while the standard deviation for the pulser distribution increases by more than 60%. This significant increase in the variation of the latch and pulser timing with the decrease in temperature leads to the increase of the probability of write failure. Fig. 5. A diagram showing arbitrary PDFs for the pulser and the latch when (a) operating at nominal supply voltage, (b) scaling down the supply voltage without configuring the pulser (or having a fixed pulser), and (c) scaling down the supply voltage and A Peer Revieved Open Access International Journal www.ijiemr.org configuring the pulser circuit to generate a wider output pulse. #### IV. PROPOSED DESIGN APPROACHES As described in the previous section, it is not easy to design a non-configurable pulsed latch circuit that can operate with just the needed timing margins at different supply voltages in the presence of process and temperature variations, while keeping the needed level of reliability. To be able to reach the needed reliability level, the pulser circuit should be reconfigured at run time to generate an output pulse whose width can be controlled based on the operating condition. Shown in Fig. 5(a) is a pulsed latch circuit designed to operate correctly at nominal supply voltage with high level of reliability. However, when scaling down the supply voltage as shown in Fig. 5(b), the circuit become less reliable with higher probability of failure. The required level of reliability can be achieved at the lower supply voltage by increasing the width of the generated pulse. As shown in Fig. 5(c), this is equivalent to shifting the pulser probability distribution to the right, compensating for the increased variation effects at lower voltages and therefore, decreasing the probability of circuit failure. In this section, two design approaches are proposed. Both approaches depend on controlling the delay path (the delayunit and its following inverter) of the pulser circuit by using an external control signal (CTRL) to generate a controllable pulse width. The first approach considers splitting the supply rail of the pulser circuit, and applying an additional controllable level of voltage scaling on the delay path when needed. The second approach relies on using multiple delay units in the pulser circuit and choosing a certain delay unit at run-time according to the operating condition. Detailed discussions of the two approaches are presented in the next two subsections. Fig. 6. The proposed header switches-based pulser design. #### A. First Approach This approach is based on using a virtual supply rail for the delay path of the pulser, driven from the main supply rail used for the rest of the pulser circuit and the latches. This can be accomplished using header PMOS switches for the delay path of the pulser circuit, similar to the local power gating topology [25] as shown in Fig. 6, where turning off some of these switches will result in lowering the A Peer Revieved Open Access International Journal www.ijiemr.org supply voltage of the delay path. Since this delay path is the main part of the circuit that control the width of the generated pulse, controlling the supply voltage of this path will result in controlling the output pulse width. Separate control signals can be used for different switches, where at least one of these switches must be always turned on (i.e, the gate of this PMOS switch should be tied to the ground) giving the maximum output pulse width, while the other switches can be turned on or off to achieve the required narrowing of the pulse width. The number of these parallel switches and their sizes will depend on the number and values of the virtual supply voltage levels, which corresponds to the needed pulse widths to achieve the target reliability level at different operating conditions. Since the delay chain current represents only 20-30% of the total pulser current, the sizes of these PMOS switches should be reasonable, adding a small area overhead to the pulser circuit. When scaling down the supply voltage, the needed margin for variations in the latch write time increases. By turning off part of the pulser header switches, an additional down scaling of the virtual supply of the delay path (*VDI*) is provided; i.e., the delay unit and its following inverter is running at a slightly lower supply than the rest of the pulser circuit. This additional voltage scaling of *VDI* will result in a small increase in the pulser output pulse width. Since the circuit is already operating with small voltage headroom at this lower supply voltage, a very small decrease in *VDI* will be sufficient to produce an adequate increase in the pulse width without having a significant difference between the supply voltages of the delay path and the rest of the pulser circuit. In addition, the remaining pulser circuit (the NAND gate and the output inverter) will act as a voltage level shifter, driving the latches by the same voltage level as their supply voltage. Fig. 7. The proposed MUX-based pulser design. ### **B. Second Approach** Implementing multiple delay units with different delays can help in generating pulses with different widths. Oneimportant design consideration is the ability to choose between these different units post silicon or at run-time. The second proposed pulser design is shown in Fig. 7. Each delay unit represents a buffer chain that can be implemented in different ways. It A Peer Revieved Open Access International Journal www.ijiemr.org can be as simple as a very small delay unit (i.e., just a wire) and up to multiple even number of inverters of different inverter sizes and/or numbers. The output of the multiplexer is used to drive an odd number of inverters, whose final output is connected to the NAND gate. By selecting a longer delay chain, the latch transparency window can be increased at run time, which is required when scaling down the supply voltage. The shortest delay unit is designed such that, when operating at a nominal supply voltage, the circuit is verified to run with very low probability of failure in the presence of different process and temperature variations. The rest of the delay units are designed depending on the number and values of the supply voltage scaling levels. ### V. RESULTS To verify the proposed approaches, test circuits of 16-bit register were examined and three implementation choices were compared. The three implementations consists of a single pulser driving sixteen identical latches similar to that shown in Fig. 1. The first choice is the implementation using the traditional non-configurable pulser shown in Fig. 7. The pulser was designed at nominal supply voltage to ensure the required reliability level. The second and the third choices are the two proposed pulser implementations, also driving sixteen identical latches. The effect of voltage scaling of one scaling level was applied on all circuits. An extreme value of voltage scaling which is usually around 30% reduction from nominal supply value was used to show the effectiveness of the proposed approaches. The same approaches can be easily extended to any other scaling values. Fig. 8.pulser used for the PL-SW based register. Fig. 9. Simulated results of pulser used for the PL-SW based register. Fig. 10.pulser used for the PL-MUX based register. A Peer Revieved Open Access International Journal www.ijiemr.org Fig. 11.simulated results of pulser used for the PL-MUX based register. Hence, for designs with few voltage scaling levels, the PL-MUX design can be preferable over the PL-SW one, as it is easier in design, generates more precise pulse widths, and its overheads (power and area) are reasonable. On the other hand, for designs with large number of voltage scaling levels, the PLSW design is preferred, as the area and power overheads of the PL-MUX design will be significant. ### VI. CONCLUSION In this paper, an analysis of the effect of PVT variations on the pulsed latch performance was presented. The analysis considered both the pulser and the latch to evaluate the reliability of the entire pulsed latch circuit. In addition, the benefits of having reconfigurable pulsed latch circuit discussed. Two novel modifications to add the reconfiguration ability to TGPL circuits were proposed. The benefits of using the proposed design approaches in enhancing the robustness of pulsed latch circuits at different supply voltages were demonstrated using 16-bit registers. Both proposed approaches were able to ensure reliable operation of the pulsed latch-based register under different supply voltages in the presence of process and temperature variations, without any unnecessary timing overhead. Both approaches have a very small area overhead of around 3% or less. In addition, the power overhead of both approaches is minimal when compared to the traditional pulsed latch based register at the same reliability level. Both approaches are easily scalable to cover different levels of voltage scaling. In addition, they can be applied to any other pulsed latches topology that depends on a delay path to generate the output pulse. ### **REFERENCES** [1]D.ChinneryandK.Keutzer, Closingthe GapBe tween ASIC & Custom: Tools and Techniques for High- PerformanceASICDesign.Norwell,MA, USA:Kluwer,2002. [2]S.Paik,G.- J.Nam,andY.Shin,"Implementationofpulse d-latch and pulsed- A Peer Revieved Open Access International Journal www.ijiemr.org - registercircuitstominimizeclockingpower,"in *P* roc.IEEE/ACMInt.Conf.Comput.AidedDesign(ICCAD), Nov.2011, pp.640–646. - [3]Y.ShinandS.Paik, "Pulsed-latchcircuits: Anewdimensionin ASICdesign," *IEEEDes.TestComput.*, vol.2 8,no.6,pp.50–57, Nov./Dec.2011. - [4]M.A.Alam,K.Roy,andC.Augustine,"Reliab ility-andprocess-variationawaredesignofintegrated circuits—Abroader perspective,"in *Proc.IEEE Int.Rel.Phys.Symp.(IRPS*), Apr. 2011, pp. 4A.1.1–4A.1.11. - [5]E.Consoli,G.Palumbo,J.M.Rabaey,andM.A lioto,"Novelclassof energy-efficientveryhigh-speedconditionalpush—pullpulsedlatches," IEEETrans. VeryLargeScaleIntegr. (VLSI)S yst.,vol.22,no.7, pp.1593–1605,Jul.2014. - [6]J.Warnock*etal.*, "Circuitandphysicaldesign ofthezEnterpriseEC12microprocessorchips andmulti-chipmodule," *IEEEJ.Solid-StateCircuits*, vol. 49, no. 1, pp. 9–18, Jan. 2014. - [7]T.Baumann*etal.*, "Performanceimprovemen tofembeddedlow-power microprocessorcoresbyselectiveflipfloprepl acement," in *Proc.33rdEur.SolidStateCircuitsConf.* (ESSCIRC), Sep. 2007, pp. 308–311. - [8]L.T.Clark*etal*., "Anembedded32-bmicroprocessorcoreforlow-powerandhigh-performanceapplications," *IEEEJ.Solid-StateCircuits*, vol.36,no.11,pp.1599–1608,Nov.2001. - [9]M.Alioto,E.Consoli,andG.Palumbo,"Analy sisandcomparisonin theenergy-delay-areadomainofnanometerCMOSflip-flops:PartI— Methodologyanddesignstrategies," *IEEETr*ans. VeryLargeScaleIntegr. (VLSI) Syst., vol. 19,no.5,pp.725–736, May 2011. - [10]M.Alioto,E.Consoli,andG.Palumbo,"Anal ysisandcompari-sonintheenergy-delayareadomainofnanometerCMOSflip-flops: PartII— Resultsandfiguresofmerit," *IEEETrans. Very LargeScaleIntegr. (VLSI) Syst.*, vol. 19, no. 5, pp. 737–750, May 2011.