

EPFL STI – SEL

Téléphone : +4121 693 1133

ELG

Fax :

Station n° 11

E-mail : [alexandre.levisse@epfl.ch](mailto:alexandre.levisse@epfl.ch)

CH-1015 Lausanne



Advanced VLSI -2021/2022

SEL February 2022

## Advanced VLSI D FLIP-FLOP Design

### 1. OBJECTIVES

The goal of this session is to understand the working principle of a D flip-flop (DFF) based register architecture and to design and simulate a selected 1-bit DFF which will be used in the multi-bit DFF of the Viterbi Decoder. The basic building blocks of sequential logic circuits are flip flops. Each flip flop can store one bit. Flip-flop name comes from the point that the output flips and flops between “0” and “1”. DFF is used vastly in many data storage elements such as shift registers and also in many mixmode blocks such as PRBS (pseudorandom binary sequence), memories, CDR (clock data recovery).

In this session, you will explore different DFF architectures. You will be free to choose among them. You are required to implement one topology, do the layout for it and compare the post layout results with what you expected from the schematic design.

You are also expected to optimize your final selected design for minimum Area-delay product by appropriate transistor sizing, while preserving acceptable dynamic (and also static in some cases) power consumption. To read more about flip-flop architectures, make a literature search and read the following reference available on the moodle.

Reference:

[1] D. Markovic, B. Nikolic, R. W. Brodersen, "Analysis and Design of Low-Energy Flip-Flops", In *Proceedings of the International Symposium on Low Power Electronics and Design*, Huntington Beach, CA, USA, 2001, pp. 52–55.

[2] Sung-Mo Kang and Yusuf Leblebici. "CMOS Digital Integrated Circuits: Analysis and Design" Third edition. Chapter 8, p: 330-350.

## 2. UNDERSTANDING A DFF

The D flip-flop captures the value of the D-input at a definite portion of the clock cycle (such as the rising edge of the clock). That captured value becomes the Q output. At other times, the output Q does not change.



**Figure 1 :** Symbol and truth table of a rising edge trigger DFF with asynchronous Reset

Most D-type flip-flops in ICs have the capability to be forced to the set or reset state (which ignores the D and clock inputs).

An easiest and most common way to design a FF is to use two latches in a Master-Slave configuration in which one is transparent high and one is transparent low –these are called: *Master-Slave Flip-Flops*. Another simple way is to use pulse generator that generates short pulses at the clock edges and provides these “short pulses” to the latch –these are called: *Pulse-Triggered Flip-Flops*.



**Figure 2:** Master-Slave and Pulse-Triggered Flip-Flops.

Here are some well-known examples of DFF:

- **Master-Slave:** C2MOS FF, Transmission Gate FF, Gated Master-Slave FF, Write-Port Master-Slave FF, Data-Transition Look Ahead FF, etc.
- **Pulse-Triggered:** Transmission-Gate Pulsed Latch, Semi-Dynamic FF, Hybrid Latch FF, Implicitly Push-Pull FF, Conditional Precharge FF, etc.

- **Dual-Edge-Triggered:** Transmission-Gate Latch Mux FF, Symmetric Pulse Generator FF, Static Pulsed Latch, Conditional Discharge FF, etc.
- **Differential:** Modified Sense Amplifier FF, Skew Tolerant FF, Conditional Capture FF, Variable Sample Window FF, etc.

### 3. CHARACTERIZING A DFF

In a digital system, every flip-flop needs to satisfy some design specs that are important for the overall synchronization and the reliable flow of data. Three most important parameters for FF characterization are: *CLK-Q Delay, Setup Time and Hold Time*.



**Figure 3:** Typical FF Environment in a Digital System.

These 3 parameters reflect to both overall flip-flop delay (latency) and internal signal race immunity.

- **CLK-Q Delay**

Represents the delay measured from the active clock edge to the output. Depending on the setup and hold time, it can change dramatically. Therefore, setup and hold times are often defined in respect to the corresponding change in the *CLK-Q Delay*.

- **Setup Time**

Represents the minimum time the data signal should be valid and kept constant before the next clock active edge, so that the input signal is sampled correctly.

- **Hold Time**

Represents the minimum time the data signal should be kept valid and constant after the next clock active edge, so that the input signal is sampled correctly.

Note that small setup and hold times can vary the CLK-Q delay significantly. Their limit is usually defined as the maximum value that causes the maximum defined change in the CLK-Q delay (5% for example).



**Figure 4:** Definition of Setup and Hold Times.

In order to maintain the correct operation of the digital system the DFF has to fulfill the following requirement:

The clock period has to be higher or equal to the sum of worst case CLK-Q delay, Setup-Time, worst-case critical-path combinational logic delay, and relative clock skew. In other words:

$$t_{CLK-Q} + t_{setup} + t_{logic} + t_{skew} \leq T_{CLK}$$

The internal signal race immunity ensures that the data is sampled correctly despite of the possible clock skew:

$$t_{CLK,worst} - t_{hold} \geq t_{skew}$$

## 4. DFF ARCHITECTURES



**Figure 5:** NAND-based positive edge-triggered DFF.



**Figure 6:** C2MOS-FF (Master-Slave) DFF.



**Figure 7:** Transmission-Gate-FF (Master-Slave) DFF.



Figure 8: Semi-Dynamic-FF (Pulse-Triggered).



Figure 9: Hybrid Latch-FF (Pulse-Triggered).



**Figure 10:** True Single-Phase Clock (TSPC) Dynamic FF with asynchronous reset. Note that for this topology, taking this proposed design as-is may make you face errors. In this proposal, the reset input does not change the memory state, but just forces the output a given state. The node holding  $Q$  between the second and the third inverters is not being reset to 0. Thereby, once  $R$  is released, this DFF goes back to its state. A small modification of the DFF is possible to cope with this problem. We let you explore it though.

### 5.1. SCHEMATIC ENTRY – STATIC DFF

- Create a new schematic in your **VLSI2** library. Name it **DFF**.
- **Draw the transistor level schematic** of one of the above DFF with **default transistor sizes**.
- When drawing the schematic, please **DO NOT** forget to connect the body connections of the MOS transistors to the respective VDD or GND nets.
- When you are finished drawing the schematic, **create the pins** then **create the symbol** for the cell.
- Then, create the test bench for the current schematic as seen below and name it as **DFF\_TB**. Connect the **D** and **CLK** inputs to pulse generators. The parameters for the pulse generators are given on Table 1. Create a VDC source and connect it to the VDD port of the symbol (Define the VDC as a variable called VDD). Connect the inputs to the pulse generators (vpulse) and adjust their parameters according to Table 1. Connect the output **Q<sub>+</sub>** (and also **Q** if you have such an output) directly to a capacitive load of 20fF. You are not asked to use any additional buffers at this stage.

Write the name of the static DFF you are designing and continue with the next step.



Figure 11: Schematic of the test bench for DFF with asynchronous reset. (DFF is based on C2MOS-FF in Figure 6)

Table 1: The values of the parameters for the simulation of DFF

| Parameter       | CK              | Din                    | R                  |
|-----------------|-----------------|------------------------|--------------------|
| DC Voltage (V)  | 0               | 0                      | 0                  |
| Voltage 1 (V)   | 0.0             | 0.0                    | 0.0                |
| Voltage 2 (V)   | VDD             | VDD                    | VDD                |
| Delay Time (s)  | d1              | 0                      | T/2                |
| Rise Time (s)   | tr              | tr                     | tr                 |
| Fall Time (s)   | tf              | tf                     | tf                 |
| Pulse Width (s) | T/2 - (tr+tf)/2 | 2*T - (tr + tf)/2 - d2 | 18*T - (tr + tf)/2 |
| Period (s)      | T               | 4* T                   | 20*T               |

## 5.2. SIMULATING WITH DEFAULT TRANSISTOR SIZES

- Set the global variables  $VDD = 1.0V$ ,  $T = 10ns$ ,  $d1 = 400ps$ ,  $d2 = 0$  and  $tr = tf = 100ps$ .
- Create a **transient simulation** setup and run it for  $250ns$ . **Plot the signals D, CLK, R and Q.**

The output waveform should looks like Figure 12.



Figure 12: Output waveform of DFF\_TB.

- To measure **setup time**; apply logic 1(0) to D input and decrease the delay of the clock compared to the data signal (decrease d1 in your variable list) until the Q outputs are not valid anymore ( $Q_+$  results in “0”(1) instead of “1”(0)). You can use Parametric analysis and sweep the value of d1 from 0 to  $T_{smax}$  (200ps for example). Then monitor the output of the circuit. This measurement is shown in Figure 13 (a).
- To measure **Hold time**; apply logic 1(0) to D input and decrease the data pulse width of the D input (increase d2 in your variable list) until the Q outputs are not valid anymore.
- To measure **CLK-Q Delay**; Set d1=400ps and d2=0. Place **marker A** and at 0.5V on the first rising edge of **CLK** and **marker B** at the preceding rising edge of **Q** on the figure you obtained which should be similar to Figure 13(b) and measure the **CLK-Q**.



Figure 13: Measuring of setup time and CLK-Q.

|                                 |  |
|---------------------------------|--|
| Write the measured CLK-Q delay. |  |
| Write the measured Setup Time.  |  |
| Write the measured Hold Time.   |  |

**Checkpoint - 1** Do not close your simulation window and please call an assistant and show him/her that you have reached this point before working on further steps.

Visa

Table 2: Simulated parameters of a DFF

| Parameter  | Delay |
|------------|-------|
| CLK-Q      | ps    |
| Setup-Time | ps    |
| Hold-Time  | ps    |

- Add transient current waveform to the output list. To do that, on the calculator, choose “it” then click on the **MINUS** port of the **VDD Source** on the test bench schematic. After this step, define a new output for the average of the current consumption by using “average” from special functions. When, you rerun the simulation, you will see the current consumption plot and its average at the output list.

|                                                                                                                                                                   |       |    |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------|----|
| Write the calculated average current consumption (Iavg)                                                                                                           | I =   | nA |
| Calculate the total power consumption and write it down                                                                                                           | P =   | nW |
| Calculate the PDP (Power Delay Product) according to the maximum calculated delay                                                                                 | PDP = | jF |
| *Before calling your assistant make sure your calculated results are consistent and logical.<br><b>Checkpoint – 2</b> Now call an assistant and show your results | Visa  |    |

### 5.3. OPTIMIZING AREA DELAY PRODUCT

- Copy your schematic and testbech which you created and name the new schematics as **DFF2** and **DFF2\_TB** accordingly. Now, keeping the constant load of 20fF at each output, find the proper value for transistor width (and also length if you do not use minimum size transistor) so that the term (FOM – Figure of Merit – Area x Delay Product in this application)

$$FOM = (\sum_i^n W_i * L_i) * T_{CLK-Q} \text{ is minimized.}$$

In the above term,  $W$  is transistor width,  $L$  is transistor length,  $T_{CLK-Q}$  is CLK-Q delay time, and  $n$  is the number of transistors you used in your design.

- Make sure your design functionally is still correct. Write the transistor size in the following table.

Table 3: Optimized transistor sizes for DFF

| PMOS | Wp/Lp | NMOS transistors | Wn/Ln |
|------|-------|------------------|-------|
| PM0  |       | NM0              |       |
| PM1  |       | NM1              |       |

|     |  |     |  |
|-----|--|-----|--|
| PM2 |  | NM2 |  |
| PM3 |  | NM3 |  |
| PM4 |  | NM4 |  |
| PM5 |  | NM5 |  |
| PM6 |  | NM6 |  |
| PM7 |  | NM7 |  |
|     |  |     |  |
|     |  |     |  |
|     |  |     |  |
|     |  |     |  |
|     |  |     |  |

- Write the initial and final value (after optimization) for the introduced FOM.

|                                                                                |      |  |
|--------------------------------------------------------------------------------|------|--|
| <b>Initial FOM (before optimization)</b>                                       |      |  |
| <b>Optimized FOM</b>                                                           |      |  |
| Before calling your assistant make sure the results are consistent and logical | Visa |  |
| <b>Checkpoint – 3</b> Now call an assistant and show your results              |      |  |

## 6. LAYOUT DESIGN FOR THE 1 BIT DFF CELL

In this part, you will be doing the layout of your chosen **DFF**. Remember your EDATP Design Labs while drawing your layouts.

- Draw a compact layout and consider that later you will be connecting all blocks together. Try to keep the same metal type for horizontal and vertical lines as well as for inputs and outputs. Also, keep the VDD and GND power line widths always the same.
- Run DRC and LVS.
- Run PEX and generate an extracted netlist view (set the extraction parameter to be  $RC+C - RC$  and coupling capacitances)
- Now, rerun the simulation for propagation delays and fill the table below with the new delay values.

|                                                                                                                                                   |      |  |
|---------------------------------------------------------------------------------------------------------------------------------------------------|------|--|
| <b>Checkpoint - 4</b> Once you have a DRC and LVS clean compact layout, call an assistant and show your completed layout and DRC and LVS results. | Visa |  |
|---------------------------------------------------------------------------------------------------------------------------------------------------|------|--|

**Table 4:** Post-Layout simulated parameters of a DFF

| Parameter  | Delay |
|------------|-------|
| CLK-W      | ps    |
| Setup-time | ps    |
| Hold-time  | ps    |

|                                                                                                                                        |      |
|----------------------------------------------------------------------------------------------------------------------------------------|------|
| <b>Compare the new delay results after parasitic extraction and the previous delay results calculated by the schematic simulation.</b> |      |
| <b>Comment about how the results could be improved.</b>                                                                                |      |
| <b>Checkpoint - 5</b> Now call an assistant and show your post-PEX simulation file and the new delay results.                          | Visa |