Investigation of Determinant Factors of Minimum Operating Voltage of Logic Gates in 65-nm CMOS

Tadashi Yasufuku¹, Satoshi Iida¹, Hiroshi Fuketa¹, Koji Hirairi², Masahiro Nomura², Makoto Takamiya¹, and Takayasu Sakurai¹
¹ Institute of Industrial Science, University of Tokyo, Japan
² Semiconductor Technology Academic Research Center (STARC), Japan
tdsh@iis.u-tokyo.ac.jp

Abstract—Determinant factors of the minimum operating voltage (V_{DDmin}) of CMOS logic gates are investigated by measurements of logic-gate chains in 65nm CMOS. V_{DDmin} consists of a systematic component (V_{DDmin(SYS)}) and a random variation component (V_{DDmin(RAND)}). V_{DDmin(SYS)} is minimized, when the logic threshold voltage of logic gates equals to half supply voltage (VDD). The tuning of the logic threshold voltage of each logic gate is achieved by the sizing of the gate width of nMOS/pMOS. V_{DDmin(RAND)} is minimized by reducing the random threshold variation achieved by increasing the gate width or the forward body biasing. In addition, the temperature dependence of V_{DDmin} is measured for the first time. The temperature for the worst corner analysis for V_{DDmin} should be changed depending on the number of gate counts of logic circuits.

I. INTRODUCTION

Reduction of power supply voltage (V_{DD}) is an effective method for achieving ultra low power logic circuits since active power is proportional to V_{DD}^2 and leakage power is proportional to V_{DD}. Thus, many works have been carried out on logic circuits operating at low V_{DD} [1-2]. V_{DD} scaling is, however, obstructed by the minimum operating voltage (V_{DDmin}) [3] of CMOS logic gates. V_{DDmin} is the minimum power supply voltage when the circuits operate without function errors. V_{DDmin} increases with the number of logic gates and CMOS technology down-scaling. Thus, reducing V_{DDmin} of logic circuits is important to achieve ultra low voltage (V_{DD} < 0.4V) logic circuits. Previously, there were no design guides to reduce V_{DDmin} of circuits since the determinant factors of V_{DDmin} were not clarified. In this paper the determinant factors of V_{DDmin} in logic circuits are investigated, and the design criteria to reduce V_{DDmin} are presented. Also, temperature dependency of V_{DDmin} is measured, revealing for the first time, that the V_{DDmin} under the worst condition depends on the gate counts of logic circuits.

Section II discusses the design of the test chip. The circuit schematics of the inverter and 2 input NAND chains are shown. Section III shows experimental results of V_{DDmin}. Determinant factors of V_{DDmin}, gate sizing and body-biasing to reduce V_{DDmin} and temperature dependence is discussed. Section IV concludes this paper.

II. TEST CHIP DESIGN

Figs. 1(a) and (b) show schematic diagrams of the inverter and 2 input NAND (2NAND) chains of which the V_{DDmin} are measured. The inverter chain has 10001 (10k) stages of inverters and has monitoring ports branching out from the 11th stage, 101st stage and so on. The NAND chain has 100001 (100k) stages of NANDs with similar ports. Fig. 1(c) shows a detailed schematic diagram of the inverters used in the inverter chain of Fig. 1(a). The body-bias voltages of both the nMOS transistors (V_{bs(nMOS)}) and the pMOS transistors (V_{bs(pMOS)}) can be controlled. Note that when V_{bs(nMOS)} is positive the nMOS transistors are forward biased, whereas when V_{bs(pMOS)} is positive the pMOS transistors are forward biased.

Fig.2 Test chip of 2NAND chain fabricated in 65-nm CMOS. (a) Layout. (b) Chip micrograph.
positive the pMOS transistors are reverse biased. The gate lengths of all transistors are fixed to the minimum of the process, and $W_n$ and $W_p$ are the gate widths of the nMOS and pMOS transistors, respectively. Fig. 2 shows the layout and chip micrograph of the test chip of the 2NAND chain. Both the inverter chain and the 2NAND chain circuits are fabricated in a 65nm CMOS process occupying 0.4mm x 0.6mm, and 1mm x 0.8mm, respectively.

III. EXPERIMENTAL RESULTS

A. Determinant factors of $V_{DD_{min}}$

Fig. 3 shows measured and simulated dependence of $V_{DD_{min}}$ on the number of stages in the inverter and 2NAND chains. The average $V_{DD_{min}}$ in the measured 17 dies for 2NAND and 20 dies for inverter is shown. Monte Carlo SPICE simulation includes within-die random threshold voltage ($V_{TH}$) variations and the number of trials of the Monte Carlo simulation is 100 times. Fig. 3 indicates that $V_{DD_{min}}$ increases as the number of stages increases. Note that the simulations are executed only up to 1k stages due to the simulation time constraint. Since the simulated results agree with the measurements in Fig. 3, various simulations are conducted to clarify determinant factors of $V_{DD_{min}}$ in the rest of this paper.

A closed-form expression of the $V_{DD_{min}}$ described in [4] is shown as,

$$V_{DD_{min}} = \frac{\sigma_{V_{TH}}}{a} \left[ \ln \left( \frac{N}{b} \right) + c \right],$$  \hspace{1cm} (1)

$$\sigma_{V_{TH}} = \sqrt{\sigma_p^2 + \sigma_n^2},$$  \hspace{1cm} (2)

where $N$ is the number of stages, $\sigma_p$ ($\sigma_n$) is the standard deviation of within-die $V_{TH}$ variations of pMOS (nMOS), $a$ is a constant determined by DIBL coefficient, $b$ is a constant determined by yield, and $c$ denotes the balance of the strength of nMOS and pMOS. The standard deviation ($\sigma$) of $V_{TH}$ variation can be expressed as noted in [5],

$$\sigma = \frac{A_{VT}}{\sqrt{LW}},$$  \hspace{1cm} (3)

where $A_{VT}$ is Pelgrom coefficient, $L$ is the gate length, and $W$ is the gate width.

By using Eq. (1), determinant factors of $V_{DD_{min}}$ are investigated in this paper. Fig. 4 shows the simulated dependence of $V_{DD_{min}}$ on the number of stages in the inverter chain with and without random $V_{TH}$ variation. $V_{DD_{min}}$ consists of the following two components; the systematic component ($V_{DD_{min}(SYS)}$), and the random variation component ($V_{DD_{min}(RAND)}$). $V_{DD_{min}(SYS)}$ is obtained by simulations without within-die $V_{TH}$ variation represented as the lower line, while $V_{DD_{min}(RAND)}$ is defined as the difference between two lines in Fig. 4. While $V_{DD_{min}(RAND)}$ depends on the number of stages, $V_{DD_{min}(SYS)}$ does not. Therefore, $V_{DD_{min}(RAND)}$ corresponds to the first term in Eq. (1) and $V_{DD_{min}(SYS)}$ corresponds to the second term ($=c$) in Eq. (1). $V_{DD_{min}(RAND)}$ depends on the random $V_{TH}$ variation and the number of stages of logic gates, while $V_{DD_{min}(SYS)}$ is determined by the balance of nMOS and pMOS.

B. Gate sizing to reduce $V_{DD_{min}}$

In order to examine the dependence of $V_{DD_{min}}$ on the balance of the drive strength of nMOS and pMOS, Monte Carlo simulations for the 101-stage inverter chain are conducted with various $V_{TH}$ of pMOS to change the balance. Fig. 5 shows the simulated dependence of $V_{DD_{min}}$ of the inverter chain on $V_{TH}$ shift of pMOS ($\Delta V_{TP}$) with and without random $V_{TH}$ variation.
random \( V_{TH} \) variation. In this simulation, \( \Delta V_{TP} \) is shifted by changing the parameter in SPICE. \( V_{DDmin} \) depends on \( \Delta V_{TP} \) and is minimum at \( \Delta V_{TP} = 30 \text{ mV} \), where nMOS and pMOS are balanced. On the other hand, \( V_{DDmin} \) is constant, because \( \sigma \) of \( V_{TH} \) variation does not depend on \( \Delta V_{TP} \). This result indicates that \( V_{DDmin} \) is determined by the balance of nMOS and pMOS, which is predicted by Eq. (1).

In [6], \( V_{DDmin} \) of subthreshold circuits is minimized by tuning the logic threshold voltage to half \( V_{DD} \). To confirm the effect of this tuning, the dependence of the normalized logic threshold voltage on \( \Delta V_{TP} \) at different \( V_{DD} \) without random \( V_{TH} \) variation as shown in Fig. 6. The definition of the logic threshold voltage is shown in the inset of Fig. 6. The logic threshold voltage is normalized by each \( V_{DD} \). When the logic threshold voltage is equal to half \( V_{DD} \) at each \( V_{DD} \), \( \Delta V_{TP} \) is 30 mV, where the minimum \( V_{DDmin} \) is achieved as shown in Fig. 5. This result indicates that \( V_{DDmin} \) is minimized when the logic threshold voltage is equal to half \( V_{DD} \).

Fig. 7(a) shows the simulated dependence of \( V_{DDmin} \) of 7 types of 101-stage chains with different logic gates. \( W_P/W_N \) of all logic gates is constant which is defined as \( \alpha \) in Fig. 7(a) and all logic gates operate as an inverter, i.e. one of the inputs of each logic gate is connected to the output of the previous stage and other inputs are tied to \( V_{DD} \) or \( V_{SS} \) in NAND and NOR gates, respectively. As shown in Fig. 7(a), \( V_{DDmin} \) of each logic gate is divided into \( V_{DDmin}(SYS) \) and \( V_{DDmin}(RAND) \). \( V_{DDmin}(SYS) \) increases as the number of inputs of the logic gates increases (e.g. 4NAND and 4NOR), because stacking and paralleling transistors worsen the balance of the strength of nMOS and pMOS and hence the logic threshold voltages deviate from half \( V_{DD} \). In contrast, \( V_{DDmin}(RAND) \) decreases as the number of inputs of the logic gates increases, because stacking and paralleling transistors decreases the transistor variations.

Logic threshold voltages, however, can be tuned to half \( V_{DD} \) by changing the gate size (\( W_P/W_N \)) in each logic gate at a design stage. Fig. 8 shows the simulated dependence of normalized logic threshold voltage on the normalized \( W_P/W_N \) for 7 types of logic gates.

<table>
<thead>
<tr>
<th>Gate type</th>
<th>Inverter</th>
<th>2NAND</th>
<th>3NAND</th>
<th>4NAND</th>
<th>2NOR</th>
<th>3NOR</th>
<th>4NOR</th>
</tr>
</thead>
<tbody>
<tr>
<td>( V_{DD} ) [mV]</td>
<td>Simulation</td>
<td>Optimization</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Simulation</td>
<td>101-stage</td>
<td>( W_P/W_N = \alpha ) (Const.)</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>( V_{DDmin} )</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Inverter</td>
<td>2NAND</td>
<td>3NAND</td>
<td>4NAND</td>
<td>2NOR</td>
<td>3NOR</td>
<td>4NOR</td>
<td></td>
</tr>
<tr>
<td>Simulation</td>
<td>Optimization</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>( V_{DD} ) [mV]</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Systematic Component</td>
<td>Inverter 2NAND 3NAND 4NAND 2NOR 3NOR 4NOR</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Simulation, 101-stage</td>
<td>( W_P/W_N = \alpha ) (Optimized)</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>( V_{DDmin} )</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

\( \alpha \): Logic threshold voltage shift of pMOS (\( \Delta V_{TP} \)) [mV]

In [6], \( V_{DDmin} \) of subthreshold circuits is minimized by tuning the logic threshold voltage to half \( V_{DD} \). To confirm the effect of this tuning, the dependence of the normalized logic threshold voltage on \( \Delta V_{TP} \) at different \( V_{DD} \) without random \( V_{TH} \) variation as shown in Fig. 6. The definition of the logic threshold voltage is shown in the inset of Fig. 6. The logic threshold voltage is normalized by each \( V_{DD} \). When the logic threshold voltage is equal to half \( V_{DD} \) at each \( V_{DD} \), \( \Delta V_{TP} \) is 30 mV, where the minimum \( V_{DDmin} \) is achieved as shown in Fig. 5. This result indicates that \( V_{DDmin} \) is minimized when the logic threshold voltage is equal to half \( V_{DD} \).

Fig. 7(a) shows the simulated dependence of \( V_{DDmin} \) of 7 types of 101-stage chains with different logic gates. \( W_P/W_N \) of all logic gates is constant which is defined as \( \alpha \) in Fig. 7(a) and all logic gates operate as an inverter, i.e. one of the inputs of each logic gate is connected to the output of the previous stage and other inputs are tied to \( V_{DD} \) or \( V_{SS} \) in NAND and NOR gates, respectively. As shown in Fig. 7(a), \( V_{DDmin} \) of each logic gate is divided into \( V_{DDmin}(SYS) \) and \( V_{DDmin}(RAND) \). \( V_{DDmin}(SYS) \) increases as the number of inputs of the logic gates increases (e.g. 4NAND and 4NOR), because stacking and paralleling transistors worsen the balance of the strength of nMOS and pMOS and hence the logic threshold voltages deviate from half \( V_{DD} \). In contrast, \( V_{DDmin}(RAND) \) decreases as the number of inputs of the logic gates increases, because stacking and paralleling transistors decreases the transistor variations.

Logic threshold voltages, however, can be tuned to half \( V_{DD} \) by changing the gate size (\( W_P/W_N \)) in each logic gate at a design stage. Fig. 8 shows the simulated dependence of normalized logic threshold voltage on the normalized \( W_P/W_N \) for 7 types of logic gates. For example, in 4NAND gate, when \( W_P/W_N = 0.11 \alpha \), the logic threshold voltage is half \( V_{DD} \), which means that \( W_N \) of 4NAND must be 8.8 \( W_P \) to minimize \( V_{DDmin}(SYS) \). When too large or small \( W_P/W_N \) is not acceptable due to the area constraint, the logic gates with a lot of inputs (e.g. 4NAND and 4NOR) should not be used in the design of subthreshold logic circuits. Fig. 7(b) shows the simulated dependence of \( V_{DDmin}(SYS) \) of 7 types of 101-stage chains with different logic gates when \( W_P/W_N \) of each logic gate is optimized to have the logic threshold voltage of half \( V_{DD} \) as shown in Fig. 8. For example, in 4NAND gate, \( V_{DDmin}(SYS) \) can be reduced from 162 mV to 102 mV by optimizing \( W_P/W_N \).
The inverter gain is degrade in stacked transistors. It is noted that $V_{DDmin}(SYS)$'s of the number of inputs of the logic gates (e.g. 2NAND and 2NOR) are the same, because the number of the stacked transistors is the same.

While $V_{DDmin}(SYS)$ is minimized by optimizing $W_P/W_N$, $V_{DDmin}(RAND)$ is reduced by increasing $W_P$ and $W_N$. Fig. 9 shows the simulated dependence of $V_{DDmin}$ on the number of stages in the inverter chain with 3 types of gate width. The gate widths of nMOS and pMOS in “x2 inverter” are two times larger than those in “x1 inverter.” $W_P/W_N$ is $\alpha$ in all the inverters. The gate width does not affect the balance of the strength of nMOS and pMOS. Therefore, $V_{DDmin}(SYS)$’s of the three types of gate chains is completely identical as illustrated in Fig. 9. On the other hand, as the gate width increases, $V_{DDmin}(RAND)$ decreases, $\sigma$ of within-die $V_{TH}$ variation is reduced, which is expressed in Eq. (1). Consequently, $V_{DDmin}$ of the “x4 inverter” chain is the lowest.

C. Body-biasing to reduce $V_{DDmin}$

Fig. 10 shows measured dependence of $V_{DDmin}$ on the number of stages in 2NAND chains in 2 wafers with different process corner. 17 dies in wafer 1 and 42 dies in wafer 2 are measured. The upper line is the same line depicted in Fig. 3 and the lower line shows a different line. These two lines are different, because the balance of nMOS and pMOS differs between the two lots due to die-to-die $V_{TH}$ variation. In order to reduce such $V_{DDmin}(SYS)$ due to the die-to-die $V_{TH}$ variation, body-biasing is effective to compensate for the die-to-die $V_{TH}$ variation. In this section, the effect of body-biasing is investigated.

Fig. 11 shows measured dependence of $V_{DDmin}$ in 100001-stage 2NAND chain on body bias of nMOS and pMOS. Fig. 12 shows simulated $V_{DDmin}(SYS)$ with and without the compensation using pMOS body-biasing in 101-stage inverter chain.
-300mV. This indicates that balancing the nMOS and pMOS using body-biasing is effective to reduce $V_{DD\min(SYS)}$.

Next, the feasibility of the post-fabrication compensation at unbalanced process corners is investigated with the 101-stage inverter chain. Fig. 12 shows simulated $V_{DD\min(SYS)}$ with and without the compensation using pMOS body-biasing. The unbalanced process corner conditions are SF and FS. SF means slow nMOS (= high $V_{TH}$) and fast pMOS, whereas FS means fast nMOS (= low $V_{TH}$) and slow pMOS. TT means typical nMOS and pMOS, which is included as a reference. The compensation using pMOS body-biasing reduces $V_{DD\min(SYS)}$ by 88mV (from 154 mV to 66 mV) and 40mV (from 108 mV to 68 mV) in SF and FS conditions, respectively. The compensated $V_{DD\min(SYS)}$'s of the three process corners show almost same values, since the strength of pMOS and nMOS is well-balanced by body-biasing. Thus, the increase in $V_{DD\min(SYS)}$ due to the die-to-die $V_{TH}$ variation is compensated by the post-fabrication body-biasing.

In addition, the body-biasing also affects $V_{DD\min(RAND)}$. Within-die $V_{TH}$ variation is reduced by forward body biasing [7]. Tuning of both $V_{DD\min(SYS)}$ and $V_{DD\min(RAND)}$ by the body-biasing is demonstrated in the measurement. Fig. 13 shows measured dependence of $V_{DD\min}$ on the number of stages in three body-bias conditions for the 2NAND chain. Table I summarizes the body bias conditions. Zero body bias is the initial condition and nMOS and pMOS are not balanced. On the other hand, the drive strength of pMOS and nMOS are balanced by optimizing forward body biasing or reverse body biasing as shown in Table I. Fig. 13 indicates that $V_{DD\min}$ of both reverse and forward body bias at 101 stages is lower than that of zero body bias, because $V_{DD\min(SYS)}$ is minimized by the optimal body biasing. It is noted that the gradient of the line shown in Fig. 13 of reverse and forward body bias is different. The gradient of the reverse body bias is steep, while the gradient of the forward body bias is gentle, which indicates that $V_{DD\min(RAND)}$ is reduced by the forward body bias because within-die $V_{TH}$ variation is reduced.

For example, compared with the initial zero body bias, measured $V_{DD\min}$ is reduced by 45mV from 193 mV to 148mV by forward body biasing at 100k stages. Thus, the optimal body-biasing minimizes $V_{DD\min(SYS)}$ and the forward body biasing decreases $V_{DD\min(RAND)}$.

### D. Temperature dependence of $V_{DD\min}$

The temperature dependence of $V_{DD\min}$ is discussed in this section. This is the first work to report the temperature dependence of $V_{DD\min}$. Fig. 14 shows the measured dependence of $V_{DD\min}$ on temperature in various inverter chains. The temperature dependence of $V_{DD\min}$ varies with the number of stages of the chain. At 11-stage chain, $V_{DD\min}$ increases by 10mV as temperature increases from -40°C to 110°C. This phenomenon is considered as a result of increase in $V_{DD\min(SYS)}$, since $V_{DD\min(SYS)}$ depends on the thermal voltage and the rise in temperature increases $V_{DD\min(SYS)}$ [4]. On the other hand, at 10001-stage chain, $V_{DD\min}$ decreases by 25mV as temperature increases from -40°C to 110°C. This phenomenon implies that $V_{DD\min(RAND)}$ decreases as the temperature increases, because $A_{VT}$ decreases with increasing temperature [8]. As a result, the worst (=highest) $V_{DD\min}$ condition of logic circuits with small gate counts is high temperature, while the worst $V_{DD\min}$ condition of logic circuits with large gate counts is the worst at low temperature. Therefore, the temperature for the worst corner analysis for $V_{DD\min}$ should be changed depending on the number of gate counts of logic circuits.

### IV. CONCLUSION

Determinant factors of $V_{DD\min}$ of CMOS logic gates are investigated and a design guide to reduce $V_{DD\min}$ are shown by measurements and SPICE simulations of logic-gate chains in 65nm CMOS. $V_{DD\min}$ consists of $V_{DD\min(SYS)}$ and $V_{DD\min(RAND)}$. 

![Fig.13] Measured dependence of $V_{DD\min}$ on number of stages in three body-bias conditions in 2NAND chain.

<table>
<thead>
<tr>
<th>Table I Body bias conditions used in Fig. 13.</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>NMOS Body Bias</strong>, $V_{bs(nMOS)}$</td>
</tr>
<tr>
<td>Zero Body Bias</td>
</tr>
<tr>
<td>Forward Body Bias</td>
</tr>
<tr>
<td>Reverse Body Bias</td>
</tr>
</tbody>
</table>

![Fig.14] Measured dependence of $V_{DD\min}$ on temperature in various inverter chains.
V_{DDmin(RAND)} depends on the random \( V_{TH} \) variation and the number of stages of logic gates, while \( V_{DDmin(SYS)} \) is determined by the balance of nMOS and pMOS and is minimized when the logic threshold voltage is equal to half \( V_{DD} \). Therefore, \( V_{DDmin(RAND)} \) is reduced by increasing \( W_N \) and \( W_P \), while \( V_{DDmin(SYS)} \) is minimized by optimizing \( W_P/W_N \) at a design stage. The body-biasing is effective to compensate for the increase of \( V_{DDmin(SYS)} \) due to the die-to-die \( V_{TH} \) variation. The optimal body-biasing minimizes \( V_{DDmin(SYS)} \) and the forward body biasing decreases \( V_{DDmin(RAND)} \). In the measurement of \( V_{DDmin} \) of 100k-stage 2NAND chain, \( V_{DDmin} \) is successfully reduced by 45mV from 193mV to 148mV by the forward body biasing. The temperature dependence of \( V_{DDmin} \) is measured for the first time. The worst (=highest) \( V_{DDmin} \) condition of logic circuits with small gate counts is high temperature, while the worst \( V_{DDmin} \) condition of logic circuits with large gate counts is low temperature. Therefore, the temperature for the worst corner analysis for \( V_{DDmin} \) should be changed depending on the number of gate counts of logic circuits.

ACKNOWLEDGMENTS

This work was carried out as a part of the Extremely Low Power (ELP) project supported by the Ministry of Economy, Trade and Industry (METI) and the New Energy and Industrial Technology Development Organization (NEDO).

REFERENCES