A Phase-scaled Vernier Time-to-Digital Converter Architecture with Switchable Coarse/Fine Resolutions, Wide Range and Ultra Low Power Consumption

by

Tuoxin (Tony) Wang

A thesis submitted to the Faculty of Graduate and Postdoctoral Affairs in partial fulfillment of the requirements for the degree of

Master of Applied Science

in

Electrical and Computer Engineering

Ottawa-Carleton Institute for Electrical and Computer Engineering
Carleton University
Ottawa, Ontario

© 2016, Tuoxin (Tony) Wang
Abstract

A novel phase-scaled Vernier time-to-digital converter (TDC) architecture with a switchable coarse/fine (16ps/2ps) time resolution function is presented to achieve large phase (time) detection range (32.7ns in 14 bits), fine time resolution (2ps), compact size and super low power consumption simultaneously. The phase noise (caused by the TDC) can also be improved due to it allowing a higher reference frequency compared to other types of TDC architectures. A phase-regulator has been created and embedded into traditional Vernier TDC core circuitry for the purpose of separating a new-defined mandatory fine-granularity phase length (normal-length phase) (80ps) from the random phase (time) difference (up to 32ns) to be measured. The mandatory phase length will be the only part to use a fine-resolution (2ps) measurement and the rest of the phase length will be counted in large intervals (80ps). By doing so, the required number of stages of the traditional Vernier TDC core can be remarkably reduced from 6250 to 40 at a fixed 2ps time resolution. Furthermore, compared to a typical Vernier ring TDC, the proposed architecture being combined with a reverse-triggered pre-logic unit and the coordinated-determination scheme facilitates a much simpler and faster determination procedure, which allows the power consumption to be greatly reduced and the reference frequency to be increased in order to achieve a 2~3dB improvement of the phase noise performance.
Acknowledgements

I would like to thank Microsemi Inc. Kanata site as well as the people in PLL department from the United States for their generous support of my research. Microsemi has been providing all of the design tools I needed and the people in the group also provided the specifications of the requirement, valuable suggestions and the feedback on my research progress in the past year. In particular, I want to thank Krste Mitric and Kobe Situ for their constant support of my work. Krste Mitric was the one who launched this partnership of the collaborated research and has been playing as a pilot to drive the project moving forward. He even spent a big proportion of his vacation time on reading and understanding the progress report and the corresponding literature, providing valuable enquiry, suggestion and comment those make the research work more comprehensive and clear. He also pointed out the flaws in the research and the places in the literature which may confuse readers. Kobe helped a lot in technic support in terms of setting up and maintaining the simulation environments like, Linux server, Matlab and Simulink, Virtuoso etc. I would have been spent ten times of my time on the working environments if I had no Kobe’s work. I also pleased to acknowledge Peter who ever helped to retrieve a Cadence Library Crash, avoiding a disaster to me.

This work would not have been possible without the support of my supervisor John Rogers. In addition to coordinating our partnership with Microsemi, he provided essential oversight and support for me, inside and outside of the context of my thesis and publications. Furthermore, he encouraged me when I got stuck somewhere and sharing his experience, ideas with positive view. His support has been solid and powerful to drive this research going forward.
I am very grateful to having every one of my family, who brings deep happiness and a feeling of satisfaction to me.
**Table of Contents**

**Abstract** ............................................................................................................................................... i

**Acknowledgements** ......................................................................................................................... ii

**Table of Contents** ................................................................................................................................. iv

**List of Tables** ......................................................................................................................................... vii

**List of Figures** ......................................................................................................................................... viii

**Chapter 1: Introduction** ......................................................................................................................... 1

1.1 All Digital Phase Locked Loop ........................................................................................................... 1

1.2 Time-to-Digital Converter .................................................................................................................. 2

1.3 TDC Key Performances summary ...................................................................................................... 4

1.4 TDC detection range expression ....................................................................................................... 5

1.5 Thesis Outline ....................................................................................................................................... 5

**Chapter 2: Background** .......................................................................................................................... 7

2.1 Inverter Based TDC (Traditional) ....................................................................................................... 7

2.2 Ring Oscillator TDC ............................................................................................................................ 9

2.3 Vernier TDC ......................................................................................................................................... 10

2.4 Vernier Ring TDC ................................................................................................................................. 11

2.5 Phase-scaled Vernier TDC .................................................................................................................. 13

**Chapter 3: System Level Design and Implementations** ......................................................................... 14

3.1 Architecture Illustration of Phase-Scaled Vernier TDC ...................................................................... 14

3.2 System Level Design Considerations ................................................................................................. 15

3.3 System Specification Assumption ..................................................................................................... 19

3.4 Phase Regulator & TDC Core .............................................................................................................. 20

3.5 Coarse/Fine Configuration & Optimization ....................................................................................... 24

3.6 Evaluator ............................................................................................................................................. 28
Chapter 4: TDC Timing Considerations for Higher Reference Frequency .......................... 37
  4.1 General Operating-Time Procedure ........................................................................ 38
  4.2 Operating-Period Performance Comparison .............................................................. 39

Chapter 5: Circuit Implementation and Simulation ....................................................... 44
  5.1 Technology Process and Basic model parameters ..................................................... 44
  5.2 Vernier TDC Core Circuit ......................................................................................... 46
    5.2.1 Slow path inverter delay chain ........................................................................... 46
    5.2.2 D Flip Flops ........................................................................................................ 49
  5.3 Differential RRD and Trigger Clock ....................................................................... 51
  5.4 Encoding Circuit ....................................................................................................... 52
  5.5 Phase Regulator and Simulation .............................................................................. 54
  5.6 TDC Core Circuit Simulation Results and Optimization ............................................ 58
  5.7 Evaluator Implementation and Simulation ............................................................... 62
    5.7.1 Coordinated-Determination Device ................................................................... 62
    5.7.2 Even-rotation Compensation and optimization .................................................. 63
  5.8 The Whole TDC Architecture and Simulation .......................................................... 68

Chapter 6: ADPLL System Calculation and Simulation ............................................... 71
  6.1 ADPLL System Specifications ................................................................................. 71
  6.2 Phase Noise Contribution and Distribution ............................................................. 73
  6.3 ADPLL System Modeling and Simulation in Simulink ............................................. 77
  6.4 Comparison between different Loop types and parameters .................................... 79
  6.5 Preferred Solution of ADPLL System Operation .................................................... 81

Chapter 7: Conclusion and Future Work ................................................................. 83
  7.1 Completed Work ...................................................................................................... 83
List of Tables

Table 1: Feature of different types of TDCs ................................................................. 12
Table 2: Specifications of the TDC ............................................................................. 19
Table 3: Performance Analysis at a Fixed 25MHz Reference Frequency (40ns period) . 23
Table 4: Improvement of the Phase Noise Performance with Different Accepted Reference Frequency (1GHz of DCO output) ........................................................................ 42
Table 5: Comparison of Several TDC Architectures ................................................. 43
Table 6: D Flip Flop initial parameters configuration .................................................... 50
Table 7: Sections, Outputs and Coding Result Index .................................................. 59
Table 8: Examples of Final output calculation ............................................................ 64
Table 9: Truth table of updated even rotation word for alignment.............................. 67
Table 10: Type I and type II loop feature contrast ...................................................... 80
Table 11: Type I and type II loop feature contrast ...................................................... 82
## List of Figures

<table>
<thead>
<tr>
<th>Figure</th>
<th>Description</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>Fig. 1</td>
<td>ADPLL system level operation diagram</td>
<td>2</td>
</tr>
<tr>
<td>Fig. 2</td>
<td>Illustration of integer and fractional counting [2]</td>
<td>2</td>
</tr>
<tr>
<td>Fig. 3</td>
<td>Noise Contributions of TDC and DCO in an ADPLL</td>
<td>3</td>
</tr>
<tr>
<td>Fig. 4</td>
<td>Inverter Based TDC (Traditional)</td>
<td>7</td>
</tr>
<tr>
<td>Fig. 5</td>
<td>Operating Principle of a Traditional TDC</td>
<td>8</td>
</tr>
<tr>
<td>Fig. 6</td>
<td>Ring Oscillator TDC Structure</td>
<td>9</td>
</tr>
<tr>
<td>Fig. 7</td>
<td>Vernier TDC Structure</td>
<td>10</td>
</tr>
<tr>
<td>Fig. 8</td>
<td>Vernier Ring TDC Structure</td>
<td>11</td>
</tr>
<tr>
<td>Fig. 9</td>
<td>Simplest Illustration of the Phase-scaled Vernier TDC Structure</td>
<td>14</td>
</tr>
<tr>
<td>Fig. 10</td>
<td>TDC module diagram</td>
<td>15</td>
</tr>
<tr>
<td>Fig. 11</td>
<td>Phase regulator illustration in time domain</td>
<td>16</td>
</tr>
<tr>
<td>Fig. 12</td>
<td>Block diagram of Phase-scaled Vernier TDC (Phase regulator and TDC core)</td>
<td>20</td>
</tr>
<tr>
<td>Fig. 13</td>
<td>A modified version of the phase-scaled Vernier TDC</td>
<td>22</td>
</tr>
<tr>
<td>Fig. 14</td>
<td>Degree of activities of the VRTDC and Phase-scaled Vernier TDC</td>
<td>25</td>
</tr>
<tr>
<td>Fig. 15</td>
<td>(a) Logical modification for rising-edge only phase counter, (b) True table of (a) circuitry.</td>
<td>27</td>
</tr>
<tr>
<td>Fig. 16</td>
<td>(a) Evaluator and Coordinated-Determination Device, (b) and their operations in time domain.</td>
<td>31</td>
</tr>
<tr>
<td>Fig. 17</td>
<td>Waveform illustration of reversed-triggered pre-logic unit</td>
<td>32</td>
</tr>
<tr>
<td>Fig. 18</td>
<td>(a) Reverse-triggered pre-logic unit block, (b) Arbiter unit block, (c) D Flip-Flop block, (d) D Flip-Flop transistor level block</td>
<td>33</td>
</tr>
<tr>
<td>Fig. 19</td>
<td>The Proposed ADPLL system operation diagram</td>
<td>35</td>
</tr>
</tbody>
</table>
Fig. 20 ADPLL timing and computation example (FCW=3.25) .......................................................... 36
Fig. 21 Operating-period Performance Comparison Between (a) VRTDC and (b) Phase-Scaled Vernier TDC ........................................................................................................................................ 39
Fig. 22. Propagation delay of inverters with various $W_P/W_N$ ratios and $C_{load}$ .................. 45
Fig. 23 Static Inverter’s DC Transfer simulation ............................................................................. 46
Fig. 24 Slow path inverter delay chain circuit schematic and associated system level diagram. ........................................................................................................................................ 48
Fig. 25. NAND Optimization for equal length of odd and even rotations ............................... 49
Fig. 26. (a) A D Flip Flop circuit and (b) illustration of non-ideal issue. [10] ......................... 50
Fig. 27. Differential RRD and differential fast path clock trigger ................................................. 52
Fig. 28. (a) 8/3 priority encoding circuit and (b) it’s truth table .................................................. 53
Fig. 29. Circuit diagram of an edge finder consisting of XOR Gates ........................................ 54
Fig. 30. Vernier core circuit and phase regulator ......................................................................... 55
Fig. 31. Diagram of the TDC core and coding output ................................................................. 56
Fig. 32. Ideal Simulation Conditions with REF and Phase Regulator Output ......................... 57
Fig. 33. Real simulation conditions with REF and phase regulator output .............................. 58
Fig. 34 Vernier core output ........................................................................................................... 59
Fig. 35. Vernier Core Output in First Rotation of the Phase Regulator ..................................... 61
Fig. 36. 8-bit 5-1 multiplexer and corresponding determination diagram ................................... 63
Fig. 37. (a) 5-1 multiplexer and corresponding determination diagram (b) logic of fi ............. 63
Fig. 38. Simulation of Final Output of the TDC and alignment issue A ................................... 66
Fig. 39. Simulation of Final Output of the TDC and alignment issue B ................................... 66
Fig. 40. Logic implementation of alignment between even rotation word and core output.

Fig. 41. TDC system block diagram.

Fig. 42. Final Simulation of Final Output of the TDC.

Fig. 43. ADPLL System Specifications from Microsemi.

Fig. 44. Original Phase noise & RMS jitter conversion of the ADPLL.

Fig. 45. Updated Phase noise & RMS jitter conversion of the ADPLL.

Fig. 46. ADPLL phase noise contribution and distribution.

Fig. 47. ADPLL phase noise calculator and analyzer.

Fig. 48. Basic integer PLL system simulation platform.

Fig. 49. ADPLL control voltage convergence with the time.

Fig. 50. Diagram of a fractional digital type I PLL system.

Fig. 51. Diagram of a fractional digital type II PLL system.

Fig. 52. ADPLL phase noise comparison between different loop parameters.

Fig. 54. ADPLL operation with combination of type I and type II. [1]
Chapter 1: Introduction

1.1 All Digital Phase Locked Loop

An all-digital phase locked loop (ADPLL) is a PLL where all components are implemented with digital signals present at their inputs and outputs. Even the low-pass filter located between the phase (frequency) detector and the digitally-controlled oscillator (DCO) (which traditionally would have been an analog voltage-controlled oscillator) is completely digital. An ADPLL has the potential advantage of size and cost reduction which will scale with process technology (say from 90nm to 45 nm) compared to its analog counterpart, which includes an analog filter implemented by passive capacitors and resistors, not being able to shrink in size at smaller geometries or be fully integrated on chip [1] [2] [3].

Commonly, an ADPLL consists of a phase comparator, a type-I or type-II digital low-pass filter, a DCO and a time-to-digital converter (TDC), as shown in Fig. 1. FCW stands for Frequency-Command-Word indicating the desired number of DCO periods in one reference period. An integral counter is used to count the real integer part of how many DCO periods in the current reference period (or system timing period). The TDC is created to measure the fractional part of DCO period ($\varepsilon$), as shown in Fig. 2. When the loop has settled, ideally the PLL system will make the sum of the comparator zero, which means:

$$\sum FCW = i + \varepsilon \quad \text{Eq.(1)}$$
To realize an ADPLL, a high performance time-to-digital converter (TDC) is crucial due to TDC mostly dominating the in band phase noise of the system (Fig. 3). A TDC can measure and quantize the phase (time) difference ($\Delta T_{Tot}$) between the reference clock and the DCO clock (feedback of the ADPLL), which corresponds to $F_{Rn}$ shown in Fig. 2. The digital output of the TDC allows for an accurate arithmetic operation in the comparator (Fig. 1), avoiding the issue of noise injection compared to an analog PLL’s charge pump.
The time resolution of the TDC ($T_{res}$) is one of the most important TDC performance metrics, indicating the accuracy of resolving the timing difference of the reference (REF) and DCO edges. Therefore with a linear conversion, the RMS phase error $\phi_{error}$ will be:

$$\phi_{error} = 2\pi \cdot \frac{T_{res}/\sqrt{12}}{T_{DCO}}$$  

Eq.(2)

where $T_{DCO}$ is the period of the DCO output [5]. This noise will be spread from dc to the sampling frequency (REF frequency, $f_{ref}$). Thus the in-band phase noise of the TDC is:

$$L(\Delta f) = 10 \log_{10}\left[\frac{(2\pi)^2}{12} \cdot \left(\frac{T_{res}}{T_{DCO}}\right)^2 \cdot \frac{1}{f_{ref}}\right]$$  

Eq.(3)

In order to reduce the phase noise caused by a TDC, a finer time resolution is needed [4]. In addition to being able to achieve a competitive phase-noise performance, the synthesizer (ADPLL) still needs to produce a wide frequency range of DCO outputs [5] especially for demanding wireless applications. Considering the requirements of state-of-the-art ADPLL
applications in the wireless field (television broadcast, mobile phones, wireless LAN, Bluetooth, ZigBee, GPS, microwave devices/communications, etc.), the novel integrated ADPLL design considered should cover the frequency range from sub-100MHz to 20GHz (or even broader), which requires a wide phase detection range of tens-ns. This means that the TDC phase detection range needs to be able to cover the period of the lowest frequency of the DCO. Therefore, a fine TDC time resolution combined with a wide phase detection range ($\Delta T_{\text{rot,max}}$) are the most important performance specifications in most wireless applications.

Note from Eq. (3), a higher reference frequency would improve the in-band phase noise performance of the ADPLL (decreasing the phase noise value). Thus, a TDC solution that allows a higher reference frequency is desired. This topic and research will be discussed in Chapter 4.

Low power consumption and small chip area are always the key performance specifications in order to achieve a relatively low cost solution and production.

1.3 **TDC Key Performances Specifications Summary**

The followings are the TDC key performance specifications to be considered in highest priority of the research:

- Time resolution, $T_{\text{res}}$ (2ps).
- Phase (time) detection range, $\Delta T_{\text{rot,max}}$ (12.5ns).
- Highest available reference frequency, 100~200MHz.
- Low power consumption, compared to current art.
- Small chip area, compared to current art.
1.4 TDC detection range expression

The other way to digitally express the phase (time) detection range of a TDC is by how many output bits it produces, \( X \). The relationship between \( X \) and \( \Delta T_{\text{rot.max}} \) is:

\[
\Delta T_{\text{rot.max}} = T_{\text{res}} \cdot (2^X - 1) \quad \text{Eq.(4)}
\]

For instance, a 10 bit TDC with 2ps resolution is able to detect 2ns of the phase (time) detection range, while a 14 bit one can cover more than 32ns. Eq.(4) implies that in the case of a fixed number of digital bits, the phase detection range gets narrower with the finer TDC time resolution, which means that a TDC design must trade-off time resolution for phase detection range, or vice versa.

1.5 Thesis Outline

Chapter 2 will discuss the background literature of the research and illustrate the progress of TDCs addressing the time resolution issue and phase (time) detection range. The advantages and disadvantages of TDCs will be discussed as well. Chapter 3 will introduce the Phase-scaled Vernier TDC architecture and describe the system level design including the system specifications provided and the circuit level specifications required for successful operation. A couple of advanced options of the TDC implementations will be provided for various purposes. In Chapter 4 we will discuss TDC timing considerations in order to achieve best phase noise performance by allowing a higher reference clock in the ADPLL system. Chapter 5 provides the circuit level implementation of the TDC core, phase regulator and the surrounding blocks of the TDC system. The simulation results and corresponding optimization for optimal performance
will also be performed in this chapter. Chapter 6 provides ADPLL system implementation for simulation and evaluation of TDC performance. All the work will be done in both Matlab and Cadence platforms. Chapter 7 will conclude the thesis. Also the future work on the topic to verify the research in physical level will be briefly discussed.
Chapter 2: Background

2.1 Inverter Based TDC (Traditional)

To date a number of types of TDCs have been studied, such as the basic inverter-based TDC (traditional TDC), the ring oscillator TDC, the Vernier TDC and the Vernier ring TDC.

![Inverter Based TDC (Traditional) Diagram]

The traditional TDC (Fig. 4) has a simple structure, which only consists of an N-stage delay chain and N D flip-flops (as an arbiter set/array) [6]. In this type of TDC, the DCO signal propagates through the delay chain with the time delay ($\tau$) for each stage of inverters. As shown in Fig. 5, a duplicated delayed version of the DCO edge is produced at the output of each inverter stage. The delay at each output is proportional to the number of inverters between the DCO input and the corresponding node. Once the reference (REF) edge arrives, the values of those nodes (data terminals of the arbiters) at that moment are stored as $Q<0:N>$. The location (k) where the values of $Q<0:N>$ transits, quantitatively determines the phase difference ($\Delta T_{rot}$) between the edge of the DCO and of the REF by:
\[ \Delta T_{Tot} = k \cdot \tau, \quad k = 1, 2, \ldots, N \quad \text{Eq.(5)} \]

where \( N \) is the number of stages of the TDC. Note that the phase detection range (\( \Delta T_{Tot,\text{max}} \)) is determined by:

\[ \Delta T_{Tot,\text{max}} = N \cdot T_{res} = N \cdot \tau \quad \text{Eq.(6)} \]

Obviously, the phase detection range (\( \Delta T_{Tot,\text{max}} \)) increases with the propagation delay (\( \tau \)) of each inverter and the number of stages (\( N \)). As mentioned, \( \tau \) also limits the TDC time resolution performance, thus increasing \( N \) is mostly the only way to meet a larger phase detection range in the case of a fixed \( T_{res} \). For instance, to detect a 1GHz DCO (\( \Delta T_{Tot,\text{max}} = 1000\text{ps} \)) with 10ps of \( T_{res} \), 100 stages of a delay chain are needed. Similarly, to detect an 80MHz DCO (\( \Delta T_{Tot,\text{max}} = 12.5\text{ns} \)), the required number of stages (\( N \)) should be 1250. The lower the frequency of the signal produced by the DCO, the more stages are required of a
delay chain at a constant inverter-delay ($\tau$). Obviously, large $N$ leads to large chip area and high power consumption, resulting in a less competitive or even impractical TDC solution.

### 2.2 Ring Oscillator TDC

To avoid a very large $N$, the ring oscillator TDC was developed (Fig. 6). It reuses a small odd-number of delay stages to achieve a wide phase detection range. It breaks the relationship of Eq.(6) and provides a classic solution for a large phase detection range requirement [7]. However, $T_{res}$ is still equal to $\tau$ which should be from 10ps to 100ps in a 65nm process, making the traditional TDC or ring oscillator TDC still limited in time resolution performance for the most demanding wireless applications. A multipath gated ring oscillator (GRO) structure was still a valuable field of the research to improve the TDC performance [8].

![Fig. 6 Ring Oscillator TDC Structure](image-url)
2.3 Vernier TDC

One way to improve the time resolution performance is to use a Vernier type TDC [9], as shown in Fig. 7, where the time resolution is equal to:

\[ T_{\text{res}} = \Delta \tau = \tau_2 - \tau_1 \quad \text{Eq.(7)} \]

where \( \tau_1 \) and \( \tau_2 \) are the time propagations of the inverters for the slow delay chain and the fast delay chain, which are driven by the DCO and the REF respectively. Theoretically, \( \Delta \tau \) could be reduced to almost zero and practically Vernier TDCs with \( T_{\text{res}} \) as low as a few ps have been implemented and verified such as [10].

Fig. 7 Vernier TDC Structure

However, the Vernier type TDC improves the time resolution at the expense of the phase detection range or area cost due to the huge number of stages needed in the delay chains. For example, aiming for \( T_{\text{res}} \) of 2ps and \( \Delta T_{\text{Tot,max}} \) of 12.5ns, a Vernier TDC will require 6250 stages. The Vernier TDC has a resolution 5 times better than that of the traditional TDC at expense of 5 times the number of stages, which will cause an even worse area & power
performance compared to the traditional TDC. In addition, it will take an extremely long time period \((\tau_1 \cdot N)\) to guarantee all arbiters’ values \((Q<0:N>)\) can be captured and stored correctly for determination. In the case of \(\tau_1 = 35\)ps, the capture period could be as long as \(35*6250=218.75\)ns which is impractical as well.

2.4 Vernier Ring TDC

By employing similar ring structures of the ring oscillator TDC into a Vernier TDC, an advanced type of TDC called the Vernier Ring TDC (VRTDC) was developed aiming at both a fine time resolution performance and a wide phase detection range, [11], [12]. The principle of the VRTDC (Fig. 8) is to connect the output end terminal to the input terminal of each Vernier delay chain which has only a finite small number of stages. The leading signal and the triggering signal run laps (rotations) along their own delay chains until the judgment of the arbiters shows that the triggering signal has caught up with the leading signal. The reuse of the delay chains in the VRTDC greatly reduces the number of delay stages, facilitating area and power cost reduction. Two counting units need to be introduced to record how many times the triggering
signal and the leading signal traverse the entire loop before the triggering signal catches up with the leading signal. The ring structure in the VRTDC requires a couple of extra demanding logic operations, which include negative pulse generation of the reference clock and the DCO clock, pre-logic operations (identifying and routing separately two inputs for the fast-ring and the slow-ring), double-scaled arbiter-sets for odd-rotation and even-rotation [13] and [14], and a registration bank with fast reading and decision logical operations. The VRTDC works correctly only if the above comprehensive considerations and corresponding complex logical operations can be performed with very high precision. However, a phase error due to finite width of the negative pulse degrades the performance of the TDC occasionally. On the whole, a VRTDC is one of the most effective and efficient TDC solutions, possessing fine time resolution performance, wide phase detection range and compact size due to a small number of delay stages. On the other hand, a VRTDC also requires a large amount of high speed circuitry due to its principle of two rings structure and usage of the multi-capturing & multi-updating or the large register bank. Table1 summaries the performances contrast among the variety of TDC solutions.

<table>
<thead>
<tr>
<th></th>
<th>Time Resolution $T_{res} = \Delta \tau$</th>
<th>Phase Detection Range $\Delta T_{Tot,max}$</th>
<th>Stages of Delay chains $N$</th>
</tr>
</thead>
<tbody>
<tr>
<td>Traditional TDC</td>
<td>Coarse</td>
<td>Narrow</td>
<td>Large</td>
</tr>
<tr>
<td>Ring-Oscillator TDC</td>
<td>Coarse</td>
<td>Wide</td>
<td>Small</td>
</tr>
<tr>
<td>Vernier TDC</td>
<td>Fine</td>
<td>Narrow</td>
<td>Super Large</td>
</tr>
</tbody>
</table>
2.5 Phase-scaled Vernier TDC

The purpose of this paper is to present a new TDC architecture which parallels the VRTDC counterpart in aspects of addressing the common limitation of the confliction between fine time resolution and wide phase detection range. Meanwhile, it avoids potential phase error and complicated logic determination in the VRTDC. Furthermore, its advantage of improved inverter-reverse frequency facilitates even less power consumption and a higher possible REF frequency for a better phase noise performance of the TDC, compared to the VRTDC.
Chapter 3: System Level Design and Implementations

3.1 Architecture Illustration of Phase-Scaled Vernier TDC

As shown in Fig. 9, the simplest topology of the phase-scaled Vernier TDC can be created by employing a partial-ring connection in a slow delay chain, such that the Vernier TDC core can be restarted periodically when the signal carried on the ring reverses. By being combined with this partial-ring structure (phase regulator) into a Vernier TDC, no matter how long the phase difference $\Delta T_{Tot}$ is, it will be scaled into a remarkably shorter length, $T_R$ (say <80ps), called the new defined mandatory fine-granularity phase length which will be the only part being measured with a very fine time resolution. While the rest of the phase ($\Delta T_{Tot} - T_R$) is not necessary for a tiny-granularity measurement and can be measured in large granularity (80ps). Based on this scheme, a 2ps 14 bits (32ns) TDC with only 40 stages will be implemented, maintaining compact area, low power consumption and tolerating higher possible reference frequency.

Fig. 9 Simplest Illustration of the Phase-scaled Vernier TDC Structure.
3.2 System Level Design Considerations

Usually, a TDC module is comprised of a TDC core section and a couple of peripheral sections for logical operations, decisions and calculations. As shown in Fig. 10, the proposed TDC architecture consists of a phase regulator, a fine resolution Vernier TDC core, a normal-phase counter, a pre-logic unit and an evaluator.

![Fig. 10 TDC module diagram](image)

The reference clock and DCO clock are processed by the pre-logic unit and then are fed into the phase regulator as the leading signal and triggering signal. The purpose of the phase regulator is to measure and cut the phase difference ($\Delta T_{Tot}$) (between the rising edge of the leading signal and the rising edge of the triggering signal) into pieces (as shown in Fig. 11). Except for the last piece ($T_R$), all the time pieces will have an identical normal-length ($T_{Nor}$) which should be set less than 100ps in order to take the most advantage in reducing the stage
number of the delay chains. Now the phase difference \( \Delta T_{\text{Tot}} \) is supposed to be broken into 2 parts, the last piece \( T_R \) and the rest of the phase \( \Delta T_{\text{Tot}} - T_R \).

![Diagram of phase regulator illustration in time domain](image)

**Fig. 11 Phase regulator illustration in time domain**

The last piece, \( T_R \) being shorter than \( T_{\text{Nor}} \) (as a fractional part of \( T_{\text{Nor}} \)), is supposed to be the only part for a very fine time resolution measurement. And it will be quantized as \( Q_R \) by a \( N \)-stage fine resolution Vernier TDC core. Note that the value of \( N \) should be picked by allowing the fine resolution Vernier TDC to just cover the normal-length phase, \( T_{\text{Nor}} \) (i.e. new defined fine-granularity phase detection range). That means:

\[
Q_{\text{Fin}} = M \times N + Q_R = m \times N + f
\]
For instance, if $T_{Nor}$ is set to be 60ps and the desired $T_{res}$ equals 5ps, $N$ (number of stages) should be 12. And the output of the Vernier TDC ($Q_R$) is determined by:

$$Q_R = T_R/\Delta\tau$$  \hspace{1cm} \text{Eq.(9)}$$

where the value ($f$) of $Q_R$ should locate between 0 and N.

While the rest of the phase ($\Delta T_{Tot} - T_R$) is not necessary for a tiny-granularity measurement so that it will be measured in large granularity (80ps). The normal-phase counter is used to record how many normal-length phase pieces are contained in the phase difference between leading and triggered signals, and then outputs a binary number ($M$) to the evaluator. Thus, the phase difference is given by:

$$\Delta T_{Tot} = M \cdot T_{Nor} + T_R$$  \hspace{1cm} \text{Eq.(10)}$$

Substituting equation Eq.(8) and Eq.(9) into Eq.(10) gives:

$$\Delta T_{Tot} = M \cdot N \cdot \Delta\tau + Q_R \cdot \Delta\tau$$
\hspace{1cm} = (M \cdot N + Q_R) \cdot \Delta\tau$$  \hspace{1cm} \text{Eq.(11)}$$

The definition of the final output of TDC ($Q_{Fin}$) therefore is:
\[ Q_{Fin} = \frac{\Delta T_{Tot}}{\Delta \tau} = M \cdot N + Q_R \quad \text{Eq.}(12) \]

The result of \( Q_{Fin} \) is computed in the evaluator based on the given \( M \) (from the normal-phase counter), \( N \) (the number of the fine TDC core stages) and \( Q_R \) (from the N-stage fine-resolution Vernier TDC).

Note from the comparison between Eq.(6) and Eq.(8), the number of stages in the phase-scaled Vernier TDC can be flexibly set as desired, unlike the traditional (Vernier) TDC solutions, in which the number of stages of the delay chains is proportional to the maximum period of the applied DCO clock. To what degree the stages of the former can be shrunk over the latter is determined by dividing Eq.(6) by Eq.(8):

\[
\frac{N_{\text{common law}}}{N_{\text{this work}}} = \left( \frac{\Delta T_{Tot,max}}{T_{res}} \right) \left( \frac{T_{Nor}}{T_{res}} \right) = \frac{\Delta T_{Tot,max}}{T_{Nor}} \quad \text{Eq.(13)}
\]

For instance in order to target a time resolution of 5ps and phase difference of 12.5ns, a traditional Vernier TDC needs 2500 stages while a phase-scaled TDC only needs 16 stages (set \( T_{Nor} \) as 80ps). Actually, the TDC core in Fig. 10 can also be implemented in another fine-resolution style (such as time-amplifier TDC) provided its time resolution performance is sufficiently good.

Theoretically, the set of the normal-length phase in the phase regulator can be as short as desired, resulting in a smaller TDC core. However, making this length shorter than 60ps is not
cost-efficient, because (as shown in Eq.(10)) a shorter $T_{Nor}$ also increases the number of normal-length phase pieces (M), resulting in a larger number of binary bits in the phase counter, which also consumes more area and power. In this thesis, choosing M up to 256 (8 bits) can cover a phase detection range of 20ns in terms of $T_{Nor}=80\text{ps}$, that is accepted for an ADPLL in wireless applications from sub-hundred MHz to tens-GHz.

### 3.3 System Specification Assumption

Considering the requirements of state-of-the-art ADPLL applications, a set of specifications was chosen for the design of the proposed phase-scaled Vernier TDC and is present in Table2.

Table2: Specifications of the TDC

<table>
<thead>
<tr>
<th>Specification</th>
<th>Symbol</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>TDC Time Resolution</td>
<td>$T_{res}(\Delta \tau)$</td>
<td>2ps</td>
</tr>
<tr>
<td>DCO Output Frequency (Period)</td>
<td>$f_{DCO}$</td>
<td>80MHz–20GHz</td>
</tr>
<tr>
<td></td>
<td>($P_{DCO}$)</td>
<td>(50ps–12.5ns)</td>
</tr>
<tr>
<td>Phase-Detection Range</td>
<td>$\Delta T_{Tot,\text{max}}$ or $Max(P_{DCO})$</td>
<td>&gt;12.5ns</td>
</tr>
<tr>
<td>Reference Frequency (Period)</td>
<td>$f_{REF}$</td>
<td>10–50MHz</td>
</tr>
<tr>
<td></td>
<td></td>
<td>(the higher the better)</td>
</tr>
<tr>
<td>Large-Size-Inverter Delay</td>
<td>$\tau_{2,\text{fast}}$</td>
<td>15ps</td>
</tr>
<tr>
<td>Normal-Inverter Delay</td>
<td>$\tau_{2,\text{normal}}$</td>
<td>30ps</td>
</tr>
</tbody>
</table>

Note that a shorter time delay of inverters would cause a bigger size of the transistors, considerably increasing the area of the layout in the case of a large number of stages.
3.4 Phase Regulator & TDC Core

Fig. 12 shows a specific implementation of the phase-scaled Vernier TDC architecture and highlights the phase regulator part and the fine time resolution TDC core part. The phase regulator in this paper is realized by a ring oscillator of a few stage of inverters, those are also multi-used as the front part of the slow-path delay chain of the Vernier TDC core.

The rising edge of the leading signal (REF) enables the gate NAND1, launching the run of the leading signal along the ring oscillator (Note that the other input terminal of NAND1 has become high already). Since this moment, the leading signal starts travelling in the loop of the ring oscillator and it triggers the counting of the normal-phase counter each time when it
completes a rotation (passes the last stage of the loop). The travelling on the ring will not stop until the triggering signal (RRD) come up. And then the counter can tell how many rotations (M) of the leading signal have been experienced along this ring structure. Note that the time period of a single rotation around the ring is actually the mentioned normal-length phase \( T_{Nor} \), which is set to be 80ps in this case. The number of the stages of the phase regulator can be 3 or 5 or 7 (odd) to make ring oscillator work correctly and efficiently. In this case the stages of the ring oscillator \( N_{ring} \) is set to be 5, thus the propagation delay of each stage of the inverter in the phase regulator \( \Delta t_{ring} \) can be determined by:

\[
\Delta t_{ring} = \frac{T_{Nor}}{N_{ring}} = \frac{80\text{ps}}{5} = 16\text{ps} = \tau_2 \quad \text{Eq.}(14)
\]

Note that \( \tau_2 \) equals to \( \Delta t_{ring} \).

As mentioned, the Vernier core is spared during most time of the phase detection operation, and it is only used to measure the last fractional piece, \( T_R \). The arrival of the triggering signal (RRD) is used not only to start the run of the RRD signal along the fast path delay chain but also activate the sixth stage of the inverter in the slow path delay chain (RRD control gate NAND2). Then the triggering signal starts chasing the leading signal; and as in the common Vernier type TDC, the position where the triggering signal just caught up with the leading signal is indicated by the transition of the arbiters’ output \( (Q_R) \). The number of stages of the Vernier TDC core \( N_{core} \) is determined by (Eq.(8)) the desired normal-length phase and the desired resolution:

\[
N_{core} = \frac{T_{Nor}}{\Delta \tau} = \frac{80\text{ps}}{2\text{ps}} = 40 \quad \text{Eq.(15)}
\]

And the slow-path inverter delay \( (\tau_2) \) and the fast-path inverter delay \( (\tau_1) \) should be equal to 16ps (as Eq.(14)) and 14ps \( (\tau_1 = \tau_2 - \Delta \tau = 16\text{ps} - 2\text{ps}) \) respectively. Therefore, the final
TDC output $Q_{Fin}$ can be determined by the acquired M, N=40 and $Q_R$ by Eq.(12). Compared to the Vernier Ring TDC solution where two arrays of arbiters are needed for odd-rotation and even-rotation respectively [13], this solution only needs one array of arbiters due to the ring-less structure of the fast path. This means that the complexity of a 40-stage Vernier core in this solution is actually equivalent to that of a 20-stage Vernier ring solution.

Because the completion of the odd rotation corresponds to the rising edge of the input signal of the normal-phase counter and even rotation corresponds to the falling edge of the input signal. Thus the phase counter should be a both-edge triggered counter to record each rotation of the signal. According to the phase detection range of 12.5ns in the Table2’s specifications and 80ps of $T_{Nor}$, the Phase Counter may record 156 rotations maximum. Thus, an 8 bits normal-phase Counter (256) is sufficient.

Fig. 13. A modified version of the phase-scaled Vernier TDC.

Fig. 13 shows a modified version of the phase-scaled Vernier TDC solution. It decouples the phase regulator circuitry and the fine TDC core circuitry of the solution shown in Fig. 12,
making the implementation easier. For example, in the former architecture, \( \tau_2 \) has to be equal to \( \Delta t_{ring} \) of 16ps; while in the latter one, \( \tau_2 \) can be set to 30ps for the purpose of the area reduction of the transistors in the TDC core circuit.

Table 3: Performance Analysis at a Fixed 25MHz Reference Frequency (40ns period)

<table>
<thead>
<tr>
<th>Specification</th>
<th>Symbol</th>
<th>Vernier</th>
<th>15-stage Vernier Ring</th>
<th>Phase-scaled Vernier</th>
</tr>
</thead>
<tbody>
<tr>
<td>TDC Time Resolution</td>
<td>( T_{res}(\Delta \tau) )</td>
<td>2ps</td>
<td>2ps</td>
<td>2ps</td>
</tr>
<tr>
<td>Phase-Detection Range</td>
<td>( \Delta T_{Tot,max} )</td>
<td>12.5ns</td>
<td>12.5ns</td>
<td>12.5ns</td>
</tr>
<tr>
<td>Normal-Inverter Delay</td>
<td>( \tau_2 &amp; \tau_1 )</td>
<td>30ps&amp;28ps</td>
<td>30ps&amp;28ps</td>
<td>30ps&amp;28ps</td>
</tr>
<tr>
<td>Mandatory Fine Resolution Phase Length</td>
<td>( T_{NOR} )</td>
<td>12.5ns</td>
<td>((2<em>N-1)</em>\tau_1=870ps)</td>
<td>80ps</td>
</tr>
<tr>
<td>Effective N stages</td>
<td>( N_e )</td>
<td>6250</td>
<td>30</td>
<td>40</td>
</tr>
<tr>
<td>Max Capture Period</td>
<td>( P_{c,max} )</td>
<td>6250*28ps=1250 ns</td>
<td>((870ps/2ps)*28ps =12ns)</td>
<td>40*28ps=1ns</td>
</tr>
<tr>
<td>Inverter output transitions in the Vernier TDC core in one reference period (inversely to Power consumption)</td>
<td>( F_{\text{Power}} )</td>
<td>6250*2=12500 ns</td>
<td>435*2=870 ns</td>
<td>40*2=80</td>
</tr>
<tr>
<td>Plus suppositional delay by evaluation period and pre-logic circuit: 4ns</td>
<td>( P_E + P_C )</td>
<td>175+4=179ns</td>
<td>Exceed one REF period (40ns)</td>
<td>12+4=16ns &lt;40ns</td>
</tr>
</tbody>
</table>

Conclusion

| | Impractical | Ok for time window, but medium power consumption | Good |

23
3.5 Coarse/Fine Configuration & Optimization

Researchers TDCs with switchable coarse/fine functions are popular in order to achieve optimal performances [15] and [16]. Similar to the current solution, as shown in Fig. 12 and Fig. 13, both of the two phase-scaled Vernier solutions illustrate a switchable coarse/fine time resolution function by employing an arbiter array (5 D Flip-Flops) into the phase regulator. The refined phase regulator is also functioning as a TDC with a coarse time resolution of 16ps (80ps/5). By replacing the seventh inverter (in slow path) and the first inverter (in fast path) with NAND gates (the other input terminal of the NAND is triggered by the RRD signal), the time window of the activity of the fine resolution Vernier TDC core can be easily managed. The circuitry is supposed to be active only when the last fractional phase piece is coming through.
In practice, the 2ps fine resolution TDC core can be turned off in most less demanding scenarios. For a demanding requirement, the fine resolution TDC core can still be off in the acquisition & coarse-resolution tracking period, until the ADPLL requests it to turn on (Fig. 14). Even during fine-resolution tracking, the real-working period of the phase-scaled Vernier TDC is only around 1ns (Table 3), 12 times shorter than that of a 15-stage VRTDC. That difference causes a huge difference in the number of times of inverters transition between the VRTDC (840 times) and the phase-scaled TDC (80 times). The above characteristic allows the phase-scaled Vernier TDC to have even lower (over 10 times) power consumption compared to the Vernier Ring counterpart. Therefore, the scheme of two steps (coarse-fine resolution) phase-scaled TDC.
provides a super low power consumption solution without trading off other crucial performance parameters.

Note that the introduction of the 5 arbiters in the phase regulator also facilitates a smart determination called the coordinated-determination device, which could remarkably simplify circuits in the evaluation unit. The coordinated-determination device will be described later. The output of 5 arbiters $Q_M < 1:5 >$ can also be used to optimize the counter circuit and up-scaled the phase detection range. Specifically, the both-edge triggered counter is used to record every rotation on the ring oscillator; however it can be simplified to a one-edge triggered counter which only record every other rotation in order to loosen the demand on the response-speed of the counter circuitry. In that case, the LSB of the 8-bit counter will correspond to 160ps instead of 80ps, and for an even number of rotations the final result needs a 80ps compensation. In addition, this modification would double the range of the phase detection, from 20ns (256*80ps) to 40ns (256*160ps) at a fixed 8 bits. This optimization can be realized by the following steps (Fig. 15):

1. The outputs of 5 D Flip-Flops $Q_M < 1:5 >$ are processed by a set of XOR gates to determine the position of the transition $Q'_M < 1:5 >$.

2. A simple encoder produces a binary number (B) basing on $Q'_M < 1:5 >$.

3. By distinguishing the state of $Q_M < 1 >$, it will be aware of whether there is an even number of rotations or not. A is used to correct the final output $B'$ with value of 5 (corresponding to one more rotation 80ps) in the case of an even number rotation.

4. $B'$ is the sum of A and B.
Fig. 15. (a) Logical modification for rising-edge only phase counter, (b) True table of (a) circuitry.
3.6 Evaluator

Functionally, as shown in Eq.(12) the evaluator calculates the output of the normal-phase counter, $M$ (the number of normal length phase pieces between phase difference) and the output of the Vernier core unit in thermometer-code format ($Q_r<1:40>$) which is determined by Eq.(9). Thus $Q_{Fin}$ for a 40 stage fine TDC core is decided by:

$$Q_{Fin} = M \cdot 40 + Q_r$$  \hspace{1cm} \text{Eq.(16)}

Usually a 40/6 bits thermometer-to-binary encoder may be employed to convert $Q_r$ from thermometer code to binary code. The complexity of the thermometer-to-binary encoder increases exponentially with the digit number of the thermometer code; although it is already much simpler than that of a priority type decoder, which is commonly used in a Vernier Ring TDC due to the possibility of the presence of multiple fake transitions. However, in this thesis a 40/6 bits thermometer-to-binary encoder is still viewed as a complicated conversion and deserves for further simplification into an 8/3 bit simple encoder by applying the coordinated-determination device in the evaluator. From Fig. 16, $Q_M<1:5>$ is the output of the arbiters located in the phase-regulator (5-stage ring-oscillator), that actually can be regarded as arbiters of a ring-oscillator TDC. As mentioned, the setting of $T_{Nor}$ (normal-length phase) of 80ps is covered by a 5-stage ring-oscillator with a propagation delay of each stage at 16ps (i.e. coarse resolution); also covered by a 40 stages Vernier TDC core with 2ps resolution (fine resolution). $Q_M<1:5>$ and $Q_r<1:40>$ indicate the quantitative measurement value of $T_r$ with 16ps and 2ps resolution respectively, and every bit in $Q_M<1:5>$ sequentially corresponds to eight bits in $Q_r<1:40>$ (Fig. 16(b)). In the coordinated-determination device (Fig. 16(a)), the exact position of the transition can be determined by the following steps:
a) $Q_M<1:5>$ gets through an array of XOR gates (Fig. 15 (a)) to find the position of its values’ transition (transferring the thermometer code format of data to simple code format $Q'_M<1:5>$). $Q_R<1:40>$ can be distributed into five sections, which correspond to $Q'_M<1> \sim Q'_M<5>$.

b) Second, the exclusive bit of “high” in $Q'_M<1:5>$ is used to select the corresponding section of $Q'_r<1:40>$ which should be the section that contains the position of the edge transition in higher precision (labeled $Q'_r<1:8>$).

c) Third, $Q'_M<1:5>$ is converted by a simple encoder (8/3) into a three bit output, $B<3:5>$ which represent the three most significant bits of a 6 bit binary number which shows the effective remaining phase difference ($T_R$) in coarse resolution of 16ps.

d) Meanwhile, $Q'_r<1:8>$ is converted by an edge finder and a simple encoder (8/3) into a three bit output, $B<0:2>$ which represent the three least significant bits of the above mentioned 6 bit binary number to exhibit the fine part of $T_R$ in 2ps resolution.

e) The key point is the 3-bit output of the coarse-detection ($B<3:5>$) from arbiters of the phase regulator can be just easily stacked onto the 3-bits of the fine-detector ($B<0:2>$), avoiding additional computation and conversion, provided the equation $N_{core}/N_{ring} = 2^K$ is the case (where K is a positive integer).
Note that the coordinated determination device reducing the processed bit number by 5 times (from 40 bits in $Q_R^{<1:40}>$ to 8 bits in $Q_R^{<1:8>}$) and not only facilitates the simplification of logic circuitry in the evaluator but also removes the possibility of the error due to the appearance of the fake transitions in other sections.

The multiplication arithmetic ($M \cdot N_{core}$) is implemented with an addition arithmetic between two bits-shifted by M as shown in Fig. 16(a) Because 40 is expressed as 101000 in binary format, which shifts M by 3-bits and 5-bits respectively, and then add two items up to get a 13-bit binary number of $M \cdot 40$. Finally, $Q_{Fin}$ will be determined by adding $M \cdot 40$ to $Q_R$. 
Fig. 16. (a) Evaluator and Coordinated-Determination Device, (b) and their operations in time domain.

3.7 Pre-logic Circuit with Reversed-Trigger Solution

Generally, a reference signal is used as a triggering signal of arbiters to latch the instantaneous state of each node of the delay chain along which the DCO signal is travelling. The
reference signal and the DCO signal cannot be exchanged due to the fact that the triggering and latching of the arbiters should be performed only once per period of the reference clock. Multitransitions of the triggering signal would lead to an error, especially for the Vernier type TDC. In a common Vernier Ring TDC, a pre-logic circuit with pulse-generators needs to be embedded into the front-end of the circuit to avoid the transition behavior of the leading signal and triggering signal during the rest of the TDC determination cycle. However, a phase error may be present when the phase difference is narrower than the width of the pulse, although the width is designated to be a tiny value like 10ps. The pre-logic circuit is also used to identify and distribute the REF signal and the DCO signal into the slow path and the fast path respectively, as the leading signal and the triggering signal. The operation will also face a meta-stability issue occasionally while the two edges of the REF and DCO are very close to each other, causing extra unknown phase error.

In the present solution, a super concise style of pre-logic circuit, named the reversed-trigger pre-logic unit is introduced to steer the whole phased-scale Vernier TDC for accurate operation with low power consumption. In Fig. 18(a), the reference signal is oversampled by the high rate DCO signal, yielding a “reference-retimed-by-DCO” signal (RRD). As shown in Fig. 17, the rising edge of the RRD is aligned in the time domain with the first DCO rising edge.

Fig. 17. Waveform illustration of reversed-triggered pre-logic unit.
which follows the presence of the reference signal. Conversely to the traditional triggering strategy, the reference signal in this case is regarded as the leading signal to be injected into the TDC directly, launching the TDC operation procedure, while the RRD signal is viewed as the triggering signal to trigger the arbiters of the TDC (Fig. 18(b)), symbolizing the end of the phase difference. The main advantage of the reversed-trigger solution is to eliminate the high rate DCO signal’s multiple transitions (reversals) avoiding the usage of the pulse generator which leads to a phase error as mentioned above.

![Diagram of Reverse-Triggered Pre-logic Unit and Arbiter Unit](image)

Fig. 18. (a) Reverse-triggered pre-logic unit block, (b) Arbiter unit block, (c) D Flip-Flop block, (d) D Flip-Flop transistor level block.
Additionally, the reverse-triggered pre-logic circuit is being combined with a function of phase-offset cancelation. In Fig. 18(c), the arbiter is actually a D flip-flop which consists of two latches staged up. When the clock triggers the D flip-flop, the arbiters do not capture the state of the REF but the state of the X node. And X is a duplicated REF with the delay of $\Delta T_A$ due to latch A’s propagation. That means the arbiters always capture the reference a short time ($\Delta T_A$) before it is triggered by the DCO signal due to the propagation delay of the transistors of latch A. This time-offset causes a constant phase error during the quantization measurement in many popular TDC solutions. Fig. 17 shows that there is another time-offset delay between the RRD and corresponding DCO rising edges ($\Delta T_B$), which is caused by latch B of the reversed-trigger pre-logic unit (Fig. 18(b)(c)). Note that the real phase difference ($\Delta T'_{Tot}$) is determined by:

$$\Delta T'_{Tot} = \Delta T_{Tot} + \Delta T_{Bprelogic} - \Delta T_{Aarbiter} \quad \text{Eq.}(17)$$

If one of the offsets can be set (by choosing proper parameters of the transistors) to be equal to the other one (i.e. $\Delta T_{Bprelogic} = \Delta T_{Aarbiter}$), the phase offsets will be canceled out, making the triggering signal capture the current state of the REF signal with no phase error (i.e. $\Delta T'_{Tot} = \Delta T_{Tot}$).
The third advantage of the reverse-triggered pre-logic unit comes from the fact that it can perfectly match the ADPLL system oversampling-time strategy, which means both system and TDC use RRD as the synchronous trigger (Fig. 19). The reference signal is introduced directly into the leading signal path of the TDC, compared to the complicated processing of the pre-logic circuit in the common Vernier Ring TDC. In the VRTDC, if the REF edge and DCO edge rise simultaneously, at least one of the outputs of the pre-logic unit will lead to a meta-stability state, resulting in an extra phase offset between the leading signal and the triggering signal. While in the phase-scaled Vernier TDC, as shown in Fig. 17, when the transitions of the REF and DCO happen simultaneously in the reversed-trigger pre-logic circuit, the second DCO rising edge would overcome the occurrence of the meta-stability and setup the RRD to the “1” state instead. In this situation, the detected phase difference will be roughly a DCO period instead of close to zero, which means that a slip with the length of a DCO period occurs. Fortunately, there is no need to be concerned, because this slip can be tolerated even without any additional computation by applying the proposed ADPLL system solution in Fig. 19. Fig. 20 shows an example of the
ADPLL system level operation in the time domain based on the principle in Fig. 19. Note that the REF edge and DCO edge rise simultaneously at the thirteenth period of the DCO and a slip occurs. However the DCO period counting and TDC measurement extents by the same value and the output of the SUM will still be zero, which means the occurrence of the slip has no impact on the aggregate operation in the system level.

\[
\frac{FCW}{T_{REF}/T_{DCO}} = 3.75\ 
\ell = \frac{i}{T_{DCO}} \quad \text{(Number of DCO Period)}
\]

<table>
<thead>
<tr>
<th>i = 0</th>
<th>1</th>
<th>2</th>
<th>3</th>
<th>4</th>
<th>5</th>
<th>6</th>
<th>7</th>
<th>8</th>
<th>9</th>
<th>10</th>
<th>11</th>
<th>12</th>
<th>13</th>
<th>14</th>
</tr>
</thead>
<tbody>
<tr>
<td>DCO R-Edge</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>REF R-Edge</td>
<td>0</td>
<td>3\frac{3}{4}</td>
<td>6\frac{1}{2}</td>
<td>9\frac{3}{4}</td>
<td>13</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>(Leading Signal)</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>RRD R-Edge</td>
<td>0</td>
<td>4</td>
<td>7</td>
<td>10</td>
<td>14</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>(Triggering Signal)</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>∑FCW (+)</td>
<td>0</td>
<td>+3\frac{3}{4}</td>
<td>+6\frac{1}{2}</td>
<td>+9\frac{3}{4}</td>
<td>+13</td>
<td>+13</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

\[
\frac{T_{DC}}{T_{DCO}} \quad +\frac{3}{4} \quad +\frac{1}{2} \quad +\frac{1}{4} \quad 0 \quad +1
\]

| i (−) | 0 | −4 | −7 | −10 | −13 | −14 |
| SUM | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

Fig. 20 ADPLL timing and computation example (FCW=3.25)
Chapter 4: TDC Timing Considerations for Higher Reference Frequency

TDC operating-period performance is seldom discussed due to sufficient time margins in most applications under low REF frequency circumstances. However, a high REF frequency requirement trends to be becoming more popular in order to achieve better phase noise performance. Therefore, the operating period performance will play a more important role in TDC research to achieve an optimal solution, especially in scenarios which need a high reference frequency (>50MHz). Operating-period ($P_L$) is defined as a time period starting from the moment of the rising edge of the triggering signal and ending with the guaranteed moment when the TDC can accomplish a final output for the ADPLL system computation. The purpose of the operating-period performance is primarily to estimate whether a certain kind of TDC solution is feasible for a specific applied reference frequency. Meanwhile, it can also predict the upper limit of the frequency of the reference signal being used by this TDC solution. Moreover, by learning about the time margin of a TDC solution, a comprehensive optimization is available for the purpose of obtaining an optimal solution not only of the TDC but also of the end-to-end synthesizer design.
4.1 General Operating-Time Procedure

Prior to a detailed discussion of the operating-period performance analysis, a general operating procedure of a TDC is described as follows (Fig. 21):

a) $P_A$: The active period starts with the rising edge of the leading signal and stops at the moment of the rising edge of the triggering signal. The active period corresponds to the phase difference of a TDC.

b) $P_C$: The capturing period is regarded as the time period that a TDC takes to ensure all necessary states of arbiters have been stored.

c) $P_E$: The evaluation period indicates how long a TDC spends to process the output of the arbiters and produce a final quantitative binary number which shows the measured phase difference. Usually this period is relatively stable and less than a few ns in a variety of TDC solutions, unless the number of the arbiters is too large (more than 200) or the computations are complicated. 4ns of $P_E$ is assumed for all TDCs for comparison in this thesis.

d) $P_M$: The margin period is the remaining time length after subtracting $P_C$ and $P_E$ from the REF period $P_{REF}$.

e) $P_S$: The period for ADPLL system computation and operation. $P_S$ is expected to occupy the next REF period.
Therefore the REF period, $P_{REF}$ and the maximum value of $f_{REF}$, $f_{REF,\text{MAX}}$ can be expressed by:

$$P_{REF} = P_C + P_E + P_M \quad \text{Eq.(18)}$$

$$f_{REF,\text{MAX}} = 1/(P_C + P_E) \quad \text{Eq.(19)}$$

Fig. 21 Operating-period Performance Comparison Between (a) VRTDC and (b) Phase-Scaled Vernier TDC.

4.2 Operating-Period Performance Comparison
First, a quantitative operating period analysis is performed for the presented phase-scaled Vernier TDC. According to the principle of the phase-scaled Vernier TDC, the mandatory fine-resolution phase length (i.e. normal length, $T_{nor}$) is 80ps in this case and the Vernier TDC core will need 40 stages with each stage having a resolution of 2ps (30ps-28ps). The guaranteed capturing period is determined by:

$$p_c = \left(\frac{T_{nor}}{\Delta r}\right) \cdot \tau_1 = \left(\frac{80\text{ps}}{2\text{ps}}\right) \cdot 28\text{ps}$$  \hspace{1cm} \text{Eq.(20)}

$$= 1.1\text{ns}$$

The length of the evaluating period depends on the complexity and the parameters of the evaluator. Choosing 4ns for the evaluation period will be adequate in most modern CMOS processes for a moderate number of inverter stages. Thus, for a 25MHz REF, the margin period is:

$$P_M = P_{REF} - P_C - P_E$$  \hspace{1cm} \text{Eq.(21)}

$$= 40n - 1.1n - 4n \approx 35n$$

Also note that the highest accepted reference frequency is:

$$f_{REF,MAX} = \frac{1}{p_C + p_E} = 200MHz$$  \hspace{1cm} \text{Eq.(22)}

As for a 15-stage Vernier Ring TDC, the mandatory fine-resolution phase length (i.e. normal length, $T_{nor}$) is $(2*N-1)\cdot \tau_1=870\text{ps}$. The guaranteed capturing period is determined by:
Similarly, for a 25MHz REF, the margin period will be only 4ns and highest accepted reference frequency will be 62MHz. The above results and performances are summarized in Table 4, which also includes the result of a traditional Vernier TDC. Note from Table 3 and Table 4 that a traditional Vernier TDC may need an extremely large number of delay stages in the case of demanding specifications (fine resolution and wide phase-detection range), resulting in an impractical solution. Thus, for a Vernier type, the phase-detection range performance and corresponding stages of delay chains would be a tough limitation, although it can achieve a fine resolution. Compare to the VRTDC, the phase-scaled Vernier TDC can accept a \( f_{REF,MAX} \) times higher (200MHz/62MHz), which can improve the phase noise performance (caused by the TDC time resolution) by over 5 dB. In fact, the phase-scaled Vernier due to its ring-less structure of the triggering path enables one-off latching action, compared to the ring structure of the triggering path in a Vernier ring TDC where the multiuse of the delay chains and arbiters needs a multi-latching and updating function which demands longer operating-period performance.

\[
P_c = \left( \frac{T_{nor}}{\Delta r} \right) \cdot \tau_1 = \left( \frac{870ps}{2ps} \right) \cdot 28ps = 12ns \quad \text{Eq.(23)}
\]
Table 4: Improvement of the Phase Noise Performance with Different Accepted Reference Frequency (1GHz of DCO output)

<table>
<thead>
<tr>
<th>Specification</th>
<th>Symbol</th>
<th>Vernier</th>
<th>15-stage Vernier Ring</th>
<th>Phase-scaled Vernier</th>
</tr>
</thead>
<tbody>
<tr>
<td>TDC Time Resolution</td>
<td>$T_{res}(\Delta \tau)$</td>
<td>2ps</td>
<td>2ps</td>
<td>2ps</td>
</tr>
<tr>
<td>Normal-Inverter Delay</td>
<td>$\tau_2 &amp; \tau_1$</td>
<td>30ps&amp;28ps</td>
<td>30ps&amp;28ps</td>
<td>30ps&amp;28ps</td>
</tr>
<tr>
<td>Mandatory Fine Resolution Phase</td>
<td>$T_{NOR}$</td>
<td>NO</td>
<td>(2<em>N-1)</em>$\tau_2=870$ps</td>
<td>3*$\tau_2=90$ps</td>
</tr>
<tr>
<td>Effective N stages</td>
<td>N_e</td>
<td>500</td>
<td>30</td>
<td>40</td>
</tr>
<tr>
<td>Max Capture Period</td>
<td>$P_{C,max}$</td>
<td>500*28ps=14ns</td>
<td>(870ps/2ps)*28ps=12ns</td>
<td>40*28ps=1ns</td>
</tr>
<tr>
<td>Operating-period</td>
<td>$P_E + P_C$</td>
<td>14n+4n=18ns</td>
<td>12n+4n=16ns</td>
<td>1n+4n=5ns</td>
</tr>
<tr>
<td>Highest REF Frequency</td>
<td>$REF_{max}$</td>
<td>1/(14n+4n)=55MHz</td>
<td>1/(12n+4n)=62MHz</td>
<td>1/(1n+4n)=200MHz</td>
</tr>
<tr>
<td>Phase noise performance Improvement</td>
<td></td>
<td>0</td>
<td>0.5dB</td>
<td>5.6dB</td>
</tr>
</tbody>
</table>
Table 5: Comparison of Several TDC Architectures

<table>
<thead>
<tr>
<th>Specification</th>
<th>Traditional TDC</th>
<th>Ring-Osci TDC (9-stage)</th>
<th>Vernier TDC</th>
<th>Vernier Ring TDC (15-stage) [13]</th>
<th>Phase-scaled Vernier TDC</th>
</tr>
</thead>
<tbody>
<tr>
<td>Resolution</td>
<td>10ps</td>
<td>10ps</td>
<td>2ps*</td>
<td>2ps</td>
<td>2ps</td>
</tr>
<tr>
<td>Effective stages</td>
<td>12.5n/10p=1250</td>
<td>9</td>
<td>6250</td>
<td>30*</td>
<td>40</td>
</tr>
<tr>
<td>Operating Period</td>
<td>4ns</td>
<td>4ns</td>
<td>183ns</td>
<td>16ns</td>
<td>5ns</td>
</tr>
<tr>
<td>Highest f_REF</td>
<td>250MHz</td>
<td>250MHz</td>
<td>5.5MHz</td>
<td>62MHz</td>
<td>200MHz</td>
</tr>
<tr>
<td>Phase Noise***</td>
<td>0dB</td>
<td>0dB</td>
<td>2.6dB worse</td>
<td>7.9dB better</td>
<td>13dB better</td>
</tr>
<tr>
<td>Area Cost Power</td>
<td>High</td>
<td>Low</td>
<td>Super high</td>
<td>Medium</td>
<td>Medium</td>
</tr>
<tr>
<td>Power Consumption</td>
<td>High</td>
<td>Low</td>
<td>High</td>
<td>Medium</td>
<td>Low</td>
</tr>
<tr>
<td>Applied field</td>
<td>Wide range,</td>
<td>Narrow range,</td>
<td>Wide range,</td>
<td>Wide range,</td>
<td>Wide range,</td>
</tr>
<tr>
<td></td>
<td>coarse resolution,</td>
<td>high resolution,</td>
<td>high resolution,</td>
<td>medium cost</td>
<td>fine resolution,</td>
</tr>
<tr>
<td></td>
<td>low cost</td>
<td>medium cost</td>
<td>low cost</td>
<td>low cost</td>
<td>low cost</td>
</tr>
</tbody>
</table>

*: effective stages doubles due to ring type of triggering path completing a rotation every other rotation.
Chapter 5: Circuit Implementation and Simulation

5.1 Technology Process and Basic model parameters

The most importance of the implementation of TDC is to reduce the propagation delay of the transistors especially of delay chain due to the requirement its time resolution performance which is at ps generally. The technology used in this research will be a 65nm CMOS process with low $V_T$ deep well NMOS and PMOS transistors. The delay unit of the delay chain is used inverter which consists of a PMOS transistor and a NMOS transistor. The propagation delay of a inverter depends on the output capacitance plus load capacitance, voltage change and drain current [17], as shown below (Eq.(24)):

$$t \propto \frac{C}{I} \Delta V$$

Eq.(24)

Thus, to shrink the delay time of a logic gate, the following parameters could be tuned:

1. Decrease the swing (supply) voltage, $\Delta V$.
2. Reduce loaded capacitance ($C_{Load}$ is a part of C in Eq.(24)).
3. Increase current by increasing the channel width-to-length ratio. In this case, because the length of the transistors has been fixed at 60nm, the width of channel could be increased to reduce the influence of parasitic or load capacitance and decrease the delay.
4. The ratio of PMOS width ($W_P$) and NMOS width ($W_N$) influences the transition response of the inverter as well.
From Fig. 22 transition response gets slow with low $W_P/W_N$ of the inverters, the capacitive load value ($C_{load}$) at a constant voltage supply and $W/L$ value. In order to get a sharp transition in the time domain, a large $W_P/W_N$ ratio and small loading are desired.

Fig. 22. Propagation delay of inverters with various $W_P/W_N$ ratios and $C_{load}$. 
Fig. 23 shows an inverter’s DC transfer simulation sweeping the ratio $W_P/W_N$. A ratio of 3:1 for the width of PMOS to NMOS is a good starting point for the size of inverters because with this choice, a 600mV input corresponds to a 600mV output.

5.2 Vernier TDC Core Circuit

As mentioned in the previous chapter, a Vernier TDC core consists of a slow path inverter delay chain, a fast path inverter delay chain, and a set of D Flip-Flops.

5.2.1 Slow path inverter delay chain

The implementation of an inverter delay chain as shown in Fig. 24 should follow these guidelines:

- Identical propagation delay time in each stage.
• Rise time and fall time should be identical to make the delay chain’s behavior in odd rotations similar to that in even rotations.

• Reduce the effect of the load from the D Flip-Flop as much as possible. Do this by making the stage transistors at least three times larger than the D Flip-Flop buffer transistors. For example assuming a W/L of 200nm/60nm for the D Flip-Flop buffer transistors, the stage should have a W/L >= 600nm/60nm to get a reasonably small impact of load on the inverter delay chain.
Because the first functional inverter is actually a NAND gate, the finger number, the width and the Wp/Wn ratio needs to be tuned to make sure this stage has the same behavior as the other inverters.

Fig. 25 illustrates the comparison before optimization and after optimization, in which odd rotation (70ps) and even rotation (90ps) are not equal initially due to impact of the NAND gate, then after optimization all rotations can be fixed at 80ps, i.e. Normal-length Phase.
Fig. 25. NAND Optimization for equal length of odd and even rotations.

5.2.2 D Flip Flops

Fig 26(a) shows a dynamic inverting flip-flop built from two back-to-back dynamic latches [18]. To reduce the delay of the circuit, there is no buffer at node Q. Flip-Flops usually use a single clock $\varnothing$ and generate its complement $\overline{\varnothing}$ locally. Obviously, there will be some delay in $\overline{\varnothing}$ relative to $\varnothing$, which will lead both the clock and its complement to be in intermediate state (Fig 26(b)). Consequently, both latches will be transparent or meta-
stable for a certain period. To avoid the above mentioned issue, a differential clock pair is needed in this research and will be introduced in the next section.

![D Flip Flop circuit and illustration of non-ideal issue](image)

Fig 26. (a) A D Flip Flop circuit and (b) illustration of non-ideal issue. [18]

In this research, two types of D Flip-Flops are needed, one for the arbiter array and the other for RRD generation. The arbiter array consists of 40 D Flip-Flops which are used to capture the output state of each stage of the slow path inverter delay chain. Thus, those D Flip-Flops are the load of the delay chains and the value of their input capacitance impacts the speed of the circuit response. The clock load should be kept as small as possible aiming at fast speed and low power consumption. Table 6 indicates the initial configuration of the design parameters.

<table>
<thead>
<tr>
<th>Static Transistors Width</th>
<th>Transmission Gate Width</th>
</tr>
</thead>
</table>

Table 6: D Flip Flop initial parameters configuration.
The RRD generator’s D Flip-Flops should be fast and heavy duty because they drive the fast path delay chain which drives the clock load of the arbiter-array in turn.

### 5.3 Differential RRD and Trigger Clock

As discussed in 5.2.2, the clock with its synchronous complement will remove the possibility of the error and intermediate state when capturing the state of the D Flip-Flops. To realize the simultaneous clock and its complement, internal conversion by inverter will no longer be used due to the delay from $\varnothing$ to $\overline{\varnothing}$. Instead, a differential trigger system has been set up. As shown in Fig. 27, the RRD and its complement are generated respectively by two identical fast D Flip-Flops, each of which has a reference signal and its complement as an input. Thus, $RRD$ and $\overline{RRD}$ drive the differential fast path delay chain together for simultaneous generation of $\varnothing_i$ and $\overline{\varnothing}_i$. 

<table>
<thead>
<tr>
<th>PMOS</th>
<th>NMOS</th>
<th>PMOS</th>
<th>NMOS</th>
<th>Fingers</th>
</tr>
</thead>
<tbody>
<tr>
<td>600nm</td>
<td>200nm</td>
<td>300nm</td>
<td>300nm</td>
<td>1</td>
</tr>
</tbody>
</table>
5.4 Encoding Circuit

As described in the previous chapter 3.6, a coordinated-determination device is used to distribute 40 outputs into 5 sections sequentially and only one of the 5 sections is useful and valid at any moment, by doing which the encoding operation can be simplified from a 40/6 priority encoder to a 8/3 priority encoder, that will scale down the circuit complexity exponentially. Fig. 28 is a schematic circuit (a) and truth table (b) of an 8/3 priority encoder.

Fig. 27. Differential RRD and differential fast path clock trigger.
Fig. 28. (a) 8/3 priority encoding circuit and (b) its truth table.

Note that using a set of XOR gates to implement the edge finder (Fig. 29) can further simplify the priority encoder.
5.5 Phase Regulator and Simulation

The circuit structure and their parameters of the slow delay chain, the arbiter array, the fast path delay chains and differential RRD generator have been discussed and determined. In this section the Vernier TDC core will be combined with the phase regulator circuit. As mentioned, the core idea of this research is to employ a phase regulator device to convert the random phase (time) difference into normal-length phase pieces, in this case 80ps long pieces. In the previous chapter a couple of options to realize the phase regulator have been introduced such as a) reuse partially the slow delay chain (Fig. 12), or b) insert an independent ring oscillator in front of the Vernier core circuit (Fig. 13). Here option b) is chosen for the next step for the simulation due to decoupling between the phase regulator circuit and the core circuit being easily attainable.
Fig. 30 illustrates the combined schematic circuit of the phase regulator and the Vernier Core, and in Fig. 31 edge finders and priority encoders are introduced to get the 5 section outputs in 3 bits and one carry out bit.

![Circuit Diagram]

Fig. 30. Vernier core circuit and phase regulator.
Fig. 31. Diagram of the TDC core and coding output.

Fig. 32 indicates the ideal initial conditions of the simulation which includes the reference signal rising at 30ps and the phase regulator reversing every 80ps. This configuration is to set rotations with an 80ps period. When the DCO comes at any certain time spot, ideally the Vernier TDC core can quantify the last fractional phase piece accurately. However, in practice this is not the case. Fig. 33 shows the simulation of the real waveform of REF and the phase regulator output. Except for the first transition which is driven by REF, all transitions are triggered by the ring-structure feedback. By optimizing the parameters of the ring circuit of the phase regulator, 80ps of the transition period could be achieved (being measured from one middle voltage to the next one)
except the first transition period which is about 86ps. This difference is obviously due to offset between the REF trigger and the ring-structure feedback trigger of the NAND gate. Thus this 6ps (around) difference could be regarded as an offset of phase measurement which can be canceled out later in the final calculation procedure. Alternatively, this offset of phase can be removed by tuning the parameters of the pre-logic timing circuit as discussed in 3.7.

Fig. 32. Ideal Simulation Conditions with REF and Phase Regulator Output.
5.6 TDC Core Circuit Simulation Results and Optimization

Now the simulation of the TDC core circuit is complete. In Fig. 34, sweeping the parameter of the DCO arriving time spots covering the second and third rotations (2*80ps=160ps) has been performed in the x axis. The result in the y axis shows the output of the 40-stage Vernier core circuit by showing 5 sections with 5 curves respectively and each of them corresponds to eight outputs as shown in Table 7. The two rising ramps (covering from 0 to 39) in the simulation result, present the output in an 80ps range, which illustrates that 40 outputs identify 80ps of phase piece in an almost linear trend. Therefore, the concept and its implementation of the phase-scaled Vernier TDC have been set up and verified.
A zone during the first rotation of the phase regulator deserves special attention, because it is the only rotation triggered by the REF signal rather than the ring-structure feedback signal. As discussed, this will have a 6ps longer than a normal length (80ps) due to phase offset between REF trigger and ring-structure feedback trigger of the NAND gate.
Fig. 35 presents the Vernier Core output changes with different DCO arriving times during the first rotation, i.e. from 0ps to 86ps later than the REF signal edge. According to the simulation result, it can be separated into 3 regions of the plot:

1. The linear region is from 16ps to 86ps away from the REF edge, which is the region where the circuit is able to indicate the result of the measurement accurately.

2. The offset region is from 0ps to 6ps away from the REF edge. This 6ps offset will be cancelled out in the calculation process.

3. The error zone is from 7ps to 16ps away from the REF edge. This error is due to impact of the meta-stability causing by the edge of the DCO signal being very close to that of REF. Note that the slip issue caused by this meta-stability has been considered and dealt with already as described in Chapter 4.

The operation of the ADPLL system can tolerate this error without significant performance degradation due to the following two reasons:

- The possibility of the occurrence of this error is very low, in addition there could be some way to try to push the DCO edge away from this tiny region or identify this occurrence then avoid counting it.

- The attenuation factor in DLF will almost relieve the impact from this one-time small error.
Fig. 35. Vernier Core Output in First Rotation of the Phase Regulator.

Now the implementation of the combination between the phase regulator and Vernier TDC core has been completed and the result of the simulation shows that it works as expected. Only one of 5 sections has non-zero output at any time and the overall trend of the outputs of the 5 sections rises linearly. The next step is the implementation of the evaluator unit which will be designed to achieve the following functions: a) Coordinated-determination device (5-1muliplexer and corresponding determination); b) Odd-even rotation compensations determination; c) Normal-length phase counter determination; d) TDC final output calculation.
5.7 Evaluator Implementation and Simulation

5.7.1 Coordinated-Determination Device

In this section, a part of evaluator called the coordinated-determination device will be discussed and implemented. As discussed in section 3.6, in order to simplify the complexity of the priority encoder from 40/6 bits to 8/3 bits, $Q_R$ should be separated into 5 sections, each of which covers 16ps with 8 outputs (corresponding to the lower 3 bits of the core output, $B<0:2>$) (Fig. 36). The value of the section number is used in a multiplexer to decide which of 5 inputs will become the output; as well this number corresponds to the higher 3 bits of the core output ($B<3:5>$). Fig. 37 shows the logic implementation of the coordinated-determination device. Note that the advantage of the implementation over the initial diagram (Fig. 36) is to use $Q_R$ itself rather than $Q_M$ to determine the multiplexer and the higher 3 bit core output, by taking advantage of the exclusive feature of the active section. The OR gate $f_1$ operation (Fig. 37(b)) is used to distinguish which section is active (non zero).
5.7.2 Even-rotation Compensation and optimization

In section 3.5, the both-edge triggered counter which is used to record every rotation on the ring oscillator can be optimized to a one-edge (rising) triggered counter which only records every other rotation so that it can loosen the demand on response-
speed of the counter circuitry, as well as to double the phase detection range of the TDC. Therefore, for the updated phase counter, it counts every 160ps instead of 80ps. To implement this advanced solution, as shown in Fig. 15, an even-rotation compensation operation is needed. Then the examples in Table 8 show how the final TDC result could be calculated by taking advantage of the word of the odd/even rotation and the words of normal phase counter. Thus, the evaluator unit should be modified as shown in Fig. 38. The realization of the circuit of the evaluator will follow this version. The word of even rotation compensation will be zero at the odd rotation time period and will be in the high state during even rotation to forward the value of 40 in binary to the final result. Also, every time the double-phase counter is triggered, one more “80” will be added into the final result.

Table 8: Examples of Final output calculation

<table>
<thead>
<tr>
<th></th>
<th>Example1</th>
<th>Example2</th>
<th>Example3</th>
</tr>
</thead>
<tbody>
<tr>
<td>Core output (0~40)</td>
<td>15</td>
<td>15</td>
<td>15</td>
</tr>
<tr>
<td>Original Rotation Number</td>
<td>1</td>
<td>2</td>
<td>6</td>
</tr>
<tr>
<td>Odd/Even Rotation (0/1)</td>
<td>0</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>Normal Phase Counter (0~255)</td>
<td>0</td>
<td>0</td>
<td>2</td>
</tr>
<tr>
<td>Final output</td>
<td>15<em>40+0</em>80=15</td>
<td>15<em>40+0</em>80=55</td>
<td>15<em>40+2</em>80=215</td>
</tr>
<tr>
<td>Phase(time) Difference</td>
<td>2ps*15=30ps</td>
<td>2ps*55=110ps</td>
<td>2ps*215=430ps</td>
</tr>
</tbody>
</table>
Fig. 38. Diagram of the modified TDC evaluator.

Fig. 39 and Fig. 40 present the simulation result of the final output of the TDC based on the new evaluator solution. The trend of the final output curve is as expected except for two unwanted spurs which happen around the places of the transition between two adjacent rotations. Note that the spurs that happen in Fig. 39 are due to the occurrence of the even rotation word not being aligned in the time domain with that of the TDC core output. Similarly, the spur in Fig. 40 is due to non-alignment in time domain of the double-phase counter, even rotation word and the TDC core output. In other words, ideally the decision of the rotation edge by the phase regulator should happen at the time when the decision of the completion of the fifth section by the Vernier TDC core is made, however practically they are not aligned due to the difference in time delays between the different determination circuits.
Fig. 39. Simulation of the Final Output of the TDC and the alignment issue.

One way to tackle the alignment issue is to introduce a logic determination circuit to link the three events and to force their output actions to be simultaneous. Notice that...
the occurrence of the rotation word is always prior to that of the completion of section 5. Therefore, it sometimes wrongly starts the compensation early. If we can predict this situation, we can avoid the compensation from being added to the output. Therefore, we introduce a couple of detection conditions including: a) if this compensation control word just came 20ps ago (edge word); b) if the Vernier TDC core circuit is experiencing the fifth section and is not completed yet (f5 word). If both these conditions are met, the even compensation word should not to be considered and the output is its complement ($\overline{W_{loop}}$).

The truth table of these logical determinations is shown in Table 9. By taking advantage of a Karnaugh map, a sum of products expression can be figured out as:

$$W_{evenrotation} = W_{loop} \cdot \overline{edge} + W_{loop} \cdot \overline{f5} + f5 \cdot edge \cdot \overline{W_{loop}} \quad \text{Eq.(25)}$$

Fig. 41 is the logic implementation of Eq.(25), in which looppuls stands for the $W_{loop}$.

<table>
<thead>
<tr>
<th>Original rotation word, $W_{loop}$</th>
<th>During Edge zone? ($\overline{edge}$)</th>
<th>During section 5? ($\overline{f5}$) (see Fig. 37(b))</th>
<th>Updated even rotation word, $W_{evenrotation}$</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>X</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>X</td>
<td>1</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
</tr>
</tbody>
</table>
Fig. 41. The logic implementation of the alignment between the even rotation word and the core output.

5.8 The Whole TDC Architecture and Simulation

Fig. 42 is the optimized TDC system circuit diagram which includes the Vernier TDC core and phase regulator, the edge finder and priority encoders, the coordinated-determination device (5-1 multiplexer), even-rotation alignment and compensation
device, loop counter alignment and compensation device and final TDC output calculation unit. Fig. 43 illustrates the optimal simulation result in which the final TDC output curve is as expected.

Fig. 42. TDC system block diagram.
Fig. 43. Final simulation of the final output of the TDC.
Chapter 6: ADPLL System Calculation and Simulation

6.1 ADPLL System Specifications

Based on the future potential requirement of the wireless communication industry, the collaborating company has provided a demanding specification (Fig. 44) of ADPLL performance requirements. In order to achieve this, the ADPLL specification needs to be broken down into a couple of sub-system specifications for the desired TDC, DCO and DLF.

Fig. 44. ADPLL System Specifications.

Fig. 45 shows the conversion from phase noise performance to RMS jitter performance. It shows that the desired phase noise performance can meet 60fs of RMS jitter. However, a common capability of a VCO’s phase noise at 10MHz offset is around -155dBc/Hz for high frequency (>10GHz) applications. A reasonable thermal noise floor is -165dBc/Hz. Secondly, note that the difference between 1MHz offset and 10MHz offset is 30dB, exceeding the typical value of -20dB/decade slope for phase noise.
Therefore, the phase noise performance has been modified to -150dBc/Hz@10MHz offset and -165dBc/Hz for the noise floor. Fig. 46 updates the contrast between phase noise and RMS jitter performances and it turns out that RMS phase Jitter is worse at 63fs.

**Phase Noise & RMS Jitter @ 3GHz**

Fig. 45. Original Phase noise & RMS jitter conversion of the ADPLL
Fig. 46. Updated Phase noise & RMS jitter conversion of the ADPLL

6.2 Phase Noise Contribution and Distribution

The total phase noise is contributed by the raw VCO noise, TDC time resolution noise and DCO quantization noise (Eq.(26)).

\[ P_{\text{Total}} = P_{\text{rawVCO}} + P_{\text{TDC Resolution}} + P_{\text{DCO Quantization}} \]  

Eq.(26)

In general, the TDC noise is supposed to be equal to VCO noise at the corner frequency. The TDC noise is flat in-band and rolls off at 20dB/decade out of band. The noise of VCO in-band has an extra suppression which relies on the type of DLF employed. Fig. 47 shows the contribution of the various elements at various frequency offsets and the total phase noise that can be achieved by integrating all of them. If we work this calculation backwards, it can guide us to a distribution of the phase noise of each element by knowing the system phase noise performance.
In order to achieve the above specifications, the following guidelines will be fulfilled:

- Ensure raw VCO phase noise between 0.1 MHz and 20MHz (out of band) at least 3dB better than the overall phase noise due to the lack of out of band suppression of the loop filter to raw VCO phase noise (High pass filter characteristic).
- Needs sufficiently fine DCO frequency resolution ($\Delta f_{\text{res}}$) to make quantization-noise appreciably smaller than the VCO's raw phase noise in the out of band region, to make it a negligible contribution. The expression of DCO quantization phase noise contribution to the system is:

$$L\{\Delta f\} = \frac{1}{12} \left(\frac{\Delta f_{\text{res}}}{\Delta f}\right)^2 \frac{1}{f_R}$$  \hspace{1cm} \text{Eq.}(27)

- Ensure in band DCO quantization phase noise is 3dB better than the overall phase noise (-112dB), according the equation below:

$$L_{\text{DCO-\text{IN}}} = \frac{(2\pi)^2}{12} \left(\frac{\Delta f_{\text{res}}}{f_R}\right)^2 \frac{1}{f_R} \frac{1}{\alpha^2}$$  \hspace{1cm} \text{Eq.}(28)

- Ensure the in band TDC resolution phase noise is 3dB better than the overall phase noise (-112dB), according the equation below:

$$L_{\text{TDC-\text{IN}}} = \frac{(2\pi)^2}{12} \left(\frac{\Delta t_{\text{res}}}{T_{\text{DCO}}}\right)^2 \frac{1}{f_R}$$  \hspace{1cm} \text{Eq.}(29)

- For optimal in-band phase noise performance, a second-order DLF is needed to reduce the impact of the raw VCO phase noise contribution in band due to nearly 40dB/decade of suppression from the corner frequency to DC of the VCO noise by the loop filter.

- The optimal values for Alpha and Rho aiming at achieving the proper loop bandwidth must be determined so that the phase noise of the TDC is equal to the phase noise of the DCO in-band.
Fig. 48 illustrates an ADPLL performance calculation and analysis in which the performance of an ADPLL can be determined by inputting the required parameters and specifications. The yellow-filled elements stand for the desired input parameters. Some of the green-filled elements are the performance specifications that an ADPLL can achieve, the others are the requirements which need to be realized in the implementation such as Alpha and Rho. According to the calculator a 2ps time resolution for the TDC can meet the system phase noise performance requirement as long as the DCO resolution can meet the 1kHz requirement. Fig. 48 also indicates the requirement of the VCO and Alpha Rho values, etc.

<table>
<thead>
<tr>
<th>PLL Frequency performance Calculation</th>
</tr>
</thead>
<tbody>
<tr>
<td>Fref MHz</td>
</tr>
<tr>
<td>---------</td>
</tr>
<tr>
<td>50</td>
</tr>
<tr>
<td>Period(ns)</td>
</tr>
<tr>
<td>20</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>DCO WF Dithingwordlength</th>
<th>DCO WI wordlength</th>
<th>DCO rough range wordlength</th>
<th>DCO freq fine-tuning range</th>
<th>DCO freq whole tuning range</th>
</tr>
</thead>
<tbody>
<tr>
<td>6</td>
<td>18</td>
<td>4</td>
<td>16.777216 GHz</td>
<td>268.435456</td>
</tr>
</tbody>
</table>

**In-band TDC PN**

-117.839 dBc/Hz

**DCO phase noise of quantization**

-114.74 dBc/Hz in-band
-127.792 dBc/Hz 0.1 MHz offset
-147.792 dBc/Hz 1 MHz offset
-167.792 dBc/Hz 10 MHz offset
-187.792 dBc/Hz 100 MHz offset

**RAW VCO Noise**

<table>
<thead>
<tr>
<th>Q factor of VCO</th>
<th>Power of Carrier dBm</th>
<th>Noise Factor of Osc dBc/Hz</th>
<th>Noise Floor dBC/Hz</th>
<th>Freq Offset kHz</th>
<th>RAW VCO Noise dBc/Hz</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>0</td>
<td>1</td>
<td>-155</td>
<td>100</td>
<td>-118.4529714</td>
</tr>
</tbody>
</table>

**RAW VCO Noise Slope of Curve**

-155 10 MHz offset 20 dB/Decade
Fig. 48. ADPLL phase noise calculator and analyzer.

6.3 ADPLL System Modeling and Simulation in Simulink

Building up a Simulink simulation platform is necessary and important in order to estimate and evaluate a certain type of TDC and to verify the performance of a certain TDC and/or PLL. In addition, it is an effective and efficient way to improve the performance of a PLL or a TDC.

Fig. 49 shows a diagram of a basic integer PLL system operation platform. Fig. 50 indicates that the control voltage of the DCO is working as expected. This simulation structure will be improved and refined step by step to a type II all digital PLL.

Fig. 49. Basic integer PLL system simulation platform.
Fig. 50. ADPLL control voltage convergence with time.

Fig. 51 shows a diagram of a PLL system in which a TDC unit and a type I digital filter have been employed based on the basic PLL structure. Fig. 52 illustrates a type II ADPLL with a second pole achieving better noise rejection in band (a further 20dB/decade) compared to the type I counterpart. In addition, a type II filter removes flicker noise and VCO noise. Another advantage of the type II loop is that it removes steady-state frequency error.
Fig. 51. Diagram of a fractional digital type I PLL system.

Fig. 52. Diagram of a fractional digital type II PLL system.

6.4 Comparison between different Loop types and parameters

A couple of simulations based on previous ADPLL type I and type II structures have been performed and compared with each other as shown in Fig. 53. For a type-I ADPLL, in-band phase-noise stays at a high level due to a lack of suppression of VCO in-band phase noise by the filter. It also increases inversely with the value of Alpha. Comparing with Type-I, a Type-II ADPLL yields lower in-band phase-noise due to extra rejection from the second-order loop. Note that the phase noise at 2kHz offset is due to the bandwidth resolution limitation of the FFT calculation.
Table 10 summaries the advantages and disadvantages of type I and type II ADPLLs. We will discuss in the next section the preferred mode of the operation of an ADPLL based on the conclusions from this table.

![ADPLL phase noise comparison](image)

**Fig. 53.** ADPLL phase noise comparison between different loop parameters.

**Table 10: Type I and type II loop feature contrast**

<table>
<thead>
<tr>
<th></th>
<th>Type I Loop [20]</th>
<th>Type II Loop</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Advantage</strong></td>
<td>Fast frequency/phase acquisition</td>
<td>No frequency/phase error</td>
</tr>
<tr>
<td></td>
<td></td>
<td>Better phase noise performance</td>
</tr>
<tr>
<td><strong>Disadvantage</strong></td>
<td>Steady-state frequency error</td>
<td>Slow due to ripple of control word of DCO</td>
</tr>
<tr>
<td></td>
<td>Poor in-band phase noise</td>
<td></td>
</tr>
</tbody>
</table>

Due to FFT conversion: 2kHz BW

Type-I Loop: higher in-band phase noise

Type-II Loop: lower in-band phase noise
6.5 Preferred Solution of ADPLL System Operation

According to the analysis of the different types of ADPLLs in section 6.4, it is obvious to conclude that a combination of type I and type II ADPLL loop operation is the optimal solution to achieve better phase noise performance as well as reasonable acquisition time. Specifically, the operation of an ADPLL can be separated into three periods:

1. Acquisition mode period: Use a type I with high gain for fast acquisition. It requires a wide loop bandwidth.
2. Tracking mode period: Use a Type I with low gain for tracking. It requires a narrow bandwidth.
3. Performance mode period: Use a Type II with Proportional-integral controller for the best phase noise performance.
Fig. 54. ADPLL operation with combination of type I and type II. [1]

Table 11: Type I and type II loop feature contrast

<table>
<thead>
<tr>
<th></th>
<th>Alpha</th>
<th>Rho</th>
<th>Note</th>
</tr>
</thead>
<tbody>
<tr>
<td>Acquisition mode</td>
<td>Large (0.1)</td>
<td>0</td>
<td>Type I Loop</td>
</tr>
<tr>
<td>Tracking mode</td>
<td>Small (0.01)</td>
<td>0</td>
<td>Type I Loop</td>
</tr>
<tr>
<td>Performance mode</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>period</td>
<td>Small (0.01)</td>
<td>Non zero Rho</td>
<td>Type II Loop</td>
</tr>
</tbody>
</table>
Chapter 7: Conclusion and Future Work

7.1 Completed Work

Much research has been done including: TDC architecture & technologies evaluation (Chapter 2), TDC architecture analysis and design (Chapters 3 and 4), TDC implementation, simulation and optimization (Chapter 5), and ADPLL system specification analysis & distribution down-to the TDC sub-block level (Chapter 6). This research has verified that:

- The proposed TDC architecture is able to meet the specifications from the ADPLL requirement (by system behavior level simulation, Simulink).
- The proposed TDC architecture has advantages in terms of high performance, low power consumption and small chip area over the other solutions.
- The proposed TDC is capable of being implemented in the physical level (Virtuoso schematic) and the characteristics are as good as expected after optimization.

7.2 Future Work

From the perspective of the end-to-end process of a chip, a real implementation should be fabricated, so that a physical chip can be measured and evaluated to verify its functionality. While some of the preliminary work and simulation for this has been done it is far from complete. Therefore, future work should include the physical implementation part, system verification and optimization. Specifically, a layout and tape-out of a chip should be done, as well as measurements performed. In addition, a circuit level ADPLL simulation platform it would better to set up for virtual verification
due to the TDC being part of an ADPLL system. Also from the simulation result shown in section 5.8 the linearity of the TDC output has room for optimization, as well as differential D Flip-Flop circuits for higher time accuracy.

7.3 Contributions and Conclusions

The phase-scaled Vernier TDC design and results were submitted to and approved by the Microsemi Patent Council in March 2016. The corresponding paper is holding for submission now and it will be sent out once the patent file number has been received from US Patent and Trademark Office.

The phase-scaled Vernier TDC achieves comparable time resolution and phase detection range performances to the Vernier Ring TDC, but a few times better in core circuit power consumption, phase noise performance due to higher frequency of the applied reference clock. The TDC time resolution performance can be improved at expense of circuit scale and power consumptions.
References


