CCC Workshop on Physics & Engineering Issues in Adiabatic/Reversible Classical Computing

and the second se

TECHNICAL SESSION II—DEVICE & CIRCUIT TECHNOLOGIES Device & Circuit Technologies for Reversible Computing—An Introduction







Tuesday, October 6<sup>th</sup>, 2020

Michael P. Frank, Center for Computing Research



Sandia National Laboratories is a multimission laboratory managed and operated by National Technology & Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International Inc., for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-NA0003525.

Approved for public release, SAND2020-10309 C

# 2 Workshop Overview

|                        | Day 1 (Mon. 10/5)                           | Day 2 (Tue. 10/6)                                         | Day 3 (Wed. 10/7)                                             | Day 4 (Thu. 10/8)                | Day 5 (Fri. 10/9)                 |
|------------------------|---------------------------------------------|-----------------------------------------------------------|---------------------------------------------------------------|----------------------------------|-----------------------------------|
| Start Time<br>(US PDT) | TECHNICAL SESSION I:<br>FUNDAMENTAL PHYSICS | TECHNICAL SESSION II:<br>DEVICE & CIRCUIT<br>TECHNOLOGIES | TECHNICAL SESSION III:<br>ARCHITECTURE &<br>HIGH-LEVEL TOPICS | FIRST DAY OF<br>WORKING MEETINGS | Second Day of<br>Working Meetings |
| 8:30 am                | Workshop Intro                              | Keynote: Ed Fredkin                                       | M. Frank,                                                     | (9a.) Day 4 Intro.               | (9a.) Day 5 Intro.                |
| 9:20 am                | Mike Frank                                  | Mike Frank<br>Sarah Frost-Murphy                          | G. Snider,<br>N. Yoshikawa, H.                                | Outbriefs from<br>Breakouts      | Outbriefs from<br>Re-Breakouts    |
| 10:00 am               | Norm Margolus                               | Jie Ren                                                   | Thapliyal, R. Wille                                           |                                  |                                   |
| 10:20 am               | Early Break                                 | Early Break                                               | Early Break                                                   | Early Break                      | Early Break                       |
| 10:50 am               | Neal Anderson                               | Kevin Osborn                                              | Erik Demaine                                                  |                                  |                                   |
| 11:10 am               | Subhash Pidaparthi                          | Ralph Merkle                                              | Robert Glück                                                  | Concordance                      | Concordance                       |
| 11:30 am               | Karpur Shukla                               | Joe Friedman                                              | Erik DeBenedictis                                             | Discussion #1                    | Discussion #2                     |
| 11:50 am               | Panel / Q&A                                 | Panel / Q&A                                               | Panel / Q&A                                                   |                                  |                                   |
| 12:10 pm               | Late Break                                  | Late Break                                                | Late Break                                                    | Late Break                       | Late Break                        |
| 12:40 pm<br>until      | Physics Breakout                            | Techno. Breakout                                          | Arch./HL Breakouts                                            | Re-Breakouts                     | Final Breakout &<br>Concordance   |

3 Abstract Text

Over the decades, a wide variety of concepts for the physical engineering implementation of reversible computing have been proposed.

To date, the most highly developed approaches are based on *adiabatically* driven microelectronic switching circuits using either semiconducting or superconducting technologies.

Less well-developed, but emerging, are approaches based on the *ballistic* propagation and elastic interaction of localized information-bearing degrees of freedom.

In this talk, I survey various approaches, discuss their pros and cons, and suggest general requirements that novel approaches should try to meet.

## <sup>4</sup> Outline of Talk

Device & Circuit Technologies for Reversible Computing— An Introduction

- ° I. Motivation & Brief History
- ° II. Adiabatic Approaches
  - Adiabatic CMOS

• (Superconducting & nanomechanical approaches to be covered by other speakers)

- ° III. Ballistic Approaches
  - Brief overview of fluxon-based reversible computing approaches
  - Ballistic Asynchronous Reversible Computing in Superconductors (BARCS)
- IV. Looking Ahead:
  - What kinds of advances in device & circuit technologies are needed?
  - What are some key metrics determining success of any new reversible technology?
  - What are some key requirements that any new candidate reversible technology *must* meet?



## Section I. Motivation & History

Device & Circuit Technologies for Reversible Computing-

#### Why are we here?

- Progress in the energy-efficiency of the conventional (non-reversible) computing paradigm is approaching hard limits, which ultimately trace back to fundamental thermodynamic issues.
  - Industry is already struggling to continue to advance along the traditional scaling path.
- Energy efficiency is a fundamental limiting factor on the economic utility of computing.
  - Without energy efficiency gains, there are diminishing returns from optimizing *every* other aspect of computing.
- Transitioning to the unconventional computing paradigm known as *reversible computing* provides the only physically possible alternative scaling path for allowing the energy efficiency of general digital computing to continue improving indefinitely...
  - And, so far, no fundamental limit to the (even practically) achievable efficiency is known.
- The overall economy is becoming increasingly dependent on computing, as a larger and larger share of economic activity takes place in the cyber realm...
  - Making reversible computing practical thus has the potential to expand *the total future economic value of civilization* (for any given amount of available energy resources) by *indefinitely many* orders of magnitude.

## Semiconductor Roadmap is Ending...

#### 7

Thermal noise on gate electrodes of minimum-width segments of FET gates leads to significant channel PES fluctuations when  $\underline{E}_g \lesssim 1-2 \text{ eV}$ 

- ° Increases leakage, impairs practical device performance
  - $\,\circ\,$  Thus, roadmap has minimum gate energy asymptoting to  ${\sim}2~eV$

Also, real logic circuits incur many *compounding* overhead factors *multiplying* this limit:

- $\circ$  Transistor width 10-20× minimum width in fast logic.
- $\circ$  Parasitic (junction, etc.) transistor capacitances (~2×).
- $\circ$  Multiple (~2) transistors fed by each input to a given logic gate.
- $^\circ$  Fan-out of each gate to a few (~3) downstream logic gates.
- Parasitic wire capacitance ( $\sim 2 \times$ ).

Due to all these overhead factors, the energy of <u>each logic</u> <u>bit in real logic circuits is many times larger</u> than the minimum-width gate energy!

- 375-600× (!) larger in ITRS'15.
  - ∴ <u>Practical</u> bit energy for irreversible logic asymptotes to <u>~1 keV</u>!

Practical, real-world logic circuit designs can't just magically cross this  $\sim 500 \times$  architectural gap!

- .: Thermodynamic limits imply much *larger* practical limits!
  - The end is near!



Only reversible computing can take us from ~1 keV at the end of the CMOS roadmap, all the way down to  $\ll kT$ .

## <sup>8</sup> Reversible computing to the rescue!

We reviewed the fundamental physical arguments for reversible computing yesterday:

- Landauer's Principle (when properly understood) fundamentally limits the energy efficiency of conventional, non-reversible approaches to general digital computing.
  - As we discussed, the various critics of this statement simply had a basic conceptual misunderstanding.
- Physical mechanisms for computing that are *logically* reversible can in principle also approach *physical* reversibility, circumventing all limits to the energy efficiency of general digital computing.
  - But, how can we *actually implement* reversible computing in a highly efficient and practical way?
    - This presents a significant challenge for the fields of device physics and device & circuit engineering.
- Some early history of physical implementation concepts:
  - Landauer (1961) described physical implementations of reversible computational operations abstractly, in terms of manipulations of bistable potential energy wells.
  - <u>Bennett (1973)</u> described logically reversible computations *abstractly* (as Turing machines) and pointed out that biomolecular processes (e.g., DNA transcription) can be understand as computational processes that **operate stochastically** and approach thermodynamic reversibility given appropriate chemical potentials.
- <u>Likharev (1977)</u> described his Parametric Quantron (PQ) Josephson junction circuit, which could implement reversible transformations of bistable potential energy wells in **superconducting circuits**.
- Fredkin & Toffoli (1980) described an idealized, ballistic *billiard ball model* (BBM) of reversible computation.
- <u>Bennett (1982)</u> described a (very slow!) macro-scale mechanical implementation of his reversible Turing machine that could operate by Brownian motion.

The rest of this talk (and this session) will focus on more modern approaches.

• But, we'll see that many of the same concepts introduced in the early years still apply!



# Existing Dissipation-Delay Products (DdP) —Non-reversible Semiconductor Circuits

#### Conventional (non-reversible) CMOS Technology:

- Recent roadmaps (e.g., IRDS '17) show Dissipation-delay Product (DdP) decreasing by only <~10× from now to the end of the roadmap (~2033).
  - Note the typical dissipation (per logic bit) at end-of-roadmap is projected to be  $\sim 0.8 \text{ fJ} = 800 \text{ aJ} = \sim 5,000 \text{ eV}.$
- $^\circ$  Optimistically, let's suppose that ways might be found to lower dissipation by an additional 10× beyond even that point.
  - That still puts us at 80 aJ =  $\sim$ 500 eV per bit.
- We need at least ~1 eV  $\approx 40 \ kT$  electrostatic energy at a minimum-sized transistor gate to maintain reasonably low leakage despite thermal noise,
  - And, typical *structural* overhead factors *compounding* this within fast random logic circuits are roughly 500×,
  - so,  $\sim 500 \text{ eV}$  is *indeed* probably about the practical limit.
    - At least, this is a reasonable order-of-magnitude estimate.



#### Existing Dissipation-Delay Products (DdP)— Adiabatic Reversible Superconducting Circuits

Reversible adiabatic superconductor logic:

- State-of-the-art is the **RQFP** (Reversible Quantum Flux Parametron) technology from Yokohama National University in Japan.
  - Chips were fabricated, function validated.

10

- Circuit simulations predict DdP is  $>1,000 \times lower$  than even end-of-roadmap CMOS.
  - Dissipation extends far below the 300K Landauer limit (and even below the Landauer limit at 4K).
  - DdP is *still* better than CMOS even after adjusting by a conservative factor for large-scale cooling overhead  $(1,000\times)$ .

#### **Question:** Could some *other* reversible technology do even better than this?

- We have a project at Sandia exploring one possible superductor-based approach for this (more later)...
  - But, what are the *fundamental* (technology-independent) limits, if any?



100 un

## Section II. Adiabatic Approaches

Device & Circuit Technologies for Reversible Computing-

# Physical Implementations of Reversible Computing using Adiabatic Processes

Most of the existing approaches to the physical implementation of reversible computing exhibit this "adiabatic" character.

- Later, we will discuss other approaches that instead emphasize different (*ballistic* and *elastic*) aspects of physical processes.
  - There is a certain degree of overlap between all of these concepts, though.

#### **Definitions.** The word "adiabatic" has a long (>135-year!) history in physics...

- Derives etymologically from the Greek *adiabatos* (άδιάβατος), "impassable,"
  - In the sense "not to be passed through;" from  $\dot{a}$  (not) +  $\delta \iota \dot{a}$  (through) +  $\beta a \tau \dot{o} \varsigma$  (passable)
- In practice, in the context of thermodynamics, the word is often used to mean something roughly like:
  - "No [free] energy may *pass through* the boundary of the system so as to become dissipated out into the system's external environment as heat"
- For our purposes, we can take it as effectively being synonymous with *isentropic* 
  - Meaning, with the same (i.e., unchanging) entropy

12

• Since, note, entropy increase implies that part of the system's energy is crossing over an *abstract* boundary from a known/controlled to unknown/uncontrolled state

## Some Requirements for Adiabaticity

For a process to be adiabatic generally requires that the active energy associated with the known/controlled degrees of freedom in the system is *well isolated* from the system's thermal environment, which implies:

- The process does not happen so quickly that uncontrolled modes become excited
  - Rate of the process should be *slow* compared to the system's relaxation timescale
- But also, it does not happen so <u>*slowly*</u> that the known/controlled energy in the system can leak out from the system to its environment via equilibration processes
  - Time for the process should be *fast* compared for the time for the non-equilibrium aspects of the system to equilibrate with the system's thermal environment

We can design adiabatic mechanisms that (as they are further refined) increasingly well satisfy *both* of these requirements simultaneously, by

- Decreasing the relaxation timescale (increase generalized "stiffness" of mechanism)
- Increasing the equilibration timescale (decrease the rate of energy "leakage")

## Adiabatic Processes: A classic example

Adiabatic compression (or expansion) of an ideal gas under control of a piston in a thermally insulated cylinder...



- Note the compression/expansion must be carried out *slowly* enough so as not to excite pressure waves in the gas...
  - Since any energy imparted to such waves would quickly degrade to heat
  - Require: Speed of piston movement << speed of sound in the gas
- And, the compression/expansion must be also done *quickly* enough so there isn't enough time for heat to be conducted into or out of the cylinder
  - Note that, by the ideal gas law, the temperature inside the cylinder would typically be changing as the gas compresses/expands, and thus is in general not always the same as the environment temperature.
    - And, any heat conducted from higher  $\rightarrow$  lower temperature yields an entropy increase
  - Require: Time for piston movement << time constant of thermal equilibration

Analogous requirements also apply in adiabatic *electronic* processes!

## Adiabatic Circuits in CMOS: A Brief History

A selection of some early papers:

Fredkin and Toffoli, 1978

#### (DOI:10.1007/978-1-4471-0129-1\_2)

- Unfinished circuit concept based on idealized capacitors and inductors
  - How to control switches to do logic was left unspecified
  - Large design overhead—Roughly one inductor per gate

Seitz et al., 1985

#### (CaltechCSTR:1985.5177-tr-85)

- Realistic MOSFET switches; more compact integration (off-chip L)
- Not yet known to be general-purpose; required careful tuning

Koller and Athas, 1992

#### (DOI:10.1109/PHYCMP.1992.615554)

- Not yet fully-reversible technique; limited efficiency
- Combinational only; conjectured reversible sequential logic impossible

Hall, 1992; Merkle, 1992

(DOIs:10.1109/PHYCMP.1992.615549; 10.1109/PHYCMP.1992.615546)

• General-purpose reversible methods, but for combinational logic only

Younis & Knight, 1993

(http://dl.acm.org/citation.cfm?id=163468)

• First fully-reversible, fully-adiabatic sequential circuit technique (CRL)



Figure reproduced with permission

### <sup>16</sup> Adiabatic Circuits in CMOS: History, cont.

Younis & Knight, 1994

- Simplified 3-level adiabatic CMOS design family (SCRL)
  - However, the original version of SCRL contained a small non-adiabaticity bug which I discovered in 1997
    - This problem is easily fixed, however

#### Subsequent work at MIT, 1995-99

- Myself and fellow students
- $\,\circ\,$  Various chips designed using SCRL  $\rightarrow\,$
- Reversible processor architectures

Substantial literature throughout the late 90s / early 2000s...

- ° Too many different papers / groups to list them all here!
  - Most of the proposed schemes were not truly/fully adiabatic, though

#### Researchers recently active in adiabatic circuits include:

- A couple I know in the US:
  - Greg Snider (Notre Dame)
  - Himanshu Thapliyal (U. Kentucky)
- Also some groups in Europe, India, China, Japan...
- My group at Sandia (new work reported on slide #18)



# Adiabatic Change Transfer: 17 Energy Dissipation Analysis

Consider passing a total quantity of charge Q through a resistive element of resistance R over a timespan t by means of a constant current, I = Q/t. • The power dissipation (rate of energy dissipation) in such a current flow is given by P = IV, where V = IR (Ohm's Law) is the voltage drop across the resistor.

The total energy dissipated over time t is therefore:

$$E_{\text{diss}} = P \cdot t = IVt = I^2Rt = (Q/t)^2Rt = \frac{Q^2R}{t}$$

**A**2 **D** 

• Note the inverse scaling with the time *t* taken for the charge transfer!

If the function of the charge transfer is to charge a linear capacitance C up to the voltage level V, then the quantity of charge transferred is Q = CV, and so the total energy dissipated in the charge transfer can be expressed as:

$$E_{\rm diss} = (CV)^2 R/t = C^2 V^2 R/t = \frac{CV^2}{t} \frac{RC}{t}$$

## Conventional vs. Adiabatic Charging

For charging a capacitive load C through a voltage swing V

#### Conventional charging:

• Constant *voltage* source



• Energy dissipated:

$$E_{\rm diss}^{\rm conv} = \frac{1}{2}CV^2$$

#### Ideal adiabatic charging:

• Constant *current* source



• Energy dissipated:

$$E_{\rm diss}^{\rm adia} = I^2 R t = \frac{Q^2 R}{t} = C V^2 \frac{RC}{t}$$

**Note:** Adiabatic charging beats the energy efficiency of conventional by advantage factor:  $A = \frac{E_{dis}^{co}}{E_{dis}^{ad}}$ 

$$A = \frac{E_{\rm diss}^{\rm conv}}{E_{\rm diss}^{\rm adia}} = \frac{1}{2} \frac{t}{RC}$$

## Adiabatic Charging via MOSFETs

A simple voltage ramp can *approximate* an ideal constant-current source.

- Note that the load gets charged up *conditionally*, if the MOSFET is turned on (gate voltage  $V_g \gtrsim V + V_t$ ) during ramp.
  - $V_t$  is the transistor's threshold, typically <  $\frac{1}{2}$  volt

Can discharge the load later using a similar ramp.

• Either through the same path, or a different path.

$$t \gg RC \Rightarrow E_{diss} \rightarrow CV^2 \frac{RC}{t}$$
  
 $t \ll RC \Rightarrow E_{diss} \rightarrow \frac{1}{2}CV^2$ 



67

Exact formula for linear ramps:  $E_{diss} = s \left[ 1 + s \left( e^{-1/s} - 1 \right) \right] CV^2$ given speed fraction s = RC/t.

The (ideal) operation of this circuit approaches *physical reversibility* ( $E_{diss} \rightarrow 0$ ) in the limit  $t \rightarrow \infty$ , but *only* if a certain *precondition* on the initial state is met (namely,  $V_g \gtrsim V_{max} + V_t$ )

- How does the possible physical reversibility of this circuit relate to its *computational* function, and to some *appropriate* concept of logical reversibility?
  - Traditional (Landauer/Fredkin/Toffoli) reversible computing theory does <u>not</u> adequately address this question, so, we need a more powerful theory!
  - The theory of *Generalized Reversible Computing* (GRC; mentioned briefly yesterday) meets this need.

See <u>arxiv:1806.10183</u> for the full GRC model.

#### <sup>20</sup> Perfectly Adiabatic Reversible Computing in CMOS

To approach ideal reversible computing in CMOS...

We must aggressively eliminate *all* sources of non-adiabatic dissipation, including:

- Diodes in charging path, "sparking," "squelching,"
  - Eliminated by "truly, fully adiabatic" design. (*E.g.*, CRL, 2LAL).
    - Suffices to get to a few aJ (10s of eV) in 180 nm before voltage optimization.
- Voltage level mismatches that dynamically arise on floating nodes before reconnection.
  - Eliminated by static, "perfectly adiabatic" design. (E.g., S2LAL).

We must also aggressively minimize standby power dissipation from leakage, including:

- Subthreshold channel currents
  - $\circ$  Low-T operation helps with this
- Gate oxide tunneling
  - *E.g.*, use thicker gate oxides

**Note:** (Conditional) logical reversibility *follows from* perfect adiabaticity.

#### Shift Register Structure and Timing in 2LAL

2LAL test chip

Sandia, Aug. '20

taped out at



#### Shift Register Structure and Timing in S2LAL



# General Combinational and Sequential Logic in Reversible Adiabatic Circuits

A general picture of how to combine arbitrary combinational and sequential logic in reversible, adiabatic logic designs:

- 1. Initially, input  $\boldsymbol{x}$  is in the register at the left.
- 2. Evaluate top combinational stages, in sequence, to produce output y = f(x) in the register at the right.
  - Hall's "retractile cascade" method, a.k.a. "Bennett clocking"
- 3. Latch output *y* in place.

21

- 4. Unroll evaluation of top stages, decomputing intermediate results.
- If f is an <u>invertible</u> function, we can then *decompute* the input x as follows:
  - 5. Evaluate bottom stages, following arrows, to compute a copy of  $x = f^{-1}(y)$ ,
  - 6. Unlatch (connect)  $\boldsymbol{x}$  to the presented copy,
  - 7. Unroll evaluation of bottom stages, decomputing those intermediate results.

At the conclusion of this entire process, information has *moved and transformed* from x to y.

• Can then pass the information through *further* stages of sequential processing, and meanwhile, start to process a new wave of input in this stage (pipelining)...

#### One Stage of Reversible Sequential Logic



## Adiabatic Reversible Computing in Superconducting Circuits

Work along this general line has roots that go all the way back to Likharev, 1977.

Most active group at present is Prof. Yoshikawa's group at Yokohama National University in Japan.

Logic style called *Reversible Quantum Flux Parametron* (RQFP).

Shown at right is a 3-output reversible majority gate.

Full adder circuits have also been built and tested.

Simulations indicate that RQFP circuits can dissipate  $< kT \ln 2$  even at T = 4K, at speeds on the order of 10 MHz.

More in tomorrow's talk!





## Section III. Ballistic Approaches

Device & Circuit Technologies for Reversible Computing-

## Can dissipation scale better than linearly with speed? (Additional detail on this can be found in Subhash's talk yesterday.)

Some observations from Pidaparthi & Lent (2018) suggest Yes!

24

- ° Landau-Zener (1932) formula for quantum transitions in e.g. scattering processes with a missed level crossing...
  - Probability of exciting the high-energy state  $P_{\rm D} = {\rm e}^{-2\pi\Gamma}$ (which then decays dissipatively) scales down exponentially as a function of speed...
    - This scaling is commonly seen in many quantum systems!
- Thus, dissipation-delay product may have no lower bound for quantum adiabatic transitions—<u>if</u> this kind of scaling can actually be realized in practice.
  - I.e., in the context of a complete engineered system.
- Question: Will unmodeled details (e.g., in the driving system) fundamentally prevent this, or not?

J. Low Power Electron. Appl. 2018, 8(3), 30; https://doi.org/10.3390 /ilpea8030030

Open Access

67

#### **Exponentially Adiabatic Switching in Quantum-Dot Cellular Automata**

Subhash S. Pidaparthi 🖾 💿 and Craig S. Lent \* 🖾 💿

Department of Electrical Engineering, University of Notre Dame, Notre Dame, IN 46556, USA Author to whom correspondence should be addressed

Received: 15 August 2018 / Revised: 5 September 2018 / Accepted: 5 September 2018 / Published: 7 September 2018

(This article belongs to the Special Issue Quantum-Dot Cellular Automata (QCA) and Low Powe Application)



### 25 Ballistic Reversible Computing

Can we envision reversible computing as a *deterministic* elastic interaction process?

Historical origin of this concept:

- Fredkin & Toffoli's *Billard Ball Model* of computation ("Conservative Logic," IJTP 1982).
  - Based on elastic collisions between moving objects.
  - Spawned a subfield of "collision-based computing."
    - Using localized pulses/solitons in various media.

No power-clock driving signals needed!

- Devices operate when data signals arrive.
- The operation energy is carried by the signal itself.
  - Most of the signal energy is preserved in outgoing signals.

However, all (or almost all) of the existing design concepts for ballistic computing invoke implicitly *synchronized* arrivals of ballistically-propagating signals...

- Making this work in reality presents some serious difficulties, however:
  - Unrealistic in practice to assume precise alignment of signal arrival times.
    - Thermal fluctuations & quantum uncertainty, at minimum, are always present.
  - Any relative timing uncertainty leads to chaotic dynamics when signals interact.
    - Exponentially-increasing uncertainties in the dynamical trajectory.
  - Deliberate *re*synchronization of signals whose timing relationship is uncertain incurs an inevitable energy cost.

Can we come up with a new ballistic model that avoids these problems?

B/

Ā · B

A • B

Springer

Andrew Adamatzky (Ed.)

## Ballistic Asynchronous Reversible Computing (BARC)





Asynchronous Ballistic



Rotary (Circulator)

Toggled Barrier

#### Example BARC device functions



Problem: Conservative (dissipationless) dynamical systems generally tend to exhibit chaotic behavior...

- This results from direct nonlinear interactions between multiple continuous dynamical degrees of freedom (DOFs), which amplify uncertainties, exponentially compounding them over time...
  - *E.g.*, positions/velocities of ballistically-propagating "balls"

26

• Or more generally, any localized, cohesive, momentum-bearing entity: Particles, pulses, quasiparticles, solitons...

**Core insight:** In principle, we can greatly reduce or eliminate this tendency towards dynamical chaos...

We can do this by avoiding any direct interaction between continuous DOFs of different ballisticallypropagating entities

Require localized pulses to arrive *asynchronously*—and furthermore, at clearly distinct, *non*overlapping times

- Device's dynamical trajectory then becomes *independent* of the precise (absolute *and* relative) pulse arrival times
  - As a result, timing uncertainty per logic stage can now accumulate only *linearly*, not exponentially!
    - Only relatively occasional re-synchronization will be needed
- For devices to still be capable of doing logic, they must now maintain an internal discrete (digitallyprecise) state variable—a stable (or at least metastable) stationary state, e.g., a ground state of a well

No power-clock signals, unlike in adiabatic designs!

- Devices simply operate whenever data pulses arrive
- The operation energy is carried by the pulse itself
  - Most of the energy is preserved in outgoing pulses
    - Signal restoration can be carried out incrementally

Goal of current effort at Sandia: Demonstrate BARC principles in an implementation based on fluxon dynamics in <u>Superconducting electronics</u> (SCE)

(BARCS B effort)

## Simplest Fluxon-Based (bipolarized) BARC Function

One of our early tasks: Characterize the simplest nontrivial ABRC device functionalities, given a few simple design constraints applying to an SCE-based implementation, such as:

• (1) Bits encoded in fluxon polarity; (2) Bounded planar circuit conserving flux; (3) Physical symmetry.

## Determined through theoretical hand-analysis that the simplest such function is the *1-Bit, 1-Port Reversible Memory Cell (RM):*

• Due to its simplicity, this was then the preferred target for our subsequent detailed circuit design efforts...



#### **RM Transition Table**

| Input<br>Syndrome                    |                                                                                                    | Output<br>Syndrome                   |
|--------------------------------------|----------------------------------------------------------------------------------------------------|--------------------------------------|
| +1(+1)<br>+1(-1)<br>-1(+1)<br>-1(-1) | $\stackrel{\uparrow}{\rightarrow}\stackrel{\uparrow}{\rightarrow}\stackrel{\uparrow}{\rightarrow}$ | (+1)+1<br>(+1)-1<br>(-1)+1<br>(-1)-1 |

Some planar, unbiased, reactive SCE circuit w. a continuous superconducting boundary

• Only contains L's, M's, C's, and unshunted JJs

- Junctions should mostly be *subcritical* (avoids  $R_N$ )
- Conserves total flux, approximately nondissipative

Desired circuit behavior (NOTE: conserves flux, respects T symmetry & logical reversibility):

- If polarities are opposite, they are swapped (shown)
- If polarities are identical, input fluxon reflects back out with no change in polarity (not shown)
- (Deterministic) elastic 'scattering' type interaction: Input fluxon kinetic energy is (nearly) preserved in output fluxon

## RM—First working (in simulation) implementation!

Erik DeBenedictis: "Try just strapping a JJ across that loop."

• This actually works!

28

- "Entrance" JJ sized to = about 5 LJJ unit cells ( $\sim 1/2$  pulse width)
- I first tried it twice as large, & the fluxons annihilated instead...
  - "If a 15  $\mu$ A JJ rotates by  $2\pi$ , maybe  $\frac{1}{2}$  that will rotate by  $4\pi$ "

Loop inductor sized so ±1 SFQ will fit in the loop (but not ±2) • JJ is sitting a bit below critical with ± 1

WRspice simulations with  $\pm 1$  fluxon initially in the loop

- Uses ic parameter, & uic option to .tran command
  - Produces initial ringing due to overly-constricted initial flux
    - Can damp w. small shunt G



```
Polarity match \rightarrow Reflect (=Exchange)
```





### Resettable version of RM cell—Designed & Fabricated!

Apply current pulse of appropriate sign to flush the stored flux (the pulse here flushes out positive flux)
To flush either polarity → Do both (±) resets in succession





Fabrication at SeeQC with support from ACI



RM Cell & SQUID





## Section IV. Looking Ahead

Device & Circuit Technologies for Reversible Computing-

## What kinds of technology advances are needed?

More specifically, what kind of improvements in device- and circuit-level characteristics of reversible computing technologies would give a big practical boost to the field?

#### Needed are:

- Basic theoretical research in the *fundamental physics of reversible computing* that illuminates how exotic quantum phenomena (*e.g.*, decoherence-free subspaces, topological invariants, quantum Zeno effects, *etc.*) might be harnessed to improve device- and circuit-level characteristics of technologies for reversible computing.
  - Barely any work at all has been done on this so far!
- Device- and circuit-level technology concepts that exhibit improved practical characteristics, particularly in terms of energy dissipation per operation  $E_{diss,op}$ , as a function of high-level parameters such as:
  - The time delay  $t_d = t_{end} t_{start}$  to carry out the given operation (from start to finish).
  - The operating temperature T of the unit (device/circuit that can perform a given operation).
  - The volume v of physical space occupied by the unit. (Its shape—*e.g.*, area, thickness—may also matter in some contexts.)
  - The (real, total, gravitating) mass *m* of the unit (or for the whole system, amortized over the number of units).
  - The active energy  $E_{in}$  invested in the operation of the unit (if reversible, most of it will be reused repeatedly for multiple operations).
  - The economic cost *c* (per unit) to build and deploy the unit, in the context of a complete working computing system.
    - This one is of course difficult to analyze, but is fundamentally essential for any kind of practical success of the technology.

### Key Requirements for Any Reversible Computing Technology

First, any viable technology must provide workable solutions for *all* of the basic requirements that apply universally to *any* kind of scheme for general digital computational hardware, such as:

• Control of timing, compositionality of operations, parallelizability, signal-level restoration, digital stability, reliability, etc.

The technology description must be *self-contained* (*i.e.*, *fully analyzed* including any driving/controlling apparatus).

- It is cheating to invoke any kind of external control or driving force without *fully* analyzing the entire, *closed* larger system.
  - E.g., an adiabatic technology that does this might not save any energy at all, and may instead just sweep all of the energy dissipation "under the rug!"

The technology must support either an adiabatic or ballistic physical model of reversible computation.

- Or more precisely, some blend of the two (most complete reversible technologies will include both aspects to some degree).
  - Or, another model besides these two, *if* others are possible.

The technology must exhibit the ability to perform a universal set of *logically reversible* (including *conditionally* reversible) computational operations in a close to thermodynamically optimal way.

• Meaning, with close to minimal ejection of (reduced) computational entropy to non-computational form.

... and the ability to perform logically *irreversible* computational operations in a close to thermodynamically optimal way

• Meaning, with close to minimal ejection of (reduced) computational entropy to non-computational form.

**Optional:** Ability to also perform *stochastic* computational operations in a close to thermodynamically optimal way.

- Meaning, the stochasticity of the operation enables a maximal amount of entropy to be moved from non-computational to computational form (temporarily).
  - Optional because it is not clear if this feature actually strictly improves computational functionality or performance on any practical problem.

#### Stretch goal: Ability to perform quantum computational operations in a close to thermodynamically optimal way.

- Meaning, approaching minimal dissipation for unitary operations (lower limit here seems larger than for classical reversible).
- Desirable, but a very challenging long-term goal.

## **Conclusion**

The field of device and circuit technologies for reversible computing is ripe for further advancement!

Demonstrated circuit techniques exist based on both semiconducting (CMOS) and superconducting (JJ-based) technology platforms.

• Superconducting techniques appear to outperform the dissipation-delay product of CMOS.

However, the existing device and circuit technologies are likely still very far from the limits of what could be achieved with RC if more intensive R&D work was done to develop innovative new technologies.

• Possibly leveraging exotic quantum phenomena

New conceptual models for the physical implementation of reversible computing (GRC, ABRC) have recently been described, expanding the space of possible solutions...

The remaining talks in this session will go into more detail on various existing and proposed implementation techniques...