

### PhD thesis

Michael Kirkedal Carøe

# Design of Reversible Computing Systems

Logic, Languages, and Circuits



Academic advisor: Robert Glück Submitted: July 7, 2012

## Design of Reversible Computing Systems

Logic, Languages, and Circuits

### Michael Kirkedal Carøe

DIKU, Department of Computer Science, University of Copenhagen, Denmark

July 7, 2012

### PhD Thesis

This thesis has been submitted to the PhD School of Science, Faculty of Science, University of Copenhagen, Denmark

| Author:           | Michael Kirkedal Carøe                                                     |
|-------------------|----------------------------------------------------------------------------|
| Born / pen name:  | Michael Kirkedal Thomsen                                                   |
| Affiliation:      | DIKU, Department of Computer Science,<br>University of Copenhagen, Denmark |
| Title:            | Design of Reversible Computing Systems / Logic, Languages, and Circuits    |
| Academic advisor: | Robert Glück                                                               |
| Submitted:        | July 7, 2012                                                               |

#### Short abstract

This thesis investigates garbage-free reversible computing systems from abstract design to physical gate-level implementation. Designed in reversible logic, we propose a ripple-block carry adder and work towards a reversible circuit for general multiplication. At a higher-level, abstract designs are proposed for reversible systems, such as a small von Neumann architecture that can execute programs written in a simple reversible two-address instruction set, a novel reversible arithmetic logic unit, and a linear cosine transform. To aid the design of reversible logic circuits we have designed two reversible functional hardware description languages: a linear-typed higher-level language and a gate-level point-free combinator language. We suggest a garbage-free design flow, where circuits are described in the higher-level language and then translated to the combinator language, from which methods to place-and-route of CMOS gates can be applied. We have also made standard cell layouts of the reversible gates in complementary pass-gate CMOS logic and used these to fabricate the ALU design.

In total, this thesis has shown that it is possible to design nontrivial reversible computing systems without garbage and that support from languages (computer aided design) can make this process easier. For Maja and Augusta

#### Abstract

Reversible computing spans computational models that are both forward and backward deterministic. These models have applications in program inversion and bidirectional computing, and are also interesting as a study of theoretical properties. The motivation for reversible computing comes, however, often from the fact that these models are information preserving. Landauer's principle links information theory and thermodynamics; all information has some physical representation, so a loss of information must cause a thermodynamical entropy decrease, which then leads to heat dissipation to obey the law thermodynamics. A reversible computation does, thus, not have to use energy, though this is impossible to avoid in practice, due to the way computers are build.

It is, however, not always obvious how to implement reversible computing systems. The restriction to avoid information loss, imposes new design criteria that need to be incorporated into the design; criteria that do not follow directly from conventional models. The result is that, today, many implementations is simple reversible embeddings of conventional solutions. This is not a desirable approach, because these simple embeddings always generate garbage, which then leads to erasure of information.

In this thesis I investigate garbage-free reversible computing systems from abstract design to physical gate-level implementation. Arithmetic operations are a basis for many computing systems, so a proposed the design of a ripple-block carry adder and work towards a reversible circuit for general multiplication are important new circuits. Such arithmetic circuits have then been used in the design of two larger reversible computing systems. The first is a small von Neumann architecture, called Bob, that can execute programs written in a simple reversible two-address instruction set. A central part of the architecture is a novel design of a reversible arithmetic logic unit. The second system is an implementation of the linear cosine transform used in the H.264/ACV encoding standard.

Designing reversible logic circuits on paper can become very complex, so I have designed two reversible functional hardware description languages that can simplify the implementation process. One language is a linear-typed higher-level language, while the other is a gate-level point-free combinator language. These two languages can be used in a garbage-free design flow, where circuits are described in the higherlevel language and then translated to the combinator language. From the gate-level language, methods of place-and-route of CMOS gates can be applied. To facilitate this last step, I have also made standard cell layouts of the reversible gates in complementary pass-gate CMOS logic and, as a test, these cells have been used to fabricate the ALU design.

In total, this thesis has shown that it is possible to design non-trivial reversible computing systems without garbage and that support from languages (computer aided design) can make this process easier. However, there is often still a need to rethink both the problem and the solution to accommodate the no-garbage approach.

#### Dansk Resumé

Reversible beregninger dækker over beregningsmodeller der er både forlæns og baglæns deterministiske. Disse modeller finder anvendelse indenfor program inversion og bidirektionel beregning, men er også interessante som et studie af deres teoretiske egenskaber. Motivationen bag reversible beregning kommer dog ofte fra det faktum at disse modeller er informationsbevarende. Landauers princip kæder informationsteori sammen med termodynamikken; al informations har en fysisk repræsentation, så tab af information må medføre en reduktion af termodynamisk entropi, som dermed fører til varmeafgivelse for at overholde termodynamikkens love. En reversibel beregning vil dermed ikke have et varmetab, dog er det i praksis umuligt at undgå pga. datamatens opbygning.

Det er dog ikke altid oplagt hvordan man kan implementere reversible beregningssystemer. Restriktionen om at undgå informationstab, opstiller nye designkriterier, som er nødvendigt at inddrage i designet – kriterier som ikke følger direkte fra konventionelle beregningsmodeller. Resultatet er at mange implementationer i dag er simple reversible indlejringer af konventionelle løsninger. Dette er ikke en ønskelig fremgangsmåde, da disse simple indlejringer altid vil generere "affald", som efterfølgende medfører informationstab.

I denne afhandling undersøges affaldsfrie reversible beregningssystemer fra abstrakt design ned til implementation på fysisk port-niveau. Aritmetiske operationer er grundlaget for mange beregningssystemer, så der foreslås et design til et "ripple-block carry" additionskredsløb og foreløbigt arbejde mod et reversibelt kredsløb for general multiplikation er vigtige nye kredsløb. Sådanne aritmetiske kredsløb er brugt i designet af to større reversible beregningssystemer. Det første er en lille von Neumann arkitektur, kaldet Bob (eng. for lod), som kan udføre programmer skrevet i et simpelt reversibelt to-adresse instruktionssæt. En central del af arkitekturen er et nyt design af en reversibel aritmetisk logisk enhed (ALU). Det andet system er en implementation af en lineær cosinus transformation, som bruges i H.264/AVC videokodningsstandarden.

Design af reversibel logik på papir kan nemt blive meget komplekst, så jeg har udviklet to reversible funktionelle hardwarebeskrivelsessprog, som kan simplificere implementationsforløbet. Det ene er et lineærttypet høj-niveau sprog, mens det andet et 'logisk port'-niveau punkt-frit kombinatorsprog. Fra port-niveau sproget kan så benyttes metoder til placering-og-rutning af CMOS porte. For at facilitere det sidste skridt, har vi også lavet standard celle layout af de reversible porte i komplementær passér-port CMOS logik, og, som en test, har disse celler været brugt til at fabrikere ALU designet.

Alt i alt har denne afhandling vist at det er muligt at designe ikketrivielle reversible beregningssystemer uden affald, og at hjælp fra programmeringssprog (datamat støttet design) kan gøre dette forløb nemmere. Der er dog stadig ofte et behov for at gentænke både problemet og løsningen for at akkommodere *intet-affald fremgangsmåden*.

## Contents

| Preface                                                                                                                                                                                                                                            |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Introduction1.1Information and the Limit of Computation1.2Foundations of Reversible Computing1.3Reversible Logic and Quantum Computing1.4Reversible Programming Languages and Transformation1.5Towards Reversible Computer Organization and Design | $\begin{array}{c} 2\\ 3\\ 4\end{array}$                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
| Design of Reversible Computing Systems                                                                                                                                                                                                             | 7                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| •                                                                                                                                                                                                                                                  |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
|                                                                                                                                                                                                                                                    |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| 2.3 Multimedia Transformation                                                                                                                                                                                                                      | . 10                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| Computer Aided Design of Reversible Circuits                                                                                                                                                                                                       | 11                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| 3.1 Reversible Logic Synthesis and Optimization                                                                                                                                                                                                    | 11                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| 3.2 Hardware Description Languages                                                                                                                                                                                                                 | 12                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| Realization of Reversible Circuits                                                                                                                                                                                                                 | <b>14</b>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| 4.1 Adiabatic Switching and Charge Recovery                                                                                                                                                                                                        | . 14                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| 4.2 Embedding in Static CMOS                                                                                                                                                                                                                       | 15                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| Conclusions and Perspectives                                                                                                                                                                                                                       | 16                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| 5.1 Future Work                                                                                                                                                                                                                                    | . 17                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| Bibliography                                                                                                                                                                                                                                       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| Papers on Gate-Level Designs of Arithmetic Functions                                                                                                                                                                                               | 33                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| A1 Optimized Reversible Binary-Coded Decimal Adders                                                                                                                                                                                                | 35                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| A2 Parallelization of Reversible Ripple-Carry Adders                                                                                                                                                                                               | 45                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| A3 Garbage-Free Integer Multiplication with Constants                                                                                                                                                                                              | 63                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| Papers on Reversible Architectures                                                                                                                                                                                                                 | 71                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| B1 Reversible Arithmetic Logic Unit for Quantum Arithmetic                                                                                                                                                                                         | 73                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| B2 A Reversible Processor Architecture and its Reversible Logic<br>Design                                                                                                                                                                          |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| ;;                                                                                                                                                                                                                                                 | Introduction         1.1       Information and the Limit of Computation         1.2       Foundations of Reversible Computing         1.3       Reversible Logic and Quantum Computing         1.4       Reversible Programming Languages and Transformation         1.5       Towards Reversible Computer Organization and Design         1.5       Towards Reversible Computing Systems         2.1       Arithmetic Logic Circuits         2.2       Computing Architectures and Instruction Sets         2.3       Multimedia Transformation         2.4       Reversible Logic Synthesis and Optimization         3.2       Hardware Description Languages         3.4       Reversible Circuits         3.1       Reversible Logic Synthesis and Optimization         3.2       Hardware Description Languages         4.1       Adiabatic Switching and Charge Recovery         4.2       Embedding in Static CMOS         4.3       Embedding in Static CMOS         5.1       Future Work         5.1       Future Work         5.1       Future Work         5.2       Parallelization of Reversible Binary-Coded Decimal Adders         5.3       Garbage-Free Integer Multiplication with Constants         5.4       Parallelization of Rever |

| $\mathbf{C}$ | C Papers on Implementation of Reversible Linear Transforms |                                                                  |     |
|--------------|------------------------------------------------------------|------------------------------------------------------------------|-----|
|              | C1                                                         | Reversible Implementation of a Discrete Integer Linear Trans-    |     |
|              |                                                            | form                                                             | 99  |
|              | C2                                                         | Garbageless Reversible Implementation of Integer Linear Trans-   |     |
|              |                                                            | formations                                                       | 111 |
| D            | Рар                                                        | ers on Design Languages for Reversible Logic                     | 123 |
|              | D1                                                         | Describing and Optimising Reversible Logic using a Functional    |     |
|              |                                                            | Language                                                         | 125 |
|              | D2                                                         | A Functional Language for Describing Reversible Logic            |     |
| $\mathbf{E}$ | E Papers on Engineering of Reversible Circuits 14          |                                                                  | 149 |
|              | E1                                                         | Design of Reversible Logic Circuits using Standard Cells / Stan- |     |
|              |                                                            | dard Cells and Functional Programming                            | 151 |
|              | E2                                                         | Interfacing Reversible Pass-Transistor CMOS Chips with Con-      |     |
|              |                                                            | ventional Restoring CMOS Circuits                                | 179 |

### Preface

This thesis has been submitted to the PhD School of Science, Faculty of Science, University of Copenhagen in partial fulfillment of the requirements for a PhD degree at DIKU, Department of Computer Science, University of Copenhagen, Denmark.

The thesis is written as a synopsis with 11 annexed research papers. The first chapter of the synopsis contains a short introduction to reversible computation and a description of the topic and objectives of the thesis. Following this, are three chapters that detail the contributions of my research and its relation to existing knowledge. The synopsis ends with conclusions and perspectives for future research. References to related work are given throughout the synopsis. Five appendices contains the 11 research papers; of these there are (at the time of writing) 6 published journal and conference papers, 2 pre-print conference papers (accepted for publication), 2 workshop papers, and 1 technical report.

This thesis is not the result of one person's lone fight against the world and it's reckless handling of information. Therefore, I would like to thank my advisor Robert Glück, who introduced me to the subject, and my co-author, office mate, and friend Holger Bock Axelsen. I also thank my other co-authors (the Flemish connection) Alexis De Vos, Stéphane Burignat, Kenneth Vermeirsh, Michał Klimczak, and Mariusz Olczak. The collaboration with Oticon A/S has been important to the direction of the research and, from there, I would especially like to thank Kenneth Branth and Kim Poulsen. Many other people have had an influence on this thesis, so in arbitrary (alphabetic) order I would like thank Jesper Louis Andersen, Patrick Bahr, Poul Johannis Clementsen, Fritz Henglein, Mathias Horn, Susan Nasirumbi Ipsen, Mathias Lehnfeld, Lars Valdemar Mogensen, Torben Æ. Mogensen, Kenji Moriyama, Jette Giovanni Møller, Thomas Pécseli, Claus Rørbech, Mary Sheeran, Jens Sparsø, Christen Artagnan Sørensen, Robert Wille, Tetsuo Yokoyama, and rest of APL group at DIKU. Also, I would like to thank the Danish Strategy Research Council for funding the MicroPower research project and, thus, my PhD stipend.

Finally, I thank my family and friends for their support during this period. Without this, I would not have been able to do all the work and travel.

> Michael Kirkedal Carøe (née Thomsen)

### Introduction

# 1

*Reversible computing* was introduced by Bennett [17] and concerns (universal) computation models where a result can not only be computed, but also uncomputed. We also define these as models that are both forward and backward deterministic. Though reversible computation models can compute all *injective computable functions*, injectivity is not enough to characterize a reversible computing model; we must also require that each computation step is bijective.

This important requirement provides the connection to preservation of information, which is a key motivation for research in reversible computing. A motivation that has its foundation in 1961 with a principle defined by Landauer [79]; a principle that was experimentally verified very recently [20].

#### **1.1** Information and the Limit of Computation

The search to understand the computation process and its limitations is older that computers themselves. Here, we do not think of the algorithmic bounds (which is a very interesting topic by itself), but the limitations that are imposed by the physical world. All computers are situated in the physical world, so the laws of physics, thus, also apply for computers including their circuits and memory. The question is what impact do the laws of physics have on the computation process?

During the 1950's (only shortly after the invention of the modern electronic digital computer [28]) the assumption arose that a logic operation requires an energy dissipation of  $kT \ln 2$ , where k is Boltzmann's constant and T is the temperature at which the operation is performed<sup>1</sup>. Von Neumann has later been credited for saying that this amount of energy is dissipated by two different sources: namely "per elementary decision of a two-way alternative and per elementary transmittal of one unit of information" [153]<sup>2</sup>. From the first source, it *is* apparent that von Neumann meant something less than all computations and it sounds a lot like a conditional, which we today know to be a problem. Today, we also know that the second source does not necessitate energy dissipation (*cf.* [80]).

The breakthrough came in 1961 when Landauer [79] argued that all logical operations are associated with a physical operation and because physical irreversibility requires heat generation, then so does logical irreversibility. It is,

<sup>&</sup>lt;sup>1</sup>This assumption followed from earlier work by Szilard [136] and Shannon [125] that argued that communicating one bit required this dissipation (cf. [81]).

 $<sup>^{2}</sup>$ John von Neumann died in 1957; four years before Landauer's seminal paper. The paper was finished by Burks and published in 1966, but the work still dates back to the 1950's.

thus, not the computation process itself that necessitates energy dissipation, but the process of deleting information. Today, we refer to this as *Landauer's principle*, and the dissipation of  $kT \ln 2$  Joules per deleted bit of information is called *Landauer's* limit<sup>3</sup>. Landauer founded his principle on a thought experiment in which each bit of information is encoded in a single particle. This is extremely hard to implement in the physical world, nonetheless, Landauer's principle was experimentally verified very recently [20].

At the time of Landauer's paper (and in the following decade) it was, however, believed that erasing information was an unavoidable consequence of a computation process. Landauer had realized that all irreversible operations can be embedded in a reversible operation and, as an example, he embedded the AND gate in the reversible gate that we today call a *Toffoli gate*. He, however, only imagined that these *Landauer embeddings* could be used to temporally store the inputs of each gate, which then had to be deleted later, thus, leading to irreversibility.

#### **1.2** Foundations of Reversible Computing

The second breakthrough came in 1973 with Bennett's seminal paper [17], where he defined the first universal reversible computation model; he constrained the conventional (irreversible) Turing machines (TMs) to define the reversible Turing machines (RTMs)<sup>4</sup>. In this paper Bennett also demonstrates how to embed an irreversible TM in an RTM using a history tape (similar to Landauer's embedding) and then run this RTM with a compute-copyuncompute method (today we call this *Bennett's method*) such that the overall result is only the input and the calculated output. This is a significant improvement over the use of a general trace, but often we are only interested in the result of a calculation and not both the input and output. Bennett later showed [18] how, for injective functions, this input can be uncomputed by using more time, viz. adding an extra compute-copy-uncompute phase. This is a very important result. In our research, we seek here to completely avoid garbage, because we need to know what is possible within the computational models. If needed, for a practical implementation perspective, it is easier to relax this criteria than it is strengthen it. Further research in the tradeoff between time and space on one hand, and erasure of information (garbage) on the other have been performed by Bennett, Levine, Sherman, Vitanyi, and more [19, 30, 86, 152, 162].

After Bennett's seminal paper, focus in reversible computation was directed towards more practical models (see Sections 1.3 and 1.4). But at the beginning of the 1990's, interest in the theoretical aspects increased [18, 67, 86, 119] and since then *computability and complexity* of the reversible model has developed into a research area by itself; trying to find the place within the *Complexity Zoo* [1] for the reversible complexity classes. This has lead to research both on RTMs [7, 10, 11, 25, 35, 83] and different models of *reversible automata* [9, 77, 107–110]. An interesting result being that reversible space equals deterministic space [83].

 $<sup>^{3}</sup>$ In practice the dissipation per bit is proportional to the signal energy used to represent the bit [58] and the actual dissipation is, thus, expected to be higher.

<sup>&</sup>lt;sup>4</sup>The first mention of reversible Turing machines can be dated back to Lecerf in 1963 [84].

#### **1.3** Reversible Logic and Quantum Computing

From the beginning, logic has had a central place in the ideas of reversible computing; *e.g.* Landauer's ideas for the reversible gates were designed as a method to reduce the heat dissipation of logic circuits.

Inspired by Landauer's and Bennett's work, Fredkin and Toffoli had, in 1978, been working to design a reversible computer that should be based on conservative logic  $[55]^5$ . In conservative logic all logic gates must be both reversible and parity preserving; *i.e.* the number of TRUE values must also be preserved over the gate. For this they introduced the *Fredkin gate*, which can be characterized as a controlled-swap  $gate^{6}$ . The model was developed to reflect fundamental physical principles and they developed a billiard ball model with a physical representation of the Fredkin gate. Conservative logic is, however, a stricter model than reversible logic, so in 1980 Toffoli presented the n-bit controlled-not gates [144]. This is the most widely used class of reversible gates today, because the gates have a simple mathematical definition, which makes reversible logic synthesis easier (see Section 3). The class covers the not, Feynman, and Toffoli gates. To follow the idea of the Fredkin gate, Toffoli also presented ingenious physical designs of these gates [145] based on 'differential gears' to implement the XOR and a 'rotating cam' to implement the AND in the Toffoli gate.

At the same time as Fredkin and Toffoli's work, other people were developing ideas for another radically new computer design. In 1980 Benioff presented his paper on how to design a (classical) computer from quantum components [16], which shortly after (in 1982) was followed by Feynman's paper on a computer that can simulate quantum physics [47]. In 1985 Deutsch presented his *universal quantum computer* [44] and the new field was born.

In the quantum model, it was easy to include Fredkin and Toffoli's gates and in 1985 Feynman, with his flair for intuitive graphical descriptions, introduced the diagram notation that is used today [48]. In this paper, Feynman also introduced the very first reversible arithmetic circuits; these circuits include a full-adder implementation with four reversible gates. More (reversible) quantum gates (e.g. [116]) and different notation were introduced, so in 1995 Barenco, Bennett, Shor, and others worked as a 'standardization committee' and decided on the notation and a set of universal quantum gates [15].

From a historical perspective, the idea of reversible logic circuits are even older than Fredkin and Toffoli's work. In 1959, two years before Landauer's paper, Huffman looked at *information-lossless* finite state machines (FSMs) [66]. He was interested in signal transformation (both for encoding and cryptography) and for these applications information-lossless FSMs are perfect; by constructing the encoding machine you also get the decoding machine. To design these FSMs, he defined information-lossless gates similar to reversible gates. Huffman also showed that adding a control signal calculated by an irreversible function to a reversible gate, does not break reversibility of the gate. The circuits are, however, not reversible according to our definition, but this was also not his purpose.

 $<sup>^5 {\</sup>rm Fredkin}$  and Toffoli's paper was published in 1982, but the paper was based on internal papers and an MIT course from 1978; see [144, References]

 $<sup>^{6}</sup>$ The gate that Fredkin and Toffoli presented swapped the two input-values if the control is FALSE. The Fredkin gate that we use today swaps if the control is TRUE.

#### 1.4 Reversible Programming Languages and Transformation

Another track in the history of reversible computing begins in 1986. At this time, Lutz, after a brief meeting with Landauer, sent him a letter about some work he did with Derby, about four years earlier, on a reversible imperative language called Janus [90]. Their work arose from an interest to investigate if it was possible to implement such a language and before 1986 Lutz and Derby did not know about Landauer's principle. The language was 'rediscovered' after the turn of the century and has since then been formalized and further developed at DIKU [32, 161, 164]; here, also student projects that implement more advanced algorithms (*e.g.* matrix multiplication) have been made [70,91]. Other simple reversible imperative languages have been developed, *e.g.* Frank developed R [52] generate instruction code for the Pendulum processor and Matos [96] made a language for linear programs.

Though the first reversible programming language was imperative, reversible functional languages have lately received the most interest. This development started in 1996 when Huelsbergen presented SECD-H [65], an evaluator for the lambda calculus that extended Landin's SECD evaluator [82] with a history tape. (Kluge [73] similarly extended a machine that can reduce a program term to normal-form and back again.) This was followed by Mu *et al.* who, with applications in bidirectional computing in mind, presented a reversible relational language [111]. More recently, work towards general purpose functional programming languages was presented independently by Yokoyama, Axelsen, Glück [163] and James, Sabry [68].

Also, a variety of languages for modeling quantum computations have been designed in almost all different paradigms. We will not detail these here, but refer to Gay's survey of the area [57].

On a related topic, transformation of reversible languages have also received some interest lately (mainly at DIKU). Though the first compiler between two reversible language was made by Frank [52] (between his language R and a reversible instruction set called PISA), it was Axelsen who devised techniques for a compiler that could perform clean translation [8]. His translation was between Janus and PISA and was clean in the sense that the compiled program did not have more than a constant memory overhead over the original Janus program. It is likely that the PISA program will use more temporary memory/registers. Mogensen's also did work on partial evaluation of Janus [105, 106].

We would also like to mention (automatic) program inversion. Inverting a reversible program is often easy, but if the program is implemented in a irreversible language it is harder. Program inversion has a direct impact on program maintainability and reliability of the inverse programs, which are otherwise hard to find if the program to be inverted either is formulated in a conventional programming language [97,133], or have to be deduced by static program analysis techniques [59] or interpretation [2]. We should also mention semi-inversion, where the inverted program not necessarily is a mapping from output to input, but a combination of the original inputs and outputs, cf. Mogensen [103, 104].

Work on reversible instruction sets is covered in Section 2, while design language for reversible logic is discussed in Section 3.2.



Figure 1.1: Tower of reversible computing system [12].

#### 1.5 Towards Reversible Computer Organization and Design

In his 1961 paper, Landauer wrote that "we shall label a machine as being logical reversible, if and only if all its individual steps are logically reversible" [79]. This is a very grand challenge and we know from Bennett (and later work) that, theoretically, such machines *do* exist – even when we add the requirement that the final result must not include garbage. But is it possible to realize such machines in practice and can it be done with the fabrication technology that exists today? And will we actually be able to achieve the expected reduced heat dissipation?

The *MicroPower* research project [12], which started in 2009, has as objective to develop a *proof-of-concept reversible computing system* and the computer science theory behind it. To do this all parts of the *reversible computation tower* (Figure 1.1) must be investigated. More specifically, the project investigates whether reversible computing can be applied in a power-limited application (specifically hearing aids) with the future hope to either reduce power consumption or increase functionality.

This dissertation is part of the MicroPower research project. Within this project, my thesis is that making a garbage-free reversible computing system is not only feasible, but does not necessarily require much more effort than making a conventional computing system. We will *not* address the questions of the actual energy consumption of the circuits and the theoretical implications; these two fundamental questions are investigated by other parts of the project.

To answer our thesis we must investigate the bottom part of the computing tower (Figure 1.1) from the machine code level down to the physical implementation. We investigate and design basic reversible logic circuits with an emphasis on arithmetic (Section 2.1). These are important basic operations in all computing systems and will, therefore, give the foundation for the reversible systems. Improvements here will improve all other parts of the tower. We also look at reversible circuits from a higher abstraction, namely in terms of logic designs of reversible computer architectures (Section 2.2) and multimedia transforms (Section 2.3). These two applications have the potential to be the first practical use of reversible circuits: architectures to create very small stand-alone systems (*e.g.* sensors) and multimedia transforms, which can be embedded in an irreversible system.

We investigate different *computer aided design* (CAD) approaches for reversible circuits (Section 3). Logic circuits designed by hand are often efficient, but it is very time consuming and verification of the designs are not practical. The purpose of a hardware description language is to raise this abstraction. Finally, we investigate how to implement reversible circuits in CMOS (Section 4). Here we desire implementations that can be used with the CAD process, but still have the prospect of reduced energy consumption. We will also look into how reversible CMOS circuits can be embedded in 'conventional' static CMOS circuits. Finally, we will look at future research topics (Section 5).

### Design of Reversible Computing Systems

# 2

To avoid the heat dissipation from Landauer's principle the entire computing system must be fully reversible. In this chapter we will look at designs of reversible computing systems at and close to the logic gate level. We first focus on design of arithmetic logic circuits. Then, based on these, we look at reversible computing architectures and designs of multimedia transforms.

#### 2.1 Arithmetic Logic Circuits

Arithmetic operations lie at the foundation of most computing systems and good logical implementations of these are important. Improvements to arithmetic circuits can result in improvements to the entire computing system. In a garbage-free reversible computing system it is especially important that the arithmetic circuits are also garbage-free, but how to do this is not always obvious, and history shows that rethinking our current knowledge can be necessary.

#### Addition

An immediate problem for reversible adder implementation is that addition is *not* an injective function in itself: given just the value of the sum A + B, one can not determine A and B uniquely. We can, however, redefine the problem by using *reversible updates* [13] and to avoid overflow we use *n*-bit modular addition,  $(A, B) \mapsto (A, B + A \mod 2^n)$ , to define reversible addition.

The adder that Feynman proposed [48] was a reversible *embedding* of the ripple-carry adder. Though addition is an injective function if one of the inputs is kept, the conventional ripple-carry structure is *not* reversible. The problem lies in the use of the full-adder circuit, because it is not possible to calculate both the sum and the carry without copying one of the inputs. You can say that there is an overlap in the information contained in the two results and this results in a garbage bit. Several reversible adder designs used this ripple structure; *e.g.* reversible *binary-coded decimal* adders have received some interest [142] (Paper A1).

The solution to this problem was presented in 1996 by Vedral *et al.* [148]. They observed that, to do an addition it is not necessary to calculate both the carry and the sum at the same time. If we first calculate the carry in a normal ripple, we can then make a *backwards ripple* where we both uncompute the carry and calculate the sum. This was the first example that shows that it really pays to rethink the problems that we want to solve. However, the cost of not creating garbage was an increase in logic depth. Vedral's *V-shaped adder* was a huge improvement (produces no garbage), but it was not optimal with respect to ancillae and logic depth. In 2005 two papers were published that suggested different improvements to the adder design (in terms of ancilla, gate count and logic depth) [36,146]. See [140] for a detailed description of the adder circuits.

The V-shaped adder is a ripple-carry adder and therefore has large logic depth. We know from conventional logic that it is possible to implement a logarithmic-depth adder at the cost of more logic gates and a more complicated circuit. The first reversible implementations of these adders were embeddings of the carry-lookahead adder [43, 72] and the parallel prefix adder [46], but they all suffered from garbage generation. Based on the techniques from the carry-lookahead adder, Draper *et al.* [45] in 2008 presented a garbage-free logarithmic-depth adder (QCLA)<sup>7</sup>, which added an extra 'look-behind' phase.

We presented another approach to a faster-than-linear reversible adder, the year after, named *ripple-block carry-adder* (RBCA) [140] (**Paper A2**). Basically it is a ripple-carry adder but instead of calculating and rippling one bit at a time, the addition is divided into several smaller block-additions that is performed in parallel. During computation a *carry-correction* phase is added and it is this phase that contains a ripple. When choosing the optimal block-size in relation to the input-size, the adder has a logic depth that is the square-root on the input-size.

When comparing the two, the QCLA in faster than the RBCA, but only significantly when the input is larger than 32 bits. In terms of gate-count the two adders are comparable, but the RBCA uses less ancilla bits. Also the RBCA has a better 'locality' (the gate uses wires that, in the diagram notation, are closer to each other), which can have an impact when the circuit is implemented in the target technology (*e.g.* quantum computer or CMOS).

#### Multiplication

Addition has an intuitive reversible formulation using modular arithmetic and reversible updates. Multiplication, on the other hand, is more difficult to define; mainly because the inverse operation is division. A simple way to define it is to take the embedding from Bennett and save both the multiplicand and the multiplier while still producing the product. This is the trick used by Kawada *et al.* in their (garbage-free) reversible logic implementation of the Karatsuba algorithm [75].

This approach is, however, not satisfying because it expands the amount of information; on the other hand, it is not in general possible to only update the multiplicand and save the multiplier. A possibility (also suggested by Ressler [122]) is to add a remainder, such that mult(A, M, R) = (A\*M+R, M), where  $0 \leq R < M$ . We have started the work towards a reversible logic implementation [14] (**Paper A3**). So far, it works for a certain class of constant multipliers that are equal to  $2^k \pm 1$  for  $k \in \mathbb{N}$ . We expect that this can be extended to general multiplication.

 $<sup>^7\</sup>mathrm{The}$  adder was designed with a focus on quantum computing and, thus, the 'Q' in the name. However, it is only implemented with reversible gates.

#### 2.2 Computing Architectures and Instruction Sets

As early as 1965, Reilly and Federighi presented a small instructions set, designed for a one-address machine with accumulator and multiplier-quotient, with the purpose of implementing reversible subroutines [121]. The set is, however, not reversible with respect to our definition; *e.g.* it includes a clearaccumulator instruction, which does not delete information from the overall system but still dissipates energy due to the clear process. The motivation was also not energy efficiency, but a desire to enable code sharing of the implemented subroutines.

The work by Ressler [122] from MIT in 1981 was, however, motivated by energy efficiency. Supervised by Fredkin, he made the design of a conservative logic computer based on a two-address von Neumann architecture. The instruction set contains the basic reversible arithmetic/logic instructions (*e.g.* add, subtract, roll-left and right), and the work contained the idea of using paired-branch instructions and to have both condition and assertion in the conditional. On the other hand, memory could be irreversibly updated and (perhaps most importantly from a program inversion perspective) there was no possibility for inverse execution (uncall).

In 1987 Briggs [23] described a system to control an electronic cricket scoreboard. An important feature of the system was the ability to 'undo' the operations that were performed; a nice feature if the operator made an error. The reverse execution was based on a trace that contained only the minimal information to do the backtracking. Based on this idea of a minimal trace, Cezzar [29] in 1991 presented (and patented) a two-address 'general purpose' machine that can reverse its forward execution.

Having a trace (which would be as long as the number of executed instructions plus amount of deleted information) is, however, a very unsatisfactory solution for a reversible computer. While Cezzar was mainly interested in a computer that could backtrack, Hall [64] designed his improved architecture with reversibility in mind. This ISA does not use a trace, but instead jumps are handled by come-from instructions, which makes it possible to break reversibility.

At MIT, Knight and Younis [165] had been working on a energy efficient logic family (see Section 4) and this implementation technology revived the interest in reversible architectures. Over a few years (ending in 1999) Vieri and Frank developed the reversible von Neumann architecture Pendulum [52,150,151] (formalized in [13]). This architecture was a big step forward. Instead of using only a single special-purpose register for program control (the program counter), the address calculation of the reversible abstract machine relies on three special-purpose registers: program counter, branch register, and *direction bit.* The calculation of the next program counter is then only dependent on the branch register and the direction bit, and the individual instructions cannot directly alter the program counter, but instead only update the branch register and direction bit. Though Pendulum has been fabricated, there is no full detailed description of the actual logic implementation and there is some indication that it is not fully reversible. Firstly, the implementation technology [165] (see Section 4.1 for more details) does not implement the reversible gates, but uses Bennett's method, which results in both the input and output. Secondly, there is no use of the garbage-free V-shaped adder or other arithmetic circuits; Vedral *et al.* [148] is not mentioned by Frank and Vieri, and the rest of the research is from after 1999. Thus, although the abstract architecture of Pendulum (as described by Frank) is reversible, it is likely that the logic implementation is not.

Inspired by the Pendulum architecture and it instruction set, we have designed a fully reversible and garbage-free two-address von Neumann architecture called Bob [141] (Paper B2). It features a locally-invertible instruction set, and the design, including areversible control logic and address calculation, is simple enough to be directly implemented in reversible logic. A central part of the processor design is the arithmetic logic unit (ALU). The conventional ALU has an inherently irreversible functionallity so we have suggested a novel alternative design for a reversible ALU [143] (Paper B1). The design of the ALU in based on the V-shaped adders and follows a strategy that puts all logical operations in sequence and then uses controls to ensure that only the desired operation changes the input. To our knowledge, this is the first garbage-free ALU.

#### 2.3 Multimedia Transformation

Multimedia transforms are an interesting application area for reversible circuits. In small battery-powered devices (*e.g.* smartphones and mp3-players) they are often included as part of an ASIC to reduce power consumption and a key property of many such transforms is that they are information-lossless (and thus invertible). There exist many application areas of such transforms and even the earliest quantum algorithms (including Shor's factorization algorithm [129]) make use of a quantum implementation of the Fourier transform. Also a implementation of the fast Fourier transform in reversible logic has been investigated [130].

Our contributions in this area have focused on implementation of wavelet transforms in reversible logic. Wavelet transforms have also been implemented in quantum computing [50], but our work builds on a paper by Bruekers and van den Enden [24]. Here, they showed a new network structure (the so-called *lifting scheme*) that can be used for perfect inversion and reconstruction of the inputs. This is desired in many transforms and the properties are also a perfect match for reversible implementations. Daubechies and Sweldens show how to factorize wavelet transforms into a lifting scheme [37] and we use this to find the lifting scheme for the linear transform of the *H.264 video encoding* [40] (**Paper C1**). This implementation, however, generates garbage that is caused by a multiplication-by-5.<sup>8</sup> In the latter work we examine other linear transforms that only have multiplications-by-2 [27] (**Paper C2**) and find the associated lifting scheme. (The papers also contain work on CMOS fabrication and testing of the transform from [40]. We shall discuss this in Section 4.)

 $<sup>^{8}</sup>$ The design was made before our own garbage-free constant multiplication circuit [14], which solves the problem with multiplication-by-5.

### Computer Aided Design of Reversible Circuits

In Section 2 we discussed the design of reversible logic circuits. All presented circuits (adders, multipliers, and transforms) have a very regular structure, which makes it a lot easier for humans to reason about them. Also, much of the development comes from novel design ideas based on a theoretical insight to the problem. However, not all problems have these properties, so in order to make good realizations of more complex circuits possible, efficient description languages, logic synthesis, and optimization techniques are being developed. Computer aided design of Boolean circuits has been (and is still being) developed (cf. [100]), so in this section we will focus on methods to aid the design of reversible circuits.

#### 3.1 Reversible Logic Synthesis and Optimization

The first approach to reversible logic synthesis is actually a very beautiful example of how mathematics can be related to reversible circuits. Based on work by Rayner and Newman [120], Storme and De Vos [135] used that reversible gates and cascading of these by serial composition forms a group, with the result that it is possible to use the known methods from group theory. Specifically, one of these methods can be used to decompose an arbitrary reversible circuit into a cascade of simpler circuits that only updates one input wire [39, 42]. Each of these simpler circuits can then be interpreted as an *exclusive-or-sum-of-products* (ESOP), which basically is a cascade of controlled-not gates with the number of controls of each gate equals to the number of variables in the products.

Reversible logic synthesis have since been much researched and often with an interest to also apply the methods to quantum circuits. Many of the approaches is based on techniques known from Boolean logic synthesis. Perkowski *et al.* [3,117] made a method for hierarchical decomposition using three known decomposition techniques and Shende *et al.* [128] implemented a brute-force algorithm with memorization, that can find optimal circuits. Maslov, Dueck, and Miller [92,93] implemented heuristic methods based on truth tables that also use the Fredkin gate. Other approaches are based on positive-polarity Reed-Muller expansion [62], or Reed-Muller spectra [94], and finally optimal synthesis based on satisfiability of Boolean formulas [61, 157]. There exists many more, but common to these are that they are based on truth tables as input and that they actually try to solve an NP-hard problem. They can, therefore, find good or optimal decompositions of reversible circuits with small width (4, perhaps 5 wires), but for more complex circuits with a large amount of wires, they have to give up. Heuristic methods using more compact function representations (*e.g.* decision diagrams) have been suggested to make synthesis of larger functions possible [131, 155]. These, however, add many extra lines to the design, which are often left as garbage.

In parallel with the development in synthesis, methods for optimization of reversible circuits have been developed. One such method is called *template matching* [92,101]. Based on a large set of identity circuits, the method matches a subpart of the identity circuits with subparts of the circuit to be optimized. If a larger part of the identity circuit is found it can be replaced by the smaller part without changing the functionality of the circuit. Other methods are optimizations based on ESOP minimization [134,147] and reducing logic depth by expanding the logic width (adding more lines) [95,102].

#### 3.2 Hardware Description Languages

Domain specific languages (DSLs) to describe computing systems and circuits have been extensively studied, *e.g.* resulting in DSLs such as VHDL, Verilog, and SystemC. Taking inspiration from this, Wille *et al.* implemented the reversible design language SyReC [156, 158]. The language builds on Janus and, therefore, it has the same properties of being sequential and imperative. This makes it less suitable for describing logic with an inherent concurrent structure. Therefore, SyReC is mainly used for creating the logic-level data structures used in synthesis. It has been used to implement different circuits from multiplication to a simple architecture [112, 113, 160]. Also, a student project at DIKU investigated a simple imperative language to describe reversible logic circuits [85].

Our work focuses on using functional languages to describe reversible logic. Using functional languages to describe logic dates back to the 1980s (see 'non-survey' by Sheeran [127]), but although research has continued, these languages are not widely used in industry. This, however, did not stop us. So we have designed two languages with the hope that description of reversible logic combined with functional languages can be a success.

The first language is a point-free combinator-style language and it is designed to be close to the reversible logic gate-level [137] (Paper D1). The language is inspired by Sheeran's  $\mu$ FP [126] but it is also related to other languages and models. A first example is Ruby [71]<sup>9</sup> that, even thought it is made to describe conventional circuits, also has algebraic laws for inverse composition. From reversible computing, a computation model designed by Green and Altenkirch [60] to study the relation between reversible and irreversible computations, use some similar basic combinators and some of the algebraic laws that is also used in our work. James and Sabry's  $\Pi^O$  calculus [68,69] is a point-free language with a similar type system with product and sum of wires. Finally, Coecke and Duncan's ZX-calculus [33,34] is a graphical calculus used to simulate quantum computations and uses some of the compositions (plus some special quantum compositions) and rewriting with using algebraic laws.

 $<sup>^{9}\</sup>mathrm{Here},$  we refer to the hardware description language and *not* the later dynamic-typed object-oriented language by Matsumoto.

The second language is a linear typed higher-level functional language [139] (**Paper D2**) with constructs such as conditionals and a let-in statements for local wire updates, which uses size-change termination to ensure termination of recursions. The language has some similarities with the previously mentioned reversible functional languages but also conventional languages like Lava [21] and Park and Im's linear lambda calculus  $(l\lambda)$  [115].

The **Paper D2** also shows ideas for a design flow that can be used garbagefree translation to reversible circuits, by using on the combinator language [137] as an intermediate language.

### Realization of Reversible Circuits

In Landauer's seminal paper [79], he sets out to identify the possible sources of errors (or heat generation) in a physical computer. He identified three, where one of these was the dissipation of heat due to irreversibility, which then became the major topic of the paper. The other two sources he identified as (1) incomplete switching due to fast switching time and (2) decay of stored information. This is a bit simplified but gives the overall picture. Landauer also knew that these two sources, in the implementation technology of his time (as well as today's CMOS), is much higher than Landauer's limit. So using reversible logic CMOS gates alone will, however, not be sufficient.

#### 4.1 Adiabatic Switching and Charge Recovery

The dissipative source relating to incomplete switching is what *adiabatic switch*ing tries to overcome. The basic concept is to achieve asymptotically zero energy loss when the switching time goes towards infinite (hence the name adiabatic).<sup>10</sup> The concept has been studied throughout the 1980's by Fredkin and Toffoli [54], Mead [98], Feynman [49], and Seitz et al. [124], but it was Athas, Koller, and Svensson [6,74] that first used the term adiabatic and applied adiabatic switching successfully to reversible logic by reusing the signal energy (sometime also called *charge-recovery*). Reusing the input signals to generate the outputs is the key concept in Fredkin and Toffoli's conservative logic; you have the same number of billiard balls at input and output. Reversible logic can easily be converted to conservative logic by using complementary signals: each bit is represented by both its value and its negated value. The combination of adiabatic switching and charge recovery is the key concept behind the designed (and fabricated) logic families that followed. We will here sketch two different families (perhaps the two most influential families to reversible logic, so far), but there exist others, e.g. Kramer et al. [76], Vetuli et al. [149], and Amirante et al. [4].

Split-level charge recovery (SLCR) logic was presented in 1994 by Younis and Knight [165] with some later improvements by Frank [52]. Inspired by

<sup>&</sup>lt;sup>10</sup>It is a common misunderstanding (and a often used reason for rejection of the entire field) that reversible and adiabatic circuits will lead to CMOS circuits that consume no energy at all. This is not true (it is not practical to use infinite time for a computation) and there is no claim of this in the literature. It is, however, a well-established fact (which to some extent also governs today's chip design of multi-core processors) that there is a tradeoff between energy consumption and switching time [166].

work by Hall [63], the gates resemble those of static CMOS logic, but with the significant difference that the constant voltage-source and voltage-drain are exchanged with trapezoid-shaped clock-signals. Full reversibility can, therefore, *not* be achieved at gate level, but instead Bennett's compute-copy-uncompute method is applied to ensure charge recovery. The Pendulum processor was implemented in this logic family [151].

Complementary pass-transistor (CPT) logic was developed by De Vos [38, 39] a few years after SLCR and has some similarities with work by Seitz et al. [124] and Merkle [99]. Here, the reversible gates are implemented by pass-gates, which work as switches that 'guide' the dual-line input signals to the desired output lines. The adiabatic switching is achieved using signals that gradually change from 'undefined' to either TRUE or FALSE and back (often using a triangular, trapezoidal, or sine-wave signal), and the pass-gates, thus, switch only when there is little voltage across the gate. As pass-gates are not *ideal* switches, this logic family has also been called semi-adiabatic [53]. An advantage of using CPT logic is, however, that the circuits can directly be used in both directions, showing reversibility directly.

SPICE simulations of reversible CPT circuits have shown that such implementations have the potential to reduce energy consumption by about a factor of ten [41] using 0.35  $\mu$ m CMOS. Similar results (with measurements showing a factor of about 5) have been presented by Amirante *et al.* [5] for their adiabatic logic family in 0.13  $\mu$ m CMOS. In both cases with a 'clock frequency' of about 10 MHz.

Our contribution has been to take the CPL logic family and implement the reversible gates applying the *standard cells methodology* [138] (**Paper E1**) with the goal of using them in a future design flow. In 2003, Frank had the same goal with a generalized version of SLCR logic [53], but no results of this work have ever been published. Blotti and Saletti [22] have also looked at semicustom designs for *positive feedback adiabatic logic* (PFA logic), the family first presented by Vetuli *et al.* [149].

#### 4.2 Embedding in Static CMOS

We do not expect to see fully reversible systems commercially available in the near future. From this perspective it is interesting to consider hybrid systems, where reversible CMOS circuits are embedded within static CMOS.

The first to look at this were Amirante *et al.* [5,51] for PFA logic. Though the gate designs of this family are similar to SLCR logic, a trapezoidal signal is used to switch the transistors. This result in the problem of converting between a trapezoidal and a digital signal; a problem we also have with CPT logic. Amirante *et al.* use a two-stage memory to synchronize the digital input with desired clock and two 2-to-1 switches to generate the non-inverted and inverted trapezoidal signals.

We have taken a similar approach [26] (**Paper E2**), but a major difference is that we also want to be able to use the reversible circuits in both directions. We solved this by an extra array of parallel switches that is controlled by a direction bit. In the implemented design, all signals (inputs, direction bit, trapezoidal signals) is generated by an FPGA.

### Conclusions and Perspectives

5

In this thesis, we have investigated the feasibility of designing and implementing garbage-free reversible computing systems. We have found that this, to a large extent, is possible with the knowledge we have today, but there are still many non-trivial barriers that need to be overcome. Experience and 'expert knowledge' about reversible computing is definitely an advantage when making these designs, but this is, of course, also the case with many other areas of computer science.

More specifically, we have developed new garbage-free circuits for addition and are working towards a general multiplication circuit. We have also combined multiple operations together to implement a reversible arithmetic logic unit. With these and other garbage-free arithmetic circuits it is possible to design larger reversible computing systems. As an example, we have implemented discrete lossless transforms by redesigning these with a lifting scheme. We have also shown the design of a reversible computing architecture and implemented this using only reversible logic gates. While, these are still small systems, with further development it should be possible to use similar strategies to implement even larger systems.

From our own design experience, we know that designing logic gate-level circuits quickly becomes complicated when the functionality and number of wires involved are increased. To make the design process easier, we have developed two hardware description languages. Using examples from known reversible circuits, we have shown that circuits can be described reasonably concisely. These are, however, still small examples and we need to implement a larger system to show the usefulness of the languages.

There are basically two different gains that are advocated for the use of reversible systems. The first is reduced energy consumption both due to Landauer's principle and a change to a adiabatic CMOS logic family or another future technology. The second gain is functionality due to the fact that the circuits (and programs) can be used in both directions. The advantage is that the same implementation can be used at multiple places (*e.g.* the same design for both the fast fourier transform (FFT) and inverse FFT) or that the same physical circuit can be used for multiple purposes (*e.g.* if the FFT is little used the same circuits can also be used for the inverse FFT). While realizing the first gain is left for future development, we believe that, with our design experience, the second gain is possible to achieve today, with benefit to future reversible computing systems.

#### 5.1 Future Work

Though the foundations for reversible computing were laid fifty years ago, and interest have increased considerably in recent years, in many ways the area is still young. In this PhD thesis, some of the unknown land was covered, but there is still plenty to explore for the future.

#### Arithmetic Logic Circuits

For conventional logic circuits there exist much research, even whole books, dedicated to the design and implementation of computer arithmetic. This is definitely not the case for reversible logic. The constraint that the circuits must be garbage-free is what makes it an interesting research problem, but most proposed designs (both hand-made and CAD generated) still implement the conventional algorithms *with* garbage. They use the reversible gates, but as their sole goal is to reduce logic size or number of garbage bits for a specific fixed-size circuit, very little knowledge is actually gained from this approach.

However, arithmetic functions often have some inherent properties that can be exploited to make a very regular circuit design. A good example is the ripple-carry adder, where only a redesign gave the garbage-free V-shaped adder; a redesign that none of the automatic approaches can find. In many cases the arithmetic function itself must also be redefined, such that it can be expressed reversibly. Here our current work on multiplication is the obvious example. With this in mind, we need more design work on good garbage-free implementations of reversible circuits.

#### Computer Aided Design

A lot of research has focused on reversible (or quantum) logic synthesis, resulting in optimal circuits for small input sizes. There has, however, been very little research on how to describe and design reversible systems; often a truth table (a permutation of the input vectors) is used.

Recently, SyReC [158] and my two languages [137,139] have been presented, but these are still only initial steps and far from 'full' description languages. More work on these languages is needed to make them better to use for descriptions of reversible circuits. A good way to acquire experience is by implementing reversible systems in these languages.

Also, algorithms for optimization have only been designed for gate-level descriptions. We must move this to a higher abstraction level and possible methods for this could be term rewriting or partial evaluation. However, there exist many other conventional optimization methods (*e.g.* from compiler technology) and some of these might also apply.

We know that there is a tradeoff between ancilla lines (logic width), logic depth, and the number of gates; *e.g.* adding one ancilla line allows a linear-depth adders with few gates, while adding n ancilla lines allows a logarithmic-depth adder. If we can find a more exact relations between these resources we can use this in synthesis. The descriptions in reversible HDLs have some degree of modularity, so we could also use approaches for trading logic depth with adding [95, 102] or removing [159] ancilla lines, depending on the ancillae lines already available.

Feynman's widely used diagram notation has an inherent 1-dimensional structure, in the sense that gates only operate vertically with computations proceeding from left to right. This 1-dimensional structure is a good description of many quantum architectures, but in recent years new architectures have been suggested, which have a 2 or even 3-dimensional structure. This has led to research in quantum circuit that use more dimensions [31,118,123]. In quantum circuits design there is a 'nearest-neighbor'-approach coming from quantum architecture models, in which qubits can only interact with its neighboring qubits. For CMOS circuit there is no nearest-neighbor problem, but the circuit are 2-dimensional and methods for placing and routing these circuits have been used for many years. There is, however, an important difference between these quantum architectures and CMOS circuits: qubits are represented with a single object (in some quantum architectures the qubits are even fixed in a placed) and the operations interact between them, while in CMOS the gates (operations) are places and the wires (bits) are routed in-between. It would, however, still be interesting to see if part of the place-and-route methods can be applied to quantum circuits also.

#### Implementation of Circuits

Only very recently was Landauer's principle experimentally verified [20], but it is still any open question how (or if) this can be used for energy reduction in a 'real' computer. Initial simulations of adiabatic switched CMOS circuits show a possible energy reduction [5,41], but there is still no experimental evidence for a whole system. Very recently Orlov *et al.* [114] showed that reversibly modifying (with copy and uncopy operations) a simple memory element (a capacitor and a resistor) with adiabatic switching can be done with lower heat dissipation than Landauer's limit at up to about 15 MHz. It is, however, a possibility that CMOS technology will never switch efficiently enough for implementations of reversible gates to be viable, if reduction of energy consumption is the main goal. It could be that a completely new implementation technology is needed [154]. Some potential technologies are nanoelectronic devices [56], nanomagnets [78], and superconductor electronics [87]. This work I will, however, leave to qualified engineers and physicists.

#### **Asynchronous Circuits**

Asynchronous circuits [88,132] have a long history as a technology that show promising results with respect to speed and energy consumption, but it is also a technology that, so far, have had little influence outside the research environments.<sup>11</sup> In this sense it has a comparable history to using functional hardware description languages, and to some part also adiabatic (reversible) circuits. It would be very interesting to investigate if a combination of asynchronous and reversible (adiabatic) circuits would be a good match. Describing and synthesizing asynchronous circuits is hard (there exist some CSP-based approaches), so perhaps a combination with functional languages is even possible.

<sup>&</sup>lt;sup>11</sup>Asynchronous circuits have been used for commercially produced network switches, which have also lead to interest from Intel [89].

### Bibliography

- [1] Aaronson, S. Complexity zoo. http://www.complexityzoo.com/, 2012.
- [2] Abramov, S., and Robert, G. The universal resolving algorithm and its correctness: inverse computation in a functional language. *Science of Computer Programming* 43, 23 (2002), 193–229. Mathematics of Program Construction (MPC 2000).
- [3] Al-Rabadi, A. N. Reversible Logic Synthesis: From Fundamentals to Quantum Computing. Springer-Verlag, 2004.
- [4] Amirante, E., Bargagli-Stoffi, A., Fischer, J., Iannaccone, G., and Schmitt-Landsiedel, D. Variations of the power dissipation in adiabatic logic gates. In 11th International Workshop on Power and Timing Modeling (2002), p. D1.1.
- [5] Amirante, E., Fischer, J., Lang, M., Bargagli-Stoffi, A., Berthold, J., Heer, C., and Schmitt-Landsiedel, D. An ultra low-power adiabatic adder embedded in a standard 0.13μm CMOS environment. In *Solid-State Circuits Conference, 2003. ESSCIRC '03. Proceedings* (2003), IEEE, pp. 599–602.
- [6] Athas, W. C., and Svensson, L. J. Reversible logic issues in adiabatic CMOS. In Workshop on Physics and Computation, 1994. PhysComp '94, Proceedings (1994), IEEE, pp. 111–118.
- [7] Axelsen, H. Time complexity of tape reduction for reversible turing machines. In *Reversible Computation, RC2011. Revised Selected Papers*, A. De Vos and R. Wille, Eds., vol. 7165 of *LNCS*. Springer-Verlag, 2012, pp. 1–13.
- [8] Axelsen, H. B. Clean translation of an imperative reversible programming language. In *Compiler Construction. Proceedings* (2011), J. Knoop, Ed., vol. 6601 of *LNCS*, Springer-Verlag, pp. 142–161.
- [9] Axelsen, H. B. Reversible multi-head finite automata characterize reversible logarithmic space. In *Language and Automata Theory and Applications. Proceedings* (2012), A.-H. Dediu and C. Martín-Vide, Eds., vol. 7183 of *LNCS*, Springer-Verlag, pp. 95–105.

- [10] Axelsen, H. B., and Glück, R. A simple and efficient universal reversible Turing machine. In *Language and Automata Theory and Applications*. *Proceedings* (2011), A.-H. Dediu, S. Inenaga, and C. Martín-Vide, Eds., vol. 6638 of *LNCS*, Springer-Verlag, pp. 117–128.
- [11] Axelsen, H. B., and Glück, R. What do reversible programs compute? In FOSSACS (2011), M. Hofmann, Ed., vol. 6604 of LNCS, Springer-Verlag, pp. 42–56.
- [12] Axelsen, H. B., Glück, R., De Vos, A., and Thomsen, M. K. MicroPower: Towards low-power microprocessors with reversible computing. *ERCIM News* 79, 1 (2009), 20–21.
- [13] Axelsen, H. B., Glück, R., and Yokoyama, T. Reversible machine code and its abstract processor architecture. In CSR (2007), V. Diekert, M. V. Volkov, and A. Voronkov, Eds., vol. 4649 of LNCS, Springer-Verlag, pp. 56–69.
- [14] Axelsen, H. B., and Thomsen, M. K. Garbage-free integer multiplication with constants. In 4th Workshop on Reversible Computing, Preliminary Proceedings (2012), R. Glück and T. Yokoyama, Eds., pp. 198–204.
- [15] Barenco, A., Bennett, C. H., Cleve, R., DiVincenzo, D. P., Margolus, N., Shor, P., Sleator, T., Smolin, J. A., and Weinfurter, H. Elementary gates for quantum computation. *Physical Review A* 52, 5 (1995), 3457–3467.
- [16] Benioff, P. The computer as a physical system: A microscopic quantum mechanical Hamiltonian model of computers as represented by turing machines. *Journal of Statistical Physics 22*, 5 (1980), 563–591.
- [17] Bennett, C. H. Logical reversibility of computation. IBM Journal of Research and Development 17, 6 (1973), 525–532.
- [18] Bennett, C. H. Time/Space Trade-Offs for reversible computation. SIAM Journal on Computing 18, 4 (1989), 766–776.
- [19] Bennett, C. H., Gács, P., Li, M., Vitányi, P. M. B., and Zurek, W. H. Thermodynamics of computation and information distance. In *Proceed*ings of the twenty-fifth annual ACM symposium on Theory of computing (1993), STOC '93, ACM, pp. 21–30.
- [20] Bérut, A., Arakelyan, A., Petrosyan, A., Ciliberto, S., Dillenschneider, R., and Lutz, E. Experimental verification of Landauer's principle linking information and thermodynamics. *Nature* 483, 7388 (2012), 187–189.
- [21] Bjesse, P., Claessen, K., Sheeran, M., and Singh, S. Lava: hardware design in Haskell. In *Proceedings of the third ACM SIGPLAN International Conference on Functional programming* (1998), ICFP '98, ACM, pp. 174–184.
- [22] Blotti, A., and Saletti, R. Ultralow-power adiabatic circuit semi-custom design. Very Large Scale Integration (VLSI) Systems, IEEE Transactions on 12, 11 (2004), 1248–1253.

- [23] Briggs, J. S. Generating reversible programs. Software: Practice and Experience 17, 7 (1987), 439–453.
- [24] Bruekers, F., and van den Enden, A. New networks for perfect inversion and perfect reconstruction. Selected Areas in Communications, IEEE Journal on 10, 1 (1992), 129–137.
- [25] Buhrman, H., Tromp, J., and Vitányi, P. Time and space bounds for reversible simulation. Journal of Physics A: Mathematical and General 34, 35 (2001), 6821–6830.
- [26] Burignat, S., Thomsen, M. K., Klimczak, M., Olczak, M., and De Vos, A. Interfacing reversible pass-transistor CMOS chips with conventional restoring CMOS circuits. In *Reversible Computation, RC 2011. Revised Papers* (2012), A. De Vos and R. Wille, Eds., vol. 7165 of *LNCS*, Springer-Verlag, pp. 112–122.
- [27] Burignat, S., Vermeirsch, K., De Vos, A., and Thomsen, M. K. Garbageless reversible implementation of integer linear transformations. In 4th Workshop on Reversible Computing, Preliminary Proceedings (2012), R. Glück and T. Yokoyama, Eds., pp. 187–197.
- [28] Burks, A. W., Goldstine, H. H., and von Neumann, J. Preliminary discussion of the logical design of an electronic computing instrument. Tech. rep., Institute of Advanced Study, U.S. Army, 1947.
- [29] Cezzar, R. The design of a processor architecture capable of forward and reverse execution. In *IEEE Proceedings of the SOUTHEASTCON '91* (1991), vol. 2, IEEE, pp. 885–890.
- [30] Chau, H. F., and Lo, H. K. One-way functions in reversible computations. *Cryptologia* 21, 2 (1997), 139.
- [31] Choi, B.-S., and Van Meter, R. On the effect of quantum interaction distance on quantum addition circuits. *Journal of Emerging Technology* and Computing Systems 7, 3 (2011), 11:1–11:17.
- [32] Clementsen, P. J., Axelsen, H. B., and Glück, R. Reversible coroutines. In Nordic Workshop in Programming Theory '10, Proceedings (2010). Extended Abstract.
- [33] Coecke, B., and Duncan, R. Interacting quantum observables. In Automata, Languages and Programming, L. Aceto, I. Damgård, L. Goldberg, M. Halldórsson, A. Ingólfsdóttir, and I. Walukiewicz, Eds., vol. 5126 of LNCS. Springer-Verlag, 2008, pp. 298–310.
- [34] Coecke, B., and Duncan, R. Interacting quantum observables: categorical algebra and diagrammatics. *New Journal of Physics 13*, 4 (2011), 043016.
- [35] Crescenzi, P., and Papadimitriou, C. H. Reversible simulation of spacebounded computations. *Theoretical Computer Science* 143, 1 (1995), 159–165.

- [36] Cuccaro, S. A., Draper, T. G., Kutin, S. A., and Moulton, D. P. A new quantum ripple-carry addition circuit. arXiv:quant-ph/0410184v1 (2005).
- [37] Daubechies, I., and Sweldens, W. Factoring wavelet transforms into lifting steps. Journal of Fourier Analysis and Applications 4, 3 (1998), 247–269.
- [38] De Vos, A. Reversible computing. Progress in Quantum Electronics 23, 1 (1999), 1–49.
- [39] De Vos, A. Reversible Computing: Fundamentals, Quantum Computing and Applications. WILEY, 2010.
- [40] De Vos, A., Burignat, S., and Thomsen, M. K. Reversible implementation of a discrete integer linear transform. *Journal of Multiple-Valued Logic* and Soft Computing, Special Issue: Reversible Computation 18, 1 (2012), 25–35.
- [41] De Vos, A., and Van Rentergem, Y. Energy dissipation in reversible logic addressed by a ramp voltage. In *Proceedings of the 15th International* Workshop PATMOS (2005), pp. 207–216.
- [42] De Vos, A., Van Rentergem, Y., and De Keyser, K. The decomposition of an arbitrary reversible logic circuit. *Journal of Physics A: Mathematical* and General 39 (2006), 5015–5035.
- [43] Desoete, B., and De Vos, A. A reversible carry-look-ahead adder using control gates. *Integration, the VLSI Journal 33*, 1-2 (2002), 89–104.
- [44] Deutsch, D. Quantum theory, the Church-Turing principle and the universal quantum computer. Proceedings of the Royal Society of London. Series A, Mathematical and Physical Sciences 400, 1818 (1985), 97–117.
- [45] Draper, T. G., Kutin, S. A., Rains, E. M., and Svore, K. M. A logarithmic-depth quantum carry-lookahead adder. arXiv (2008).
- [46] Feinstein, D., Thornton, M., and Nair, V. Prefix parallel adder virtual implementation in reversible logic. In *Region 5 Technical Conference* (2007), IEEE, pp. 74–80.
- [47] Feynman, R. Simulating physics with computers. International Journal of Theoretical Physics 21, 6 (1982), 467–488.
- [48] Feynman, R. P. Quantum mechanical computers. Optics News 11 (1985), 11–20.
- [49] Feynman, R. P. Feynman Lectures on Computation. Addison-Wesley, 1996.
- [50] Fijany, A., and Williams, C. Quantum wavelet transforms: Fast algorithms and complete circuits. In *Quantum Computing and Quantum Communications* (1999), C. Williams, Ed., vol. 1509 of *LNCS*, Springer-Verlag, pp. 10–33.

- [51] Fischer, J., Amirante, E., Bargagli-Stoffi, A., and Schmitt-Landsiedel, D. Adiabatic circuits: converter for static cmos signals. *Advances in Radio Science* 1 (2003), 247–251.
- [52] Frank, M. P. Reversibility for Efficient Computing. PhD thesis, MIT, EECS, 1999.
- [53] Frank, M. P. Common mistakes in adiabatic logic design and how to avoid them. In *Proceedings of the International Conference on Embedded* Systems and Applications. ESA'03 (2003), H. Arabnia and L. Yang, Eds., CSREA Press, pp. 216–222.
- [54] Fredkin, E., and Toffoli, T. Design principles for achieving highperformance submicron digital technologies. Tech. rep., Proposal to DARPA, 1978.
- [55] Fredkin, E., and Toffoli, T. Conservative logic. International Journal of Theoretical Physics 21, 3-4 (1982), 219–253.
- [56] Galatsis, K., Khitun, A., Ostroumov, R., Wang, K. L., Dichtel, W. R., Plummer, E., Stoddart, J. F., Zink, J. I., Lee, J. Y., Xie, Y.-H., and Kim, K. W. Alternate state variables for emerging nanoelectronic devices. *IRE Transactions on Nanotechnology 8* (2008), 66–75.
- [57] Gay, S. J. Quantum programming languages: survey and bibliography. Mathematical Structures in Computer Science 16, 04 (2006), 581–600.
- [58] Gershenfeld, N. Signal entropy and the thermodynamics of computation. IBM Systems Journal 35, 3.4 (1996), 577–586.
- [59] Glück, R., and Kawabe, M. A method for automatic program inversion based on LR(0) parsing. *Fundamenta Informaticae* 66, 4 (2005), 367–395.
- [60] Green, A. S., and Altenkirch, T. From reversible to irreversible computations. *Electronic Notes in Theoretical Computer Science 210* (2008), 65–74. Proceedings of the 4th International Workshop on Quantum Programming Languages (QPL 2006).
- [61] Große, D., Chen, X., Dueck, G. W., and Drechsler, R. Exact SAT-based Toffoli network synthesis. In *Proceedings of the 17th ACM Great Lakes* symposium on VLSI (2007), GLSVLSI '07, ACM, pp. 96–101.
- [62] Gupta, P., Agrawal, A., and Jha, N. K. An algorithm for synthesis of reversible logic circuits. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on 25, 11 (2006), 2317–2330.
- [63] Hall, J. S. An electroid switching model for reversible computer architectures. In In Proc. of the Workshop on Physics and Computation (1993), IEEE Press, pp. 237–247.
- [64] Hall, J. S. A reversible instruction set architecture and algorithms. In Proceedings Workshop on Physics and Computation. PhysComp '94 (1994), pp. 128–134.

- [65] Huelsbergen, L. A logically reversible evaluator for call-by-name lambda calculus. In Workshop on Physics and Computation, 1996. PhysComp '96, Proceedings (1996), T. Toffoli, M. Biafore, and L. J., Eds., IEEE.
- [66] Huffman, D. A. Canonical forms for information-lossless finite-state logical machines. *IRE Transactions on Information Theory* 5, 5 (1959), 41–59.
- [67] Jacopini, G., Mentrasti, P., and Sontacchi, G. Reversible Turing machines and polynomial time reversible computable functions. SIAM Journal of Discrete Mathematics 3, 2 (1990), 241–254.
- [68] James, R. P., and Sabry, A. Information effects. In Proceedings of the 39th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages (2012), POPL '12, ACM, pp. 73–84.
- [69] James, R. P., and Sabry, A. Isomorphic interpreters from logically reversible abstract machines. In 4th Workshop on Reversible Computing, Preliminary Proceedings (2012), R. Glück and T. Yokoyama, Eds., pp. 63–75.
- [70] Jensen, K. B., and Jensen, S. B. Reversible sorting algorithms in janus. DIKU student report, 2011.
- [71] Jones, G., and Sheeran, M. Circuit design in Ruby. In *Formal Methods* for VLSI Design (1990), Elsevier Science Publishers.
- [72] Khan, M. H., and Perkowski, M. A. Quantum ternary parallel adder/subtractor with partially-look-ahead carry. *Journal of Systems Architecture* 53 (2007), 453–464.
- [73] Kluge, W. A reversible SE(M)CD machine. In Implementation of Functional Languages, P. Koopman and T. Clack, Chrisffoli, Eds., vol. 1868 of LNCS. Springer-Verlag, 2000, pp. 95–113.
- [74] Koller, J., and Athas, W. Adiabatic switching, low energy computing, and the physics of storing and erasing information. In Workshop on Physics and Computation. PhysComp '92. (1992), pp. 267–270.
- [75] Kowada, L. A. B., Portugal, R., and Figueiredo, C. M. H. Reversible Karatsuba's algorithm. *Journal of Universal Computer Science* 12, 5 (2008), 499–511.
- [76] Kramer, A., Denker, J. S., Flower, B., and Moroney, J. 2nd order adiabatic computation with 2n-2p and 2n-2n2p logic circuits. In *Proceed*ings of the 1995 international symposium on Low power design (1995), ISLPED '95, ACM, pp. 191–196.
- [77] Kutrib, M., and Malcher, A. Reversible pushdown automata. In Language and Automata Theory and Applications, A. Dediu, H. Fernau, and C. Martín-Vide, Eds., vol. 6031 of LNCS. Springer-Verlag, 2010, pp. 368– 379.

- [78] Lambson, B., Carlton, D., and Bokor, J. Exploring the thermodynamic limits of computation in integrated systems: Magnetic memory, nanomagnetic logic, and the Landauer limit. *Phys. Rev. Lett.* 107 (2011), 010604.
- [79] Landauer, R. Irreversibility and heat generation in the computing process. IBM Journal of Research and Development 5, 3 (1961), 183–191.
- [80] Landauer, R. Information is physical. *Physics Today* 44, 5 (1991), 23–29.
- [81] Landauer, R. Zig-zag path to understanding. In Workshop on Physics and Computation, 1994. PhysComp '94, Proceedings (1994), IEEE, pp. 54–59.
- [82] Landin, P. J. The mechanical evaluation of expressions. The Computer Journal 6, 4 (1964), 308–320.
- [83] Lange, K., McKenzie, P., and Tapp, A. Reversible space equals deterministic space. Journal of Computer and System Sciences 60, 2 (2000), 354–367.
- [84] Lecerf, Y. Machines de Turing réversible. Comptes Rendus Hebdomadaires des Séances de l'Académie des Sciences 257 (1963), 2597–2600.
- [85] Lehnfeld, M. Design and translation of a description langauge for reversible hardware. DIKU student report, 2010.
- [86] Levine, R. Y., and Sherman, A. T. A note on Bennett's time space tradeoff for reversible computation. SIAM Journal on Computing 19, 4 (1990), 673–677.
- [87] Likharev, K. K. Superconductor digital electronics. *Physica C: Super*conductivity (2012). Accepted manuscript.
- [88] Lines, A. Pipelined asynchronous circuits. Tech. Rep. CaltechCSTR:1998.cs-tr-95-21, California Institute of Technology, 1998.
- [89] Lines, A. Asynchronous interconnect for synchronous SoC design. Micro, IEEE 24, 1 (2004), 32–41.
- [90] Lutz, C., and Derby, H. Janus: A time-reversible language. A letter to R. Landauer. http://tetsuo.jp/ref/janus.pdf, 1986.
- [91] Madsen, F. M., and Poulsen, D. R. Reversible matrix multiplication in janus. DIKU student report, 2011.
- [92] Maslov, D., Dueck, G. W., and Miller, D. M. Fredkin/Toffoli templates for reversible logic synthesis. In *Proceedings of the 2003 IEEE/ACM International Conference on Computer Aided Design* (2003), ACM Press, pp. 256–261.
- [93] Maslov, D., Dueck, G. W., and Miller, D. M. Synthesis of Fredkin-Toffoli reversible networks. *IEEE Transactions on Very Large Scale Integration* (VLSI) Systems 13, 6 (2005), 765–769.

- [94] Maslov, D., Dueck, G. W., and Miller, D. M. Techniques for the synthesis of reversible Toffoli networks. ACM Trans. Des. Autom. Electron. Syst. 12, 4 (2007).
- [95] Maslov, D., and Saeedi, M. Reversible circuit optimization via leaving the boolean domain. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on 30, 6 (2011), 806–816.
- [96] Matos, A. B. Linear programs in a simple reversible language. Theoretical Computer Science 290, 3 (2003), 2063–2074.
- [97] Matsuda, K., Mu, S.-C., Hu, Z., and Takeichi, M. A grammar-based approach to invertible programs. In *Programming Languages and Systems*, A. Gordon, Ed., vol. 6012 of *LNCS*. Springer-Verlag, 2010, pp. 448–467.
- [98] Mead, C., and Conway, L. Introduction to VLSI Systems, second ed. Addison-Wesley, 1980.
- [99] Merkle, R. C. Reversible electronic logic using switches. Nanotechnology 4, 1 (1993), 21–40.
- [100] Micheli, G. D. Synthesis and Optimization of Digital Circuits. McGraw-Hill Higher Education, 1994.
- [101] Miller, D. M., Maslov, D., and Dueck, G. W. A transformation based algorithm for reversible logic synthesis. In *Design Automation Conference*, 2003. Proceedings (2003), pp. 318–323.
- [102] Miller, D. M., Wille, R., and Drechsler, R. Reducing reversible circuit cost by adding lines. In *Multiple-Valued Logic (ISMVL)*, 2010 40th IEEE International Symposium on (2010), pp. 217–222.
- [103] Mogensen, T. Æ. Semi-inversion of guarded equations. In *Generative Programming and Component Engineering*, R. Glück and M. Lowry, Eds., vol. 3676 of *LNCS*. Springer-Verlag, 2005, pp. 189–204.
- [104] Mogensen, T. Æ. Semi-inversion of functional parameters. In Proceedings of the 2008 ACM SIGPLAN symposium on Partial evaluation and semantics-based program manipulation (2008), PEPM '08, ACM, pp. 21– 29.
- [105] Mogensen, T. Æ. Partial evaluation of the reversible language janus. In Proceedings of the 20th ACM SIGPLAN workshop on Partial evaluation and program manipulation (2011), PEPM '11, ACM, pp. 23–32.
- [106] Mogensen, T. Æ. Partial evaluation of janus part 2: Assertions and procedures. In *Perspectives of Systems Informatics*, E. Clarke, I. Virbitskaite, and A. Voronkov, Eds., vol. 7162 of *LNCS*. Springer-Verlag, 2012, pp. 289–301.
- [107] Morita, K. A simple universal logic element and cellular automata for reversible computing. In *Third International Conference on Machines*, *Computations, and Universality. MCU 2001* (2001), M. Margenstern and Y. Rogozhin, Eds., vol. 2055 of *LNCS*, Springer-Verlag, pp. 102–113.

- [108] Morita, K. Reversible computing and cellular automata-a survey. Theoretical Computer Science 395, 1 (2008), 101–131.
- [109] Morita, K., Ogiro, T., Tanaka, K., and Kato, H. Classification and universality of reversible logic elements with one-bit memory. *Lecture Notes in Computer Science* 3354 (2005), 245–256.
- [110] Morita, K., and Yamaguchi, Y. A universal reversible Turing machine. In 5th International Conference on Machines, Computations, and Universality, MCU 2007 (2007), vol. 4664 of LNCS, pp. 90–98.
- [111] Mu, S.-C., Hu, Z., and Takeichi, M. An injective language for reversible computation. In *Mathematics of Program Construction* (2004), LNCS, Springer-Verlag, pp. 289–313.
- [112] Offermann, S., Wille, R., and Drechsler, R. Efficient realization of control logic in reversible circuits. In *Specification & Design Languages*, FDL 2011. Forum on (2011), pp. 1–7.
- [113] Offermann, S., Wille, R., Dueck, G. W., and Drechsler, R. Synthesizing multipliers in reversible logic. In 13th IEEE Symposium on Design and Diagnostics of Electronic Circuits and Systems (2010), IEEE, pp. 335– 340.
- [114] Orlov, A. O., Lent, C. S., Thorpe, C. C., Boechler, G. P., and Snider, G. L. Experimental test of Landauer's principle at the sub-k<sub>b</sub>t level. Japanese Journal of Applied Physics 51 (2012), 06FE10.
- [115] Park, S., and Im, H. A calculus for hardware description. Journal of Functional Programming 21, 01 (2011), 21–58.
- [116] Peres, A. Reversible logic and quantum computing. *Physical Review A* 32, 6 (1985), 3266–3276.
- [117] Perkowski, M., Jozwiak, L., Kerntopf, P., Mishchenko, A., Al-Rabadi, A., Coppola, A., Buller, A., Song, X., Khan, M. H., Yanushkevich, S., Shmerko, V. P., and Chrzanowska-Jeske, M. A general decomposition for reversible logic. *Proc. Int'l Workshop on Applications of Reed-Muller Expansion in Circuit Design* (2001), 119–138.
- [118] Pham, P., and Svore, K. M. A 2D nearest-neighbor quantum arithmetic for factoring. In 4th Workshop on Reversible Computing, Preliminary Proceedings (2012), R. Glück and T. Yokoyama, Eds., pp. 158–170.
- [119] Pin, J. On reversible automata. In LATIN '92 (1992), I. Simon, Ed., vol. 583 of LNCS, Springer-Verlag, pp. 401–416.
- [120] Rayner, M. R., and Newman, D. J. On the symmetry of logic. Journal of Physics A: Mathematical and General 28, 19 (1995), 5623.
- [121] Reilly, Jr., E. D., and Federighi, F. D. On reversible subroutines and computers that run backwards. *Communications of the ACM 8*, 9 (1965), 557–558.

- [122] Ressler, A. L. The design of a conservative logic computer and a graphical editor simulator. Master's thesis, MIT, EECS, 1981.
- [123] Rosenbaum, D. J. Optimal quantum circuits for nearest-neighbor architectures. arXiv:1205.0036v2 (2012).
- [124] Seitz, C. L., Frey, A. H., Mattisson, S., Rabin, S. D., Speck, D. A., and van de Snepscheut, J. L. A. Hot clock nmos. Tech. rep., California Institute of Technology, 1985.
- [125] Shannon, C. E. A mathematical theory of communication. The Bell System Technical Journal 27 (1948), 379–423.
- [126] Sheeran, M. muFP, a language for VLSI design. In Proceedings of the 1984 ACM Symposium on LISP and functional programming (1984), LFP '84, ACM, pp. 104–112.
- [127] Sheeran, M. Hardware design and functional programming: a perfect match. Journal of Universal Computer Science 11, 7 (2005), 1135–1158.
- [128] Shende, V. V., Prasad, A. K., Markov, I. L., and Hayes, J. P. Reversible logic circuit synthesis. In *Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design* (2002), ICCAD '02, ACM, pp. 353–360.
- [129] Shor, P. W. Algorithms for quantum computation: discrete logarithms and factoring. In Foundations of Computer Science, 1994 Proceedings., 35th Annual Symposium on (1994), IEEE, pp. 124–134.
- [130] Skoneczny, M., Van Rentergem, Y., and De Vos, A. Reversible Fourier transform chip. In Proceedings of the 15th International Conference on Mixed Design of Integrated Circuits and Systems (2008), IEEE, pp. 281– 286.
- [131] Soeken, M., Wille, R., and Drechsler, R. Hierarchical synthesis of reversible circuits using positive and negative Davio decomposition. In *Design and Test Workshop (IDT), 2010 5th International* (2010), pp. 143– 148.
- [132] Sparsø, J. Asynchronous circuit design A tutorial. Technical University of Denmark, 2006.
- [133] Srivastava, S., Gulwani, S., Chaudhuri, S., and Foster, J. S. Path-based inductive synthesis for program inversion. In *Proceedings of the 32nd* ACM SIGPLAN conference on Programming language design and implementation (2011), PLDI '11, ACM, pp. 492–503.
- [134] Stergiou, S., Daskalakis, K., and Papakonstantinou, G. A fast and efficient heuristic ESOP minimization algorithm. In *Proceedings of the 14th* ACM Great Lakes symposium on VLSI (2004), pp. 78–81.
- [135] Storme, L., De Vos, A., and Jacobs, G. Group theoretical aspects of reversible logic gates. *Journal of Universal Computer Science* 5, 5 (1999), 307–321.

- [136] Szilard, L. Über die Entropieverminderung in einem thermodynamischen System bei Eingriffen intelligenter Wesen. Zeitschrift für Physik A Hadrons and Nuclei 53 (1929), 840–856. 10.1007/BF01341281.
- [137] Thomsen, M. K. Describing and optimizing reversible logic using a functional language. In *Implementation and Application of Functional Languages, 23rd International Workshop, IFL 2012* (2012), A. Gill and J. Hage, Eds., LNCS. To appear.
- [138] Thomsen, M. K. Design of reversible logic circuits using standard cells / standard cells and functional programming. Tech. Rep. 2012-03, DIKU, Department of Computing Science, University of Copenhagen, 2012.
- [139] Thomsen, M. K. A functional language for describing reversible logic. In Specification & Design Languages, FDL 2012. Forum on (2012). To appear.
- [140] Thomsen, M. K., and Axelsen, H. B. Parallelization of reversible ripplecarry adders. *Parallel Processing Letters* 19, 1 (2009), 205–222.
- [141] Thomsen, M. K., Axelsen, H. B., and Glück, R. A reversible processor architecture and its reversible logic design. In *Reversible Computation*, *RC 2011. Revised Selected Papers* (2012), A. De Vos and R. Wille, Eds., vol. 7165 of *LNCS*, Springer-Verlag, pp. 30–42.
- [142] Thomsen, M. K., and Glück, R. Optimized reversible binary-coded decimal adders. *Journal of Systems Architecture* 54, 7 (2008), 697–706.
- [143] Thomsen, M. K., Glück, R., and Axelsen, H. B. Reversible arithmetic logic unit for quantum arithmetic. *Journal of Physics A: Mathematical* and Theoretical 43, 38 (2010), 382002.
- [144] Toffoli, T. Reversible computing. In ICALP (1980), J. W. de Bakker and J. van Leeuwen, Eds., vol. 85 of LNCS, Springer-Verlag, pp. 632–644.
- [145] Toffoli, T. Bicontinuous extensions of invertible combinatorial functions. Theory of Computing Systems 14 (1981), 13–23. 10.1007/BF01752388.
- [146] Van Rentergem, Y., and De Vos, A. Optimal design of a reversible full adder. International Journal of Unconventional Computing 1, 4 (2005), 339–355.
- [147] Van Rentergem, Y., and De Vos, A. Synthesis and optimization of reversible circuits. In *Proceedings of the Reed-Muller Workshop 2007* (2007), Reed-Muller, Ed., pp. 67–75.
- [148] Vedral, V., Barenco, A., and Ekert, A. Quantum networks for elementary arithmetic operations. *Physical Review A* 54, 1 (1996), 147–153.
- [149] Vetuli, A., Pascoli, S., and Reyneri, L. Positive feedback in adiabatic logic. *Electronics Letters 32*, 20 (1996), 1867–1869.
- [150] Vieri, C. J. Pendulum: A reversible computer architecture. Master's thesis, MIT, EECS, 1995.

- [151] Vieri, C. J. Reversible Computer Engineering and Architecture. PhD thesis, MIT, EECS, 1999.
- [152] Vitányi, P. Time, space, and energy in reversible computing. In CF '05: Proceedings of the 2nd conference on Computing Frontiers (2005), ACM, pp. 435–444.
- [153] Von Neumann, J. Theory of Self-Reproducing Automata. University of Illinois Press, 1966.
- [154] Welser, J., Bourianoff, G., Zhirnov, V., and Cavin, R. The quest for the next information processing technology. *Journal of Nanoparticle Research* 10 (2008), 1–10.
- [155] Wille, R., and Drechsler, R. BDD-based synthesis of reversible logic for large functions. In *Design Automation Conference*, 2009. DAC '09. 46th ACM/IEEE (2009), pp. 270–275.
- [156] Wille, R., and Drechsler, R. Towards a Design Flow for Reversible Logic. Springer Science, 2010.
- [157] Wille, R., Le, H. M., Dueck, G. W., and Große, D. Quantified synthesis of reversible logic. In DATE '08: Proceedings of the conference on Design, automation and test in Europe (2008), ACM, pp. 1015–1020.
- [158] Wille, R., Offermann, S., and Drechsler, R. SyReC: A programming language for synthesis of reversible circuits. In *Specification & Design Languages, FDL 2010. Forum on* (2010), IET, pp. 1–6.
- [159] Wille, R., Soeken, M., and Drechsler, R. Reducing the number of lines in reversible circuits. In *Proceedings of the 47th Design Automation Conference* (2010), DAC '10, ACM, pp. 647–652.
- [160] Wille, R., Soeken, M., Große, D., Schönborn, E., and Drechsler, R. Designing a RISC CPU in reversible logic. In 41st IEEE International Symposium on Multiple-Valued Logic (ISMVL) (2011), IEEE, pp. 170–175.
- [161] Yokoyama, T., Axelsen, H. B., and Glück, R. Principles of a reversible programming language. In *Conference on Computing Frontiers. Proceed*ings (2008), ACM Press, pp. 43–54.
- [162] Yokoyama, T., Axelsen, H. B., and Glück, R. Optimizing reversible simulation of injective functions. *Multiple-Valued Logic and Soft Computing* (2011).
- [163] Yokoyama, T., Axelsen, H. B., and Glück, R. Towards a reversible functional language. In *RC 2011. Revised Selected Papers* (2012), A. De Vos and R. Wille, Eds., LNCS, Springer-Verlag.
- [164] Yokoyama, T., and Glück, R. A reversible programming language and its invertible self-interpreter. In *Partial Evaluation and Program Manip*ulation. Proceedings (2007), ACM Press, pp. 144–153.

- [165] Younis, S. G., and Knight, J. T. F. Asymptotically zero energy computing split-level charge recovery logic. *International Workshop on Low Power Design* (1994), 177–182.
- [166] Zhirnov, V. V., Cavin, R. K. I., Hutchby, J. A., and Bourianoff, G. I. Limits to binary logic switch scaling - a gedanken model. *Proceedings of the IEEE 91*, 11 (2003), 1934–1939.

### Papers on Gate-Level Designs of Arithmetic Functions



This appendix contains three papers relating to reversible logic design of arithmetic circuits.

- Paper A1: Thomsen, M.K., Glück, R.: Optimized Reversible Binary-Coded Decimal Adders. In: Journal of Systems Architecture, vol. 54, issue 7, pp. 697–706, 2008. © Elsevier B.V. 2008
- Paper A2: Thomsen, M.K., Axelsen, H.B.: Parallelization of Reversible Ripple-Carry Adders. In: Parallel Processing Letters 19(2), pp. 205–222, 2009.
  © 2009 World Scientific Publishing Company.

Preprint version.

Paper A3: Axelsen, H.B., Thomsen, M.K.: Garbage-Free Integer Multiplication with Constants. In: Glück, R., Yokoyama, T. (eds.) 4th Workshop on Reversible Computation (RC), Preliminary Proceedings, pp. 198–204, 2012.

#### Papers on Reversible Architectures

Β

This appendix contains two papers relating to design of reversible architectures.

- Paper B1: Thomsen, M.K., Glück, R., Axelsen, H.B.: Reversible arithmetic logic unit for quantum arithmetic. Journal of Physics A: Mathematical and Theoretical 43(38), 382002(10pp), 16 Aug 2010. © 2010 IOP Publishing Ltd.
- Paper B2: Thomsen, M.K., Axelsen, H.B., Glück, R.: A reversible processor architecture and its reversible logic design. In: De Vos, A., Wille, R. (eds.) Reversible Computation, RC2011, Revised Papers. LNCS, vol. 7165, pp. 30–42, 2012. © Springer-Verlag Berlin Heidelberg 2012.

## Papers on Implementation of Reversible Linear Transforms

# C

This appendix contains two papers relating to design of linear transforms in reversible logic.

Paper C1: De Vos, A., Burignat, S., Thomsen, M.K.: Reversible Implementation of a Discrete Integer Linear Transform. In: Journal of Multiple-Valued Logic and Soft Computing, Special Issue: Reversible Computation, vol. 18, issue 1, pp. 25–35, 2012. © Old City Publishing

Preprint version. This paper is the journal version of the paper from the 2nd Workshop on Reversible Computation, RC2010.

Paper C2: Burignat, S., Vermeirsch, K., De Vos, A., Thomsen, M.K.: Garbageless Reversible Implementation of Integer Linear Transformations. In: Glück, R., Yokoyama, T. (eds.) 4th Workshop on Reversible Computation, Preliminary Proceedings, pp. 198–204, 2012.

# Papers on Design Languages for Reversible Logic

# D

This appendix contains two papers on description languages for reversible logic.

Paper D1: Thomsen, M.K.: Describing and Optimising Reversible Logic using a Functional Language. In: Gill, A., Hage, J. (eds.) Implementation and Application of Functional Language, IFL2011. LNCS, 2012.
© Springer-Verlag Berlin Heidelberg 2012.

Preprint version.

Paper D2: Thomsen, M.K.: A Functional Language for Describing Reversible Logic. In: Forum on Specification & Design Languages, 2012 Close-to-preprint version.

### Papers on Engineering of Reversible Circuits

This appendix contains two papers on engineering of reversible circuits.

E

- Paper D1: Thomsen, M.K.: Design of Reversible Logic Circuits using Standard Cells / Standard Cells and Functional Programming. DIKU technical report. No. 2012-03, ISSN 0107-8283, 2012.
- Paper D2: Burignat, S., Thomsen, M.K., Klimczak, M., Olczak, M., De Vos, A.: Interfacing Reversible Pass-Transistor CMOS Chips with Conventional Restoring CMOS Circuits. In: De Vos, A., Wille, R. (eds.) Reversible Computation, RC2011, Revised Papers. LNCS, vol. 7165, pp. 112–122, 2012. © Springer-Verlag Berlin Heidelberg 2012.