• Open Access

Practical Quantum Error Mitigation for Near-Future Applications

Suguru Endo1, Simon C. Benjamin1, and Ying Li2,1,*

  • 1Department of Materials, University of Oxford, Oxford OX1 3PH, United Kingdom
  • 2Graduate School of China Academy of Engineering Physics, Beijing 100193, China

  • *ying.li.phys@gmail.com

Phys. Rev. X 8, 031027 – Published 26 July, 2018

DOI: https://doi.org/10.1103/PhysRevX.8.031027

Abstract

It is vital to minimize the impact of errors for near-future quantum devices that will lack the resources for full fault tolerance. Two quantum error mitigation (QEM) techniques have been introduced recently, namely, error extrapolation [Y. Li and S. C. Benjamin, Phys. Rev. X 7, 021050 (2017); K. Temme et al., Phys. Rev. Lett. 119, 180509 (2017)] and quasiprobability decomposition [K. Temme et al., Phys. Rev. Lett. 119, 180509 (2017)]. To enable practical implementation of these ideas, here we account for the inevitable imperfections in the experimentalist’s knowledge of the error model itself. We describe a protocol for systematically measuring the effect of errors so as to design efficient QEM circuits. We find that the effect of localized Markovian errors can be fully eliminated by inserting or replacing some gates with certain single-qubit Clifford gates and measurements. Finally, having introduced an exponential variant of the extrapolation method we contrast the QEM techniques using exact numerical simulation of up to 19 qubits in the context of a “swap” test circuit. Our optimized methods dramatically reduce the circuit’s output error without increasing the qubit count.

Physics Subject Headings (PhySH)

Popular Summary

Article Text

I. INTRODUCTION

Controlling noise in quantum systems is crucial for the development of practical technologies. Such noise can occur due to unwanted interactions of a passive qubit with the environment, or due to imperfections in the use of circuit elements that compose the algorithm (qubit initialization, gates, and measurement). In all cases the result is errors occurring at the level of physical qubits. The theory of quantum fault tolerance (QFT) reveals that the introduction of logical qubits, composed of numerous physical qubits, can allow one to detect and correct errors at the physical level; however, this capacity comes at an enormous multiplicative cost in resources. A recent estimate suggests that a Shor algorithm operating on a few thousand logical qubits would require several million physical qubits . While it is encouraging to know that such techniques exist, hardware on this scale is probably at least a decade away. The timely (indeed, urgent) question is, to what extent can we control the impact of errors in computing devices that are too small to support full QFT?

It may prove to be the case that deep quantum algorithms, such as Shor’s factoring algorithm and Grover’s search algorithm, cannot be successfully executed on classically intractable problems without the support of QFT. However, fortunately there are other algorithms of potential practical significance that focus on shallow circuits, with the output typically being fed into a classical supervising algorithm so as to form a hybrid system. Such approaches have been proposed for the simulation to aid discovery in chemistry and materials science; see Refs.  for examples. Hybrid systems may be capable of yielding significant results, surpassing conventional computers, even when finite error rates are present because of their error resilience . In order to achieve this it is desirable to suppress or mitigate errors to the greatest extent possible while keeping the qubit count ideally unchanged, or increasing only modestly compared to the high cost of full QFT.

Recently two techniques were introduced for quantum error mitigation (QEM) in generic hybrid quantum algorithms where the expected value of an observable—say, a -basis measurement of a given qubit—is the quantity of interest. The goal is to estimate the value that this observable would take given an error-free circuit, despite the reality that the real experimental system cannot perform operations with less than a certain error rate. Reference  introduced a hybrid algorithm simulating quantum dynamics, which featured an active error minimization technique involving extrapolation. The experimentalist would execute the circuit with all errors at their minimum achievable severity, obtain the expected value of the observable, and then repeat the exercise having deliberately increased the physical error rate (or having applied additional quantum gates to achieve the same effect). By noting the effect of the increased errors on the observable, the experimentalist would be able to make an extrapolated estimate of the zero-error value, presuming that the error sources had scaled proportionately. The technique was found to be very advantageous in the numerical simulations of few-qubit experiments presented in that paper (see, e.g., Fig. 5 in Ref. ).

A paper that appeared online at almost the same time was Ref.  by the IBM-based team of Temme, Bravyi, and Gambetta. That Letter presented a comprehensive analysis of the extrapolation technique, which the authors had independently conceived, and moreover, it introduced a second technique with using (what we will call) a “quasiprobability” formalism. The authors explained that by replacing operations in the quantum circuit and assigning parity to each operation following a certain probability distribution dependent on the noise, an experimentalist can obtain the unbiased estimator, at the cost of an increase in the variance. Their method was shown to be applicable to specific noise types, including homogeneous depolarizing errors and damping errors. The authors found both methods to be promising in few-qubit numerical simulations (see, e.g., Fig. 2 in Ref. ).

As exciting as these studies were, open questions remained to be answered before these two techniques could be considered to be fully practical. First, both techniques rely on the full knowledge of the error model, whereas an experimentalist will have imperfect knowledge and the real noise will generally differ from the canonical types considered in these first papers. Second, we need an explicit method to derive the QEM circuits, i.e., a specification of how to algorithmically increase the error rate in the error extrapolation or how to sample circuits in the quasiprobability decomposition. In this paper, we solve these two problems. We find that gate set tomography (GST) provides sufficient information to enable full elimination of the impact of localized Markovian errors. As with other process tomography protocols, GST cannot determine the exact physical error model due to noise associated with state preparation and measurement. However, we determine that preparation and measurement noise in GST is not harmful to the overall QEM approach. We also find that single-qubit Clifford gates and measurements are universal in computing expected values. Each quantum operation is a linear map, and single-qubit Clifford gates and measurements yield a complete set of linearly independent maps (quantum operations). Therefore, any error can be simulated or subtracted by decomposing the error using the complete operation set, which is the standard linear decomposition. We prove that, by combining GST and the complete set decomposition, any localized and Markovian errors in the quantum computer can be systemically mitigated, so that the error in the final computational output is only due to unbiased statistical fluctuation.

For the quasiprobability method, we provide an upper bound of the cost in QEM, and we describe the utility of “twirling” operations in minimizing this cost. For the extrapolation method, which is a relatively straightforward technique, our optimization is to observe that typically for the classes of noise most common in experiments it is appropriate to assume that the expected value of the observable will decay exponentially with the severity of the circuit noise. Adopting this underlying assumption, rather than a polynomial (e.g., linear) fit, proves to be quite advantageous.

Having thus optimized both the quasiprobability and the extrapolation techniques, we make a series of numerical simulations to study their efficacy. We opt for a specific circuit, a realization of the “swap test” that is often employed in quantum algorithms as a means for estimating the similarity of quantum states . Our swap test operates on qubits, and we simulate a total of 15 qubits over a comprehensive set of cases as well as 19 qubits for specific cases. We numerically simulate the actions of the experimentalist, who must perform many circuit trials in order to make a single estimate of the observable (we choose trials). But in order to evaluate our QEM techniques we must then repeat this entire process to determine the distribution of values that the experimentalist might obtain. We perform at least repetitions so that the distribution becomes clear; thus, at least individual numerical experiments are performed for each of the curves that we presently report.

II. ERROR MITIGATION

In this paper, we focus on computing the expected value of an observable in a state (the final state of a quantum circuit) using a quantum computer. It is typical of a number of quantum algorithms and subroutines that the desired output is the expected value of a qubit or qubits—the swap test itself, which is a component of algorithms including the recently introduced autoencoder , and several proposed hybrid algorithms for simulating chemical or materials systems .

Without using QEM as shown in Fig. , the quantum circuit is repeated for many times, and the measurement outcome of each time is collected. Then, we can calculate the average as our best estimate of the expected value. Given that the number of repetitions is finite, the value of is a random variable with an associated distribution. Because the implementation of the quantum circuit is imperfect, it is likely that the distribution of is not even centered at the ideal value, i.e., the exact expected value when the quantum circuit is perfectly implemented without error.

FIG. 1.

Quantum computing of the expected value of an observable (a) without quantum error mitigation (QEM) and (b) with QEM. In QEM circuit (b), each operation (including the memory operation) in the original circuit (a) is replaced by an operation depending on the corresponding random numbers [see Fig. ].

When we use QEM as shown in Fig. , instead of the original quantum circuit, we implement a set of modified circuits. The scheme depicted in the figure is relevant to the quasiprobability method for QEM, but can also apply to the extrapolation method as a means to deliberately boost errors. Each modified circuit is determined by a set of random numbers . The distribution of random numbers, i.e., modified circuits, depends on the error model, which is measured using GST before the quantum computing. In each run of the quantum experiment, firstly the random number set is generated, then depending on a specific circuit is implemented, and finally the measurement outcome is collected. Rather than calculating the average , we use both and to calculate the average of an effective outcome , which is given explicitly later. If QEM is successful, the distribution of is centered at the ideal value, but the distribution is wider than because of the factor , which is greater than 1 and can be efficiently computed. Thus, only error due to the statistical fluctuation remains, although it is amplified. By repeating the quantum experiment enough times, we can obtain an accurate computing result of the expected value.

In Sec. , we explicitly give the effective outcome and the factor . Modified circuits and their distribution are given in Sec. .

III. NOTATION FOR STATES, OPERATORS, AND OPERATIONS

We use the notation commonly used in quantum tomography (e.g., in Refs. ).

In quantum theory, a quantum state is usually represented by a density matrix , and an observable is represented by a Hermitian operator . The expected value of the observable quantity in the state is . An operation is a map on the space of states, , expressed in the Kraus form.

Because an operation is a linear map, we can always express the operation as a square matrix, e.g., using the Pauli transfer matrix representation, acting on the state expressed as a column vector . Similarly, an observable can be expressed as a row vector , and the expected value is . Throughout this paper, we use the Pauli transfer matrix representation; see Appendix  for details. In quantum tomography, usually we focus on observables that are positive-operator valued measure (POVM) operators, which is not necessary here.

In the Pauli transfer matrix representation, vectors representing states or observables and matrices representing operations are all real. For qubits, vectors and matrices are -dimensional. The expected value of the observable in the state going through a sequence of operations reads as follows:

IV. QUANTUM COMPUTING BY SAMPLING CIRCUITS

We suppose that the initial state is , which goes through a sequence of operations , and in the final state the observable is measured. Each time the experimentalist implements this circuit, the measurement returns an eigenvalue of , and the probability distribution of eigenstates is determined by the final state. By repeating such a circuit for many times, we can estimate the expected value , where , and is the measurement outcome. Generally in this paper we use the superscript 0 to denote the ideal noise-free realization of a state, operation, or observable quantity.

In the case that the quantum computation has errors, the actual initial state is , actual operations are , and the actually measured observable is . As a result, the estimation of the expected value converges to rather than . Here, , and we have assumed that errors are Markovian.

The central idea introduced by the IBM team in Ref.  is that one can exactly compensate for the effect of errors by sampling from a set of (real, error-burdened) circuits, each labeled for , provided that their outputs satisfy

Reference  describes how the real numbers , which represent quasiprobabilities, can be efficiently derived given specific error models, assuming that the experimentalist has full knowledge of the model. Note that each denotes the total operation composed by a sequence of operations in the th circuit.

We can use the Monte Carlo method to compute . We note that , where is the measurement outcome in the th circuit. Then . To compute , we randomly choose a circuit to implement, and the th circuit is chosen with the probability , where . Then, the computing result is given by the expected value of effective measurement outcomes, i.e., , where the effective outcome is if the th circuit is chosen to be implemented, and is the outcome directly obtained in the th circuit.

V. PER-OPERATION ERROR CORRECTION

We can correct errors in each operation using the quasiprobability method, which is the primary focus for the following several sections. We suppose that we have a set of initial states satisfying , and a set of operations satisfying for each error-free operation , and a set of observables satisfying . Then, computing with error mitigation can be expressed as

(1)

When we sample circuits to compute , the initial state is with probability , the th operation is with probability , and the observable is with probability . Here, , and accordingly. To calculate , we use .

VI. VARIANCE AMPLIFICATION IN QUASI-PROBABILITY DECOMPOSITION

The presence of quasiprobabilities taking negative values amplifies the variance of the expected value of the observable. We consider the case that is a Pauli operator (maybe with error) and the measurement reports two kinds of outcomes denoted by , respectively. In this case, the distribution is binomial. The standard deviation of the average of outcomes in the Monte Carlo calculation is . Here, is the total number of samples; i.e., the total number of circuits of all kinds which the experimentalist performs is . We compare this to the error-free computing; i.e., the ideal original circuit is repeated for times to estimate . For the error-free computing, the standard deviation is given by . Therefore, to achieve the same accuracy, i.e., , the error-mitigated computation needs times more samples than the error-free computation. Here, we have used the fact that the error-mitigated computation and the error-free computation should converge to the same value of ; i.e., .

In order to limit the standard deviation to be , we can choose . Therefore, if the factor is larger, the computing takes longer.

Because if errors are corrected for each operation, we call the cost for mitigating error in the corresponding operation. The overall cost therefore increases with the number of operations; thus it is important to reduce the operation number. For example, in a quantum computer with qubits fully connected , operations for communication are not required, which may significantly reduce the cost.

VII. UNIVERSAL OPERATION SET

The set of operations including measurement and single-qubit Clifford gates is universal in computing expected values of observables. The relevant measurement operation reads , which projects a qubit to the state . Here, denotes a superoperator. Such a nondestructive measurement can be realized using a destructive measurement followed by initializing the qubit in the state . Single-qubit Clifford gates include the Hadamard gate , the phase gate , and all other single-qubit Clifford gates can be derived from these two.

The measurement superoperator [ ] also means postselection; i.e., if the outcome of the measurement corresponding to [ ] (which is not the final measurement on the observable ) is in a trial, the value of the observable is noted as , but the trial is counted in the total number of samples in the Monte Carlo calculation. If has two values , we can estimate the value of by calculating . Here, we have supposed that the circuit is implemented for total times; for times, the circuit does not pass postselections (i.e., ), and for times, the circuit passes all postselections and reports (i.e., ). It is the same when we compute using the Monte Carlo method. If the effect outcome is with the probability , then the standard deviation of the Monte Carlo calculation becomes .

In Table , we list 16 linearly independent operations that can be derived from the minimum universal operation set . In the following, we use to denote these 16 operations. Because they are linearly independent, any single-qubit operation , which is a real matrix, can be expressed as a linear combination of 16 basis operations; i.e., . Similarly, multiqubit operations can be expressed as a linear combination of tensor products of basis operations. Using the quasiprobability method, any computation of expected values of observables can be realized using this operation set.

TABLE I.

Sixteen basis operations. Gates and can be derived from [ ] and [ ], and other operations can be derived from [ ], , and .

1 (no operation)
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

Note that these basis operations are universal, as one can verify by constructing a non-Clifford gate or an entangling gate: We can decompose the gate using our basis operations as (see Appendix  for controlled-not as a second example). However, this construction would not be used in practice—it is not an efficient means to actually implement a desired in the basic circuit, since the corresponding cost would imply an unacceptably steep exponential in the time overhead, as one would expect from, e.g., Refs. . Instead we rely on the assumption that the experimental system can directly implement a universal set of gates (including entangling and non-Clifford gates) with a reasonably high fidelity. Then, rather than fully synthesizing any of the basic gates using our basis, we need only compensate for slight imperfections. The cost for doing so, for each imperfect gate, is then , as we presently discuss.

Having obtained the complete operation set in Table  we can use it in deriving the protocol that will compensate for errors. In this paper, we focus on the case that errors are localized: An (error-free) operation that is applied on a set of qubits is a -dimensional real matrix, then the corresponding operation in real (i.e., error-burdened) is also a -dimensional real matrix acting on the same set of qubits. The overall operation on the entire system can be expressed as , where is the identity acting on all other qubits. It is similar for the initialization and measurement. If each qubit is initialized individually, the overall initial state is , where is a two-dimensional real vector representing the th qubit’s initial state. Similarly, individual measurement of qubits implies that the overall measured observable is , where is a two-dimensional real vector representing the measured observable for the th qubit. In this case, a single-qubit operation with error can still be expressed using a real matrix. We suppose that for a qubit, 16 basis operations with errors are , which are all real matrices. When errors are not significant, these 16 bases should still be linearly independent; i.e., the set of basis operations with errors is still universal.

To make this statement more precise, we consider the real matrix

(2)

Here, denotes the th column of the matrix of the basis operation . Sixteen basis operations are linearly independent if the matrix is invertible. We use as the measure of the error severity in basis operations. When , is always invertible (see Appendix ). We remark that even if exceeds the threshold, basis operations are still likely to be linearly independent.

VIII. ERROR MITIGATION USING BASIS OPERATIONS

Given an operation with error , we can use 16 basis operations to correct the error, i.e., realize the operation without error . There are two ways for correcting the error.

Compensation method.—The operation is close to . Therefore, we can keep the correct component of and only decompose the error component using basis operations. We decompose the operation without error as , where is an arbitrary real number. If basis operations are linearly independent, the decomposition always exists, and there is only one solution of coefficients when is determined.

Inverse method.—If the matrix is invertible, we can express as followed by a noise operation, i.e., , where the noise operation . In order to correct the error, we can decompose the inverse of the noise as . By applying the inverse of the noise after the operation , we can realize the operation without error, i.e., . Similar to the compensation method, if basis operations are linearly independent, the decomposition always exists, and there is only one solution of coefficients . However, the inverse method can be applied only if the matrix is invertible.

For multiqubit operations, the decomposition is performed using tensor products of basis operations, as described explicitly in Appendix . Although basis operations are not entangling, we can use basis operations to efficiently mitigate multiqubit errors and errors that can entangle qubits. As an example, we show how to decompose the controlled-not gate only using basis operations in Appendix , which suffices to imply that any error in the form of the controlled-not gate can be mitigated using basis operations.

Initialization and measurement errors can also be corrected using basis operations. Taking first the case of initialization errors: If is the error-burdened initial state, and it is a nonzero vector, we can always find a transformation that satisfies , where is the error-free initial state. Thus, by decomposing using basis operations and applying it after the initialization, we can prepare the initial state without error. Actually, given an initial state that is close to , we can generate a complete set of linearly independent vectors using basis operations. With these vectors, we can decompose the initial state without error as .

A similar approach yields the corresponding result for measurement: For an observable there will be some where is the error-free quantity. If an observable is close to , then a linearly independent set can be generated; then the error-free observable .

Circuits for QEM are shown in Fig. . Given quasiprobabilities, we can compute the corresponding probability in sampling circuits as shown in Sec. . More details of QEM using basis operations are given in Appendix .

FIG. 2.

(a) Error mitigation circuits. The choice of a basis operation is determined by the corresponding random number , , or . Original gate that is identity (memory operation) also has to be error mitigated, unless memory error is negligible. In the compensation method, either the original gate or basis operations are applied depending on the random number. (b) The schematic of the linear extrapolation (orange curve) and exponential extrapolation (green curve).

Using the same technique, we can also increase the error in an operation, as required by the alternative error extrapolation method for QEM. Instead of decomposing the error-free operation using and basis operations, we can also decompose the error-boosted operation ( ) using and basis operations. It is similar for initial states and observables. We have noted that in the decomposition of an error-free operation, there are always some negative quasiprobabilities, i.e., the factor is greater than 1, which leads to greater time costs. But fortunately, when we merely wish to decompose an error-boosted operation we can do so without introducing negative quasiprobability, e.g., by boosting Pauli errors using Pauli gates .

IX. QUANTUM GATE SET TOMOGRAPHY

We can measure a set of initial states , observables , and operations (including basis operations) using GST . These vectors and matrices with the bar notation describe the actual physical system. Because there are errors in both initial states and observables, and initialization and measurement errors cannot be distinguished, we may not obtain exactly these vectors and matrices describing the actual physical system. Instead, the vectors and matrices obtained using GST are , , and , which are estimations of , , and , respectively.

If we know , , and because the physical system is well understood, we can directly use them in QEM. If our knowledge about the physical system is not enough, we can use GST to obtain , , and . We show that, although the estimations may not be exact, we can exactly correct errors by using these estimations in QEM.

Using the protocol in Refs.  (also see Appendix ), the estimation of an operation and the actual physical operation are similar matrices, i.e., , where is a matrix determined by initial states (i.e., ), and is an arbitrary invertible matrix. We note that and are independent of the operation , and cannot be determined by GST. By choosing , we can obtain different estimations of the operation set. Similarly, and .

All operations are transformed by the same similarity transformation, and initial states and observables are also transformed accordingly. As a result, these estimations obtained by GST can exactly predict the expected value of an observable; i.e., . Therefore, we can directly use these estimations in QEM, and the similarity transformation does not lead to any computing error.

Using GST estimations in QEM, the actual operations realized in this way differ from operations without error, but the computing result is correct. To correctly obtain , we decompose the initial state, the observable, and the operation using , , and , respectively. Here, , , , and GST estimations are all known to us. The decompositions are , , and . Accordingly, we actually realize , , and in the physical system. We have , , and . Therefore, the physical system gives the computing result , i.e., the desired error-free output. The cost of this adaption lies in the potential increase to the number of samples required, as shown in Fig.  and discussed in the caption.

FIG. 3.

Cost ( ) for correcting errors. We consider a universal set of operations, including the initialization, measurement, single-qubit Clifford gates, a single-qubit non-Clifford gate, and a two-qubit entangling Clifford gate. Sixteen basis operations of each qubit can be generated using these operations. Every operation in the set has error, and the memory error is also included. We assume that qualities of the initialization and single-qubit gates are 10 times better than the measurement and two-qubit gates, and also that the quality of the memory operation is 100 times better. Details of the model are given in Appendix . The cost for correcting error in each operation in the universal set is calculated, and the maximum cost over all operations is plotted. (a) For the depolarizing error model, the cost is lower if we directly use actual operations to correct errors, and the cost is higher if we use gate set tomography (GST) estimations to correct errors. (b) For the overrotation model, the cost is higher without using the Pauli twirling, and the cost is lower when the Pauli twirling is used. In (a) and (b), solid curves correspond to the compensation method (with the optimized ), and dashed curves correspond to the inverse method. (c) The cost as a function of the distance between operations with errors and operations without error. For the operation with error and the operation without error , the distance is . The axis illustrates the maximum distance over all operations in the universal set. In (b) and (c), we always use GST estimations. In (c), Pauli twirling and the inverse method are used for all the data. Pauli twirling is applied to the measurement and two-qubit gate, and the inverse method is applied only to the two-qubit gate, while errors in other operations are corrected using the compensation method. We remark that usually the maximum distance and the maximum cost are given by the two-qubit gate.

We remark that, when errors in actual operations are small, errors in estimations of operations are also small. If we take a proper strategy for choosing , and errors in initial states and observables are small, the estimation of an operation is close to the operation without error when the actual operation is close to . It is similar for estimations of initial states and observables. See Appendix  for details.

X. ESTIMATION OF THE COST

In general, when the error in an operation is more significant, there is a higher cost for mitigating the error (to a given level of suppression). We take as the measure of the error severity in the operation, where ( ) is the -qubit operation with (without) error. An upper bound of the cost for correcting error in is

(3)

Here, is the maximum error in all basis operations for all qubits, and . Similar upper bounds can be obtained for correcting errors in initial states and observables. See Appendix  for details.

There are several ways for reducing the cost. The upper bound of the cost is obtained using the compensation method and taking . In general, we can optimize the value of or use the inverse method to minimize the cost. For example, for the depolarizing error model (see Appendix ), the cost of using the inverse method is lower than using the compensation method [see Fig. ]. We remark that, to obtain data for the compensation method in Fig. , we have optimized the value of . If we use estimations obtained from GST to correct errors, we can optimize the matrix to minimize the cost. In Fig. , we can find that, without optimizing matrices, the cost using estimations obtained from GST is higher than using actual operations. If we choose the matrix in the form , where is a four-dimensional real matrix corresponding to the th qubit, there are total parameters to be optimized for a -qubit quantum computer, which is a nontrivial task. Under some reasonable conditions, we can also use the Pauli twirling to reduce the cost.

A. Pauli twirling

In many quantum computing systems, e.g., superconducting qubits and ion traps , the fidelity of single-qubit gates is much better than the fidelity of two-qubit gates, and usually a state can be initialized with a high fidelity while the fidelity of measurement is worse. In this section, we focus on the case that error rates of initialization and single-qubit gates are much lower than error rates of two-qubit gates and measurement.

If the error rate of initialization is low (much lower than the error rate of measurement), we know how to choose so that the estimation of an operation obtained from GST is close to the actual operation. We cannot exactly estimate operations using GST, because we cannot distinguish initialization and measurement errors. If we treat all errors in the initialization and measurement as measurement error (which corresponds to in Appendix ), the difference between the estimation and the actual operation is determined only by the initialization error. Therefore, if the initialization is high fidelity, the estimation obtained in this way and the actual operation are close.

Because the set of basis operations includes Pauli gates, it is easy to use basis operations to correct Pauli errors. By using the Pauli twirling, we can convert the error in a two-qubit entangling Clifford gate to Pauli error , which is achieved by applying Pauli gates before and after the two-qubit gate. This treatment of the error is feasible only if the fidelity of Pauli gates is much better than the two-qubit gate, otherwise Pauli gates cause significant new errors, which may not be Pauli error, on the two-qubit gate. In Fig. , we can find that the cost can be significantly reduced by using the Pauli twirling for the overrotation error model (see Appendix ).

In Fig. , costs of different error models are compared, including the depolarizing model, pure-dephasing model, amplitude-damping model, and the overrotation model. We also randomly generated many other error models; see Appendix  for details of these error models. For a random-operation model, we randomly generate an operation close to the ideal error-free operation, and we find that the cost is approximately the cost of the depolarizing model. For a random-field model, we randomly generate a Hamiltonian that drives the erroneous evolution, and the cost is between the depolarizing model and overrotation model.

From Fig.  we see that the cost of quantum error mitigation varies according to the error model but is generally upper bounded by the case of depolarizing noise, over the range of noise levels shown here. (Note that other models can exceed the cost of the depolarising model if we use even lower fidelity gates.) For the depolarizing model, the cost for mitigating error in a two-qubit entangling gate is , where is the error rate and the factor is between 2 and 3 [see Fig. ]. If errors in initialization and single-qubit gates are negligible, or if the matrix is optimized to minimize the cost, the factor can approach 2. Accepting the depolarizing model as an approximate upper bound, we can estimate the overall cost in a quantum algorithm. Suppose the total number of gates in a quantum algorithm is , the overall amplification of the standard deviation (uncertainty of the computing result) is . Therefore, times more repetitions of the experiment are required in order to reduce the standard deviation. We are interested in the case that is large but is small; therefore, . As a rule of thumb we might take as a limit for acceptable scenarios, since then . However, larger overhead factors may be acceptable depending on the speed of the quantum computer.

XI. NUMERICAL SIMULATION

In our numerical simulation, we apply QEM to the swap-test circuit shown in Fig. , in which we realize each controlled-swap gate using Toffoli gates and realize each Toffoli gate using gates, gates, Hadamard gates, and controlled-not gates . We note with interest that very recently the implementation of a swap test using shallow circuit has been proposed . However, for present purposes, it is not essential to use an optimized realization of the swap circuit; its role is simply to act as a real test case for our technique, and indeed the considerable depth of our nonoptimal circuit is helpful here. The number of gates scales as , where is the number of qubits (e.g., in Fig. ). Without error, the expected value of the observable ( of the probe qubit) in the swap-test circuit in Fig.  is 0.5.

FIG. 4.

swap-test circuit. The first qubit (denoted black) is a probe qubit, and the expected value of gives the overlap between states of two groups of qubits (denoted green and orange, respectively). Green qubits are prepared in the Greenberger-Horne-Zeilinger (GHZ) state , and orange qubits are prepared in . Therefore, the ideal expected value of is 0.5.

We consider error models according to which the same noise is applied after the initialization to the state , before the measurement, and before and after each gate. For the controlled-not gate, the noise applied is on two qubits. We remark that basis operations are also affected by noise likewise. We consider two types of noise: inhomogeneous Pauli error and leakage error, which can be, respectively, described as

and

where is the probability of the error , and is the probability of the leakage error from the state . It is worth mentioning that the leakage error is a non-trace-preserving error. In our simulations, we set , , and . Thus, in both models the total error rate is 0.08% for initialization and measurement, 0.16% for single-qubit gates, and 0.32% for two-qubit gates, which is achievable with two-qubit gates in ion traps and can be far surpassed for one-qubit gates . Moreover, with these numbers the expected total number of error events in circuits of the depth and breadth that we consider here is approximately unity; this is a challenging domain for error mitigation.

In addition to quasiprobability decomposition (see Appendix  for an instruction of the implementation), we also study the extrapolation technique introduced in Ref. . The expected value of obtained by running the swap-test circuit in a quantum computer with noise depends on the error rate; i.e., it is a function that can be denoted as , where is the overall error rate. For our first set of numerical experiments we consider linear extrapolation to the error-free value as follows: We obtain the expected value with the lowest attainable error rate , and by increasing error rate to with , we obtain another expected value . Using these two values, we can infer , as shown in Fig. , which is the final estimation of . Here, we set .

The first set of numerical results is shown in Fig. . We assume that the experimentalist makes their overall estimate of the after they perform individual experiments. We take this number of runs as a fixed constraint (effectively, we are constraining their overall time resource), and they may choose to employ those runs using one of three alternative approaches: no error correction, linear extrapolation, and quasiprobability decomposition (using basis operations and incorporating GST). In each experiment the swap-test circuit or its variant for the purpose of QEM is implemented. Because of the finite number of samples, the estimation is stochastic. Therefore, in our numerical simulation we perform the appropriate series of experiments, mirroring the actions of the experimentalist, and then we repeat times in order to determine the distribution of final estimations that may be obtained. The distribution for each case is plotted in Fig. .

FIG. 5.

Histograms of the estimation of using quantum computers with inhomogeneous Pauli error and leakage error. For the inhomogeneous Pauli error model, the swap-test circuit involving 19 qubits is simulated: one qubit is the probe qubit, and each group has 9 qubits. For the leakage error model, the swap-test circuit involving fewer qubits (15 qubits) is simulated, because in the numerical simulation we need to use an additional qubit to introduce the leakage process. The ideal value of is marked by the red arrow.

We can observe that both QEM approaches can improve the result; i.e., the corresponding distributions are shifted closer to the ideal value 0.5 compared to the approach without QEM. For the inhomogeneous Pauli error model, the means of distributions are at 0.1961, 0.3415, and 0.5011 for the three approaches, respectively. The distribution of the quasiprobability approach is centered at the ideal value, which clearly shows its desirable property of completely removing any systematic bias. However, the distribution is wider (as we expected) compared to the other two approaches. A fairer metric would be the expected absolute error versus ideal value (i.e., ). Given an ideal error-free computer and trials, this metric would evaluate to 0.006910. Using the error-prone computer with our three protocols the three corresponding values are 0.3039, 0.1853, and 0.0491. Similarly, for the leakage error model, the means for three approaches now lie at 0.3819, 0.4710, and 0.5007, while the expected absolute error evaluates to 0.1181, 0.0294, and 0.0434.

From these results it may appear that (given a large but reasonable number of samples) the quasiprobability technique outperforms the extrapolation method, with the latter unable to approach the mean of the error-free circuit. However, here the extrapolation method was limited to linear interpolation whereas the physical error rates are high enough that the linear assumption is poor. One could fit a higher-order polynomial using more data points (here, we have only used two: one derived from the actual lowest possible error rate and one boosted to twice the error rate); however, since we are limiting the total number of experimental runs to this would lead to greater noise in each data point. Moreover, as we now argue, the underlying trend is likely to be well approximated by an exponential decay rather than a polynomial one (i.e., the expected value of the observable falls exponentially with the physical error rate) and two data points will suffice to estimate the zero-error observable under that assumption.

In Fig. , we show the results when the experimentalist indeed assumes that the expected value changes exponentially with respect to the error rate and converges to 0 in the limit of . Then they will infer the error-free value as

Here we take .

FIG. 6.

Comparison of optimized quantum error mitigation techniques. The green outlines correspond to the quasiprobability technique while solid histograms correspond to the extrapolation technique using a presumption of an underlying linear (blue) or exponential (red) dependence. For the inhomogeneous Pauli error model, the swap-test circuit involving 19 qubits is simulated. For the leakage error model, the swap-test circuit involving fewer qubits (15 qubits) is simulated. The horizontal axis is the estimate of that an experimentalist who performs experiments will obtain. Ideally the circuit produces . Panel (a) corresponds to physical errors of the inhomogeneous Pauli type, while panel (b) corresponds to physical leakage errors. Note that the horizontal scale differs between the two panels; a gray bar showing the scale from 0.4 to 0.6 appears in both panels to facilitate comparison. For either type of noise, it is clear that exponential extrapolation mitigates noise more than linear extrapolation.

As shown in Fig. , the distribution of the final result using the exponential extrapolation approaches the ideal value of (which is 0.5 for the swap-test circuit) much better than the linear extrapolation. Given the same experimental runs, the mean of the experimentalist’s estimate is now 0.5111 for the inhomogeneous Pauli error model and 0.4986 for the leakage error model. These numbers almost rival those of the quasiprobability technique but do so with a smaller variance. The expected absolute error for inhomogeneous Pauli error and leakage error are 0.065 01 and 0.018 82, respectively. For the latter, the expected absolute error comes within a factor of 3 of the shot-noise limit that would be achieved by error-free ideal hardware (0.00691). This is despite the fact that our error-burdened circuits have error rates corresponding to at least one error event per circuit. We emphasize that this suppression results purely from the QEM protocol; i.e., it is achieved at no cost in terms of the qubit count or the total number of runs (constrained to ).

Because of the limited power of the classical computer we utilized, our exact numerical simulations did not involve more than 19 qubits. However, it is of course very interesting to assess the relevance of our techniques to quantum computing using over 50 qubits, which is in the so-called “quantum supremacy” regime. Therefore, we estimate the cost of quantum error mitigation in the swap-test circuit, using the same error models in our numerical simulation and error rates achievable in ion trap experiments ; i.e., the error rate of the two-qubit gate is 0.1% and error rates of single-qubit operations are 0.01%. Take for example the swap test with qubits (the number of gates is 1152). For the inhomogeneous Pauli error model, the overall cost is , which implies that we can attain the same computing precision as the ideal case if we have times more repetitions of the experiment, which is experimentally feasible. For the leakage error model, the cost for the 51-qubit swap test is , which means times more repetitions. A plot showing how the cost scales versus qubit count is shown in Fig. .

We also evaluate for a fully paralleled circuit, whose circuit depth is , and each layer has single-qubit gates and controlled-not gates, which means the quantum circuit has single-qubit gates and controlled-not gates. As single-qubit gates, we use the gate, gate, and Hadamard gate, because these gates plus a controlled-not gate constitute a universal gate set, and we equally assign the number of qubits to these three types of single-qubit gates. We plot versus the number of qubits for the swap-test circuit and the fully paralleled circuit in Fig. . We observe that for the swap-test circuit, it is feasible to venture into the supremacy regime with today’s best fidelities; for the more demanding case of full parallelism (so that the gate count scales as ), we see that today’s error rates would not suffice much beyond 50 qubits, but that error rates 10 times lower would easily suffice for 80 qubits and beyond.

FIG. 7.

The graphs show the cost of matching the performance of an ideal noiseless circuit with a noisy circuit, using the quasiprobability method. The vertical axis ( ) is a multiplicative factor indicating how many more repetitions of the circuit execution are requited. In each graph the upper pair of lines correspond to error rates achievable in ion trap experiments ; i.e., the error rate of two-qubit gate is 0.1% and error rates of single-qubit operations are 0.01%. The lower pair of lines indicate the result of reducing these error rates by a factor of 10. Panel (a) corresponds to the swap-test circuit. Panel (b) corresponds to a circuit where every qubit is actively gated in every time step, and the number of steps equals to the number of qubits.

XII. INTUITION FOR EXPONENTIAL EXTRAPOLATION

Intuitively, the explanation for the success of the exponential extrapolation is as follows. We express the th noise event occurring in the quantum circuit as

(4)

where is the error component. The only assumption is that the error component only weakly depends on the error rate (see Appendix ). Now, for simplification we ignore the computing operations, which do not affect our general argument. The total noise that the entire quantum circuit experiences is

(5)

Here, is the total number of the noise-burdened operations. Expanding the overall noise, we get

(6)

where

(7)

Note that the coefficient of in the overall noise corresponds to a binomial distribution, which can be approximated by the Poisson distribution. We have

(8)

We can find that the impact of the overall noise on the expected value of some observable is proportional to , which implies that exponential extrapolation works better than linear extrapolation.

XIII. CONCLUSIONS

We demonstrate that, following our protocol step by step, an experimentalist can derive an algorithm to run on a noisy quantum computer so as to estimate an output observable with zero bias versus the ideal observable. The experimentalist does not require any prior knowledge of the physical property of the noise, and the only condition is that the noise is localized and Markovian. For this purpose, we show that quantum gate set tomography is a perfect tool for measuring the noise in a quantum computer, if the aim is only to compensate the effect of the noise in quantum computing, and we also show that single-qubit Clifford gates and measurement can derive a complete set of operations that can compensate any noise in quantum computing.

The price of using such a systematic method to negate computing errors is that the quantum computation needs to run for a longer time than an error-free system. We verify the protocol with numerical simulations of up to 19 qubits, in which an alternative method, i.e., exponential error extrapolation, is introduced and studied. We find that the estimation using exponential error extrapolation is also very accurate, while the computing time could be shorter. An approach combining two methods may optimize both accuracy and efficiency.

In Appendix , we describe in detail the steps that an experimentalist would take in order to realize the quasiprobability method. We hope that this compact summary, presented in a single section, will indeed be useful to researchers who are interested in demonstrating the QEM technique with their hardware.

Our general conclusion is that these quantum error mitigation techniques can dramatically enhance the performance of quantum computers, especially at the small-to-medium scale where full code-based quantum error correction is impossible. Our simulations consider circuits up to 19 qubits, but with error rates considerably worse than the state of the art. Extrapolating from the trends that we observe in these smaller systems, we anticipate that hybrid algorithms involving qubits, i.e., beyond the reach of classical emulation, will benefit from QEM techniques if the hardware fidelity matches today’s state-of-the-art error or modestly improves upon it.

This work was supported by the EPSRC National Quantum Technology Hub in Networked Quantum Information Technologies. S. E. is supported by Japan Student Services Organization (JASSO) Student Exchange Support Program (Graduate Scholarship for Degree Seeking Students). Y. L. is also supported by NSAF (Grant No. U1730449). Numerical simulations were performed using QuEST, the Quantum Exact Simulation Toolkit . The authors would like to acknowledge the use of the University of Oxford Advanced Research Computing (ARC) facility in carrying out this work .

APPENDIX A: PAULI TRANSFER MATRIX

A state can be expressed as a real column vector

(A1)

where the vector element is

(A2)

is a Pauli operator, and is the dimension of the Hilbert space. Similarly, an observable (i.e., Hermitian operator) can be expressed as a real row vector,

(A3)

where the vector element is

(A4)

Here, we use notations and to denote real row and column vectors, respectively. A physical operation [i.e., ] can be expressed as a real square matrix,

(A5)

where are Pauli operators. If , we have .

APPENDIX B: DECOMPOSITION OF CONTROLLED-not GATE USING BASIS OPERATIONS

The controlled-not gate reads

(B1)

The controlled-not gate can be decomposed as

(B2)

Then, the corresponding cost is given by .

APPENDIX C: ERROR THRESHOLD OF BASIS OPERATIONS

For two real matrices and and a nonzero real vector , we have

(C1)

where is the minimum singular value of . We also have

(C2)

Therefore,

(C3)

If , is always positive (nonzero); i.e., is invertible.

Now, is the matrix formed by basis operations with error as defined in Eq. , and is the matrix formed by basis operations without error. Because , is invertible; i.e., basis operations without error are linearly independent. The minimum singular value is . Because , the matrix is invertible if .

APPENDIX D: DECOMPOSITION USING BASIS OPERATIONS

We consider the -qubit operation . For each qubit, there is a set of basis operations , where is the label of the qubit. For each set of basis operations, there is a matrix as defined in Eq. . We use to denote the matrix of the th qubit.

The operation is decomposed as

(D1)

Coefficients form a -dimensional vector:

(D2)

Therefore, the decomposition is given by , where is a -dimensional vector corresponding to .

We choose the order of Pauli operators, i.e., the order of bases of Pauli transfer matrices , as , , , and (which are also denoted as , , , and , respectively). Then, to be consistent with , we have

(D3)

Here, ( , , , ) is a Pauli operator of the th qubit.

The state of a qubit is represented by a four-dimensional real vector. To decompose the initial state of a qubit without error , we need four linearly independent initial states. If the qubit can be initialized in the state , we can choose the set of four states as . These four states can be obtained by applying basis-adjusting operations (Clifford gates) on the initial state . Because of the error in the state and errors in basis-adjusting operations, the prepared four states are not exactly states . When the overall error is small, states are still linearly independent. We introduce the matrix , and is the matrix corresponding to . States are linearly independent if is invertible. Similar to the analysis of the linear independence of basis operations (i.e., the invertibility of the matrix ; see Appendix ), we have that is always invertible if . The initial state without error is decomposed as . Coefficients form a four-dimensional column vector . The decomposition is given by .

Similarly, an observable of a qubit is also represented by a four-dimensional real vector. To decompose the observable of a qubit without error , we need four linearly independent observables. If can be measured, we can choose the set of four observables as Pauli operators . The operator denotes a trivial measurement; i.e., the outcome is always . Measurements of the other three Pauli operators can be obtained by applying basis-adjusting operations (Clifford gates) before the measurement of . Because of the error in the measurement of and errors in basis-adjusting operations, the measured observables are not exactly . When the overall error is small, observables are still linearly independent. We introduce the matrix , and is the matrix corresponding to . Observables are linearly independent if is invertible. We have that is always invertible if . The initial state without error is decomposed as . Coefficients form a four-dimensional row vector . The decomposition is given by .

APPENDIX E: QUANTUM GATE SET TOMOGRAPHY

To measure a set of operations on qubits using GST, we need to choose a set of linearly independent initial states and a set of linearly independent observables . Given these initial states and observables, we measure expected values

(E1)

Here, is one of operations .

The matrix is equivalent to up to a transformation. Because and (the sum is taken over all Pauli operators), we have

(E2)

where and are matrices defined as and . We remark that the initialization error and measurement error are included in and , respectively. We cannot measure matrices and independently; therefore, we cannot determine using GST. By taking as the identity operation (i.e., ) in Eq. , we can measure

(E3)

The estimation of is given by

(E4)

Here, and are obtained by measuring the expected values of observables, and is an arbitrary invertible matrix. If and are different, is different from , but they are always similar matrices. The estimations of states and observables are given by

(E5)
(E6)

Here, ( ) denotes the th column ( th row) of the matrix .

We introduce matrices and defined as and , respectively. Then and .

For a sequence of operations , because and are similar matrices up to the same transformation independent of the operation (i.e., the index ), we have

(E7)

Therefore, although estimations may be different from their correspondences , they can always provide the correct prediction for the expected value of an observable in an initial state going through a sequence of operations.

APPENDIX F: STABILITY OF THE QUANTUM GATE SET TOMOGRAPHY

We define

(F1)
(F2)
(F3)

which describe severities of the initialization error, measurement error, and operation error, respectively. Here, and are matrices corresponding to the th qubit. The overall matrices of qubits are and .

Similar to the analysis of the linear independence of basis operations (i.e., the invertibility of the matrix ; see Appendix ), we have that and are always invertible, i.e., is always invertible, if and . Choosing and as in Appendix , we have and .

We choose , then and , where and are matrices corresponding to the th qubit. The severity of errors in estimations of initial states is

(F4)

and the severity of errors in estimations of observables is

(F5)

Here, we have used that

(F6)

Choosing and as in Appendix , we have , , and .

The severity of the error in the estimation of an -qubit operation is

(F7)

as we show next. Here,

(F8)

For an invertible matrix , we have

(F9)

Then, using the inequality Eq. , we have

(F10)

We have the expression

(F11)

First, we have . Second, using

(F12)

we have

(F13)

Third, using the inequality Eq. , we have

(F14)

We remark that for a -dimensional matrix , .

APPENDIX G: UPPER BOUND OF THE COST

We consider the compensation method and take ; i.e., the -qubit operation without error is realized as , where is decomposed using basis operations as shown in Eq. . Then the cost for correcting the error in is determined by

(G1)

Decomposition coefficients are determined by , where and are defined in Eqs.  and . Here, and are -dimensional vectors, and is a -dimensional matrix. Therefore, for each element of ,

(G2)

Here, we have used that the maximum absolute value of an element of is . Because , we have . Using the inequality Eq. , we have

(G3)

Here, . There are in total decomposition coefficients; therefore,

(G4)

Here, we have assumed that .

We consider using the set of initial states with errors to realize the initial state , which is in the set of initial states without error . The initial state can be decomposed as (see Appendix ). We use as the measure of the error severity in initial states. Using and , we have the cost for correcting errors in initial states:

(G5)

It is similar for observables. We consider using the set of observables with errors to realize the observable , which is in the set of observables without error . We use as the measure of the error severity in observables. Then, the cost for correcting errors in measured observables is

(G6)

APPENDIX H: ERROR MODELS

We consider a quantum computer with the following operations. The initialization , which prepares the state , the projective measurement [ ], single-qubit Clifford gates and , the single-qubit non-Clifford gate [ ], where , and two-qubit maximally entangling gate [ ], where , which is equivalent to the controlled-not gate and controlled-phase gate up to single-qubit gates. The 16 basis operations can be realized as shown in Table . In order to perform GST, we choose initial states and observables as in Appendix , and we choose the invertible matrix for qubits.

For the initialization, the state prepared is rather than . We can always express the initialization operation with error as , where .

A POVM is defined by a set of operators satisfying . In a POVM, the state is mapped to when the outcome is . When the measurement has error, we may not be able to obtain all the information . Usually there are only two outcomes corresponding to and , respectively. In this case, maybe several values correspond to the same outcome , 1. Therefore, we model the projective measurement [ ] with error as , where is the set of corresponding to the measurement outcome .

For a gate without error , the gate with error can be expressed as . Any noisy gate can be expressed in this form: Because is invertible, we can always take and .

We suppose that time costs of the measurement [ ] and the two-qubit gate [ ] are the same, and time costs of single-qubit gates are negligible.

We distinguish the identity operation and the memory operation. Without error, both of them are the same operation . In any case, the identity operation is , which means that the next operation is performed immediately, so it takes no time and there is not any memory error. When the memory operation is performed, the qubit waits for the next operation, so memory errors may occur on it. We apply the identity operation for measuring the matrix (see Appendix ). In the basis operation set, the operation is replaced by the memory operation.

We set the cycle time of the computing as the time cost of the measurement and the two-qubit gate. In one cycle, only one operation is performed on a qubit. If the operation is a single-qubit gate, the gate is performed at the middle of the cycle; i.e., the overall operation is , where denotes memory noise. If no gate or measurement is performed in the cycle, the overall operation is , which is the error version of the operation in the basis operation set.

We suppose that the single-qubit noise is described by the superoperator , and the two-qubit noise is described by the superoperator . Here, is a parameter describing the intensity of the noise. Then, the initialization noise is , and the measurement with noise is . For single-qubit gates, . For the two-qubit gate, . For the memory operation, .

1. Depolarizing error

The single-qubit depolarizing noise is

(H1)

where , , , and correspond to , , , and , respectively. The two-qubit depolarizing noise is

(H2)

The axis (error rate) in Fig.  is of the two-qubit gate.

2. Dephasing error

The single-qubit dephasing noise is

(H3)

The two-qubit dephasing noise is

(H4)

3. Damping error

The single-qubit damping noise is

(H5)

The two-qubit damping noise is

(H6)

4. Overrotation error

Noise is gate dependent. Initialization, measurement, and memory operation are perfect; i.e., for these operations. Only gates have noise. For gate , . For gate , . For gate , . For gate , . The axis (overrotation) in Fig.  is of the two-qubit gate.

5. Random-field error

Noise is gate dependent. For each operation, the noise is determined by a Hamiltonian. Here, , and each element of is randomly generated with a uniform distribution in the unit circle. We remark that the noise is time independent; i.e., the noise is the same for the same gate implemented at different times.

6. Random-operation error

The operation without noise is . The operation with noise is , which depends on the error parameter. As the same as other models, the error parameter is operation dependent. Each operation can be expressed using a matrix . We suppose the matrix corresponding to is , and the matrix corresponding to is . To generate , first, we generate a Hermitian matrix around , which is , where is generated as the same as the random Hamiltonian. Second, if is not measurement, should be trace preserving. However, may correspond to a non-trace-preserving operation. In this case, we project to the subspace in the matrix space that corresponds to trace-preserving operations; i.e., is the matrix closest to and corresponds to a trace-preserving operation. If is measurement, . Third, may not be positive semidefinite. Therefore, we take if the minimum eigenvalue of is negative, otherwise . Finally, , where the factor makes sure that the operation is still trace preserving and the maximum eigenvalue of is smaller than 1.

APPENDIX I: INSTRUCTION OF THE IMPLEMENTATION OF THE QUASIPROBABILITY METHOD

This section is a self-contained description of how to implement QEM using the quasiprobability decomposition. There are three steps: first, implement GST; second, compute the quasiprobability decomposition; third, implement the quasiprobability decomposition using the Monte Carlo approach.

1. Implementation of gate set tomography

General discussions of GST are given in the main text and Appendix ; therefore, here we describe GST in a more concrete way. GST is implemented to measure all gates used in the quantum computation. We discuss how to measure single-qubit gates first and two-qubit gates afterwards.

To measure a single-qubit gate using GST, we prepare initial states , , , and , where and are the eigenstates of Pauli operators and with the eigenvalue , respectively. We denote these states , , , and , respectively. These initial states can be noisy states, which is the essential advantage of GST; i.e., GST can tolerate state preparation and measurement errors. Then, we apply the gate that we want to measure, for instance, Hadamard gate, gate, and gate in the swap-test circuit. Here we use (superoperator acting on a reduced density matrix) to denote the gate to be measured, which has noise. Subsequently, we measure expectation values for four observables, , , , and , respectively. Here, is a trivial observable whose measurement outcome is always . We denote these observables as , , , , and measurements of these observables can also be noisy. Then, by repeating the experiment to compute the mean value of observables, we can construct the matrix , and matrix elements are

(I1)

Similarly, we can obtain the matrix by choosing not to apply any gate to the initial state, so that the matrix elements are

(I2)

This process is implemented for each qubit and each type of single-qubit gate (including all basis operations).

One will find that there is a freedom in the specification of the gate . Legitimate variants can be obtained as

(I3)

where is an invertible matrix. The matrix can be different for different qubits but must be the same for all gates on the same qubit. We can choose to minimize the cost in QEM. In the case that the error rate of preparing initial states is low, we can take

(I4)

which approximately minimize the cost according to our experience.

Estimations of the initial state and the observable are, respectively,

(I5)
(I6)

Here, ( ) denotes the th column ( th row) of the matrix .

To measure a two-qubit gate using GST, the procedure is basically the same. The only difference is that there are 16 initial states and 16 observables to be measured. Initial states are the tensor products of single-qubit initial states, i.e., , and observables are tensor products of single-qubit observables, i.e., . Here, the superscript is the label of the qubit. Accordingly, the matrix , which is the tensor product of matrices of two qubits, and similarly the matrix , which is the tensor product of matrices of two qubits. We need to implement two-qubit gate GST for each pair of qubits that the two-qubit gate may be performed on.

2. Quasiprobability decomposition

Using results obtained from GST, we can compute the quasiprobability decomposition. From GST we obtain estimations of initial states, observables to be measured, and gates (including basis operations), and they are

These estimations are utilized to compute the quasiprobability decomposition.

Now, we focus on the inverse method. We use to denote the Pauli transfer matrix of the ideal gate (without error). The estimation of the Pauli transfer matrix of the actual gate with noise (i.e., ) is , which is obtained in GST. To compute the decomposition, first, we compute the ideal matrix ; second, we compute the inverse of the noise,

(I7)

and finally, we solve the equation (for the single-qubit gate),

(I8)

to determine quasiprobabilities of the gate . We need to compute quasiprobabilities for each qubit and each gate. For example, for the swap-test circuit, we need to compute the decomposition for the Hadamard gate, gate, and gate of each qubit, and the controlled-not gate of each pair of qubits that the controlled-not gate may be performed on.

For two-qubit gates, the procedure is the same but tensor products of single-qubit basis operations, i.e., (where the superscript is the label of the qubit), are used to decompose the inverse of the noise .

In order to mitigate errors in initial states and measurements of observables, we should solve the following equations for the quantities and :

(I9)
(I10)

for each qubit. Here, is the label of the qubit, is the column vector representing the ideal initial state , and is the row vector representing the ideal observable .

Before implementing the quasiprobability decomposition on a quantum computer, we compute

(I11)
(I12)

for each qubit and

(I13)

for each gate.

3. Monte Carlo implementation of the quasiprobability decomposition

It is vital to note that we use estimations to decompose the inverse of the noise, but usually there is a difference between and the actual basis operation . This difference does not cause any error in the final computing result, because the computing result is invariant under a similarity transformation, as we explained in the main text.

Now, we describe how to implement the quasiprobability decomposition on a quantum computer. We suppose the circuit is sequentially performing gates , on the initial state , and the first qubit is measured in the basis to read the computing result. The procedure can be generalized to the case of measuring multiple qubits.

First, we generate a set of random integers: for each qubit , we randomly select an integer such that each integer would be selected with corresponding probability ; similarly for each gate , we generate random integer with corresponding probability ; and finally we generate random integer with the probability .

Second, on the quantum computer, we implement the following quantum computing for once: we initialize the qubit in the state ; then, we sequentially perform gates , , , , ; finally, we measure the observable . The measurement outcome is .

Third, we compute the effective measurement outcome

(I14)

By repeating these three steps, we can obtain the mean of effective outcomes . The final computing result is , where

(I15)

APPENDIX J: ERROR COMPONENT OF PAULI ERROR AND LEAKAGE ERROR

Taking , it is obvious that in the inhomogeneous Pauli error model, the error component does not depend on the error rate; i.e., . We remark that ratios do not change with .

For the leakage error model, we take . Then the error component is

(J1)

where and . Therefore, varies slowly with .

References (33)

  1. J. O’Gorman and E. T. Campbell, Quantum Computation with Realistic Magic State Factories, Phys. Rev. A 95, 032338 (2017).
  2. Y. Li and S. C. Benjamin, Efficient Variational Quantum Simulator Incorporating Active Error Minimization, Phys. Rev. X 7, 021050 (2017).
  3. A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. Zhou, P. J. Love, A. Aspuru-Guzik, and J. L. O’Brien, A Variational Eigenvalue Solver on a Photonic Quantum Processor, Nat. Commun. 5, 4213 (2014).
  4. D. Wecker, M. B. Hastings, and M. Troyer, Progress Towards Practical Quantum Variational Algorithms, Phys. Rev. A 92, 042303 (2015).
  5. J. R. McClean, J. Romero, R. Babbush, and A. Aspuru-Guzik, The Theory of Variational Hybrid Quantum-Classical Algorithms, New J. Phys. 18, 023023 (2016).
  6. B. Bauer, D. Wecker, A. J. Millis, M. B. Hastings, and M. Troyer, Hybrid Quantum-Classical Approach to Correlated Materials, Phys. Rev. X 6, 031045 (2016).
  7. J. M. Kreula, S. R. Clark, and D. Jaksch, Non-Linear Quantum-Classical Scheme to Simulate Non-Equilibrium Strongly Correlated Fermionic Many-Body Dynamics, Sci. Rep. 6, 32940 (2016).
  8. J. M. Kreula, L. García-Álvarez, L. Lamata, S. R. Clark, E. Solano, and D. Jaksch, Few-Qubit Quantum-Classical Simulation of Strongly Correlated Lattice Fermions, EPJ Quantum Techno. 3, 11 (2016).
  9. Y. Shen, X. Zhang, S. Zhang, J.-N. Zhang, M.-H. Yung, and K. Kim, Quantum Implementation of Unitary Coupled Cluster for Simulating Molecular Electronic Structure, Phys. Rev. A 95, 020501 (2017).
  10. P. J. J. O’Malley et al., Scalable Quantum Simulation of Molecular Energies, Phys. Rev. X 6, 031007 (2016).
  11. A. Kandala, A. Mezzacapo, K. Temme, M. Takita, M. Brink, J. M. Chow, and J. M. Gambetta, Hardware-Efficient Variational Quantum Eigensolver for Small Molecules and Quantum Magnets, Nature (London) 549, 242 (2017).
  12. J. R. McClean, M. E. Kimchi-Schwartz, J. Carter, and W. A. de Jong, Hybrid Quantum-Classical Hierarchy for Mitigation of Decoherence and Determination of Excited States, Phys. Rev. A 95, 042308 (2017).
  13. J. I. Colless, V. V. Ramasesh, D. Dahlen, M. S. Blok, M. S. Kimchi-Schwartz, J. R. McClean, J. Carter, W. A. de Jong, and I. Siddiqi, Computation of Molecular Spectra on a Quantum Processor with an Error-Resilient Algorithm, Phys. Rev. X 8, 011021 (2018).
  14. K. Temme, S. Bravyi, and J. M. Gambetta, Error Mitigation for Short-Depth Quantum Circuits, Phys. Rev. Lett. 119, 180509 (2017).
  15. S. T. Merkel, J. M. Gambetta, J. A. Smolin, S. Poletto, A. D. Córcoles, B. R. Johnson, C. A. Ryan, and M. Steffen, Self-Consistent Quantum Process Tomography, Phys. Rev. A 87, 062119 (2013).
  16. D. Greenbaum, Introduction to Quantum Gate Set Tomography, arXiv:1509.02921.
  17. E. Knill, Fault-Tolerant Postselected Quantum Computation: Threshold Analysis, arXiv:quant-ph/0404104.
  18. J. J. Wallman and J. Emerson, Noise Tailoring for Scalable Quantum Computation via Randomized Compiling, Phys. Rev. A 94, 052325 (2016).
  19. J. O’Gorman, N. H. Nickerson, P. Ross, J. J. L. Morton, and S. C. Benjamin, A Silicon-Based Surface Code Quantum Computer, npj Quantum Inf. 2, 15019 (2016).
  20. A. K. Ekert, C. M. Alves, D. K. L. Oi, M. Horodecki, P. Horodecki, and L. C. Kwek, Direct Estimations of Linear and Nonlinear Functionals of a Quantum State, Phys. Rev. Lett. 88, 217901 (2002).
  21. L. Cincio, Y. Subaşi, A. T. Sornborger, and P. J. Coles, Learning the Quantum Algorithm for State Overlap, arXiv:1803.04114.
  22. J. Romero, J. P. Olson, and A. Aspuru-Guzik, Quantum Autoencoders for Efficient Compression of Quantum Data, Quantum Sci. Technol. 2, 045001 (2017).
  23. C. Song, K. Xu, W. Liu, C. Yang, S.-B. Zheng, H. Deng, Q. Xie, K. Huang, Q. Guo, L. Zhang, P. Zhang, D. Xu, D. Zheng, X. Zhu, H. Wang, Y.-A. Chen, C.-Y. Lu, S. Han, and J.-W. Pan, 10-Qubit Entanglement and Parallel Logic Operations with a Superconducting Circuit, Phys. Rev. Lett. 119, 180511 (2017).
  24. S. Bravyi and D. Gosset, Improved Classical Simulation of Quantum Circuits Dominated by Clifford Gates, Phys. Rev. Lett. 116, 250501 (2016).
  25. S. Bravyi, G. Smith, and J. A. Smolin, Trading Classical and Quantum Computational Resources, Phys. Rev. X 6, 021043 (2016).
  26. M. Howard and E. Campbell, Application of a Resource Theory for Magic States to Fault-Tolerant Quantum Computing, Phys. Rev. Lett. 118, 090501 (2017).
  27. R. Barends et al., Superconducting Quantum Circuits at the Surface Code Threshold for Fault Tolerance, Nature (London) 508, 500 (2014).
  28. T. P. Harty, D. T. C. Allcock, C. J. Ballance, L. Guidoni, H. A. Janacek, N. M. Linke, D. N. Stacey, and D. M. Lucas, High-Fidelity Preparation, Gates, Memory, and Readout of a Trapped-Ion Quantum Bit, Phys. Rev. Lett. 113, 220501 (2014).
  29. C. J. Ballance, T. P. Harty, N. M. Linke, M. A. Sepiol, and D. M. Lucas, High-Fidelity Quantum Logic Gates Using Trapped-Ion Hyperfine Qubits, Phys. Rev. Lett. 117, 060504 (2016).
  30. J. P. Gaebler, T. R. Tan, Y. Lin, Y. Wan, R. Bowler, A. C. Keith, S. Glancy, K. Coakley, E. Knill, D. Leibfried, and D. J. Wineland, High-Fidelity Universal Gate Set for Ion Qubits, Phys. Rev. Lett. 117, 060505 (2016).
  31. M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information (Cambridge University Press, Cambridge, England, 2010).
  32. T. Jones, A. Brown, I. Bush, and S. Benjamin, QuEST and High Performance Simulation of Quantum Computers, arXiv:1802.08032.
  33. http://dx.doi.org/10.5281/zenodo.22558.