{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# This cell is added by sphinx-gallery\n# It can be customized to whatever you like\n%matplotlib inline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Optimizing noisy circuits with Cirq\n===================================\n\n*Author: PennyLane dev team. Posted: 1 June 2020. Last updated: 16 Jun\n2021.*\n\n![](../demonstrations/noisy_circuit_optimization/noisy_qubit.png){width=\"90%\"}\n\nUntil we have fault-tolerant quantum computers, we will have to learn to\nlive with noise. There are lots of exciting ideas and algorithms in\nquantum computing and quantum machine learning, but how well do they\nsurvive the reality of today's noisy devices?\n\nBackground\n----------\n\nQuantum pure-state simulators are great and readily available in a\nnumber of quantum software packages. They allow us to experiment,\nprototype, test, and validate algorithms and research ideas---up to a\ncertain number of qubits, at least.\n\nBut present-day hardware is not ideal. We're forced to confront\ndecoherence, bit flips, amplitude damping, and so on. Does the presence\nof noise in near-term devices impact their use in, for example,\nvariational quantum algorithms </glossary/variational\\_circuit>?\nWon't our models, trained so carefully in simulators, fall apart when we\nrun on noisy devices?\n\nIn fact, there is some optimism that variational algorithms may be the\nbest type of algorithms on near-term devices, and could have an in-built\nadaptability to noise that more rigid textbook algorithms do not\npossess. Variational algorithms are somewhat robust against the fact\nthat the device they are run on may not be ideal. Being variational in\nnature, they can be tuned to \"work around\" noise to some extent.\n\nQuantum machine learning leverages a lot of tools from its classical\ncounterpart. Fortunately, there is great evidence that machine learning\nalgorithms can not only be robust to noise, but can even benefit from\nit! Examples include the use of [reduced-precision arithmetic in deep\nlearning](https://dl.acm.org/doi/abs/10.5555/3045118.3045303), the\nstrong performance of [stochastic gradient\ndescent](https://en.wikipedia.org/wiki/Stochastic_gradient_descent), and\nthe use of \"dropout\" noise to [prevent\noverfitting](https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf).\n\nWith this evidence to motivate us, we can still hope to find, extract,\nand work with the underlying quantum \"signal\" that is influenced by a\ndevice's inherent noise.\n\nNoisy circuits: creating a Bell state\n-------------------------------------\n\nLet's consider a simple quantum circuit which performs a standard\nquantum information task: the creation of an entangled state and the\nmeasurement of a [Bell\ninequality](https://en.wikipedia.org/wiki/Bell%27s_theorem) (also known\nas the [CHSH\ninequality](https://en.wikipedia.org/wiki/CHSH_inequality)).\n\nSince we'll be dealing with noise, we'll need to use a simulator that\nsupports noise and density-state representations of quantum states (in\ncontrast to many simulators, which use a pure-state representation).\n\nFortunately, [Cirq](https://cirq.readthedocs.io) provides mixed-state\nsimulators and noisy operations natively, so we can use the\n[PennyLane-Cirq plugin](https://pennylane-cirq.readthedocs.io) to carry\nout our noisy simulations.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import pennylane as qml\nfrom pennylane import numpy as np\nimport matplotlib.pyplot as plt\n\ndev = qml.device(\"cirq.mixedsimulator\", wires=2, shots=1000)\n\n# CHSH observables\nA1 = qml.PauliZ(0)\nA2 = qml.PauliX(0)\nB1 = qml.Hermitian(np.array([[1, 1], [1, -1]]) / np.sqrt(2), wires=1)\nB2 = qml.Hermitian(np.array([[1, -1], [-1, -1]]) / np.sqrt(2), wires=1)\nCHSH_observables = [A1 @ B1, A1 @ B2, A2 @ B1, A2 @ B2]\n\n# subcircuit for creating an entangled pair of qubits\ndef bell_pair():\n qml.Hadamard(wires=0)\n qml.CNOT(wires=[0, 1])\n\n# circuits for measuring each distinct observable\n@qml.qnode(dev)\ndef measure_A1B1():\n bell_pair()\n return qml.expval(A1 @ B1)\n\n@qml.qnode(dev)\ndef measure_A1B2():\n bell_pair()\n return qml.expval(A1 @ B2)\n\n@qml.qnode(dev)\ndef measure_A2B1():\n bell_pair()\n return qml.expval(A2 @ B1)\n\n@qml.qnode(dev)\ndef measure_A2B2():\n bell_pair()\n return qml.expval(A2 @ B2)\n\ncircuits = qml.QNodeCollection([measure_A1B1,\n measure_A1B2,\n measure_A2B1,\n measure_A2B2])\n\n# now we measure each circuit and construct the CHSH inequality\nexpvals = circuits()\n\n# The CHSH operator is A1 @ B1 + A1 @ B2 + A2 @ B1 - A2 @ B2\nCHSH_expval = np.sum(expvals[:3]) - expvals[3]\nprint(CHSH_expval)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The output here is $2\\sqrt{2}$, which is the maximal value of the CHSH\ninequality. States which have a value $\\langle CHSH \\rangle \\geq 2$ can\nsafely be considered \"quantum\".\n\n> **note**\n>\n> In this situation \"quantum\" means that there is\n>\n> : no [local hidden variable\n> theory](https://en.wikipedia.org/wiki/Local_hidden-variable_theory)\n> which could produce these measurement outcomes. It does not\n> strictly mean the presence of entanglement.\n>\nNow let's turn up the noise! \ud83d\udce2 \ud83d\udce2 \ud83d\udce2\n\nCirq provides a number of noisy channels that are not part of PennyLane\ncore. This is no issue, as the\n[PennyLane-Cirq](https://pennylane-cirq.readthedocs.io) plugin provides\nthese and allows them to be used directly in PennyLane circuit\ndeclarations.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from pennylane_cirq import ops as cirq_ops\n# Note that the 'Operation' op is a generic base class\n# from PennyLane core.\n# All other ops are provided by Cirq.\navailable_ops = [op for op in dir(cirq_ops) if not op.startswith('_')]\nprint(\"\\n\".join(available_ops))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"PennyLane operations and external framework-specific operations can be\ninterwoven freely in circuits that use that plugin's device for\nexecution. In this case, the Cirq-provided channels can be used with\nCirq's mixed-state simulator.\n\nWe'll use the `BitFlip` channel, which has the effect of randomly\nflipping the qubits in the computational basis.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"noise_vals = np.linspace(0, 1, 25)\n\nCHSH_vals = []\nnoisy_expvals = []\n\nfor p in noise_vals:\n # we overwrite the bell_pair() subcircuit to add\n # extra noisy channels after the entangled state is created\n def bell_pair():\n qml.Hadamard(wires=0)\n qml.CNOT(wires=[0, 1])\n cirq_ops.BitFlip(p, wires=0)\n cirq_ops.BitFlip(p, wires=1)\n # measuring the circuits will now use the new noisy bell_pair() function\n expvals = circuits()\n noisy_expvals.append(expvals)\nnoisy_expvals = np.array(noisy_expvals)\nCHSH_expvals = np.sum(noisy_expvals[:,:3], axis=1) - noisy_expvals[:,3]\n\n# Plot the individual observables\nplt.plot(noise_vals, noisy_expvals[:, 0], 'D',\n label = r\"$\\hat{A1}\\otimes \\hat{B1}$\", markersize=5)\nplt.plot(noise_vals, noisy_expvals[:, 1], 'x',\n label = r\"$\\hat{A1}\\otimes \\hat{B2}$\", markersize=12)\nplt.plot(noise_vals, noisy_expvals[:, 2], '+',\n label = r\"$\\hat{A2}\\otimes \\hat{B1}$\", markersize=10)\nplt.plot(noise_vals, noisy_expvals[:, 3], 'v',\n label = r\"$\\hat{A2}\\otimes \\hat{B2}$\", markersize=10)\nplt.xlabel('Noise parameter')\nplt.ylabel(r'Expectation value $\\langle \\hat{A}_i\\otimes\\hat{B}_j\\rangle$')\nplt.legend()\nplt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"By adding the bit-flip noise, we have degraded the value of the CHSH\nobservable. The first two observables $\\hat{A}_1\\otimes \\hat{B}_1$ and\n$\\hat{A}_1\\otimes \\hat{B}_2$ are sensitive to this noise parameter.\nTheir value is weakened when the noise parameter is not 0 or 1 (note\nthat the the CHSH operator is symmetric with respect to bit flips).\n\nThe latter two observables, on the other hand, are seemingly unaffected\nby the noise at all.\n\nWe can see that even when noise is present, there may still be subspaces\nor observables which are minimally affected or unaffected. This gives us\nsome hope that variational algorithms can learn to find and exploit such\nnoise-free substructures on otherwise noisy devices.\n\nWe can also plot the CHSH observable in the noisy case. Remember, values\ngreater than 2 can safely be considered \"quantum\".\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"plt.plot(noise_vals, CHSH_expvals, label=\"CHSH\")\nplt.plot(noise_vals, 2 * np.ones_like(noise_vals),\n label=\"Quantum-classical boundary\")\nplt.xlabel('Noise parameter')\nplt.ylabel('CHSH Expectation value')\nplt.legend()\nplt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Too much noise (around 0.2 in this example), and we lose the quantumness\nwe created in our circuit. But if we only have a little noise, the\nquantumness undeniably remains. So there is still hope that quantum\nalgorithms can do something useful, even on noisy near-term devices, so\nlong as the noise is not high.\n\n> **note**\n>\n> In Google's quantum supremacy paper \\[\\#arute2019\\]\\_, they were able\n> to show that some small signature of quantumness remained in their\n> computations, even after a deep many-qubit noisy circuit was executed.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Optimizing noisy circuits\n=========================\n\nNow, how does noise affect the ability to optimize or train a\nvariational circuit?\n\nLet's consider an analog of the basic\nqubit rotation tutorial </demos/tutorial\\_qubit\\_rotation>, but\nwhere we add an extra noise channel after the gates.\n\n> **note**\n>\n> We model the noise process as the application of ideal noise-free\n>\n> : gates, followed by the action of a noisy channel. This is a common\n> technique for modelling noise, but may not be appropriate for\n> all situations.\n>\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"@qml.qnode(dev)\ndef circuit(gate_params, noise_param=0.0):\n qml.RX(gate_params[0], wires=0)\n qml.RY(gate_params[1], wires=0)\n cirq_ops.Depolarize(noise_param, wires=0)\n return qml.expval(qml.PauliZ(0))\n\ngate_pars = [0.54, 0.12]\nprint(\"Expectation value:\", circuit(gate_pars))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this case, the depolarizing channel degrades the qubit's density\nmatrix $\\rho$ towards the state\n\n$$\\rho' = \\tfrac{1}{3}\\left[X\\rho X + Y\\rho Y + Z\\rho Z\\right]$$\n\n(at the value $p=\\frac{3}{4}$, it passes through the maximally mixed\nstate). We can see this in our circuit by looking at how the final\n\\~pennylane.ops.PauliZ expectation value changes as a function of the\nnoise strength.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"noise_vals = np.linspace(0., 1., 20)\nexpvals = [circuit(gate_pars, noise_param=p) for p in noise_vals]\n\nplt.plot(noise_vals, expvals, label=\"Expectation value\")\nplt.plot(noise_vals, np.ones_like(noise_vals), '--', label=\"Highest possible\")\nplt.plot(noise_vals, -np.ones_like(noise_vals), '--', label=\"Lowest possible\")\nplt.ylabel(r\"Expectation value $\\langle \\hat{Z} \\rangle$\")\nplt.xlabel(r\"Noise strength $p$\")\nplt.legend()\nplt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's fix the noise parameter and see how the noise affects the\noptimization of our circuit. The goal is the same as the\nqubit rotation tutorial </demos/tutorial\\_qubit\\_rotation>, i.e.,\nto tune the qubit state until it has a `PauliZ` expectation value of\n$-1$ (the lowest possible).\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# declare the cost functions to be optimized\ndef cost(x):\n return circuit(x, noise_param=0.0)\n\ndef noisy_cost(x):\n return circuit(x, noise_param=0.3)\n\n# initialize the optimizer\nopt = qml.GradientDescentOptimizer(stepsize=0.4)\n\n# set the number of steps\nsteps = 100\n# set the initial parameter values\ninit_params = np.array([0.011, 0.055])\nnoisy_circuit_params = init_params\nparams = init_params\n\nfor i in range(steps):\n # update the circuit parameters\n # we can optimize both in the same training loop\n params = opt.step(cost, params)\n noisy_circuit_params = opt.step(noisy_cost, noisy_circuit_params)\n\n if (i + 1) % 5 == 0:\n print(\"Step {:5d}. Cost: {: .7f}; Noisy Cost: {: .7f}\".\n format(i + 1,\n cost(params),\n noisy_cost(noisy_circuit_params)))\n\nprint(\"\\nOptimized rotation angles (noise-free case):\")\nprint(\"({: .7f}, {: .7f})\".format(*params))\nprint(\"Optimized rotation angles (noisy case):\")\nprint(\"({: .7f}, {: .7f})\".format(*noisy_circuit_params))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There are a couple interesting observations here:\n\ni) The noisy circuit isn't able to achieve the same final cost function\n value as the ideal circuit. This is because the noise causes the\n state to become irreversibly mixed. Mixed states can't achieve the\n same extremal expectation values as pure states.\nii) However, both circuits still converge to the *same parameter values*\n $(0,\\pi)$, despite having different final states.\n\nIt could have been the case that noisy devices irreparably damage the\noptimization of variational circuits, steering us towards parameter\nvalues which are not at all useful. Luckily, at least for the simple\nexample above, this is not the case. *Optimizations on noisy devices can\nstill lead to similar parameter values as when we run on ideal devices.*\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Understanding the effect of noisy channels\n==========================================\n\nLet's dig a bit into the underlying quantum information theory to\nunderstand better what's happening \\[\\#meyer2020\\]\\_. Expectation values\nof variational circuits </glossary/variational\\_circuit>, like the\none we are measuring, are composed of three pieces:\n\ni) an initial quantum state $\\rho$ (usually the zero state);\nii) a parameterized unitary transformation $U(\\theta)$); and\niii) measurement of a final observable $\\hat{B}$.\n\nThe equation for the expectation value is given by the [Born\nrule](https://en.wikipedia.org/wiki/Born_rule):\n\n$$\\langle \\hat{B} \\rangle (\\theta) =\n \\mathrm{Tr}(\\hat{B}U(\\theta)\\rho U^\\dagger(\\theta)).$$\n\nWhen optimizing, we can compute gradients of many common gates using the\nparameter-shift rule </glossary/parameter\\_shift>:\n\n$$\\nabla_\\theta\\langle \\hat{B} \\rangle(\\theta)\n = \\frac{1}{2}\n \\left[\n \\langle \\hat{B} \\rangle\\left(\\theta + \\frac{\\pi}{2}\\right)\n - \\langle \\hat{B} \\rangle\\left(\\theta - \\frac{\\pi}{2}\\right)\n \\right].$$\n\nIn our example, the parametrized unitary $U(\\theta)$ is split into two\ngates, $U = U_2 U_1$, where $U_1=R_X$ and $U_1=R_Y$, and each takes an\nindependent parameter $\\theta_i$.\n\nWhat happens when we apply a noisy channel $\\Lambda$ after the gates? In\nthis case, the expectation value is now taken with respect to the noisy\ncircuit:\n\n$$\\langle \\hat{B} \\rangle (\\theta) =\n \\mathrm{Tr}\\left(\\hat{B}\\Lambda\\left[\n U(\\theta)\\rho U^\\dagger(\\theta)\n \\right]\\right).$$\n\nThus, we can treat it as the expectation value of the same observable,\nbut with respect to a different state\n$\\rho' = \\Lambda\\left[U(\\theta)\\rho U^\\dagger(\\theta)\\right]$.\n\nAlternatively, using the Heisenberg picture, we can transfer the channel\n$\\Lambda$ acting on the state $U(\\theta)\\rho U^\\dagger(\\theta)$ into the\n*adjoint channel* $\\Lambda^\\dagger$ acting on the observable $\\hat{B}$,\ntransforming it to a new observable\n$\\hat{B} = \\Lambda^\\dagger[\\hat{B}]=\\hat{B}'$.\n\nWith the channel present, the expectation value can be interpreted as if\nwe had the same variational state, but measured a different observable:\n\n$$\\langle \\hat{B} \\rangle (\\theta) =\n \\mathrm{Tr}(\\hat{B}'U(\\theta)\\rho U^\\dagger(\\theta)) =\n \\langle \\hat{B}' \\rangle (\\theta).$$\n\nThis has immediate consequences for the parameter-shift rule. With the\nchannel present, we have simply\n\n$$\\nabla_\\theta\\langle \\hat{B} \\rangle(\\theta)\n = \\frac{1}{2}\n \\left[\n \\langle \\hat{B}' \\rangle\\left(\\theta + \\frac{\\pi}{2}\\right)\n - \\langle \\hat{B}' \\rangle\\left(\\theta - \\frac{\\pi}{2}\\right)\n \\right].$$\n\nIn other words, the parameter-shift rule continues to hold for all\ngates, even when we have additional noise!\n\n> **note**\n>\n> In the above derivation, we implicitly assumed that the channel\n>\n> : does not depend on the variational circuit's parameters. If the\n> channel depended on the particular state, or if it depended on the\n> parameters $\\theta$, we would need to be more careful.\n>\nLet's confirm the above derivation with an example.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"angles = np.linspace(0, 2 * np.pi, 50)\ntheta2 = np.pi / 4\n\ndef param_shift(theta1):\n return 0.5 * (noisy_cost([theta1 + np.pi / 2, theta2]) - \\\n noisy_cost([theta1 - np.pi / 2, theta2]))\n\nnoisy_expvals = [noisy_cost([theta1, theta2]) for theta1 in angles]\nnoisy_param_shift = [param_shift(theta1) for theta1 in angles]\n\nplt.plot(angles, noisy_expvals,\n label=\"Expectation value\") # looks like 0.4 * cos(phi)\nplt.plot(angles, noisy_param_shift,\n label=\"Parameter-shift value\") # looks like -0.4 * sin(phi)\nplt.ylabel(r\"Expectation value $\\langle \\hat{Z} \\rangle$\")\nplt.xlabel(r\"Angle $\\theta_1$\")\nplt.legend()\nplt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"By inspecting the two curves, we can see that the parameter-shift rule\ngives the correct gradient of the expectation value, even with the\npresence of the noisy channel!\n\nIn this example, the influence of the channel is to attenuate the\nmaximal amplitude that the qubit state can achieve ($\\approx 0.4$). But\neven though the qubit's amplitude is attenuated, the gradient computed\nby the parameter-shift rule still \"points in the right direction\".\n\nThis backs up the observation we made earlier that the result of the\noptimization gave the correct values for the angle parameters, but the\nvalue of the final cost function was lower than the noise-free case.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Interpreting noisy circuit optimizations\n========================================\n\nDespite the observations that we can compute gradients for noisy\nchannels, and that optimization may lead to the same parameter values\nfor both noise-free and noisy circuits, we must still remain cautious in\nhow we interpret the results.\n\nWe can evaluate the correct gradient for the expectation value\n\n$$\\langle \\hat{B} \\rangle =\n \\mathrm{Tr}\\left(\\hat{B}\\Lambda\\left[\n U(\\theta)\\rho U^\\dagger(\\theta)\n \\right]\\right),$$\n\nbut, because the noisy channel is present, *this expectation value may\nnot reflect the actual expectation value we wanted to compute*. This is\nimportant to keep in mind for certain algorithms that have a physical\ninterpretation for the variational circuit.\n\nFor example, in the\nvariational quantum eigensolver </demos/tutorial\\_vqe>, we want to\nfind the ground-state energy of a physical system. If there is an\nappreciable amount of noise present, the state we are optimizing will\nnecessarily become mixed, and we should be careful interpreting the\noptimum value as the exact ground-state energy.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"References\n==========\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.2"
}
},
"nbformat": 4,
"nbformat_minor": 0
}