Quantum Computing and Elliptic Curve Cryptography

In the last post, we have learned what an elliptic curve is, how it is used in cryptography and we saw an example of how Alice can encrypt her message to Bob using Bob’s public key and her own private key. We also saw how Bob is able to decrypt this message by multiplying Alice’s public key with his private key and subtracting the result from the encrypted message to get the clear text message.

An eavesdropper, by the name of Eve, has seen the exchanges between Alice and Bob and she wants to be able to decrypt the message of Alice. However, she only sees the public keys of Alice and Bob, the encrypted message, the elliptic curve equation used and the modulus. How can she decrypt the message?

Discrete Logarithm Problem

Eve knows both the public key of Alice and Bob. If she can somehow compute the private key, she will be able to decrypt the message. The public key of Alice is

\text{Alice Public Key} = P_a = e_a B

Where e_a is the private key of Alice and B is the base point of the elliptic curve agreed before hand by Alice and Bob. Using a group theoretic notation, we can also write this as

P_a = B^{e_a}

where the group “multiplication” operation is defined at the addition of points defined in the last blog post, that is,

B^n = \underbrace{B+B+B}_{\text{n times}}

and B^0 = O, the identity element (point at infinity).

If P_a = B^{e_a}, then

e_a = \log_{B} P_a

Solving for e_a is called the discrete logarithm problem.

Quantum Algorithm

If Eve has access to a quantum computer, she can use Shor’s Algorithm to find the value of e_a. The trick to using Shor’s algorithm is to create a periodic function f such that the period of f is e_a. To do this, define

f(a,b) = B^{a-e_ab}

If (a_1,b_1) and (a_2,b_2), notice that f(a_2,b_2) = f(a_1, b_1) if (a_2, b_2) = (a_1, b_1) + k(e_a,1). To see this,

\begin{array}{rl}  f(a_2,b_2) &= B^{a_2-e_ab_2} = B^{(a_1+ke_a) - e_a(b_1+k)}\\  &= B^{a_1+ke_a - e_ab_1 - ke_a}\\  &= B^{a_1 - e_ab_1}\\  &= f(a_1,b_1)  \end{array}

Eve will prepare 2 input registers and prepare the quantum state to be a superposition of all values of the basis vectors \left|a\right\rangle\left|b\right\rangle

\displaystyle \frac{1}{p}\sum_{a=0}^{p-1}\sum_{b=0}^{p-1} \left|a\right\rangle\left|b\right\rangle\otimes\left|0\right\rangle

She will then apply an operator \mathbf{U}_f defined as

\mathbf{U}_f \left|a\right\rangle \left|b\right\rangle\left|0\right\rangle = \left|a\right\rangle \left|b\right\rangle\left|f(a,b)\right\rangle

to get

\displaystyle \frac{1}{p} \sum_{a=0}^{p-1}\sum_{b=0}^{p-1} \left|a\right\rangle\left|b\right\rangle\left|f(a,b)\right\rangle

where f is the periodic function defined as above.

A measurement of the output register system will then yield a specific value f, which corresponds to m values of a and b because of the periodic nature of the function:

\displaystyle \frac{1}{\sqrt{m}} \sum_{k=0}^{m-1} \left|a_0 + ke_a\right\rangle\left|b_0+k\right\rangle

To solve for e_a, we apply the Quantum Fourier Transform as defined here:

\begin{array}{lr}  \displaystyle \mathbf{U}_{\mathbf{FT}}\otimes\mathbf{U}_{\mathbf{FT}}\frac{1}{\sqrt{m}} \sum_{k=0}^{m-1} \left|a_0 + ke_a\right\rangle\left|b_0+k\right\rangle\\  = \displaystyle \frac{1}{\sqrt{m}} \sum_{k=0}^{m-1} \mathbf{U}_{\mathbf{FT}}\left|a_0 + ke_a\right\rangle\otimes \mathbf{U}_{\mathbf{FT}}\left|b_0+k\right\rangle\\  = \displaystyle \frac{1}{(p-1)\sqrt{m}} \sum_{k=0}^{m-1} \sum_{x=0}^{p-2}\sum_{y=0}^{p-2}e^{2\pi i (a_0+ke_a)x/(p-1)} e^{2\pi i (b_0+k)y/(p-1)}\left|x\right\rangle\left|y\right\rangle\\  = \displaystyle \frac{1}{(p-1)\sqrt{m}} \sum_{k=0}^{m-1} \sum_{x=0}^{p-2}\sum_{y=0}^{p-2}e^{2\pi i/(p-1) [(a_0x + b_0y) +k(e_ax +y)]}\left|x\right\rangle\left|y\right\rangle\\  = \displaystyle \frac{1}{(p-1)\sqrt{m}} \sum_{x=0}^{p-2}\sum_{y=0}^{p-2}e^{2\pi i (a_0x + b_0y)/(p-1)} \sum_{k=0}^{m-1} e^{2\pi i k(e_ax +y)/(p-1)}\left|x\right\rangle\left|y\right\rangle\\  \end{array}

The probability of getting a specify x and y is given by the square of the amplitudes:

\displaystyle \frac{1}{(p-1)^2m}\underbrace{\left|\sum_{k=0}^{m-1} e^{2\pi i k(e_ax +y)/(p-1)}\right|}

If we let

\displaystyle f = \frac{e_ax +y}{p-1}

we can write the quantity in underbrace as

\displaystyle \frac{\sin^2(\pi fm)}{\sin^2(\pi f)}

which is maximum when f is an integer, that is,

e_ax + y = 0 \mod p-1

After a measurement of the input registers, we get a specific value of x and y. We can then compute the private key using:

e_a = -y x^{-1} \mod p-1

where x^{-1} is the inverse of x \mod p-1.


Suppose after we measure the input registers, we get the values x=51, y=44. We can compute for the private key as follows:

\begin{array}{rl}  e_a &= -y x^{-1} \mod p-1\\  &= -44(51)^{-1} \mod 70\\  &= -44(11) \mod 70\\  &= 6  \end{array}

Decrypting the Message

Suppose we are given the following parameters: The elliptic curve equation is y^2 = x^3 - 3x +3 \mod 71 , the base point is (0,28), Alice’s public key is (61,58), Bob’s public key is (30,2) and the encrypted message is (23,12).

Eve uses the Quantum Computer to compute Alice private key to get e_a = 6. She can then multiply Bob’s public key with Alice’s private key to get 6\times (30,2) = (10,60). Subtracting this from the encrypted text, we get (23,12) + (10,-60) = (17,54), which is the decrypted message.


Elliptic Curve Cryptography

Let y^2 = x^3 + ax + b be the equation of the elliptic curve E. Any line L that intersects the curve will intersect the curve in at most 3 points. Below are examples of how these look like.

A mathematical structure can defined on E. We can “add” two points on E to get a third point. First, let’s define the negative of a point. A point Q is the negative of a point P if the point Q is the mirror image of the point P with respect to the x-axis that is,

Q = - P

From the relation above, we can define O to be

P + (-P) = O

The point O is the “point at infinity”.

Let L be a line that intersects E at points P, Q and R. We define the sum

P + Q + R = O

From this definition, we see that the the sum of any of the two points is the negative of the third point, that is, P + Q = -R.

So how do we actually add 2 points given their coordinates? It turns out that if we are given the x and y coordinates of 2 points, we can easily compute the third point. To see this, let y = mx +c be the equation of the line that intersects the elliptic curve y^2 = x^3 + ax + b at points P, Q and R. Solving for the x-coordinates of the intersection, we get

\begin{array}{rl}  \displaystyle x_1 &= \displaystyle \frac{m^2}{3} + \frac{2^{1/3}}{3} A - \frac{B}{3}\\  \displaystyle x_2 &= \displaystyle \frac{m^2}{3} - \frac{(1+i\sqrt{3})}{3 2^{2/3}}A + (1-i\sqrt{3})\frac{B}{6}\\  \displaystyle x_3 &= \displaystyle \frac{m^2}{3} - \frac{(1-i\sqrt{3})}{3 2^{2/3}}A + (1+i\sqrt{3})\frac{B}{6}  \end{array}

A = \frac{\left(3 a-6 c m-m^4\right)}{  \sqrt[3]{\sqrt{\left(9 a m^2+27 b-27 c^2-18 c m^3-2 m^6\right)^2+4 \left(3 a-6 c  m-m^4\right)^3}+9 a m^2+27 b-27 c^2-18 c m^3-2 m^6}}

B = \frac{\sqrt[3]{\sqrt{\left(9am^2+27b-27c^2-18cm^3-2m^6\right)^2+4\left(3a-6  cm-m^4\right)^3}+9am^2+27b-27c^2-18cm^3-2m^6}}{  \sqrt[3]{2}}

From this we see that

x_1 + x_2 + x_3 = m^2

Therefore, if we have x_1 and x_2, we can get x_3 from the formula:

x_3 = m^2 -x_1 -x_2

Once we get x_3, we can get y_3 from the formula of the slope:

\begin{array}{rl}  \displaystyle \frac{y_3-y_1}{x_3-x_1} &= m\\  y_3-y_1 &= m(x_3 - x_1)\\  y_3 &= m(x_3 - x_1) + y_1  \end{array}

The variable m is the slope of the line. If the line is tangent to the curve E at point P, the slope m is calculated using the derivative of E at the P:

\begin{array}{rl}  d(y^2) &= d(x^3 + ax + b) \\  2y dy &= (3x^2 + a) dx\\  \displaystyle \frac{dy}{dx} &= m = \displaystyle \frac{3x^2 +a}{2y}  \end{array}

Addition Formula

Now that we have seen how to add two points, let’s summarize the formula here:

Given points P=(x_1,y_1) and Q=(x_2,y_2), the third point R=(x_3, -y_3) can be computed from

x_3 = m^2 -x_1 -x_2

y_3 = m(x_3 - x_1) + y_1


m = \begin{cases}  \displaystyle \frac{y_1 - y_2}{x_1 - x_2} & P \ne Q\\  \displaystyle \frac{3x^2 +a}{2y} & P = Q  \end{cases}

Important! Take note that R=(x_3,-y_3). The value of y_3 should be the negative of the computed value.

Example 1

Let the elliptic curve be y^2 = x^3 -3 + 3, then points P=(-2,1) and Q=(0, 1.732051) lie on the curve. The third point can be computed as follows:

\begin{array}{rl}  m &= \displaystyle \frac{y_1 - y_2}{x_1 - x_2}\\  &= \displaystyle \frac{1-1.732051}{-2 - 0}\\  &= 0.3660255  \end{array}

\begin{array}{rl}  x_3 &= m^2 - x_1 - x_2\\  &= (0.3660255)^2 - (-2) - 0\\  &= 2.133975  \end{array}

\begin{array}{rl}  y_3 &= m(x_3 - x_1) + y_1\\  &= 0.3660255 ( 2.133975 - (-2)) + 1\\  &= 2.51314  \end{array}

Therefore R=(x_3,-y_3) = (2.133975, -2.51314).

Example 2

If P=Q=(-2,1), we have:

\begin{array}{rl}  m &= \displaystyle \frac{3x^2 +a}{2y}\\  &= \displaystyle \frac{3(-2)^2 -3}{2(1)}\\  &= 4.5  \end{array}

\begin{array}{rl}  x_3 &= m^2 - x_1 - x_2\\  &= (4.5)^2 - (-2) - (-2)\\  &= 24.25  \end{array}

\begin{array}{rl}  y_3 &= m(x_3 - x_1) + y_1 \\  &= 4.5 ( 24.25 - (-2)) + 1\\  &= 119.125  \end{array}

Therefore, R=(x_3,-y_3) = (24.5, -119.125)

Group Structure

The set of points of E together with the point at infinity, E\cup \{O\}, forms a group under the addition operation defined above.

  • It is associative: (P+Q) + R = P + (Q+R)
  • Every element has an inverse: P + (-P) = O
  • It contains the identity element. From the inverse we have P - P = 0. Therefore, P + O = P, which makes O the identity element.
  • It is closed: For every P, Q, P + Q \in E\cup \{O\}

Modulo Arithmetic

The addition formula also works for modular arithmetic. Let p be a prime number, then given two points P and Q, the third point can be computed as

x_3 = m^2 -x_1 -x_2 \mod p

y_3 = m(x_3 - x_1) + y_1 \mod p


m = \begin{cases}  \displaystyle \frac{y_1 - y_2}{x_1 - x_2} \mod p & P \ne Q\\  \displaystyle \frac{3x^2 +a}{2y} \mod p & P = Q  \end{cases}


Suppose p=71, P =(0,28) and Q = (-2, 1), then we compute the third point R as follows:

\begin{array}{rl}  m &= \displaystyle \frac{y_1 - y_2}{x_1 - x_2} \mod 71\\  &= \displaystyle \frac{28-1}{0 - (-2)} \mod 71\\  &= 27(36) \mod 71, \text{ where 36 is the inverse of 2 modulo 71}\\  &= 49  \end{array}

\begin{array}{rl}  x_3 &= m^2 - x_1 - x_2 \mod 71\\  &= 49^2 - 0 - (-2) \mod 71\\  &= 60  \end{array}

\begin{array}{rl}  y_3 &= m(x_3 - x_1) + y_1 \mod 71\\  &= 49(60 - 0) + 28\\  &= 57  \end{array}

Therefore, P+Q = R = (x_3,-y_3) = (60, -57) \mod 71 = (60, 14)

N-fold Point Addition

Let B be a point in E. If n is an integer greater than 1, the point nB is defined as

nB = \underbrace{B + B + \ldots + B}_{\text{n times}}


Let y^2 = x^3 - 3x + 3 \mod 71 . Let B=(0,28), then using the formula for x_3 and y_3:

\begin{array}{rl}  3B &= (0,28) + (0,28) + (0,28) \mod 71\\  &= (37,63) \mod 71  \end{array}

Elliptic Curve Cryptography

Alice and Bob want to exchange information using Elliptic Curve Cryptography. They agree on an elliptic curve equation and a base point B. Alice and Bob both choose a random number between 1 and 71. Let e_a and e_b be the secret numbers chosen by Alice and Bob (respectively) and publishes the public keys e_aB and e_bB. Alice wants to send the message encoded in a point P. She multiplies Bob’s public key and her private key and adds it to the message P to get P+ e_ae_bB and sends the pair \{e_aB, P + e_ae_bB\}.

Bob receives the message multiplies his private key to Alice’s public key to get e_be_aB. He then subtracts this to the message P+ e_ae_bB to get the original message P.

Assume Bob and Alice agree on the equation y^2 = x^3 - 3x + 3, the base point B=(0,28) and the modulus p=71. Suppose Alice’s private key is 4 and Bob’s private key is 5. Alice will publish her public key as e_aB = 4B = (42,57). Likewise, Bob will publish his public key as e_bB = 5B=(30,2) .

Alice wants to send Bob a message encoded in the point P=c(2,17) of E. First, Alice will multiply Bob’s public key with her private key to get e_ae_bB = 4\cdot (30,2) = (18,32). She will then add this to P to get encrypted message P + e_ae_bB = (2,17) + (18,32) = (10,11). She will send (10,11) to Bob.

Bob then receives the message (10,11). The first thing he will do is to multiply Alice’s public key e_aB = (42,57) with his private key to get e_be_aB = 5\cdot(42,57) = (18,32). Next, he will subtract this from the encrypted message to get (10,11) + (18,-32) = (2,17), which is the unencrypted message of Alice!

Quantum Searching

Imagine a shuffled deck of 52 cards, and you are asked to find the ace of spades. The most natural thing to do is to take the top most card, see if it’s the ace of spades. If not, put it aside, take the top most and repeat the process until you find the ace of spades. If you are lucky, the ace of spades is the top most then we’re done. If you’re not so lucky, the ace of spades is at the bottom and it will take you 52 peeks before you find the card.

If we scatter the cards face down on the floor and randomly pick a card, then on the average, we will need 52/2 peeks before we can find the ace of spades.

With quantum computing we can do even better!

For the sake of demonstration, let’s say we have 8 cards as shown in the figure below. We want to find the ace of spades.

Here are the steps:

1. Label each card from 0 to 7 in random order. The positions of the cards will represent the states of cards. The state \left|6\right\rangle will therefore represent the ace of spades.

2. Let \left|\phi\right\rangle be the superposition of states:

\displaystyle \left|\phi\right\rangle =  \frac{1}{2^{n/2}} \sum_{k=0}^{2^n-1} \left|x\right\rangle = \frac{1}{2^{n/2}} \Big( \left|0\right\rangle + \ldots + \left|7\right\rangle \Big) = \frac{1}{\sqrt{8}} \left[ \begin{array}{c}  1\\  1\\  1\\  1\\  1\\  1\\  1\\  1  \end{array} \right]

3. Let f be a function such that

f(x) = \begin{cases}  1 & x \text{ corresponds to ace of spades}\\  0 & \text{ otherwise}  \end{cases}

Define the operator \mathbf{V}

\mathbf{V} = \mathbf{I} - 2\left|a\right\rangle \left\langle a\right|

where a is the unique value where

f(a) = 1

In our example, the value of a is 6. Therefore,

\mathbf{V} = \begin{bmatrix}    1.00 & 0.00 & 0.00 & 0.00 & 0.00 & 0.00 & 0.00 & 0.00 \\     0.00 & 1.00 & 0.00 & 0.00 & 0.00 & 0.00 & 0.00 & 0.00 \\     0.00 & 0.00 & 1.00 & 0.00 & 0.00 & 0.00 & 0.00 & 0.00 \\     0.00 & 0.00 & 0.00 & 1.00 & 0.00 & 0.00 & 0.00 & 0.00 \\     0.00 & 0.00 & 0.00 & 0.00 & 1.00 & 0.00 & 0.00 & 0.00 \\     0.00 & 0.00 & 0.00 & 0.00 & 0.00 & 1.00 & 0.00 & 0.00 \\     0.00 & 0.00 & 0.00 & 0.00 & 0.00 & 0.00 & -1.00 & 0.00 \\     0.00 & 0.00 & 0.00 & 0.00 & 0.00 & 0.00 & 0.00 & 1.00 \\     \end{bmatrix}

Observe that the matrix element \mathbf{V}_{6,6} = -1

4. Define the operator W:

\mathbf{W} = 2\left|\phi\right\rangle \left\langle\phi\right| - \mathbf{I} = \begin{bmatrix}    -0.75 & 0.25 & 0.25 & 0.25 & 0.25 & 0.25 & 0.25 & 0.25 \\     0.25 & -0.75 & 0.25 & 0.25 & 0.25 & 0.25 & 0.25 & 0.25 \\     0.25 & 0.25 & -0.75 & 0.25 & 0.25 & 0.25 & 0.25 & 0.25 \\     0.25 & 0.25 & 0.25 & -0.75 & 0.25 & 0.25 & 0.25 & 0.25 \\     0.25 & 0.25 & 0.25 & 0.25 & -0.75 & 0.25 & 0.25 & 0.25 \\     0.25 & 0.25 & 0.25 & 0.25 & 0.25 & -0.75 & 0.25 & 0.25 \\     0.25 & 0.25 & 0.25 & 0.25 & 0.25 & 0.25 & -0.75 & 0.25 \\     0.25 & 0.25 & 0.25 & 0.25 & 0.25 & 0.25 & 0.25 & -0.75 \\     \end{bmatrix}

5. Compute the operator \mathbf{WV}:

\mathbf{WV} = \begin{bmatrix}    -0.75 & 0.25 & 0.25 & 0.25 & 0.25 & 0.25 & -0.25 & 0.25 \\     0.25 & -0.75 & 0.25 & 0.25 & 0.25 & 0.25 & -0.25 & 0.25 \\     0.25 & 0.25 & -0.75 & 0.25 & 0.25 & 0.25 & -0.25 & 0.25 \\     0.25 & 0.25 & 0.25 & -0.75 & 0.25 & 0.25 & -0.25 & 0.25 \\     0.25 & 0.25 & 0.25 & 0.25 & -0.75 & 0.25 & -0.25 & 0.25 \\     0.25 & 0.25 & 0.25 & 0.25 & 0.25 & -0.75 & -0.25 & 0.25 \\     0.25 & 0.25 & 0.25 & 0.25 & 0.25 & 0.25 & 0.75 & 0.25 \\     0.25 & 0.25 & 0.25 & 0.25 & 0.25 & 0.25 & -0.25 & -0.75 \\     \end{bmatrix}

6. Multiply \mathbf{WV} to \left|\phi\right\rangle a number of times, about \pi/4 \cdot 2^{n/2} = \pi/4 \cdot \sqrt{8} = 2.2 \approx 2:

\mathbf{WV}\left|\phi\right\rangle = \begin{bmatrix}    0.18 \\     0.18 \\     0.18 \\     0.18 \\     0.18 \\     0.18 \\     0.88 \\     0.18 \\     \end{bmatrix}, \mathbf{(WV)^2}\left|\phi\right\rangle = \begin{bmatrix}    -0.09 \\     -0.09 \\     -0.09 \\     -0.09 \\     -0.09 \\     -0.09 \\     0.97 \\     -0.09 \\     \end{bmatrix}

As you can see, the probability of getting the state \left|6\right\rangle becomes very close to 1.

Making a measurement of the input register at this point will give us the state \left|6\right\rangle with a probability very close to 1.

We have just demonstrated the quantum search algorithm!

Why it works

Let f be a function such that

f(x) = \begin{cases}  1 & x \text{ the item we are looking for}\\  0 & \text{ otherwise}  \end{cases}

Define the operator \mathbf{U}_f whose action on an n qubit register and 1 qubit output register is

\mathbf{U}_f(\left|x\right\rangle \otimes \left|y\right\rangle) = \left|x\right\rangle\otimes\left|y\oplus f(x)\right\rangle

Prepare the n qubit input register and 1 qubit output register in the following state:

\underbrace{\left|0\right\rangle\ldots \left|0\right\rangle}_{\text{n times}} \otimes \left|1\right\rangle

Applying the Hadamard operator on the input and output qubits gives us a superposition of N=2^n states:

\begin{array}{rl}  \displaystyle \mathbf{H}^{\otimes n}\otimes\mathbf{H}(\left|0\ldots 0\right\rangle\otimes \left|1\right\rangle )&= \mathbf{H}^{\otimes n}\left|0\ldots 0\right\rangle\otimes \mathbf{H}\left|1\right\rangle\\  &= \displaystyle \frac{1}{2^{n/2}}\sum_{x=0}^{2^n-1} \left|x\right\rangle \otimes \frac{1}{\sqrt{2}} \left(\left|0\right\rangle - \left|1\right\rangle\right)    \end{array}

Next apply the operator \mathbf{U}_f to get

\begin{array}{rl}  \displaystyle \mathbf{U}_f \left[\frac{1}{2^{n/2}}\sum_{x=0}^{2^n-1} \left|x\right\rangle \otimes \frac{1}{\sqrt{2}} \left(\left|0\right\rangle - \left|1\right\rangle\right)\right] &= \displaystyle \frac{1}{\sqrt{2}} \frac{1}{2^{n/2}}\left[\sum_{x=0}^{2^n-1} \mathbf{U}_f \left|x\right\rangle \otimes  \left|0\right\rangle - \sum_{x=0}^{2^n-1} \mathbf{U}_f \left|x\right\rangle \otimes \left|1\right\rangle\right]  \\  &= \displaystyle \frac{1}{\sqrt{2}} \frac{1}{2^{n/2}}\left[\underbrace{\sum_{x=0}^{2^n-1} \left|x\right\rangle \otimes  \left|0\oplus f(x) \right\rangle }_{\text{first underbrace}}  - \underbrace{\sum_{x=0}^{2^n-1} \left|x\right\rangle \otimes \left|1\oplus f(x) \right\rangle}_{\text{second underbrace}}\right]  \end{array}

When f(x) = 1 for x=a, the first underbrace will expand to

\displaystyle \sum_{x=0}^{2^n-1} \left|x\right\rangle \otimes  \left|0\oplus f(x) \right\rangle = \left|0\right\rangle\otimes\left|0\right\rangle + \ldots + \underbrace{\left|a\right\rangle\otimes\left|1\right\rangle} + \ldots + \left|2^n-1\right\rangle \otimes \left|0\right\rangle

The second underbrace will expand to

\displaystyle \sum_{x=0}^{2^n-1} \left|x\right\rangle \otimes  \left|1\oplus f(x) \right\rangle = \left|0\right\rangle\otimes\left|1\right\rangle + \ldots + \underbrace{\left|a\right\rangle\otimes\left|0\right\rangle} + \ldots + \left|2^n-1\right\rangle \otimes \left|1\right\rangle\\

The underbraces in the above summations can be swapped so that we get

\displaystyle \frac{1}{\sqrt{2}} \frac{1}{2^{n/2}} \left[ \sum_{x=0}^{2^n-1} (-1)^{f(x)} \left|x\right\rangle \otimes  \left|0\right\rangle - \sum_{x=0}^{2^n-1} (-1)^{f(x)} \left|x\right\rangle \otimes \left|1\right\rangle\right]

\begin{array}{rl}  &= \displaystyle \frac{1}{2^{n/2}} \sum_{x=0}^{2^n-1} (-1)^{f(x)} \left|x\right\rangle \otimes \frac{1}{\sqrt{2}} \left( \left|0\right\rangle - \left|1\right\rangle\right)\\  &= \displaystyle \frac{1}{2^{n/2}} \sum_{x=0}^{2^n-1} (-1)^{f(x)} \left|x\right\rangle \otimes \mathbf{H}\left|1\right\rangle  \end{array}

This means that the action of \mathbf{U}_f leaves the output qubit unentangled with the input qubit. We can just ignore this moving forward and focus our attention to the input qubits.

We view the current state of the input qubit as the result of some operator \mathbf{V} defined by

\mathbf{V} \left|\phi\right\rangle = \displaystyle \frac{1}{2^{n/2}} \sum_{x=0}^{2^n-1} (-1)^{f(x)} \left|x\right\rangle


\left|\phi\right\rangle = \displaystyle \frac{1}{2^{n/2}} \sum_{x=0}^{2^n-1} \left|x\right\rangle

We can derive the expression of \mathbf{V} by expanding \mathbf{V} \left|\phi\right\rangle, noting that at x=a, f(x) = 1:

\begin{array}{rl}  \mathbf{V} \left|\phi\right\rangle &= \displaystyle \frac{1}{2^{n/2}} \sum_{x=0}^{2^n-1} (-1)^{f(x)} \left|x\right\rangle\\  &= \displaystyle \frac{1}{2^{n/2}} \left[\left|0\right\rangle + \ldots - \left|a\right\rangle +\ldots + \left|2^n-1\right\rangle\right]\\  &= \displaystyle \frac{1}{2^{n/2}} \left[\left|0\right\rangle + \ldots + \left|a\right\rangle +\ldots + \left|2^n-1\right\rangle - 2\left|a\right\rangle\right]\\  &= \displaystyle \frac{1}{2^{n/2}} \sum_{x=0}^{2^n-1} \left|x\right\rangle - \frac{2}{2^{n/2}} \left|a\right\rangle\\  &= \left|\phi\right\rangle - 2\left\langle a|\phi\right\rangle \left|a\right\rangle\\  &= \Big[\mathbf{I} - 2\left|a\right\rangle \left\langle a\right|\Big] \left|\phi\right\rangle\\  \end{array}

which means

\mathbf{V} = \mathbf{I} - 2\left|a\right\rangle \left\langle a\right|

The two vectors \left|\phi\right\rangle and \left|a\right\rangle determine a plane P. Let \left|a_{\perp}\right\rangle be the vector in the plane perpendicular to \left|a\right\rangle. We can write \left|\phi\right\rangle as

\left|\phi\right\rangle = \phi_1 \left|a_\perp\right\rangle + \phi_2\left|a\right\rangle

Applying \mathbf{V} to \left|\phi\right\rangle gives us:

\begin{array}{rl}  \mathbf{V}\left|\phi\right\rangle &= \left(\mathbf{I} - 2\left|a\right\rangle \left\langle a\right| \right) \left(  \phi_1 \left|a_\perp\right\rangle + \phi_2\left|a\right\rangle \right)\\  &= \phi_1 \left|a_\perp\right\rangle + \phi_2\left|a\right\rangle - 2\phi_2\left|a\right\rangle\\  &= \phi_1 \left|a_\perp\right\rangle - \phi_2\left|a\right\rangle  \end{array}

The effect of the operator \mathbf{V} is therefore to reflect the vector \left|\phi\right\rangle with respect to the vector \left|a_\perp\right\rangle.

Now, we want to reflect this vector \mathbf{V}\left|\phi\right\rangle with respect to \left|\phi\right\rangle. To accomplish this, we define the operator \mathbf{W} by

\mathbf{W} = 2\left|\phi\right\rangle \left\langle\phi\right| - \mathbf{I}

Apply this operator to \mathbf{V}\left|\phi\right\rangle:

\begin{array}{rl}  \mathbf{WV}\left|\phi\right\rangle &= \Big[ 2\left|\phi\right\rangle \left\langle\phi\right| - \mathbf{I} \Big]\mathbf{V}\left|\phi\right\rangle\\  &= 2\left|\phi\right\rangle \left\langle\phi\right|\mathbf{V}\left|\phi\right\rangle -\mathbf{V}\left|\phi\right\rangle  \end{array}

If we express \mathbf{V}\left|\phi\right\rangle as a linear combination of \left|\phi\right\rangle and a vector \left|\phi_\perp\right\rangle in the plane P,

\mathbf{V}\left|\phi\right\rangle = \alpha \left|\phi\right\rangle + \beta \left|\phi_\perp\right\rangle

We have,

\begin{array}{rl}  \mathbf{WV}\left|\phi\right\rangle &= \Big[ 2\left|\phi\right\rangle \left\langle\phi\right| - \mathbf{I} \Big]\mathbf{V}\left|\phi\right\rangle\\  &= 2\left|\phi\right\rangle \left\langle\phi\right|\mathbf{V}\left|\phi\right\rangle -\mathbf{V}\left|\phi\right\rangle\\  &= 2\alpha\left|\phi\right\rangle  - \alpha \left|\phi\right\rangle - \beta \left|\phi_\perp\right\rangle\\  &= \alpha\left|\phi\right\rangle  - \beta \left|\phi_\perp\right\rangle  \end{array}

which demonstrates that \mathbf{W} reflects \mathbf{V}\left|\phi\right\rangle with respect to \left|\phi\right\rangle.

Therefore, the effect of \mathbf{WV} on \left|\phi\right\rangle is to rotate it by an angle \gamma counter-clockwise.

We can compute this \gamma by getting the inner product of \mathbf{WV}\left|\phi\right\rangle and \left|\phi\right\rangle. First, let’s find the expression of \mathbf{WV}\left|\phi\right\rangle:

\begin{array}{rl}  \mathbf{WV}\left|\phi\right\rangle &= \mathbf{W} \left( \mathbf{I} - 2\left|a\right\rangle \langle a |\right) \left|\phi\right\rangle\\  &= \mathbf{W} \left(\left|\phi\right\rangle - 2 \underbrace{\langle a| \phi \rangle} \left|a\right\rangle \right)  \end{array}

The quantity \langle a| \phi \rangle is

\langle a| \phi \rangle = \displaystyle \langle a| \left( \frac{1}{2^{n/2}} \sum_{k=0}^{2^n-1} \left|x\right\rangle \right) = \frac{1}{2^{n/2}} = \cos \theta

where \theta is the angle between \left|a\right\rangle and \left|\phi\right\rangle.

The complementary angle, \rho = 90-\theta is the angle between \left|\phi\right\rangle and \left|a_\perp\right\rangle. Using a well-known trigonometric identity, we can compute for the angle of \rho:

\cos \theta = \sin \rho = \displaystyle \frac{1}{2^{n/2}}

Since \displaystyle \frac{1}{2^{n/2}} is very small if n is large,

\sin \rho = \displaystyle \frac{1}{2^{n/2}} \approx \rho


\begin{array}{rl}  \mathbf{WV}\left|\phi\right\rangle &= \mathbf{W} \left( \mathbf{I} - 2\left|a\right\rangle \langle a |\right) \left|\phi\right\rangle\\  &= \mathbf{W} \left(\left|\phi\right\rangle - 2 \cos\rho \left|a\right\rangle \right)\\  &= (2\left|\phi\right\rangle \langle \phi| - \mathbf{I}) \left(\left|\phi\right\rangle - 2 \cos\rho \left|a\right\rangle \right)\\  &= 2\left|\phi\right\rangle - \left|\phi\right\rangle - 2\cos\rho \left|\phi\right\rangle \langle \phi |a\rangle + 2 \cos\rho \left|a\right\rangle\\  &= \left|\phi\right\rangle - 4\cos^2\rho \left|\phi\right\rangle + 2 \cos\rho \left|a\right\rangle\\  &= (1- 4\cos^2\rho) \left|\phi\right\rangle + 2 \cos\rho \left|a\right\rangle  \end{array}

The inner product of \mathbf{WV}\left|\phi\right\rangle and \left|\phi\right\rangle is given by

\begin{array}{rl}  \langle\phi|\mathbf{WV}\left|\phi\right\rangle &= \langle\phi|\left[  (1- 4\cos^2\rho) \left|\phi\right\rangle + 2 \cos\rho \left|a\right\rangle\right]\\  &= (1- 4\cos^2\rho) + 2\cos^2\rho\\  &= 1- 2\cos^2\rho\\  &= \cos 2\rho  \end{array}

Therefore, the angle between these two vectors is 2\rho = \displaystyle \frac{1}{2^{n/2}}. How many times do we have to apply the operator \mathbf{WV} to get to \pi/2?

\begin{array}{rl}  m\cdot 2\rho &= \displaystyle \frac{\pi}{2}\\  \displaystyle m \frac{2}{2^{n/2}}&= \displaystyle \frac{\pi}{2}\\  m &=\displaystyle \frac{\pi\cdot 2^{n/2}}{4}  \end{array}

Therefore, we need to apply the operator \mathbf{WV} O(\sqrt{N}) times to find our value (where N=2^n).

RSA Encryption and Quantum Computing

We are going to crack the RSA by finding the period of the ciphertext. Let m be the message, c the ciphertext, e the encoding number and N=pq such that

c=m^e \mod N

From Period Finding and the RSA post, the period of a ciphertext is defined as the smallest number r such that

c^r \equiv \mod N

First we prepare n=2n_0 QBits at the state \left|0\right\rangle where n_0 is the number of bits of N=pq . The number of bits can be computed using the formula:

\lceil \log_2(N) \rceil

We also prepare n_0 QBits for the output register.

So initially we have the following setup:

\underbrace{\left|00\ldots 0\right\rangle}_{n\text{ times}} \otimes \underbrace{\left|00\ldots0\right\rangle}_{n_0\text{ times}}

The output register will contain the output of the operator:

\mathbf{U}_f \left|x\right\rangle\left|0\right\rangle = \left|x\right\rangle\left|f(x)\right\rangle


f(x) = c^x \mod N

We now apply the Hadarmard gate to the input register and apply the operator \mathbf{U}_f:

\displaystyle \frac{1}{2^{n/2}} \sum_{x=0}^{2^n-1} \left|x\right\rangle\left|f(x)\right\rangle

The Hadamard gates transform the state into a superposition of all possible values of the period while the operator \mathbf{U}_f transforms the output QBits as superposition of all possible values of c^x\mod N.

Next we make a measurement of the output QBits. This will give us a number f_0 which corresponds to all states whose value of \left|x\right\rangle is of the form x_0+kr since

\begin{array}{rl}  c^{x_0+kr} & \equiv c^{x_0}c^{kr} \mod N \\  & \equiv c^{x_0}(c^{r})^k \mod N \\  & \equiv c^{x_0} \mod N  \end{array}

since (c^{r})^k \equiv 1 \mod N by definition of a period.

Therefore, after the measurement, the input state is now

\displaystyle \left|\Psi\right\rangle = \frac{1}{\sqrt{m}} \sum_{k=0}^{m-1} \left|x_0 + kr\right\rangle

where m is the smallest integer such that x_0 + mr \ge 2^{n}.

We will then apply the quantum Fourier transform to state \left|\Psi\right\rangle . The Quantum Fourier Transform is defined by its action on an arbitrary \left|x\right\rangle

\displaystyle \mathbf{U}_{\mathbf{FT}}\left|x\right\rangle = \frac{1}{2^{n/2}} \sum_{y=0}^{2^n-1} e^{2\pi ixy/2^n}\left|y\right\rangle

Applying the Fourier Transform to the state \left|\Psi\right\rangle ,

\begin{array}{rl}  \displaystyle \mathbf{U}_{\mathbf{FT}} \frac{1}{\sqrt{m}} \sum_{k=0}^{m-1}\left|x_0 + kr\right\rangle &= \displaystyle \frac{1}{\sqrt{m}} \sum_{k=0}^{m-1}\mathbf{U}_{\mathbf{FT}} \left|x_0 + kr\right\rangle\\  &= \displaystyle \frac{1}{\sqrt{m}} \sum_{k=0}^{m-1} \frac{1}{2^{n/2}} \sum_{y=0}^{2^n-1} e^{2\pi i(x_0+kr)y/2^n}\left|y\right\rangle\\  &= \displaystyle \frac{1}{\sqrt{m}} \sum_{k=0}^{m-1} \frac{1}{2^{n/2}} \sum_{y=0}^{2^n-1} e^{2\pi ix_0y/2^n}e^{2\pi ikry/2^n}\left|y\right\rangle\\  &= \displaystyle \sum_{y=0}^{2^n-1} \underbrace{e^{2\pi ix_0y/2^n} \frac{1}{\sqrt{2^n m}} \sum_{k=0}^{m-1} e^{2\pi ikry/2^n}}_{\text{probability amplitude}}\left|y\right\rangle  \end{array}

Now if we make a measurement of this new state, the probability of getting a value of y is the probability amplitude multiplied by it’s complex conjugate:

\displaystyle \Big[\underbrace{e^{2\pi ix_0y/2^n} } \frac{1}{\sqrt{2^n m}} \sum_{k=0}^{m-1} e^{2\pi ikry/2^n}\Big]\Big[\underbrace{e^{-2\pi ix_0y/2^n} } \frac{1}{\sqrt{2^n m}} \sum_{k=0}^{m-1} e^{-2\pi ikry/2^n}\Big]

The factors in underbrace evaluates to 1 when multiplied and we’re left with

\displaystyle \frac{1}{2^n m}\Big| \underbrace{\sum_{k=0}^{m-1} e^{2\pi ikry/2^n}}\Big|^2

Using the formula

e^{i\theta} = \cos(\theta) + i\sin(\theta)

we can write the summation in underbrace as

\displaystyle \Big(\sum_{k=0}^{m-1} \cos(2\pi kry/2^n)\Big)^2 + \Big(\sum_{k=0}^{m-1} \sin(2\pi kry/2^n)\Big)^2

We’d like to know at what value of y the expression above is maximum. We can use the following formula to simplify above summations:

\displaystyle \sum_{k=0}^{m-1} \cos(2\pi fk) = \frac{\cos(\pi f(m-1)) \sin(\pi fm)}{\sin(\pi f)}\\  \displaystyle \sum_{k=0}^{m-1} \sin(2\pi fk) = \frac{\sin(\pi f(m-1)) \sin(\pi fm)}{\sin(\pi f)}

If we let

f = ry/2^n

this reduces the summations to:

\displaystyle \Big(\sum_{k=0}^{m-1} \cos(2\pi kry/2^n)\Big)^2 + \Big(\sum_{k=0}^{m-1} \sin(2\pi kry/2^n)\Big)^2

\begin{array}{rl}  &=\displaystyle \frac{\cos^2(\pi f(m-1)) \sin^2(\pi fm)}{\sin^2(\pi f)} + \frac{\sin^2(\pi f(m-1))\sin^2(\pi fm))}{\sin^2(\pi f)}\\  &= \displaystyle \frac{\sin^2(\pi fm)}{\sin^2(\pi f)}\Big[\underbrace{\cos^2(\pi f(m-1)) + \sin^2(\pi f(m-1))}_{\text{= 1}}\Big]\\  &= \displaystyle \frac{\sin^2(\pi fm)}{\sin^2(\pi f)}  \end{array}

Therefore, the probability of getting a specific y is

p(y) = \displaystyle \frac{1}{2^nm} \displaystyle \frac{\sin^2(\pi fm)}{\sin^2(\pi f)}

Here is a plot of the function \sin^2(\pi fm)/\sin^2(\pi f) for m=4:

Notice that the graph attains its maximums when the values of f are integer values, that is when

\displaystyle f = j= \frac{ry}{2^n} \text{ where } j= 1, 2, 3, 4, \ldots, r

Since y is an integer value, this means that the value of y must be close to j2^n/r. We can use a theorem in Number theory which states that if p/q is a rational number that approximates a real number x and

\displaystyle \Big|\frac{p}{q} - x\Big| \le \frac{1}{2q}

we can approximate x using continued fraction expansion of p/q. In fact, if we find a y that is within 1/2 of one of the j2^n/r values, we have

\displaystyle \left| y-j\frac{2^n}{r}\right| \le \displaystyle \frac{1}{2}\\

and dividing both sides by 2^n, we get

\displaystyle \left| \frac{y}{2^n}-\frac{j}{r}\right| \le \displaystyle \frac{1}{2^{n+1}}

which satisfies the theorem. We can therefore expand y/2^n via continued fraction expansion and continue until we find a denominator r that is less that 2^{n_0}. We then test whether r could be our period if it satisfies

\displaystyle c^r \equiv 1

If it satisfies the above, then r is the period. It’s just a matter of computing d^\prime, the inverse of e modulo r satisfying

\displaystyle ed^\prime \equiv 1 \mod r

Having computed d^\prime, we can then decrypt the ciphertext using

m = c^{d^\prime} \mod N

In our example in period finding, let’s suppose that the quantum computer gave us the following number


We can expand the fraction y/2^n = 1446311/16777216 as a continued fractions as follows:

\displaystyle \frac{y}{2^n} = \frac{1446311}{16777216} = \frac{1}{11 + \displaystyle \frac{1}{1+\displaystyle\frac{1}{1+\displaystyle\frac{1}{1}}}} = \frac{5}{58}


\displaystyle \frac{y}{2^n} = \frac{1446311}{16777216} = \frac{1}{11 + \displaystyle \frac{1}{1+\displaystyle\frac{1}{1+\displaystyle\frac{1}{\displaystyle\frac{1}{6886}}}}} = \frac{34433}{399423}

We can stop until q=58 since q=58 < 2^{n_0} = 2^{12} = 4096 . So we use the first expansion where q=58 and testing it

\begin{array}{rl}  \displaystyle c^{58} \mod 3127 &= \displaystyle 794^{58} \mod 3127\\  &= \displaystyle 1 \mod 3127    \end{array}

which confirms that q=58 is a period. Using r=58 in the equation

\displaystyle ed^\prime = m\times r + 1

and solving for d^\prime (which is the inverse of e modulo 58) we get

\displaystyle d^\prime = 25

We can now get the original message using

\displaystyle 0794^{25} \mod 3127 = 1907

which agrees with our original message.

How lucky do we need to get in order for the state to collapse to one of these y values close to j2^n/r? Let’s evaluate the probability for

y_j=\displaystyle j\frac{2^n}{r} \pm \delta_j

where \delta_j is the distance of the closest y to j2^n/r. Since f = ry/2^n, we have

\begin{array}{rl}  p(y_j) = \displaystyle \frac{1}{2^nm}\frac{\sin^2(\pi fm)}{\sin^2(\pi f)} &= \displaystyle \frac{1}{2^nm} \frac{\sin^2(\pi mr\displaystyle\frac{y}{2^n})}{\sin^2(\pi r\displaystyle\frac{y}{2^n}) }\\  &= \displaystyle \frac{1}{2^nm} \frac{\sin^2\Big[\displaystyle\frac{\pi mr}{2^n}(j\frac{2^n}{r} \pm \delta_j)\Big]}{\sin^2\Big[\displaystyle\frac{\pi r}{2^n}(j\frac{2^n}{r} \pm \delta_j)\Big] }\\  &= \displaystyle \frac{1}{2^nm} \frac{\sin^2\Big[\displaystyle j\pi m \pm \frac{\pi m r\delta_j}{2^n})\Big]}{\sin^2\Big[\displaystyle j\pi \pm \frac{\pi r\delta_j}{2^n}\Big] }\\  &= \displaystyle \frac{1}{2^nm} \frac{\sin^2\Big[\displaystyle \frac{\pi m r\delta_j}{2^n})\Big]}{\sin^2\Big[\displaystyle \frac{\pi r\delta_j}{2^n}\Big] }  \end{array}

We can approximate the denominator using the small angle approximation of sine:

\sin(x) \approx x

to get

\displaystyle p(y_j) = \displaystyle \frac{1}{2^nm} \frac{\sin^2\Big[\displaystyle \frac{\pi m r\delta_j}{2^n})\Big]}{\left[\displaystyle \frac{\pi r\delta_j}{2^n}\right]^2}

Since m is the smallest integer such that x_0 + mr \ge 2^{n}, then

m = \displaystyle \left\lceil\frac{2^n}{r}\right\rceil

We can use this to approximate

\displaystyle m\frac{r}{2^n} \approx 1

to get

p(y_j) = \displaystyle \frac{1}{2^nm} \frac{\sin^2(\displaystyle \pi\delta_j)}{\left(\displaystyle \frac{\pi\delta_j}{m}\right)^2}

Since 0 \le \delta_j \le 1/2, we can see from the graph below that the line joining (0,0) and (\pi/2,1), whose equation is 2x/\pi is less that the value of the sine function at that interval, that is,

\displaystyle \frac{2x}{\pi} \le \sin(x)

Using this inequality, we can get a lower bound of the probability of getting a specific y value to be

\begin{array}{rl}  p(y_j) = \displaystyle \frac{1}{2^nm} \frac{\sin^2(\displaystyle \pi\delta_j)}{\left(\displaystyle \frac{\pi \delta_j}{m}\right)^2} &\ge \displaystyle \frac{1}{2^nm} \left(\frac{\displaystyle \frac{2\pi\delta_j}{\pi}}{\displaystyle \frac{\pi \delta_j}{m}}\right)^2\\  &= \displaystyle \frac{1}{2^nm} \left( \frac{2m}{\pi}\right)\\  &= \displaystyle \frac{m}{2^n}\frac{4}{\pi^2}\\  &= \displaystyle \frac{1}{r}\frac{4}{\pi^2}  \end{array}

Since there are r of these y_j‘s, the probability of getting any one of these y_j‘s is therefore:

p(y) = \displaystyle \frac{4}{\pi^2} \approx 0.405

which means that there is 40% probability of the state to collapse to a value of y that will give us the correct period.

We have just seen how a quantum computer can be used to crack the RSA. We have exploited the fact that the measurement will give us a value of y that is near j2^n/r with high probability. We then computed for the period using the continued fraction expansion of y/2^n. With the period computed, it’s becomes straightforward to decrypt the ciphertext.

Period Finding and the RSA

In the previous post, we learned how to decrypt RSA by getting the factors of the big number N and computing for the inverse of e (the encoding number) modulo N. There is also another way to decrypt an RSA encrypted message. This is when you are able to get the period of the ciphertext. If c is the ciphertext, the period r is the smallest integer that satisfies:

c^r \equiv 1 \mod N

Once we get the period, we compute for d^\prime , the inverse of e modulo r:

ed^\prime \equiv 1 \mod r

The inverse can then be used to decrypt the ciphertext:


In our previous example, we encrypted the message


using public key p=53, q=59, N=pq=3127 and e=7 and private key d=431. The “plain text” is

1907 0818 2608 1826 0026 1804 0217 0419 2612 0418 1800 0604

and the ciphertext is:

0794 1832 1403 2474 1231 1453 0268 2223 0678 0540 0773 1095

Let’s compute the period of the first block of our ciphertext:

0794^r \equiv 1 \mod 3127

Using the python script below, we can compute the period

for r in range(1,100):
   if p == 1:
     print "%d %d" % (r,p)

The result of running the above program gives r=58. We can then compute d^\prime using the following equation:

ed^\prime = m\times r + 1

The above equation is satisfied when m=3 and d^\prime = 25 . Using this value of d^\prime , we can compute for

\begin{array}{rl}  m&=0794^{25} \mod 3127 \\  &= 1907  \end{array}

which gives us the original message!

However, unlike using the private key, you need to compute the period r and d^\prime for every block of the ciphertext (unless the ciphertext is composed of only one block). However, that should not stop a cracker from deciphering all the blocks.

How RSA Encryption Works

Alice and Bob live in different parts of the world. They want to communicate with each other but they don’t want anyone to know the messages they exchange. In order to protect the message, they need to encrypt their message. Bob comes up with a secret key that will allow both of them to encrypt and decrypt their messages so that when they send it via email they will be confident that no one can read the message if ever someone (like Eve) intercepted the message. However, they have a dilemma. How will Bob send the encryption key to Alice ? If he sends the key and Eve intercepts it, then Eve will be able to decrypt the messages both Alice and Bob exchange and know what they are up to.

Bob should be able to send the key to Alice encrypted so that Eve will not be able to read it. In order to do this, Bob will have to create a second key to encrypt the key he wants to send to Alice. The problem now is how will Alice decrypt the message (which is the encrypted key) if she does not have the second key?

This is where RSA encryption is used. Suppose we want to encrypt a message using RSA, what we’ll do is find 2 large prime numbers p and q and get their product N = pq. We will need another number e, which we will use to encode the message into a ciphertext. The set of numbers N and e is called the public key which Alice can send to Bob via email. Bob will use these numbers to encrypt the secret key before sending to Alice.

We will represent a textual message like “THIS IS A SECRET MESSAGE” into numbers. To accomplish this, we need to map letters into numbers like the following:

\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}  \hline  A & B & C & D & E & F & G & H & I & J & K & L & M & N & O & P & Q & R & S & T & U & V & W & X & Y & Z & \\\hline  0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10& 11& 12& 13& 14& 15& 16& 17& 18& 19& 20& 21& 22& 23& 24& 25& 26\\  \hline  \end{tabular}

Using the above mapping, we can write ‘THIS IS A SECRET MESSAGE’ as:

19 7 8 18 26 8 18 26 0 26 18 4 2 17 4 19 26 12 4 18 18 0 6 4

If a number is less than 10, we pad it with a zero to the left. The message then becomes:

19 07 08 18 26 08 18 26 00 26 18 04 02 17 04 19 26 12 04 18 18 00 06 04

To conserve some space, we can group the numbers into groups of 4:

1907 0818 2608 1826 0026 1804 0217 0419 2612 0418 1800 0604

Now for each M number above, we will encode it using the formula:

\displaystyle C = M^e \mod N

What is this mod operation? The above says that we raise M to the exponent e, divide the result by N and get the remainder. For example, if M=10, e=7 and N=17 we have


Now divide the result by 17 and get the remainder:

10000000 / 7 = 588235 \text { Remainder } 5


10^7 \mod 17 = 5

In python we can get the answer using the pow function:

>>> pow(10,7,17)

Let’s say we choose p = 53, q=59 and e = 7. This gives us N=pq = 53\times 59 = 3127. To encode 1907, we do

\displaystyle C=1907^7 \mod 3127 = 0794

The number 0794 is now the ciphertext. It is the number we give to the recipient of the message. We can use python to generate the ciphertext above.

for n in ("1907","0818","2608","1826","0026","1804","0217","0419","2612","0418","1800","0604"):
   print pow(int(n),7,3127)                                                                

Doing this for all numbers we get:

0794 1832 1403 2474 1231 1453 0268 2223 0678 0540 0773 1095

When the recipient gets this message, she can decipher it using a key which she keeps private to herself. The key d is the inverse of e modulo (p-1)(q-1). The key d is called the private key. We can retrieve the original message using the formula:

\displaystyle M = C^d

The number d can be calculated using the following formula:

ed \mod (p-1)(q-1) \equiv 1

Using python we can compute for d using the following program:

>>> p=53
>>> q=59
>>> e=7
>>> NN = (p-1)*(q-1)
>>> for d in range(1,NN):
...   x = e*d % NN
...   if x == 1:
...     print 'd = ', d
d =  431

Using d = 431 and applying the decipher formula to the first block, we get

\displaystyle M = 0794^{431} \mod 3127 = 1907

which is our original message!

We now apply this to the entire ciphertext

0794 1832 1403 2474 1231 1453 0268 2223 0678 0540 0773 1095

using the python program below:

for n in ("0794","1832","1403","2474","1231","1453","0268","2223","0678","0540","0773","1095"):
   print pow(int(n),431,N)

we get

1907 0818 2608 1826 0026 1804 0217 0419 2612 0418 1800 0604

which is our original full message! It’s just a matter of mapping these numbers back to letters to get the message text.

Using this mechanism, Alice will send 2 numbers N and e to Bob which he will use to encrypt the secret key and send to Alice. When Alice receives the encrypted secret key, she will use her private key d to decrypt it and get the secret key. After that, they can now start using the secret key to encrypt messages between them.

Parallel Computing in the Small: An Introduction to Quantum Computing

As I watched my one-month old baby sleeping, a thought ran through my mind. I wonder what technology would look like when she’s my age. There’s a lot of things going on simultaneously in technology today. I remember doing Artificial Intelligence/Machine Learning about 15 years ago. They used to be done in a university setting. They are now the new normal. Some technologies are quite recent like Big Data and Blockchain but they have become mainstream very quickly. I wonder what it will look like when my daughter grows up.

There is a technology I’m currently watching. It’s still in it’s infancy but this technology is really fascinating. It is a technology based on a remarkably counter-intuitive theory of the universe that governs the atoms and sub-atomic particles. It is called Quantum Computing. Given the high speed at which science fiction become realities, it’s possible that my baby girl will someday be programming on a quantum computer. But for now, we can only hope.

I have not seen a Quantum Computer and based on what I read so far, it’s still an engineering challenge. It might probably be a few more years before they are mass-produced but it is something to look forward to. But the good thing is, we can already talk about quantum algorithms! So let’s sharpen our pencils and get a few sheets of paper for we are going to talk about this new way of computing.

In classical computing, a bit has a value that is either one or zero. In quantum computing, a bit can now have a value that is a superposition of 1 and 0. We call this a QBit. The values 1 and 0 are now symbolized as \left| 1\right\rangle and \left| 0\right\rangle respectively. These are unit vectors in what is known as a Hilbert space. They are also known as kets, derived from the word bracket which we’ll talk more of later. An example of a vector space that you are familiar with is the color wheel. Each color is actually a combination of red, blue and green. For example, the color yellow is a combination of red and green. It’s also the same as with the qubit. A general QBit vector is of the form

\displaystyle \left| \Psi \right\rangle = a\left| 0\right\rangle + b\left| 1\right\rangle

such that \mid a\mid^2 + \mid b\mid^2 = 1. The numbers a and b are complex numbers and the symbol \mid a\mid^2 is defined as

\displaystyle \mid a\mid^2 = a^*a

where a^* is the complex conjugate of a. A QBit vector represents the state of the QBit. We shall use the term vector and state interchangeably moving forward.

However, you won’t be able to see them in superposition state. If you want to know the state of a qubit, you will have to measure it. But when you measure it, it will collapse to one of either \left| 0\right\rangle or \left| 1\right\rangle but you will have a clue as to what that state will be. The coefficients a and b will give you the probability of the result of the measurement: it will be \mid a\mid^2 in state \left| a\right\rangle and \mid b\mid^2 in state \left| b\right\rangle.

If the state of a QBit is a superposition of basis states \left| 0\right\rangle and \left| 1\right\rangle, does this mean our computation is also random? Does it mean we get different answers every time ?

It turns out that we can get a definite answer! Let’s illustrate this by a simple quantum computation.

Suppose a I have a number between 0 and 1 million. What is my number? You can guess my number and I will only tell you if you got it correctly or not. I won’t tell you if my number is higher or lower than your guess. How many tries do you think you need in order to guess my number? I’m sure our tries will be many. In the worst case, you’ll need about 1 million tries.

For a quantum computer, you only need one try to guess it and that’s what we’re going to see. However, bear with me since we have to build up our arsenal of tools first before we can even talk about it. Let’s start with how a QBit looks like as a matrix.

The basis QBits \left| 0\right\rangle and \left| 1\right\rangle can be written as 1×2 matrices:

\displaystyle \left| 0 \right\rangle =  \begin{bmatrix}  1\\  0  \end{bmatrix}

\displaystyle \left| 1 \right\rangle =  \begin{bmatrix}  0\\  1  \end{bmatrix}

Using this representation, we can write a general QBit as:

\displaystyle \mid\Psi\rangle = a\mid 0\rangle + b\mid 1\rangle = a  \begin{bmatrix}  1\\  0  \end{bmatrix}  + b  \begin{bmatrix}  0\\  1  \end{bmatrix}  = \begin{bmatrix}  a\\  b  \end{bmatrix}

So now we know that every QBit is a linear combination of ket vectors \left| 0\right\rangle and \left| 1\right\rangle. These ket vectors have a corresponding “bra” vectors \langle 0 \mid and \langle 1\mid. In matrix representation, the bra basis vectors are defined as:

\displaystyle \langle 0\mid  = \begin{bmatrix}  1 & 0  \end{bmatrix}


\displaystyle \langle 1\mid =  \begin{bmatrix}  0 & 1  \end{bmatrix}

For a general QBit \mid\Psi\rangle = a\mid 0\rangle + b\mid 1\rangle, the corresponding bra vector is

\displaystyle  \langle \Psi \mid = a^*\langle 0\mid + b^*\langle 1\mid =  \begin{bmatrix}  a^* & b^*  \end{bmatrix}

There is an operation between bra and ket vectors of the basis states called inner product which is defined as:

\begin{array}{rl}  \displaystyle \langle 0\mid 0 \rangle &= 1\\  \displaystyle \langle 1\mid 1 \rangle &= 1\\  \displaystyle \langle 0\mid 1 \rangle &= \langle 1\mid 0\rangle = 0  \end{array}

Using the above rules, we can define the inner product of any two QBits \mid \Psi\rangle = a\mid 0\rangle + b\mid 1\rangle and \mid\Phi\rangle = c\mid 0\rangle + d\mid 1\rangle as

\begin{array}{rl}  \displaystyle \langle \Psi\mid\Phi\rangle &= \big(a^*\langle 0\mid + b^*\langle 1\mid\big)\big(c\mid 0\rangle + d\mid 1\rangle\big)\\  &= \displaystyle a^*c \langle 0\mid 0\rangle + a^*\langle 0\mid 1\rangle + b^*c\langle 1\mid 0\rangle + b^*d\langle 1\mid 1\rangle\\  &= a^*c + b^*d  \end{array}

Using the above definition of an inner product, the norm of a vector is defined as

\displaystyle \mid\left|\Psi\right\rangle\mid = \displaystyle \sqrt{\langle \Psi \mid \Psi \rangle} =\displaystyle \sqrt{a^*a + b^*b}

Geometrically, the norm is the length of the ket \mid \Psi\rangle.

Quantum computations are done using quantum operators. They are also called quantum gates. Operators are linear mappings that take a state to another state. That is, an operator \mathbf{A} acts on a\left| 0 \right\rangle + b\left| 1\right\rangle to give another vector c\left| 0 \right\rangle + d\left| 1\right\rangle. We write this as

\displaystyle \mathbf{A}(a\left| 0\right\rangle + b\left| 1\right\rangle) = c\left| 0\right\rangle + d\left| 1\right\rangle

Since the basis vectors \left| 0\right\rangle and \left| 1\right\rangle are also vectors, operators can also operate on them. An example of a very important operator and it’s action on the basis vectors is the Hadamard operator:

\begin{array}{rl}  \displaystyle \mathbf{H} \left| 0 \right\rangle &= \displaystyle \frac{1}{\sqrt{2}}(\left| 0 \right\rangle + \left| 1\right\rangle)\\  \displaystyle \mathbf{H} \left| 1 \right\rangle &= \displaystyle \frac{1}{\sqrt{2}}(\left| 0 \right\rangle - \left| 1\right\rangle)  \end{array}

By knowing the action of an operator on the basis vectors, we can determine the matrix representation of an operator. For the Hadamard operator, the matrix representations are:

\displaystyle \mathbf{H} \left| 0 \right\rangle = \displaystyle \frac{1}{\sqrt{2}}(\left| 0 \right\rangle + \left| 1\right\rangle) = \frac{1}{\sqrt{2}}  \begin{bmatrix}  1\\  1  \end{bmatrix}


\displaystyle \mathbf{H} \left| 1 \right\rangle = \displaystyle \frac{1}{\sqrt{2}}(\left| 0 \right\rangle - \left| 1\right\rangle) = \frac{1}{\sqrt{2}}  \begin{bmatrix}  1\\  -1  \end{bmatrix}

Therefore the matrix representation of the Hadamard operator is

\displaystyle \mathbf{H}=  \frac{1}{\sqrt{2}}  \begin{bmatrix}  1 & 1 \\  1 & -1  \end{bmatrix}

The identity operator is defined as the operator that maps a vector into itself and is symbolized as \mathbf{I}:

\displaystyle \mathbf{I}\left|\Psi\right\rangle = \left|\Psi\right\rangle

The NOT operator is defined as

\begin{array}{rl}  \mathbf{X}\left|0\right\rangle &= \left|1\right\rangle\\  \mathbf{X}\left|1\right\rangle &= \left|0\right\rangle  \end{array}

What it does is transform the \left|0\right\rangle to \left|1\right\rangle and vice versa, just like the classical bits.

Operators are linear. We can use linearity to derive the resulting vector when an operator acts on a general QBit:

\displaystyle \mathbf{A}(a\left| 0\right\rangle + b\left| 1\right\rangle) = a\mathbf{A} \left| 0\right\rangle + b\mathbf{A} \left| 1\right\rangle)

If \mathbf{A} is the Hadamard operator, that is, \mathbf{A} = \mathbf{H}, then the above operation becomes:

\displaystyle \mathbf{H}(a\left| 0\right\rangle + b\left| 1\right\rangle) = a\mathbf{H} \left| 0\right\rangle + b\mathbf{H} \left| 1\right\rangle)

Since we know that:

\begin{array}{rl}  \displaystyle \mathbf{H} \left| 0 \right\rangle &= \displaystyle \frac{1}{\sqrt{2}}(\left| 0 \right\rangle + \left| 1\right\rangle)\\  \displaystyle \mathbf{H} \left| 1 \right\rangle &= \displaystyle \frac{1}{\sqrt{2}}(\left| 0 \right\rangle - \left| 1\right\rangle)  \end{array}

We can then write this as:

\begin{array}{rl}  \displaystyle \mathbf{H}(a\left| 0\right\rangle + b\left| 1\right\rangle) &= \displaystyle a\mathbf{H} \left| 0\right\rangle + b\mathbf{H} \left| 1\right\rangle)\\  &= \displaystyle a\frac{1}{\sqrt{2}}\big(\left| 0\right\rangle + \left| 1\right\rangle\big) + b\frac{1}{\sqrt{2}}\big(\left| 0\right\rangle - \left| 1\right\rangle\big)\\  &= \displaystyle \frac{a+b}{\sqrt{2}} \left| 0\right\rangle + \frac{a-b}{\sqrt{2}} \left| 1\right\rangle  \end{array}

Let see what the operators look like when they operate on bra vectors. Let’s start with the Hadamard operator acting on a general QBit above.

\begin{array}{rl}  \displaystyle \mathbf{H} \mid 0 \rangle &= \displaystyle \frac{1}{\sqrt{2}}(\mid 0 \rangle + \mid 1\rangle) \rightarrow \langle 0\mid \mathbf{H^\dagger} = \displaystyle \frac{1}{\sqrt{2}}(\langle 0 \mid + \langle 1\mid)\\  \displaystyle \mathbf{H} \mid 1 \rangle &= \displaystyle \frac{1}{\sqrt{2}}(\mid 0 \rangle - \mid 1\rangle) \rightarrow \langle 1\mid \mathbf{H^\dagger} = \displaystyle \frac{1}{\sqrt{2}}(\langle 0 \mid - \langle 1\mid)  \end{array}

where we symbolize \mathbf{H^\dagger} as the operator corresponding to the Hadamard operator acting on bras.

The matrix representation of the Hadamard operator acting on bras is therefore:

\mathbf{H^\dagger} = \frac{1}{\sqrt{2}}  \begin{bmatrix}  1 & 1\\  1 & -1  \end{bmatrix} = \mathbf{H}

In general, the matrix representation of \mathbf{H^\dagger} is the conjugate transpose of the matrix of \mathbf{H}. In simple terms, this means take the transpose of \mathbf{H} and apply the complex conjugates to each element. It’s amazing that the Hadamard operator is equal to its conjugate transpose. The types of operators that behave that way are called Hermitian operators.

Let \mathbf{A} be an operator and \left| \Psi\right\rangle a vector and let \Phi = \mathbf{A}\left|\Psi\right\rangle, then the norm of \Phi is

\displaystyle \langle\Phi\mid\Phi\rangle = \langle \Psi\mid \mathbf{A^\dagger}\mathbf{A}\mid \Psi\rangle = \langle\Psi\mid\Psi\rangle

This means that

\displaystyle \mathbf{A^\dagger}\mathbf{A} = \mathbf{I}

The kind of operator that satisfies the above relationship is called Unitary.

Now that we have gained knowledge of some of the tools, we can now describe how a quantum computation is set up. We need QBits for the input and QBits for the output. The simplest setup is a one QBit input and one QBit output. Let \left| x\right\rangle and \left| y\right\rangle be the input and output QBits,respectively. The general state of this quantum system is a superposition of the following states:

\left| 0 \right\rangle \otimes \left| 0 \right\rangle, \left| 0 \right\rangle \otimes \left| 1 \right\rangle, \left| 1 \right\rangle \otimes \left| 0 \right\rangle, \left| 1 \right\rangle \otimes \left| 1 \right\rangle

which could also be written as

\left|00\right\rangle, \left|01\right\rangle, \left|10\right\rangle, \left|11\right\rangle,

or, if we take them as binary representation of numbers, we can write them concisely as

\left|0\right\rangle, \left|1\right\rangle,\left|2\right\rangle,\left|3\right\rangle,

Using the above basis vectors, we can write a general 2-QBit system as:

\displaystyle \left|\Psi\right\rangle = c_1\left|0\right\rangle + c_1\left|1\right\rangle + c_1\left|2\right\rangle + c_1\left|3\right\rangle = \sum_{i=0}^{2^n-1} c_i\left|i\right\rangle

where the numbers c_i are complex and n=2. The summation is up to 2^n-1 since the maximum number n QBits can represent is 2^n-1. The summation on the right also gives us the general form of a QBit for an n-QBit system.

As we saw earlier, the quantum operators (or gates) are the ones used to compute from a given QBit inputs. Let \mathbf{U}_f represent a unitary operator implementing the function f. The function f is a function that takes the numbers from the domain \{0,1\} to \{0,1\}, that is,

\displaystyle f: \{0,1\}\rightarrow \{0,1\}

For a two-QBit system, the form of quantum computation is

\displaystyle \mathbf{U}_f\left|x\right\rangle \left|y\right\rangle = \left|x\right\rangle \left|x\oplus f(x)\right\rangle

where the symbol \oplus represent bitwise OR operation. The table below will give you the results of the bitwise OR operation for every combination of values in \{0,1\}:

\begin{array}{cc}  0 \oplus 0 &= 0\\  0 \oplus 1 &= 1\\  1 \oplus 0 &= 1\\  1 \oplus 1 &= 0  \end{array}

A simple but important example of a unitary operator acting on two QBits is the \mathbf{C_{\text{NOT}}} gate, defined as:

\displaystyle \mathbf{C_{\text{NOT}}}\left|x\right\rangle\left|y\right\rangle = \left|x\right\rangle\left|y \oplus x\right\rangle

Here is the complete action of the \mathbf{C_{\text{NOT}}}:

\begin{tabular}{c|c|c}  \text{Input} & \text{Output} & \text{Result}\\  0 & 0 & 0\\  0 & 1 & 1\\  1 & 0 & 1\\  1 & 1 & 0  \end{tabular}

If you study the table above, you’ll notice that if the input QBit is in the state \left|0\right\rangle, the output QBit stays the same. However, if the input QBit is in \left|1\right\rangle, the output QBit is flipped. The input QBit in this case controls the flipping of the second QBit. The input QBit is said to be the control bit. That’s why it’s called a \mathbf{C_{\text{NOT}}} which stands for Controlled-NOT gate.

Let’s now go back to our original problem. There is a positive number a that is less than 2^n . We want to find this number using quantum computation. We are given a black box operator \mathbf{U}_f whose action on a bra is

\displaystyle \mathbf{U}_f \left|x\right\rangle\left|y\right\rangle = \left|x\right\rangle\left| y \oplus f(x)\right\rangle


\displaystyle f(x) = x_0 a_0 \oplus x_1 a_1 \oplus \cdots x_{n-1}a_{n-1} = x \cdot a

where a is the unknown number we are going to guess and x_ia_i are the i-th bits of x and a.

So how many times do we have to apply \mathbf{U}_f in order to guess the number a?

To solve this, we need n input QBits and 1 output QBit. We will initialize our QBits to \left|0\right\rangle. So the initial configuration is

\displaystyle \underbrace{\left|000...0\right\rangle}_{\text{n number of zeroes}}\otimes\left|1\right\rangle

Notice how we separate the input and output qubits.

Next we apply the Hadamard operator to each QBit:

\displaystyle \mathbf{H}^{\otimes n}\otimes \mathbf{H} \left|000...0\right\rangle\left|1\right\rangle = \mathbf{H}^{\otimes n} \left|000...0\right\rangle\otimes \mathbf{H}\left|1\right\rangle

We have seen the Hadamard operator earlier. For the sake of convenience we’ll repeat it here:

\begin{array}{rl}  \displaystyle \mathbf{H} \left| 0 \right\rangle &= \displaystyle \frac{1}{\sqrt{2}}(\left| 0 \right\rangle + \left| 1 \right\rangle)\\  \displaystyle \mathbf{H} \left| 1 \right\rangle &= \displaystyle \frac{1}{\sqrt{2}}(\left| 0 \right\rangle - \left| 1\right\rangle)  \end{array}

We can generalize this by letting x be either 0 or 1:

\begin{array}{rl}  \displaystyle \mathbf{H}\left| x\right\rangle &= \frac{1}{\sqrt{2}}\big(\left|0\right\rangle + (-1)^x \left|1\right\rangle  \end{array}

We can use this to compute the action on two QBits and generalize:

\begin{array}{rl}  \displaystyle \mathbf{H}^{\otimes 2}\left| x\right\rangle &= \mathbf{H}^{\otimes 2}\left| x_0x_1\right\rangle = \mathbf{H}\left| x_0\right\rangle \otimes \mathbf{H}\left| x_1\right\rangle\\  &= \displaystyle \frac{1}{\sqrt{2}}\big(\left|0\right\rangle + (-1)^{x_0} \left|1\right\rangle \otimes \frac{1}{\sqrt{2}}\big(\left|0\right\rangle + (-1)^{x_1} \left|1\right\rangle\\  &= \displaystyle \big(\frac{1}{\sqrt{2}}\big)^2 \big( \left|00\right\rangle + (-1)^{x_0} \left|01\right\rangle + (-1)^{x_1}\left|10\right\rangle + (-1)^{x_1 + x_0}\left|11\right\rangle\\  &= \displaystyle \big(\frac{1}{\sqrt{2}}\big)^2 \big( (-1)^{0\cdot x_1 + 0\cdot x_0} \left|00\right\rangle + (-1)^{0\cdot x_1 + 1\cdot x_0} \left|01\right\rangle + (-1)^{1\cdot x_1 + 0\cdot x_0}\left|10\right\rangle + (-1)^{1\cdot x_1 + 1\cdot x_0}\left|11\right\rangle\\  &= \displaystyle \big(\frac{1}{\sqrt{2}}\big)^2 \big( (-1)^{0\cdot x_1 + 0\cdot x_0} \left|0\right\rangle + (-1)^{0\cdot x_1 + 1\cdot x_0} \left|1\right\rangle + (-1)^{1\cdot x_1 + 0\cdot x_0}\left|2\right\rangle + (-1)^{1\cdot x_1 + 1\cdot x_0}\left|3\right\rangle\\  &= \displaystyle \big(\frac{1}{\sqrt{2}}\big)^2 \sum_{y=0}^{2^2-1} (-1)^{x\cdot y}\left|y\right\rangle  \end{array}

In general,

\displaystyle \mathbf{H}^{\otimes n}\left| x\right\rangle = \frac{1}{2^{n/2}}\sum_{y=0}^{2^n-1} (-1)^{x\cdot y}\left|y\right\rangle

Using the above formula, the action of \mathbf{H}^{\otimes n} on \left|0\right\rangle is

\displaystyle \mathbf{H}^{\otimes n} \left|0\right\rangle = \frac{1}{2^{n/2}}\sum_{y=0}^{2^n-1} \left|y\right\rangle

The next step is to apply our operator \mathbf{U}_{f} which was defined above:

\begin{array}{rl}  \displaystyle \mathbf{U}_f\mathbf{H}^{\otimes n}\left| 0\right\rangle \otimes \frac{1}{\sqrt{2}} \big( \left|0\right\rangle - \left|1\right\rangle\big) &= \displaystyle \frac{1}{2^{n/2}}\sum_{y=0}^{2^n-1} \mathbf{U}_f\left|y\right\rangle \otimes \frac{1}{\sqrt{2}} \big( \left|0\right\rangle - \left|1\right\rangle\big)\\  &= \displaystyle \frac{1}{2^{n/2}}\sum_{y=0}^{2^n-1} \frac{1}{\sqrt{2}} \big( \mathbf{U}_f\left|y\right\rangle \otimes  \left|0\right\rangle - \mathbf{U}_f\left|y\right\rangle \otimes \left|1\right\rangle\big)\\  &= \displaystyle \frac{1}{2^{n/2}}\sum_{y=0}^{2^n-1} \frac{1}{\sqrt{2}} \big( \left|y\right\rangle \otimes  \left|0\oplus f(y)\right\rangle - \left|y\right\rangle \otimes \left|1\oplus f(y)\right\rangle\big)\\  \end{array}

Let’s break this down a bit. Since the value of f(y) is either 0 or 1, we have 2 cases:

When f(y) = 0,

\displaystyle \left|y\right\rangle \otimes  \left|0\oplus f(y)\right\rangle = \left|y\right\rangle \otimes  \left|0\right\rangle
\displaystyle \left|y\right\rangle \otimes  \left|1\oplus f(y)\right\rangle = \left|y\right\rangle \otimes  \left|1\right\rangle

which means

\displaystyle \frac{1}{2^{n/2}}\sum_{y=0}^{2^n-1} \frac{1}{\sqrt{2}} \big( \left|y\right\rangle \otimes  \left|0\oplus f(y)\right\rangle - \left|y\right\rangle \otimes \left|1\oplus f(y)\right\rangle\big) = \frac{1}{2^{n/2}}\sum_{y=0}^{2^n-1} \left|y\right\rangle \otimes \frac{1}{\sqrt{2}} \big(  \left|0\right\rangle - \left|1\right\rangle\big)

When f(y) = 1,

\displaystyle \left|y\right\rangle \otimes  \left|0\oplus f(y)\right\rangle = \left|y\right\rangle \otimes  \left|1\right\rangle
\displaystyle \left|y\right\rangle \otimes  \left|1\oplus f(y)\right\rangle = \left|y\right\rangle \otimes  \left|0\right\rangle

which means

\displaystyle \frac{1}{2^{n/2}}\sum_{y=0}^{2^n-1} \frac{1}{\sqrt{2}} \big( \left|y\right\rangle \otimes  \left|0\oplus f(y)\right\rangle - \left|y\right\rangle \otimes \left|1\oplus f(y)\right\rangle\big) = \frac{1}{2^{n/2}}\sum_{y=0}^{2^n-1} (-1) \left|y\right\rangle \otimes \frac{1}{\sqrt{2}} \big(  \left|0\right\rangle - \left|1\right\rangle\big)

Combining these 2 we have

\begin{array}{rl}  \displaystyle \mathbf{U}_f\mathbf{H}^{\otimes n}\left| 0\right\rangle \otimes \frac{1}{\sqrt{2}} \big( \left|0\right\rangle - \left|1\right\rangle\big)  &= \displaystyle \frac{1}{2^{n/2}}\sum_{y=0}^{2^n-1} \frac{1}{\sqrt{2}} \big( \left|y\right\rangle \otimes  \left|0\oplus f(y)\right\rangle - \left|y\right\rangle \otimes \left|1\oplus f(y)\right\rangle\big)\\  &= \displaystyle \frac{1}{2^{n/2}}\sum_{y=0}^{2^n-1} (-1)^{f(y)} \left|y\right\rangle \otimes \frac{1}{\sqrt{2}} \big(  \left|0\right\rangle - \left|1\right\rangle\big)  \end{array}

Finally, getting the Hadamard action on this gives:

\begin{array}{rl}  \displaystyle \mathbf{H}^{\otimes n}\otimes\mathbf{H}\Big[\mathbf{U}_f\mathbf{H}^{\otimes n}\left| 0\right\rangle \otimes \frac{1}{\sqrt{2}} \big( \left|0\right\rangle - \left|1\right\rangle\big)\Big] &=\mathbf{H}^{\otimes n} \mathbf{U}_f\mathbf{H}^{\otimes n}\left| 0\right\rangle \otimes \mathbf{H}\Big[\frac{1}{\sqrt{2}} \big( \left|0\right\rangle - \left|1\right\rangle\big)\Big]\\  &= \underbrace{\mathbf{H}^{\otimes n} \Big[\displaystyle \frac{1}{2^{n/2}}\sum_{y=0}^{2^n-1} (-1)^{f(y)} \left|y\right\rangle \Big]}_{A} \otimes \underbrace{\mathbf{H}\Big[\frac{1}{\sqrt{2}} \big(  \left|0\right\rangle - \left|1\right\rangle\big)\Big]}_{B}  \end{array}

Expanding underbrace B gives us:

\begin{array}{rl}  \displaystyle \mathbf{H}\Big[\frac{1}{\sqrt{2}} \big(  \left|0\right\rangle - \left|1\right\rangle\big)\Big] &= \displaystyle \frac{1}{\sqrt{2}}\big(\mathbf{H}\left|0\right\rangle - \mathbf{H}\left|1\right\rangle\big)\\  &= \displaystyle \frac{1}{\sqrt{2}}\Big[\frac{1}{\sqrt{2}} \big( \left|0\right\rangle + \left|1\right\rangle \big) - \frac{1}{\sqrt{2}}\big( \left|0\right\rangle - \left|1\right\rangle\big) \Big]\\  &= \displaystyle \frac{1}{2}\Big[ 2\left|1\right\rangle \Big]\\  &= \left|1\right\rangle  \end{array}

Expanding underbrace A above:

\begin{array}{rl}  \mathbf{H}^{\otimes n} \Big[\displaystyle \frac{1}{2^{n/2}}\sum_{y=0}^{2^n-1} (-1)^{f(y)} \left|y\right\rangle \Big] &= \displaystyle \frac{1}{2^{n/2}}\sum_{y=0}^{2^n-1} (-1)^{f(y)}\mathbf{H}^{\otimes n} \left|y\right\rangle  \end{array}

We have found earlier that

\displaystyle \mathbf{H}^{\otimes n}\left| y\right\rangle = \frac{1}{2^{n/2}}\sum_{x=0}^{2^n-1} (-1)^{x\cdot y}\left|x\right\rangle

Substituting this to the above and noting that

\displaystyle f(y) = y\cdot a = \sum_{i=0}^{n-1} y_ia_i \mod 2,

we get:

\begin{array}{rl}  \mathbf{H}^{\otimes n} \Big[\displaystyle \frac{1}{2^{n/2}}\sum_{y=0}^{2^n-1} (-1)^{f(y)} \left|y\right\rangle \Big] &= \displaystyle \frac{1}{2^{n/2}}\sum_{y=0}^{2^n-1} (-1)^{f(y)}\mathbf{H}^{\otimes n} \left|y\right\rangle \\  &= \displaystyle \frac{1}{2^{n}}\sum_{y=0}^{2^n-1} \sum_{x=0}^{2^n-1} (-1)^{f(y)}(-1)^{y\cdot x}\left|x\right\rangle\\  &= \displaystyle \frac{1}{2^{n}}\sum_{y=0}^{2^n-1} \sum_{x=0}^{2^n-1} (-1)^{y\cdot a}(-1)^{y\cdot x}\left|x\right\rangle\\  &= \displaystyle \frac{1}{2^{n}}\sum_{x=0}^{2^n-1} \underbrace{\sum_{y=0}^{2^n-1} (-1)^{(a+x)\cdot y}}\left|x\right\rangle\\  \end{array}

where we have interchanged the order of the summation. To simplify the term with the underbrace, notice that

\begin{array}{rl}  \displaystyle \Big[1+ (-1)^{z_0}\Big]\Big[1+(-1)^{z_1}\Big] &= 1 + (-1)^{z_0} + (-1)^{z_1} + (-1)^{z_0+z_1}\\  &= \displaystyle (-1)^{0\cdot z_0 + 0\cdot z_1} + (-1)^{1\cdot z_0 + 0\cdot z_1} + (-1)^{0\cdot z_0 + 1\cdot z_1} + (-1)^{1\cdot z_0+ 1\cdot z_1}\\  &= \displaystyle(-1)^{0\cdot z} + (-1)^{1\cdot z} + (-1)^{2\cdot z} + (-1)^{3\cdot z}\\  &= \displaystyle \sum_{y=0}^{2^2-1} (-1)^{y\cdot z}  \end{array}

We can therefore write the factor in underbrace as

\begin{array}{rl}  \mathbf{H}^{\otimes n} \Big[\displaystyle \frac{1}{2^{n/2}}\sum_{y=0}^{2^n-1} (-1)^{f(y)} \left|y\right\rangle \Big] &= \displaystyle \frac{1}{2^{n}}\sum_{x=0}^{2^n-1} \underbrace{\sum_{y=0}^{2^n-1} (-1)^{(a+x)\cdot y}}\left|x\right\rangle\\  &= \displaystyle \frac{1}{2^{n}}\sum_{x=0}^{2^n-1} \prod_{i=0}^{n-1} \Big[ \underbrace{1 +(-1)^{(a_i+x_i)}}\Big] \left|x\right\rangle\\  \end{array}

Notice that

\displaystyle 1 +(-1)^{(a_i+x_i)} = \begin{cases}  0 & x\ne a\\  2 & x=a  \end{cases}

that is, the product \prod_{i=0}^{n-1} \Big[1 + (-1)^{(a_i+x_i)}\Big] = 0 unless x=a. Therefore

\begin{array}{rl}  \mathbf{H}^{\otimes n} \Big[\displaystyle \frac{1}{2^{n/2}}\sum_{y=0}^{2^n-1} (-1)^{f(y)} \left|y\right\rangle \Big]  &= \displaystyle \frac{1}{2^{n}}\sum_{x=0}^{2^n-1} \prod_{i=0}^{n-1} \Big[ \underbrace{1 +(-1)^{(a_i+x_i)}}\Big] \left|x\right\rangle\\  &= \displaystyle \frac{1}{2^{n}} 2^n \left|a\right\rangle\\  &= \left|a\right\rangle  \end{array}

Therefore, the quantum computation gives the answer in one application of \mathbf{U}_f:

\displaystyle \mathbf{H}^{\otimes n+1}\mathbf{U}_f \mathbf{H}^{\otimes n+1} \left|0\right\rangle = \left| a\right\rangle \left| 1\right\rangle

So what just happened? We began with a superposition of states and finished with the correct answer in one application of the black box operator \mathbf{U}_f. That’s the power of Quantum Computing!