## Think of a number: How to Program a Quantum Computer

I have always loved figuring out the unknowns. That’s why when I was a child, I love “think of a number” games.

There is a quantum version of this game. First, let’s define an operation called “dot product” of two numbers:
$f(x) = a\cdot x = a_0x_0\oplus a_1x_1\oplus\ldots\oplus a_nx_n$

where $a_i$ is the ith bit of the binary representation of $a$, and $x_i$ is the i’th bit of $x$ and the symbol $\oplus$ is the XOR operator defined as:

$\begin{array}{rl} 1\oplus 1 &= 0\\ 1\oplus 0 &= 1\\ 0\oplus 1 &= 1\\ 0\oplus 0 &= 0 \end{array}$

For example, suppose $a=5$ and $x=6$. The binary representation of $5=101_2$ and the binary representation of $6=110_2$. Then

$\begin{array}{rl} f(6) &= 1\cdot 1 \oplus 0\cdot 1 \oplus 1\cdot 0\\ &= 1\oplus 0 \oplus 0\\ &= 1 \end{array}$

If you were to guess my number $a$, how would you go about it?

Well the easiest way to guess my number is to get the dot product of my number with the powers of 2. For example,

$\begin{array}{rl} f(2^0) &= f(1) = 101_2 \cdot 001_2 = 1\\ f(2^1) &= f(2) = 101_2 \cdot 010_2 = 0\\ f(2^2) &= f(4) = 101_2 \cdot 100_2 = 1 \end{array}$

So for an n-bit number, it takes you $n$ invocations of the dot product to determine my hidden number.
The bigger the number, the larger the number of invocations.

Using a quantum computer, it only takes a 1-time invocation of the function in order to guess the number $a$!

How is it done? You can find how it works from here. In this blog post, we will show how it is implemented using the IBM Quantum Information Science Toolkit.

As we dive into coding, let’s also talk about Quantum Circuits. This is the foundation of programming the quantum computer.

## Introducing QISKit

IBM has an opensource framework to program a real Quantum Computer which you can also use to simulate the computation. It’s called QISKit, short for Quantum Information Science Kit. They have an excellent documentation which you can refer for installation instructions.

Assuming we now have installed QISKit, we can start coding our quantum programs. We start with creating some qubits

# Import the Qiskit SDK
from qiskit import QuantumCircuit, ClassicalRegister, QuantumRegister
from qiskit import execute, Aer
from qiskit.tools.visualization import circuit_drawer, plot_histogram

# Create an input Quantum Register with 3 qubits.
qinput = QuantumRegister(3)
# Create an output Quantum Register with 1 qubits.
qoutput= QuantumRegister(1)
# Create a Classical Register with 3 bits.
c = ClassicalRegister(3)
# Create a Quantum Circuit
qc = QuantumCircuit(qinput, qoutput, c)


The most important class in the above code is the QuantumCircuit. This class allows us to perform quantum gate operations on the qubits that we pass to it in the initialization as well as to measure the qubits and write results to the classical qubits.

## Quantum Circuits

A circuit is composed of qubits and gates. Gates are operators that act on qubits to change it or to change an output qubit. An example of how we visualize circuits is shown below.

You can see 3 input qubits, 1 output qubit and 3 classical bits. The “meter” icon you see in the drawing indicates that we are measuring the state of the qubit at that point in time and saving the result to a classical qubit.

The code for the above circuit is shown below:

from qiskit import QuantumCircuit, ClassicalRegister, QuantumRegister
from qiskit import execute, Aer
from qiskit.tools.visualization import circuit_drawer, plot_histogram

# Create an input Quantum Register with 3 qubits.
qinput = QuantumRegister(3,name="in")

# Create an output Quantum Register with 1 qubit.
qoutput= QuantumRegister(1,name="out")

# Create a Classical Register with 3 bits.
c = ClassicalRegister(3, name="c")

# Create a Quantum Circuit
qc = QuantumCircuit(qinput, qoutput, c)


The circuit drawing we saw above was easily generated using the following code:

from qiskit.tools.visualization import circuit_drawer

circuit_drawer(qc)


We repeat the output here for convenience:

### Measuring the State of the Qubits

Below is the code to measure the input qubits and writing the result to the classical bits. We will execute the circuit 1000 times

shot=1000

and plot the histogram.

qc.measure(qinput, c)

# We will use a simulator
backend_sim = Aer.get_backend('qasm_simulator')
shots = 1000

job_sim = execute(qc, backend=backend_sim, shots=shots)
result_sim = job_sim.result()


The resulting histogram shows that our qubits are in the initial state of $|000\rangle$ with probability equal to 1.

## NOT Gate

By default, our qubits are in the $|0\rangle$ state. To convert it to a $|1\rangle$ qubit, we use the NOT gate:

How do we know that the qubit is indeed in the $|1\rangle$ state? When we measure it, the probability of getting the state $|001\rangle$ is 1. This means if we do the experiment many times, the state will be in $|001\rangle$ all the time!

Below is the result of the measurement:

Another interesting gate, and in my opinion, is the Hello World of Quantum Computing is the Hadamard gate. The Hadamard gate transforms the basis qubits according to the rule:

$\begin{array}{rl} \mathbf{H}|0\rangle &= \frac{1}{\sqrt{2}} \Big(|0\rangle + |1\rangle\Big)\\ \mathbf{H}|1\rangle &= \frac{1}{\sqrt{2}} \Big(|0\rangle - |1\rangle\Big)\\ \end{array}$

What makes the Hadamard gate very interesting is that when you apply it to the initial state of

$|\underbrace{000\ldots 0}_{\text{n bit string}}\rangle$,

it transforms the initial state into a superposition of all states $|x\rangle$

$\displaystyle \mathbf{H}^{\otimes n} |0\rangle = \displaystyle \frac{1}{\sqrt{2}^n} \sum_{x=0}^{2^{n-1}} |x\rangle$

where

$|x\rangle = |x_{n-1}\rangle\otimes\ldots\otimes|x_2\rangle \otimes|x_1\rangle\otimes|x_0\rangle$ and $x_i \in \{0,1\}$ for $i=0\ldots 2$

Therefore, for $n=3$, we have a superposition

$|x\rangle = \displaystyle \frac{1}{\sqrt{2}^3} \Big(|0\rangle + |1\rangle + |2\rangle + |3\rangle + |4\rangle + |5\rangle + |6\rangle + |7\rangle\Big)$

Below is a circuit that implements the above superposition:

If we measure the value of $|x\rangle$, we can see that every state is equally likely to be the value of $|x\rangle$.

## CNOT Gate

The last basic gate we will introduce is the $C_{\mathrm{NOT}}$ gate ($C_{\mathrm{NOT}}$ stands for Controlled-NOT). The $C_{\mathrm{NOT}}$ gate flips the value of the output qubit depending on the value of the input qubit. It is defined as:

$C_{\mathrm{NOT}}|x\rangle|y\rangle = |x\rangle|y\oplus x\rangle$

Below is an example of a $C_{\mathrm{NOT}}$ circuit.

Here, we flip the value of the first bit using a NOT gate then we apply a $C_{\mathrm{NOT}}$ gate between the first bit and the third bit. Since the first bit is now equal to $|1\rangle$, the $C_{\mathrm{NOT}}$ will flip the value of the third bit from $|0\rangle$ to $|1\rangle$. As expected, when we measure the value of the qubits, we find that the state is in $|101\rangle$ with probability equal to 1.

## Implementing the Dot Product Circuit

We now have the ingredients that we can use to implement the Dot Product circuit. In this circuit, we have 3 input qubits, 1 output qubit and 3 classical bits. If our hidden number is $a=5$, this is represented in binary as $101$. To compute the dot product of any 3 bit number $x$ with $a=5=101_2$, we draw 3 lines corresponding to the binary representation of $x$. We also draw the line for the output qubit.

Since $a=101_2$, the 0th bit of $a$ is 1. Let $x_0$ be the 0th bit of $x$.

$x_0 \cdot 1 = \begin{cases} 1, \text{ if } x_0 = 1\\ 0, \text{ if } x_0 = 0 \end{cases}$

When $x_0\cdot 1 = 1$, then it will flip the output qubit. Otherwise, $x_0\cdot 0 = 0$ and it will leave the output qubit as is.

This suggests that we need a $C_\mathrm{NOT}$ gate that connects $|\mathrm{in}_0\rangle$ and $|\mathrm{out}_0\rangle$.

Therefore,

qc.cx(qinput[0], qoutput[0])


The same is true for $a_2$:

qc.cx(qinput[2],qoutput[0])


With this in mind, we can program the Dot Product circuit:

from qiskit import QuantumCircuit, ClassicalRegister, QuantumRegister
from qiskit import execute, Aer
from qiskit.tools.visualization import circuit_drawer, plot_histogram

# Create an input Quantum Register with 3 qubits.
qinput = QuantumRegister(3, "in")
# Create an output Quantum Register with 1 qubit.
qoutput= QuantumRegister(1,"out")

# Create a Quantum Circuit
qc = QuantumCircuit(qinput, qoutput)
qc.barrier()
# Add a CX (CNOT) gate on control qubit qinput[0] and target qubit qoutput[0]
qc.cx(qinput[0], qoutput[0])
# Add a CX (CNOT) gate on control qubit qinput[2] and target qubit qoutput[0]
qc.cx(qinput[2],qoutput[0])
qc.barrier()


## Finally, Computing for the Unknown Number

The trick to solving the unknown is by first flipping the output qubit using the NOT gate and applying Hadamards to all qubits:

$\mathbf{H}^{\otimes n+1}\mathbf{U}_f\mathbf{H}^{\otimes n+1}\Big[|0\rangle|1\rangle\Big] = |a\rangle|1\rangle$

See this post for a more detailed explanation.

The circuit corresponding to this equation is the diagram below where we also show barriers that separate the input, the black-box function and the measurement.

The code to do the above is shown below:

from qiskit import QuantumCircuit, ClassicalRegister, QuantumRegister
from qiskit import execute, Aer
from qiskit.tools.visualization import circuit_drawer, plot_histogram

# Create an input Quantum Register with 3 qubits.
qinput = QuantumRegister(3)
# Create an output Quantum Register with 1 qubits.
qoutput= QuantumRegister(1)
# Create a Classical Register with 3 bits.
c = ClassicalRegister(3)
# Create a Quantum Circuit
qc = QuantumCircuit(qinput, qoutput, c)

# Apply Hadamard gate to the input qubits
qc.h(qinput[0])
qc.h(qinput[1])
qc.h(qinput[2])

# Flip the output qubit and apply Hadamard gate.
qc.x(qoutput[0])
qc.h(qoutput[0])

qc.barrier()
# Add a CX (CNOT) gate on control qubit 0 and the output qubit.
qc.cx(qinput[0], qoutput[0])

# Add a CX (CNOT) gate on control qubit 0 and the output qubit.
qc.cx(qinput[2],qoutput[0])

qc.barrier()

# Apply Hadamard gates to all input and output qubits.
qc.h(qinput[0])
qc.h(qinput[1])
qc.h(qinput[2])
qc.h(qoutput[0])

# Add a Measure gate to see the state.
qc.barrier()
qc.measure(qinput, c)

# Compile and use a simulator
backend_sim = Aer.get_backend('qasm_simulator')
shots = 1000

job_sim = execute(qc, backend=backend_sim, shots=shots)
result_sim = job_sim.result()

# Draw the circuit
from qiskit.tools.visualization import circuit_drawer
my_style = {'plotbarrier': True}
circuit_drawer(qc,style=my_style)


Measuring the state of the input qubits after the computation, we get with probability 1 our hidden number $a=5=101_2$.

## Coloring the Circuits

You might have noticed that our circuit has colors while the code only gives a black-and-white color. To color our circuits, we use the following code:

from qiskit.tools.visualization import matplotlib_circuit_drawer as drawer, qx_color_scheme
my_scheme=qx_color_scheme()
my_scheme['plotbarrier'] = True
drawer(qc, style=my_scheme)


## Conclusion

It’s now fairly easy to write quantum programs using IBM QISKit. In the future, it will even be much easier as we will just be using the libraries and composing them in high-level rather than really looking at the algorithms from a low-level. Just as AI is now accessible to a lot of people, Quantum Computing will have it’s day as well.

## Implementing Parallel Algorithms Part 1

Now we know that parallel algorithms allow us to make our programs run faster. So how do we implement them?

I have used mpich before, but that was more than a decade ago. Recently, I found myself looking for new ways of doing parallel programming. I discovered a very nice parallel programming language called Chapel. This is what we’ll use to implement parallel algorithms.

## Algorithm 1: Parallel sum of consecutive numbers from 1 to N

To get the sum of numbers from 1 to N, where N is some integer is easy. There is a formula to do that:

$\displaystyle \sum_{i=1}^N i = \frac{N(N+1)}{2}$

However for the sake of illustration, we are going to compute the sum of 1 to N using a cluster of machines. Here is the Chapel code to accomplish it inside the file add_parallel.chpl.

config var N:int = 1000;
var D:domain(1) = {0..numLocales -1};
var s:[D] int;
var sum:int= 0;
var bs = N/numLocales;
coforall loc in Locales {
on loc {
var i = loc.id;
var start = i*bs + 1;
var end = start + bs -1;
var _sum:int = 0;
for j in start .. end {
_sum += j;
}
writeln("i= " + i + ", start= "+ start + ", end=" + end + ", sum = " + _sum);
s[i] = _sum;
}
}

sum = + reduce s;
writeln("sum: " + sum);


This program is compiled using the command:

chpl add_parallel.chpl -o add_parallel


where add_parallel.chpl is the filename of the program and -o add_parallel specifies the filename of the binary produced after compilation.

One line 1, we have defined the default value of N to be 1000. This can be overridden on the command line by specifying the --x parameter. The number of machines we are going to use is also specified on the command line using the -nl parameter. A sample invocation of this program is the following:

./add_parallel -nl 3 --N=120


The above command means that we want to run the add_parallel binary using 3 machines with the value of N=120.

Executing the above command will give us:

./add_parallel -nl 3 --N=120
i= 0, start= 1, end=40, sum = 820
i= 1, start= 41, end=80, sum = 2420
i= 2, start= 81, end=120, sum = 4020
sum: 7260


## How the program works

The program will partition N into 3 blocks. This is specified on line 5 where we divided N by numLocales to get the block size. The numLocales will contains the value of the parameter -nl which in this example is 3.

The code on line 6 tells chapel to execute a parallel for-loop executing the code inside on loc block on each Locale. A Locale has an id starting from 0. The Locale will determine it’s id and compute the starting and ending number to sum and store this value in the variable _sum.

coforall loc in Locales {
on loc {
var i = loc.id;
var start = i*bs + 1;
var end = start + bs -1;
var _sum:int = 0;
for j in start .. end {
_sum += j;
}
writeln("i= " + i + ", start= "+ start + ", end=" + end + ", sum = " + _sum);
s[i] = _sum;
}
}


This _sum is stored in the array s. We take the sum of entries of the s array using the reduce keyword specifying + as the reduction operator. Finally we print the total sum across the machines.

sum = + reduce s;
writeln("sum: " + sum);


## Conclusion

We have seen that that we can implement parallel programs using Chapel programming language. In part 2, we will show how to do Parallel Matrix Multiplication using Chapel.

## Counting the Fruits of the Tree, Part 2

Algorithm Complexity

In the last post, we computed the probability distribution of the set of events that would occur when the first letter of the search term is pressed, given a set of 100K english words. Let’s review the algorithm we are trying to analyze:

input = []; //array of words
matches = [];
while(incremental_search){
for(i =0;i -1){
matches.push(input[i]);
}
}
input = matches;
matches = [];
}


When the first keystroke is pressed, we search through the entire array (comparing each element with the keystroke) and collecting all matching words. The matching words are then collected by the matches array. After the loop is finished, we set the input array to the contents of the matches array and we empty the matches array. We again search the new input array when the user presses the next keystroke, this time matching all words in the new input with the word fragment formed by the first and 2nd letter. We do this until the user is presented with a small set of words that matches his/her search term. We can see that the performance of this algorithm is dependent on the length of the input array for each iteration.

Suppose we take as our input a random subset of our list of words with size 1000. This subset of words should not have any word repeated. We want to know how our algorithm performs on the average for any such subset. To figure this out, we need to know how many words would match the input array when we press the first letter. Since this input array could be any subset of 1000 words from the original list of 109582 words, we want to know the average number of words that matches the first keystroke. Take for example, the letter a. What is the average number of words that will we would likely see when we press the letter a?

Random Variables

In a deterministic world, we can compute the behaviour of things using a function and we would get the same answer for the same input. But we know that the world is all but deterministic and that functions do not always assume the same value for the same input. We call such functions from a set $U$ to the set of real numbers whose value depend on a probability distribution a Random Variable. For example, when we toss a coin, it can either land heads or tails. If we assign a value 1 to heads and 0 to tails, then the coin toss can be thought of as a random variable whose domain is the set ${text{head},text{tail}}$ and whose range is the set ${1,0}$. It’s value is 1 with probability 1/2 and 0 with probability 1/2. If we toss the coin many times and count the number of heads that came up, we would find that 50% of the time, it came up heads. We also say that “on the average”, heads come up half of the time. The “on the average” is also known as the Expected Value of the random variable and is defined by the formula:

$\displaystyle E[X] = \sum_j j\cdot Pr(X = j)$

where $j$ is a value that the random variable assumes and $Pr(X=j)$ is the probability of the random variable $X$ in assuming the value $j$. The summation is over all values that X assumes. In our coin toss example, there are 2 possible values of the $X$ and these are 1 (with probability 1/2) and 0 (with probability 1/2). Therefore, the expected value of the coin toss random variable is

$\displaystyle E[X] = 1\cdot Pr(X=1) + 0\cdot Pr(X=0)$
$\displaystyle = 1\cdot \frac{1}{2} = \frac{1}{2}$

In the same way, given any word from our list, we can define a function that takes on a value of 1 if it contains the letter ‘a’ and 0 otherwise. From the previous post, we have computed the probability of getting an ‘a’ to be 0.0752. Let’s call this function $X_alpha$, then it is defined as;

$\displaystyle X_\alpha = \Big\{\begin{array}{ll} 1 & \text{if the word contains an a}\\ 0 & \text{otherwise} \end{array}$

In general, if $p$ is the probability of getting a letter ‘a’, then $1-p$ is the probability of getting something else. The expected value of this random variable can be computed using the formula:

$\displaystyle \begin{array}{ll} E[X_\alpha] &= 1 \cdot Pr(X_\alpha = 1) + 0 \cdot Pr(X_\alpha = 0)\\ &= 1 \cdot p + 0 \cdot (1-p)\\ &= p \end{array}$

Imagine we have a set of 1000 words, if for each word we apply this function, then we have an array of 1’s and 0’s. The total number of 1’s indicate the number of words that contain the letter ‘a’. In other words, the function

$\displaystyle X = \sum_\alpha^{1000} X_\alpha$

is a random variable that gives us the number of words that contain a letter ‘a’. We can compute the expected value of $X$ using the formula:

$\displaystyle E[X] = E[\sum_\alpha^{1000} X_\alpha]$
$\displaystyle = \sum_\alpha^{1000} E[X_\alpha]$
Since $E[X_\alpha] = p$, the above computation leads to

$\displaystyle E[X] = \sum_\alpha^{1000} p = 1000 p$

The result tells us that the expected number of words containing the letter ‘a’ is $1000p = 1000\cdot 0.0752$ or approximately 75 words. This is a good thing because in the second loop, we will only need to iterate on an input with an average length of 75 words. If we do the same computation for all letters, we find the expected number of words per letter for every thousand to be

   key Expected #Words Matching
1    '               0.02624589
2    a              75.22991402
3    b              22.05573578
4    c              43.26766611
5    d              39.77565011
6    e              99.04150001
7    f              14.90897924
8    g              31.76540371
9    h              25.38502724
10   i              80.46859417
11   j               2.27158200
12   k              10.22408743
13   l              54.74761950
14   m              30.32581651
15   n              67.64353879
16   o              59.99679800
17   p              30.62108280
18   q               2.24402381
19   r              74.48190608
20   s              80.93445876
21   t              65.87062875
22   u              37.54737384
23   v              12.34738014
24   w              10.11910386
25   x               3.54188320
26   y              20.14765939
27   z               5.01034088


If each letter is equally likely to be the first letter to be typed, then on the average we have 37 words matching the first letter, consequently reducing our 1000 length input to 37.

Since the figure we computed was the average, it means that the length of the next input array can be more than this average. In the next part of the series, we will derive some bounds on the length of the next input array. We will also compute the length of the next input array when we type the next letter of the search term.

## Counting the Fruits of the Tree, Part 1

In the last post, we described a way to filter a set of words using a Trie. The other way to filter is via searching the array of matching substrings for each keystroke and returning all words that matched. Let $W$ be an array of words. If we type the letter “f”, we search through the entire array and return those words that contain the letter f. Here’s a sample of words containing the letter f.

"folios"          "graffiti"        "frowner"         "affirmativeness"
"confounds"       "nitrifies"       "thrifts"         "decaffeinated"
"undefiled"       "shiftily"        "finises"         "midwiferies"
"firepan"         "wafered"         "purifies"        "flatly"
"scarfing"        "preform"


Typing the letter “i” will further reduce this list to

"graffiti"        "affirmativeness" "nitrifies"       "undefiled"
"finises"         "firepan"         "purifies"        "scarfing"


Typing the letter “e” will reduce this list to

"nitrifies" "purifies"


Analysis: Data Preparation

Let’s get a feel of how this algorithm performs on a set of english words published at site http://www-01.sil.org/linguistics/wordlists/english/. The word list can be download from this link. We’ll also use our favorite tool to do data manipulation and statistical computations.

x=read.csv("wordsEn.txt", header=F)


After loading, the data is now referenced via the variable $x$. The number of words loaded is 109582. We use the head command to take a peek at the data.

> head(x)
V1
1        a
2      aah
3    aahed
4   aahing
5     aahs
6 aardvark


For every letter in the english alphabet, we want to determine how many words in the list $W$ contain that letter. To determine this, we will split each word into its constituent letters and get the unique letters in each word.

xvec=as.vector(x[,1])
xsplit=strsplit(xvec, "")
[[1]]
[1] "a"

[[2]]
[1] "a" "a" "h"

[[3]]
[1] "a" "a" "h" "e" "d"

[[4]]
[1] "a" "a" "h" "i" "n" "g"

[[5]]
[1] "a" "a" "h" "s"

[[6]]
[1] "a" "a" "r" "d" "v" "a" "r" "k"


To get the unique letters we apply the unique function to all words in the array:

tmp=lapply(xsplit, FUN=unique)
[[1]]
[1] "a"

[[2]]
[1] "a" "h"

[[3]]
[1] "a" "h" "e" "d"

[[4]]
[1] "a" "h" "i" "n" "g"

[[5]]
[1] "a" "h" "s"

[[6]]
[1] "a" "r" "d" "v" "k"


Now that we have a list of unique letters of each word, we can count the number of words that contain each letter. We do this by concatenating the list of unique letters of each word into one big list and tabulating the occurrences of each letter.

tmp2=unlist(tmp)
> table(tmp2)
tmp2
'     a     b     c     d     e     f     g
20 57327 16807 32971 30310 75472 11361 24206
h     i     j     k     l     m     n     o
19344 61319  1731  7791 41719 23109 51546 45719
p     q     r     s     t     u     v     w
23334  1710 56757 61674 50195 28612  9409  7711
x     y     z
2699 15353  3818

# In terms of percentage, rounded to 2 decimal places...
> round(table(tmp2)/length(tmp2)*100,2)
tmp2
'    a    b    c    d    e    f    g    h    i
0.00 7.52 2.21 4.33 3.98 9.90 1.49 3.18 2.54 8.05
j    k    l    m    n    o    p    q    r    s
0.23 1.02 5.47 3.03 6.76 6.00 3.06 0.22 7.45 8.09
t    u    v    w    x    y    z
6.59 3.75 1.23 1.01 0.35 2.01 0.50


We can visualize the numbers above by plotting the frequency as a bar chart against the letters.

plot(table(tmp2)/length(tmp2), ylab="probability", xlab="letters")


The result shows us that the 5 most common letters in the english dictionary are “e”, “s”, “i”, “a”, and “r”.

> f=data.frame(f=round(table(tmp2)/length(tmp2)*100,2))
> attach(f)
> colnames(f)=c("letter", "frequency")
letter frequency
6       e      9.90
20      s      8.09
10      i      8.05
2       a      7.52
19      r      7.45


Recap

What we have done is create a frequency distribution of the occurrence each letter in the english alphabet based on a list of 100K words. Now that we have our data prepared, we will analyze the performance of the algorithm, given the input $W$, in the next post.

## Growing Words on Tries

Give me a place to stand and I will move the Earth.

Thus says Archimedes. Leverage. Brute force can only get you somewhere but true power is in leverage. But leverage does not come easily. It involves a lot of thinking and training. In this post, we are going to explore how we use leverage in order to solve a common programming problem.

Suppose I have an array of words or phrases (or String, as used in programming) that I want to search. The search is incremental in the sense that whenever I press the first few letters of the word, I am given an incremental search result. These first few letters can occur in any part of the string. I want to develop an algorithm to to search the array of strings in an efficient manner.

Let’s say we have an array of words: “far”, “for”, “force”, “forced”, “forces”, “force field”. Suppose I want to search for the word “force”. What I do is type in sequence “f”, “o”, “r”,… When I type “f”, the algorithm will then search for all strings containing “f”.

When I type “o”, it will search for all strings containing “fo”. When I type “r”, it will search for all strings containing “for”, and so on.

The first thing that comes to mind is to match each string in the array for each keystroke that we type. All those that match will then be collected and will be used as the new array where we search the new substring formed when the next letter is pressed. If there are a million strings in the array, we will have to search the entire array when the first string is pressed. Hopefully, we will get a small subset that will help us find quickly what we are looking for. This is a brute force approach because it will have to compare a million times just to come up with the set of matches.

Leverage

Nature has always be a source of innovations for science and engineering. We turn to the birds to learn how planes can fly. We turn to the fishes to help our submarines dive. We turn to ants to help us find the shortest path to food. For this problem, we turn to trees.

Imagine representing the array of words as leaves of a tree as shown in the diagram below.

In the diagram, we have 6 words, namely: “far”, “for”, “force”, “forced”, “forces”, “force field” that are arranged in a tree structure. Each non-leaf node contains a letter and the leaf nodes contain the words. Typing the first letter “f” already focuses on all words containing “f” thereby preventing the entire array to be traversed. After forming the sequence “for”, we immediately focus on the words “force”, “forced”, “forces”, and “force field”. The more we type letters, the more focused we are on the item we are searching. This is a very efficient search structure. Thanks to the trees for the inspiration.

To appreciate this structure, you can find below an example that uses this structure to power the word search. In the text box below, type the word “fr”, without quotes, and you should see a filtering of the words.

Enter Search String

• {{ result }}

This is an example of what is known as a typeahead and is very convenient when you are searching for a specific word or phrase in a long list. Thanks also to the javascript technology, we are able to immediately see our algorithm in action.

In my younger years as a novice programmer, I can comfortably sit in a corner and program the day away. Life was so focused back then. You arrive at nine and without even noticing it, it’s time to go home with a great sense of achievement, perhaps because you were able to solve a major bug or develop an elegant framework. Alas, those were the days! Multitasking is now the norm and not the exception.

These days, you come at 9 AM doing all sorts of work, handling all sorts of interruption. By six o’clock you feel you haven’t even accomplished anything. What will all those meetings and interruptions every now and then. How do you measure your productivity then, with all these tasks you need to do every day. Some of these tasks are not even done in a day or two as you wait for data or input to proceed with it. Good thing there is already workflow technology to help you and the management keep track of the time you spent on your tasks without even consciously recording the time you started your task and the time you finished with it.

Suppose a certain employee is multitasking between 3 projects A, B, and C. Each project is composed one or more tasks. To represent Task1 of Project A, we symbolize it as A.1. The same goes with other tasks of other projects. When the employee works on a task, the system records the start time and the end time of the task. Most of the time, the employee switches between tasks, especially if one of the tasks is more urgent, and switches back to the previous task. We can represent this scenario using the diagram below:

In this example, you can see that the employee multitasked between tasks 1 and 2 of project A. We observe that the end time of task A.1 is greater than the start time of task A.2, which means that they overlap. The numbers 1, 2, and 1.5 represent the number of hours of the duration of each segment. The question we want to answer is, what is the total time spent by the employee in each task? Since the tasks overlap for a duration of 2 hours, we can assume, for the sake of simplicity , that the employee did half of the work on task A.1 and half of the work on task A.2 (As to the validity of this assumption, that is another story). Therefore, the total time alloted to each task is

$A.1 = 1 + 2/2 = 2$
$A.2 = 2/2 + 1.5 = 2.5$

Well that was easy. Now what happens if the employee’s multitasking graph is like the one below, how will you compute for the duration of each task?

The diagram shows 5 overlaps. The challenge is to create a program that will take as input the start and end times of each task and compute the total time of each task on the assumption that for every time interval that overlaps, the duration assigned to each task is divided equally among the overlapping tasks.

The algorithm

Label the start times and end times of each task like in the figure below.

Sort these times from earliest to latest to get $t_1 < t_5 < t_3 < t_2 < t_6 < t_4$. Let us define the segment between any two consecutive $t_j$ a TimeSegment. Create two structures named Project and TimeSegment. The Project structure will hold the name of the project and the start and end times. The TimeSegment structure will hold the duration two timestamps and the projects that intersect that time segment. We can then compute for the duration of each task using the algorithm below:

times= [ t1, t5, t3, t2, t6, t4 ]
times=sorted(times)
t = times[0]
for tj in times[1:len(times)]:
duration = tj - t
timesegment = TimeSegment(duration)
# iterate over all projects and determine intersection
for p in projectList:
if p.start < tj and p.end >= tj:
t=tj

for ts in segmentList:
for p in ts.projects:
theduration = ts.duration/float(len(ts.projects))
p.duration += theduration

for p in projectList:
print(p.name + " : " + str(p.duration))


I created a python script to implement this algorithm with the following output below:


project P0: start = 0, end = 3
project P1: start = 2, end = 5
project P2: start = 1, end = 4

Duration of Each Project:

P0 : 1.83333333333
P1 : 1.83333333333
P2 : 1.33333333333



In the next article, I’ll change the output of this program to a csv file and create an R script that will automatically plot the output to show clearly the overlaps.

## Finding the cliques of my Facebook friends

The availability of the facebook graph api made the studying of social interactions much easier. I have seen some social experiments using the data from facebook. For example, you can verify the small world effect or what is known as the six degrees of separation.

I did my own experiments also using the facebook api. What I did was to compute the cliques of my friends. I did this experiment last August of 2009 and this is the only time I’m able to document it. During that time, I was only connected to 82 friends.

What is a CLIQUE? A clique is a group of people who are mutual friends with each other. In the facebook setting, if 5 people are mutual friends, they form a clique. In this experiment, I computed the maximal cliques contained in my set of friends. A maximal clique is the maximum number of friends who are mutually connected with each other. Why study cliques? Cliques allow you to determine the interest of a person just by studying the group he/she belongs to. As the saying goes, birds of the same feather flock together.

To gather my data, I used the facebook api to get a list of my friends, and for each friend I queried the facebook function friends.areFriends by passing to this function a list of facebook ids corresponding to my friends. The output is an array of zeroes and ones. A 1 indicates that two people are mutual friends and a 0 indicate that they are not.

Once I gathered my data, I put it in a format that can be read by a program to compute the cliques. Here is a sample of the input file.

 0 1
0 2
0 4
0 7
0 10
0 13
0 14
0 15
1 2
1 3
1 7
1 8
1 9
1 10
1 11
1 12
1 14
1 15
1 19
1 22
1 25
2 3
2 4
2 7
2 8
2 10
2 12
2 13
2 14
2 16


Let me explain the format. The numbers represent a person id. Each row is composed of two numbers. For example, the row “0 1” can be read as person number 0 is a friend of person 1. The row containing “2 14” can be read as person number 2 is a friend of person 14. I did this for all my friends.

The algorithm I applied to generate the maximal cliques is taken from the book of Donald Kreher and Douglas Stinson entitled “Combinatorial Algorithms: Generation, Enumeration and Search”. The algorithm can be found on page 112 of the book. Below is the java program I wrote to implement the algorithm.

import java.io.BufferedReader;
import java.io.File;
import java.util.Comparator;
import java.util.Date;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Set;
import java.util.StringTokenizer;
import java.util.TreeSet;

import com.apple.mrj.macos.carbon.Timer;

public class AllClique {

long startTime = System.currentTimeMillis();

public void allClique(Set NLminusOne, Set CLminusOne, int x , String v){
if( null == NLminusOne ){
System.out.println("[]");
} else {
v += x + " ";
//printVector();
//System.out.println(v);
}

Set NL= null;
if(null == NLminusOne){

NL= (Set) nodeSet.clone();
} else {
Node n = nodeMap.get(x);
if(null !=n){

NL = (Set) n.connections.clone();
intersect( NL , NLminusOne);
}
}

if(null == NL || NL.size() == 0){
//System.out.println("*");
v+="*";
System.err.println(v);
}

System.out.println(v);

Set CL = null;
if( null == CLminusOne ){
CL = (Set) nodeSet.clone();
} else {
Node n= nodeMap.get(new Integer(x));
if(null != n){
CL = (Set) n.connections.clone();
Set BLminusOne = getBL(x);
intersect(CL, BLminusOne, CLminusOne);
}
}

if(null != CL){

if(v.length()> 0){
char c =  v.charAt(v.length() -1);
if(c == '*'){
v = v.substring(0,v.length()-1);
}
}

for(Iterator iter= CL.iterator(); iter.hasNext();){
allClique(NL, CL, iter.next(), v);

}
}
}

public Set getBL(int l){
int max = nodeSet.size();
TreeSet set = new TreeSet();
for(int i=l+1;i
}

return set;
}

public void printVector(int l){
String str="[";
for(int i=0;i
str+= i;
}
str+= "]";

System.out.println(str);
}

public void printVector(Integer[] X){
String str = "[";
for(Integer i: X){
str += i + " ";
}
str += "]";

System.out.println(str);
}

HashMap nodeMap = new HashMap();
TreeSet nodeSet = new TreeSet();

public boolean intersect(Set a, Set b, Set c){
boolean bool = a.retainAll(b);
if(bool){
bool = a.retainAll(c);
}

return bool;
}

public boolean intersect(Set a, Set b){
boolean bool = a.retainAll(b);

return bool;
}

public void readInput(File f) throws Exception {
String line = null;
try {
if(line.startsWith("#")) continue;
StringTokenizer st = new StringTokenizer(line," ");
String sFrom = st.nextToken();
String sTo = st.nextToken();
createConnections(sFrom.trim(),sTo.trim());

}
} finally {
try {
if(null != r){
r.close();
}
}catch (Exception e) {
// TODO: handle exception
e.printStackTrace();
}
}

}

public void createConnections(String sFrom, String sTo){

System.out.println(sFrom + "--" + sTo);
Node n1 = nodeMap.get(new Integer(sFrom));
if(null == n1){
n1 = new Node();
n1.id = new Integer(sFrom);
n1.connections = new TreeSet();
nodeMap.put(n1.id, n1);

}

Node n2 = nodeMap.get(new Integer(sTo));
if(null == n2){
n2 = new Node();
n2.id = new Integer(sTo);
n2.connections = new TreeSet();
nodeMap.put(n2.id, n2);
}

}

class Node {
Integer id;
TreeSet connections;
}

public static void main(String[] args) throws Exception {
final AllClique app = new AllClique();
for(Iterator iter = app.nodeSet.iterator();iter.hasNext();){
System.out.println("=> " + iter.next());
}

app.allClique(null, null, 0, "");

}
}



Running program with my data, I was able to generate the cliques. To visualize the result, a graph drawing application called graphviz came in very handy. Here is the result of the clique computation:

Click on the graphic so that you can zoom on it.