When Average Is Not Enough: Thoughts on Designing for Capacity

Designing a system from scratch to handle a workload you don’t know is a challenge. If you put to much hardware, you might be wasting money. You put little, then your users will complain of how slow the system is.

If you’re given only a rate, like 6000 hits/hour, you don’t know how these are distributed in a minute by minute or per second interval. We can make a guess and say that there are about 100 hits per minute or 1.67 hits/sec. If hits come uniformly at that rate, then we can design a system that can handle 2 hits/sec and all users will be happy since all requests will be served quickly and no queueing of requests. But we know it’s not going to happen. There will be some interval where the number of hits is less than 3 and some more than 3.

Theoretically, requests to our server come randomly. Let’s imagine 60 bins represented by seconds in one minute. We also imagine that requests are like balls we throw into the bins. Each bin is equally likely to be landed by a ball. It’s possible that all balls land on only one bin!

throw_balls_random

After throwing the balls into bins, let’s see what we have.

simulation

As you can see, some bins have more than 2 balls (which is the average number of balls in a bin). Therefore if we design our system based on the average, 50% of our users will have a great experience while the other 50% will have a bad experience. Therefore we need to find how many requests per second our server needs to handle so that our users will have a good experience (without overspending).

To determine how many requests per second we need to support, we need to get the probability of getting 4, 5, 6 or more request per second. We will compute the probability starting from 3 requests per second and increment by one until we can get a low enough probability. If we design the system for a rate that has a low probability, we are going to spend money for something that rarely occurs.

Computing the Probability Distribution

We can view the distribution of balls into bins in another way. Imagine labeling each ball with a number from 1 to 60. Each number has an equal chance to be picked. The meaning of this labeling is this: the number that was assigned to the ball is the bin (time bucket) it belongs to. After labeling all balls, what you have is a distribution of balls into bins.

label_balls
Since each ball can be labeled in 60 different ways and there are 100 balls, the number of ways we can label 100 different balls is therefore

\displaystyle 60^{100}

Pick a number from 1-60. Say number 1. Assume 2 balls out of 100 are labeled with number 1. In how many ways can you do this ? Choose the first ball to label. There are 100 ways to choose the ball. Choose the second ball. Now there are 99 ways to choose the second ball. We therefore have 990 ways to select 2 balls and label them 1. Since we don’t really care in what order we picked the ball, we divide 990 with the number of possible arrangements of ball 1 and ball 2, which is 2! (where the exclamation mark stands for “factorial”). So far, the number of ways to label 2 balls with the same number is

\displaystyle \frac{100 \times 99}{2!}

Since these are the only balls with label 1, the third ball can be labeled anything except number 1. In that case, there are 59 ways to label ball 3. In the same way, there are 59 ways to label ball 4. Continuing this reasoning until ball 100, the total ways we can label 2 balls with number 1 and the rest with anything else is therefore:

\displaystyle \frac{100 \times 99}{2!} \times 59^{98}

Notice that the exponent of 59 is 98 since there are 98 balls starting from ball 3 to ball 100.

Therefore, the probability of having two balls in the same bin is

\displaystyle \frac{100 \times 99}{2!} \times \frac{59^{98}}{60^{100}} = 0.2648

We can also write this as

\displaystyle \frac{100!}{2! \times 98!} \times \frac{(60-1)^{98}}{60^{100}} = \binom{100}{2} \frac{(60-1)^{98}}{60^{100}}

In general, if m is the number of balls, n the number of bins and k the number of balls with the same label, then the probability of having k balls within the same bin is given by

\displaystyle \binom{m}{k} \frac{(n-1)^{m-k}}{n^{m}}

,

where

\displaystyle \binom{m}{k} = \frac{m!}{k!(m-k)!}

is the binomial coefficient.

It turns out that this is a probability distribution since the sum of all probabilities from k=0 to k=m is equal to 1. that is

\displaystyle \sum_{k=0}^{n} \binom{m}{k} \frac{(n-1)^{m-k}}{n^{m}} = 1

To see this, recall from the Binomial Theorem that

\displaystyle \big( x + y \big)^n = \sum_{k=0}^{n} \binom{n}{k} x^{n-k}y^k

If we let x=n-1 and y=1, we can write the above equation as

\displaystyle  \begin{array}{ll}  \displaystyle \sum_{k=0}^{m} \binom{m}{k} \frac{(n-1)^{m-k}}{n^{m}} &= \displaystyle \sum_{k=0}^{m} \binom{m}{k} \frac{(n-1)^{m-k}\cdot 1^k}{n^{m}}\\  &= \displaystyle\frac{(n-1+1)^m}{n^{m}}\\  &= \displaystyle\frac{n^m}{n^m}\\  &= \displaystyle 1  \end{array}

Here is a graph of this probability distribution.

probability_distribution

Here’s the plot data:


probability
1 0.315663315854
2 0.264836171776
3 0.146632456689
4 0.060268424995
5 0.019612775592
6 0.005263315484
7 0.001197945897
8 0.000236035950
9 0.000040895118
10 0.000006307552

We can see that for k=9, the probability of it occurring is .004%. Anything beyond that we can call rare and no need to spend money with.

Just For Fun

What’s the probability that a given bin is empty, that is, there are no balls in it?

Other Probability Distributions

Our computation above was based on a uniform probability distribution. However, there are other distributions that are more suitable for arrival of requests. One of the most widely used is called the Poisson Distribution where you can read from here.

R Code

The R code to generate the simulation:

par(mfrow=c(4,4))
f=function(){
  x=c()
  for(i in 1:100){
    x=c(x,sample(seq(1,60),1,replace=T))
  }
  plot(tabulate(x),type="h", ylab="tx", xlab="secs")
}

for(i in 1:16){
 f()
}

The R code to generate the probability distribution:

p=function(m,n,s){
 prod(seq(m,m-s+1))/factorial(s)*(n-1)^(m-s)
}

tt=c()
for(i in 1:10){
 tt=c(tt,p(100,60,i)/60^100)
}
plot(tt,type="h",xlab="Number of Balls",ylab="Probability")

Dedication

This post is dedicated to my friend Ernesto Adorio, a mathematician. He loves combinatorial mathematics.

Rest in peace my friend! I miss bouncing ideas with you.

New Blog Location

I migrated my blog from a hosted linux server which I maintain into wordpress.com hosting service. Times are changing. I used to have more time before to maintain my site, upgrade my linux applications to the latest security patches, configure my own DNS server, email server, http server, what have you… No I just don’t have the luxury of time. It’s good that we have this hosting providers doing the job for you for free!!!

I thought that migrating my old wordpress blog from the old site to the new site would be a challenge. I was wrong. Thank you wordpress for making my migration painless.

I just followed the instructions from here to export and import my blog:

http://en.support.wordpress.com/import/

and it worked like a charm!

A Tribute to My Manager

janlategahn.jpgI have been under many managers in my IT career which spans over 11 years. They come in all shapes and sizes. But one manager stands out and this is a tribute to him.

My manager’s name is Jan Lategahn. He was my manager for one year in an IT project. But it was one of the most productive years of my life. He was different from all my previous managers because of the following qualities which I will enumerate below.

My manager inspires his subordinates. A manager who does not know how to do the work of his subordinates will find it hard to inspire. The metaphor that usually comes to my mind is that of a Shogun; the most skillful swordsman in his territory. Everybody treats him with respect. The same is true for a manager who is the best programmer in his team.

My manager is an expert programmer. He is an expert in SAP and UNIX. Being a programmer he knows that programming is a creative process and sometimes it is just difficult to predict when you can finish a piece of code. He doesn’t give unrealistic deadlines but if the deadline can’t be moved, he knows how to adjust the scope or to give you all the help you need to succeed.

Once my manager went on a vacation and I was so swamped with work. When he returned to work and found me working till 12 midnight that week he was so kind enough to offer his assistance. I was surprised when he approached my desk and asked me “what can I do to help you?” I asked him to install the server in the uat environment which he did in just two hours. After that he returned and asked for more tasks. I’ve never had any other manager ask me for tasks (which is not surprising; they don’t know how to do it).

When I went to Singapore for my vacation I brought my laptop just to make sure I can do support when extremely needed. I was expecting my manager to call me anytime and ruin my vacation. But instead I got a text message telling me about places I can go and enjoy in Singapore. After that he let me alone in my vacation until I returned to work. Later on he told me he had a rough time during those days and that he must have aged six years when I was as away. I really appreciate what he did and feel so lucky he can do without me.

My manager is not afraid to show his humanity. He is not afraid to show to his subordinates that he sometimes feels tired; that he dislikes unproductive meetings and chatterboxes. He would rather code something than waste his time in meetings. Other managers look like supermen who don’t seem to feel tired and never complain. My manager gives us an example that it’s okay to be human.

Once we had a visit wih the big bosses from onshore. And we had to behave the whole week because they were sitting near our cubicles. On the last day when the big bosses finally left, my manager was not ashamed to show a sigh of relief and say “it was very hard to breathe with them around” or so along those lines. We were laughing, but I really appreciated that reaction. It showed me that my boss is also human and whatever ordinary things we feel, he feels it too.

My manager has a great personality. He always greets people when he arrives in the morning and always finds time to do a quick chat about anything under the sun. I have learned so much from him about electronics, WI-FI hacking, etc.; things not directly related to work. In fact when he’s absent from work due to some illness, I find the team spirit not so happy. He brings life to the team. In my previous work we feel elated when our managers are not around, in my manager’s case the opposite is true. I even have to send a text message to him to ask how he is doing.

And most of all, my manager is a friend to his men. Other manager’s talk about open door policy but always maintain the distance from their subordinates. Not so with my manager. I can always speak my thoughts to him without reservation because he has become a friend.

To my manager, mentor and friend, I wish you only the best in life!

Do It Yourself Supercomputing in Linux Part 1

If you recently purchased a laptop or desktop computer, chances are you have a dual core system. There does not seem to be any indication anymore of us going back to single core unless some technological breakthrough will break the power barrier of CPUs. This means that more and more people will have a high powered computer in their homes without any idea how to harness such power.

What does it mean to have a dual processor? On first impulse, you probably will think it will speed up the execution of your programs. You would probably perceive a significant difference between the response time of your programs in a single core versus a dual core. But why do programs run faster in dual processor systems? For one thing, the speed of each processor is faster as compared to older single processors. The other thing is due to symmetric multiprocessing. An analogy of the dual processor system is a bank with 2 tellers and having a single line for customers. When a teller is done with one customer, it will get the next customer from the line. You can read about the performance of multiprocessor systems in this article.

There is another way you can make use of multiprocessors to make your programs run faster. This is through parallel programming. Parallel computing has been around for a long time already but are usually confined to university laboratories or supercomputing facilities. It has not caught the interest of the common people because they have no access to such machines. However, the future is already about multi-processing. More and more of these machines will reach the masses. This means that highly talented members of the masses get to experiment with parallel computing on a day-to-day basis. We are in for another revolution in computing. Are you ready for this revolution?

Continue reading Do It Yourself Supercomputing in Linux Part 1

The Best Place in Singapore

I was fortunate enough to go to Singapore in October of last year. We arrived in Changi airport about lunch time and it was raining. I was expecting it as some of my friends told me it rains every now and then in Singapore. I was immediately impressed by the airport. Although I have several big airports before when I was in San Franciso, USA and Dubai, this airport was refreshing. We took a cab from the airport to our hotel (Peninsula Excelsior). The scenery was really beautiful as we drove from the airport to the city. The city itself was nothing compared to anything in the Philippines. The streets were so clean and it was very hard for us to find any cigarette butt.

Our first destination was the Zoo. I’m not good with remembering names so I can’t remember the name of the Zoo. But it was the first time I entered a zoo where I saw many animals for the first time. Well, I don’t really go to the zoo, even here in the Philippines. It’s one of those places that are low in my priority. However, I was not disappointed when I went to Singapore zoo.

Our hotel was very near the place of our conference so we just walk it on the way home, utilizing the underground mall that connects Suntec to a train station near our hotel. I can’t remember the name of that underground mall, or if there is a name to it. It was just a pleasant walk. You can shop as you go to the hotel.

We stayed in Singapore for 6 days. Before we left for the Philippines, my wife and I went to Sentosa. It was the place in the top of her list to go. The place was really great. We went to this underwater museum where you can see esoteric fishes and crustaceans. There is already one here in the Philippines but I haven’t gone there yet. We went to Sentosa by train and went out of it by cable car. The view was really great!

On the first day, my wife asked me if I would consider going to Singapore next time for vacation. I told her the city was beautiful, very modern and clean but that there was really no compelling reason to go there. I can live with what we have in the Philippines.

However, all that changed when I went to Borders bookstore and Kinokuniya. There is nothing of such beauty and magnitude in the Philippines as those bookstores. I can spend my entire day just browsing through all the books. And that would be just one row of bookshelves! So the best place in Singapore for me are the bookstores and if I have my way, then the reason I will return to Singapore is to visit the bookstores.

Browsing Books at Borders

John Wheeler, A great physicist!

John Archibald Wheeler is one of my most admired physicist. He just died recently and I want to make a small tribute to him to did so much in General Relativity. Wheeler is the person who coined the word “black hole”, a concept which captivated my imagination when I was yet a kid and a concept which I eventually studied on my own in the University. He wrote one of the best books in General Relativity entitled “Gravitation”. This is a thick and heavy book full of physics and mathematics. It introduces advanced mathematical concepts as you progress in your study of Relativity in a very geometric way. This book has full of illustrations that really whets your appetite for studying advanced physics.

John Archibald Wheeleer

I have bought many books authored by Wheeler. One of them is “Exploring Black Holes“, with Taylor as co-author. I bought this book in Borders bookstore in Singapore. Unfortunately, we don’t have these kinds of books in the Philippines. A friend of mine also lent me a layman’s book written by Wheeler entitled “Geons, Black Holes & Quantum Foam”. This book gave accounts on Wheeler’s contribution to the Manhattan Project and the other great people whom he worked with.

I have a professor before who took his PhD in US who told me that whenever Wheeler gives a lecture at 7 am in the morning, he will always attend it no matter how cold it is in New York. That’s how big an impact Wheeler had on him.

To the great man Wheeler, thank you for inspiring us to study one of the greatest theories in the 20th century!

Robots and Emotions

I just came across this article entitled “The rise of the Emotional Robot” which talks about owners of Roomba robot dressing them up and even giving them names and gender. People are now getting attached to their robots as if they were part of the family. What I find funny is that we put a lot of value to these human creations forgetting that there are real humans out there who are of greater value by virtue of being human and are living in wretched conditions. I don’t mean to say that attaching ourselves to our robot is a bad thing but that when we do these things we should also be aware of real people out there who also need the same care and attention.