Performance Arithmetic

When analyzing performance issues in computer systems, it is always a good thing to have a feel for the overall picture and be able to make back-of-the-envelope calculations. In this article, we are going to familiarize ourselves with the various performance metrics and how they relate to one another. We then apply these concepts to be able to compute the maximum throughput and minimum response time of a given system.
Performance can be viewed from different perspectives. For a manager, the most important metric is the throughput. How many transactions can be completed in an hour. The greater the throughput, the more money for the business. On the other hand, for a customer, the most important metric is the response time. How fast can I get the response? Do i have to wait forever to get the next page?

Throughput is defined as the number of completed transactions per unit of time. This unit of time is usually in “seconds”. For a website, throughput can be the number of requests in a day. If a site can complete one million transactions per day, then the thoughput is 1,000,000/(24*3600), since there are 24 hours in a day and 3600 seconds in an hour. We denote throughput by X and define it as:

X_i = C_i/T

where C_i is the number of completed transactions of resource i and T is the observation time.

A system can be composed of more than one resource connected with one another. The overall system throughput is denoted by X_0 and defined similarly as above:

X_0 = \frac{C_0}{T}

where C_0 is the number of transactions completed by the system.

Throughput depends on how fast a resource can complete a transaction. The time it takes for a server to complete a transaction is called the service time S_i. If a CPU takes on the average 30 msecs to complete a computation, then the service time for that type of transaction is 30 msecs.

In a typical 8 hour day, an employee does not really consume the whole 8 hours working. There are periods where the employee is idle, or is not doing useful work. The number of hours he/she is working on the average is less than 8 hours. If we denote the number of hours busy as B, then the Utilization is defined as


where T is the total working hours. If in a one hour observation period, the CPU is utilization is 40%, then it was busy 0.40*(3600 s) = 1440 seconds and idle the rest of the time.

There is an interesting relationship between the utilization, service demand, and throughput. This is known as the Utilization Law:


for a given resource i. This can be seen easily by dividing the numerator and denominator of the utilization by \displaystyle C_i:

\displaystyle U_i =\frac{B_i}{T}
\displaystyle =\frac{B_i/C_i}{T/C_i}

But B_i/C_i is average time per transaction completed, which is precisely the service time. Therefore

\displaystyle U_i =\frac{S_i}{T/C_i}
\displaystyle =\frac{S_i}{1/X_i}
\displaystyle = X_iS_i

If a website should process 1,000,000 requests in a day, how fast should the CPU process each request in order to maintain a utilization of 15%? Using the utilization law we have,
\displaystyle S_i =U_i/X_i
\displaystyle =15/(1000000/(24*3600))
\displaystyle = 0.01296
\displaystyle = 12.96\text{msec}

Consider a web server that is composed of a CPU and 2 disks. A typical transaction may use the CPU and both disks one or more times before completing. Let us denote the number of times a transaction visits a resource i as V_i. Then the total time a transaction spends on a resource i is V_iS_i where $S_i$ is the service time of resource i. This is called the Service Demand D_i of resource i:

\displaystyle D_i = V_i S_i.

There is an interesting relationship between the number of times a transaction visits a resource i and the system throughput. If V_i is the number of times a transaction visits resource i, then

\displaystyle X_i = V_i X_0
\displaystyle V_i = \frac{X_i}{X_0}
\displaystyle = \frac{C_i/T}{C_0/T}
\displaystyle = \frac{C_i}{C_0}

By definition of the service demand:

\displaystyle D_i =V_i S_i
\displaystyle = \frac{C_i}{C_0}S_i
\displaystyle = \frac{C_iSi/T}{C_0/T}
\displaystyle =\frac{U_i}{X_0}

The above result is called the Service Demand Law:

\displaystyle D_i= \frac{U_i}{X_0}

The service demand is a very important quantity. It enables us the compute the maximum throughput of the system. Since X_0=U_i/D_i, as the utilization increase to 100%, the throughput also increases. However, when U_i=100%, the throughput cannot be increased any further. Therefore, the throughput is bounded above by


Therefore, in a given system, the resource with the highest service demand limits the throughput and is therefore the bottleneck.

Let’s have an example. Suppose that the utilizations of the CPU,disk1 and disk2 are measured and given as 15%, 35% and 30% respectively. Furthermore, suppose that the current throughput is 500,000 transactions per day. Calculate the maximum throughput of this configuration.

The system throughput is X_0=500000/(24*3600)=5.79. The service demand of the CPU is

\displaystyle D_\text{CPU} = \frac{U_\text{CPU}}{X_0}
\displaystyle = \frac{0.15}{5.79}
\displaystyle =0.026

Similar computations for disk1 and disk2 yeild:

\displaystyle D_\text{disk1} = 0.06
\displaystyle D_\text{disk2}=0.052

Since disk1 has the highest service demand, this resource limits the throughput. The maximum throughput this configuration can attain is:

\displaystyle X_0 \le \frac{1}{D_\text{disk1}}
\displaystyle = \frac{1}{0.06}
\displaystyle = 16.67

In one day, the maximum number of transactions is therefore

\displaystyle 16.67\times 24\times 3600=1,440,000

In the next article, we are going to learn the minimum response time this system will attain. It can never do faster than that minimum.


Published by

Bobby Corpus

Is an IT Architect.