Traceability Matrix

I work in Information Technology and part of my job is to make sure that my team and I are able to determine the impact of introducing changes in our code. Systems and components do not live in isolation. Most of the time, systems are composed of simple components strung together to make up a complicated piece of software. In the same token, some complex systems are also composed of other systems. Introducing a change in one of the basic parts of a system can affect other components or systems that depend on it. You can imagine the dependency of these systems by drawing a dependency graph. A dependency graph can be viewed as a drawing of circles and arrows connecting those circles. The direction of the arrow determines which component depends on what. In the figure below, we have component A dependent on components B and C while component D depends on component C. Likewise, there’s no reason for another component to depend on that which it depends on. For example, component C also depends on component D.

If a change was introduced in component F, we can see that components B, C, A, and D are affected. In the real world, we usually don’t draw these diagrams because for very large systems, space will just be cluttered and it is very hard to figure out the dependencies just by looking at the graph. In practice, we create a two dimensional matrix that depicts the dependencies among components. Several Enterprise Architecture tools already exists to do this automatically. However, if you are on a tight budget, you still have some tools at your disposal. This article will demonstrate how to create the dependency matrix diagram using a tool called R.

Create a csv file that contains the dependency information as shown in the diagram above. The contents of the csv file will look like this:


The first two lines of the file above simply means that A depends on both B and C. The rest of the lines can easily be read off the file. To convert this into a dependency matrix diagram, we create an R script that will read this csv file and output an image:

> x=read.csv("traceability.csv",header=F)
> z=table(x)
> image(1:ncol(z),1:nrow(z),z=t(as.matrix(z)),col=c("white","red"),xaxt="n",yaxt="n", xlab="Component B",ylab="Component A")
> abline(h=1:nrow(z)+0.5,v=1:ncol(z) +0.5,col="black")
> axis(1,at=1:ncol(z),labels=colnames(z))
> axis(2,at=1:nrow(z),labels=rownames(z))

How it works

Line 1 above reads the csv file. The output is a data frame that is assigned to the variable x. The next line creates a contingency table of the data in x. Here is a dump of the value of z:

V1  b c d e f g
  a 1 1 0 0 0 0
  b 0 0 0 0 1 0
  c 0 0 1 0 1 0
  d 0 1 0 1 0 0
  e 0 0 0 0 0 1
  f 0 0 0 0 0 1
  g 0 0 0 1 0 0

It is a matrix filled with 1’s and 0’s. A 1 means that a dependency exists between the component along the x axis to the corresponding component along the y axis.

The third line of the code creates an image of the matrix, where the color of the cells is determined by the value of the matrix t(as.matrix(z). A value of 0 will take a white color while a value of 1 will take a red color.

The resulting image does not really look nice. The next two lines of code adds grid lines and axis labels to give an image like the one below:

How to Read It

Reading the traceability matrix is not as straightforward as one would hope. This is because a change in one component can impact another component which likewise impacts a component dependent on it. This is called transitive dependency. For example, a change in component E will impact G and D (read along the x axis). However, a change in G will impact F and E, which both impacts B, C and D. B and C impacts A. In spite of the transitive dependency, it is clear that a tool like this is very important when doing impact analysis.