Traceability Background

Over the course of my career, there have been many times I have been asked to investigate various software issues. The typical tools used to investigate an issue are log files, screen shots, database queries, and perhaps debuggers. Each of these have severe limitations however. Logs may contain pre-state, post-state, and perhaps intermediate data, but are limited by what a developer might predict as useful. Screen shots are highly dependent on timeliness, and typically only show the end result of a service call. Database queries typically contain only post-state data (and perhaps pre-state if a specific table has audit records). Debuggers are useful for viewing intermediate states, but have the paradoxical situation where its view is often very narrow while making the specific data being used hard to find (in short the debugger's interface is still severely limited).

While troubleshooting a particularly complex set of calculations I came to the conclusion all of the standard troubleshooting tools were simply not enough. What I needed was a solution that would show me all of the functions that were called, every branch decision that was made, as well as every valued used in the functions. In short, I wanted a to see what every line of code was doing.

In the following paper, I will lay out my specific goals, over all design concept, and ways to implement these designs. This paper is written with the expectation that the reader has a moderate amount of programming experience and a good understanding of inheritance and polymorphic behavior, with specific emphasis on the decorator and factory design patterns. In addition to describing how to use the above design patterns, I will also be touching upon other technologies that enable easier development and maintenance of Traceability; however in depth discussion of these enabling technologies is beyond the scope of this paper. Finally, I will discuss other elements that needs to be addressed when designing Traceability into a system; these elements include, but are not limited to security and regulatory concerns.

It should be noted that Logging using decorator classes is not a new concept. In fact, all Traceability is doing is taking the same idea and extracting a hyper-detailed amount of information.

Traceability: Objectives

What I really desired was the ability to log every value used in the service call, as well as the results of every single function call used by the service call. However, that level of logging would likely have huge impact on performance, as well as taking up a significant amount disk space; and the vast majority of the time, that detail would not be needed.

Keeping in mind that full blown logging was not feasible, I specified the following requirements:

  1. The name and value of each variable would be recorded;
  2. The name of each function called would be recorded;
  3. The name and value of each parameter passed into a function would be recorded;
  4. The result from each function call would be recorded;
  5. The additional information gathering would not impact performance;
  6. The ability to request this data on demand;
  7. The amount of work to maintain the traceability would be minimized;
  8. Safety measures would exist to ensure that the traceability was not accidentally used in a customer facing production system.

The solution I came up with consisted of dividing the code into two layers: the Functional Layer and the Traceable Layer. The Functional Layer would have all the standard trappings of a typical service, while the Traceable Layer would contain all the tracing code. These separate layers allowed me to easily address requirements 5 and 8 at the same time. Since the layers were separate, I was able to put each layer in an individual library. Only the library for the Functional Layer would be used on customer facing systems, thus ensuring no performance impact. The Traceable Layer library would only be used on systems that support personnel would have access to. Additionally, by having separate support systems, we also fulfill requirement 6. Finally, by having the Traceable Layer in a non-customer facing system, we prevent the accidental disclosure of system / code information from being exposed to a customer.

While my original goal was to gain a a highly detailed perspective of a system for the purposes of troubleshooting, I found that that I had gained much more. During coding and testing, I was able to use the Traceability to determine how exactly the code was processing the data. I could also show the data to the QA personnel, and discuss with them how the code was working; we could then compare those results against the specifications, and truly determine if the code was functioning as desired.

Coding Languages

My original code development for Traceability used C#, and thus my example in this paper are also in pseudo-C#. While none of the code in here will actually compile, it was designed to covey the concepts we are discussing. It should be noted that these techniques may be applied to any language that implements inheritance and polymorphic code.

The variations between various languages should have little impact on the over all concept we are using to implement Traceability. For example, while C# has properties, and Java does not, you only need to realize that a C# property is really a syntactic style for creating a getter and setter function with a single name.

Another aspect to consider if is the language supports automatic code generation, and if it supports reflection. As will be shown later, both of these tools are extremely useful for maintaining the Traceable Layer. If your language of choice is lacking built in automatic code generation, then you will need to write your own.

Table of Contents

Design Concepts: The over arching design for implementing Traceability.
Code Organization: A high level overview of how to potentially organize the code so as to ease maintenance, and more easily visualize how the various parts are related.
System Organization: A high level overview of how Functional and Traceable code should be placed into the production and testing environments.
Functional Layer: Design requirements in the Functional Layer which allow Traceability to be attached.
Traceable Layer: Discusses the requirements to implement the Traceable Layer.
Traceable Layer Return Types: Discusses additional work needed so that a Traceable class may return Traceable objects from functions.
Formatting Traceable Objects: Discusses way to represent trace data to be more useful.
Tracer: An example API used to record the trace data.
Factories: How to use the factory class pattern to ease maintenance.
Factories, Initializing: Continues the discussion on factories that contain the object initialization code.
Factories For Returned Types: Discusses the use of factories for generating return types. Daos: How to connect Data Access Objects to both the Functional Layer and Traceable Layer.
Meta-Factories: Discusses how to ensure Traceable coverage and simplify maintenance.
CodeGeneration: Gives an overview of how to use reflection and automated code generation to greatly reduce maintenance.
Code Analysis: Covers how to interpret trace data as well as how to use it to analyze work load.
ToolDesign: Covers ways to set up tools to safely retrieve the trace data, as well as other considerations to keep in mind.
Legal Aspects: Covers some considerations to keep in mind about when handling sensitive data.
Conclusion
Source Links