ILogger Interface

The ILogger interface allows us to abstract away the actual logging mechanism. This in turn means even the Logging Layer does not specifically care where the logged data is being sent to. This gives us the ability to completely change between different storage solutions for the logged information very quickly.

One use of this flexibility would be to move from writing the logs to a file system, to sending them to a database or microservice. Eventually, if someone comes up with a better / faster storage solution, we would be able to switch over to the new one with significantly reduced effort.

Logging Context

Beyond simply writing log lines and recording metrics, we can also use the Logging Layer to keep track of specific data that should be recorded every time a log line is written. This can easily be implemented by creating a context object that contains the data, and than have that context passed into the logged class during construction. Using the context object also gives us an advantage in that we do not have to potentially pass fields around just in case an error does occur. Instead, we store them in the context, and if an error does occur, then the logging system has access to them. Below are list of fields that should be included in the logging.

CustomerID: By including the CustomerID, we are able to easily identify which log lines were generated by which customer's activities. If we search a centralized logging system for a specific customer, we are able to quickly find all of the log lines related to that customer. This may help us locate a specific issue the customer was having, or perhaps lets us gather comprehensive data about the customer over a period of time.

To extend this even further, if we are investigating an issue for a customer, and find a specific error in the log files, we are able to search for the error, and then get back a complete list of all impacted customers.

I cannot stress enough how useful the CustomerID is when troubleshooting an issue.

RequestID: was first mentioned in Request / Response Logging. We can retrieve this value when the request first comes into the service call, and then store it in the context. After that point, we automatically write the RequestID to any logs. This in turn lets us locate any logs generated as a result of a specific service call.

Service Name: If you are using a centralized logging system, then be sure to include the name of the service generating the logs. On the other hand, if the logs are being written to a file on the server that the Service resides upon; we should instead clearly name the log and file path such that service Name is clearly indicated.

Server Name: If you are using a centralized logging system, then be sure to include the name of the Server generating the logs. This is particularly important if only one server is having issues, since we want to quickly locate and investigate it. If the logs are being written to a file on the server, then we can ignore this, since we know exactly which server generated the logs.

Module Name: If you are using a centralized logging system, then the module name may be used to quickly identify a sub-component of the service that has written the logs. IF the logs are being written to a file on the harddrive, then the module name would be specified by the log's file name.

Formatting Logs

Since we are able to control the data that is written to the logs, it is a good idea to setup a standard format for the information being written to the log. For example, if writing to a file you may want to have the data laid out like this: yyyy-mm-dd hh:mm:ss.sss LogLevel ServiceCall=Request Account=1234567890 RequestID=ABC123 ...

By creating formatting libraries that is used by the Logging Layer, we are able to create a very consistent data format across many services. This in turn gives us the ability to more easily search and parse the logged information.

Additional Logging Considerations

This section covers more general logging concepts. These would actually be implemented in the Functional Layer, since they include data that is created by the Functional Layer code. I am including them here as an additional element to take into consideration when writing error handling logic.

Date/Time: since it very useful to know when the logged event occurred. If the logging system is writing to a file located on the server, then we need to make sure that the date and time are automatically written as well; if it is not automatically written, then we need to make sure we do include it explicitly.

If we are using a logging service or database to record the log lines, then we must explicitly include the date / time from the service generating the logged data. There may be a temptation to simply use a date/time generated by the database that the logging service stores the log lines in; however I have personally seen situations where the queue where the logged info was sent to has backed up, and thus the database insert time was significantly off.

Error Data: If something causes an error to occur, we need to log more than the simple fact that an error occurred. We need to log the value that caused the error as well as why that value caused an error. For example, I had to deal with a service that logged an error that simply said "Invalid date was rejected". This log line is effectively useless when trying to troubleshoot the issue. What it needed to show was the date value that failed, as well as the valid date range that would have been accepted.

Default Instantiation: in an error causes a default value to be used, that fact should be logged as well. For example, if a service call, and thus a specific value is not retrieved, and so the system uses a default, we should have log lines that look something like this: Call to ServiceName failed due to error: errorExplanation Using default value of {defaultValue} for {fieldName}

If an entire object is being used as a default value, then we need to specify the exact object type that was used. Specifying the exact object type is especially important when polymorphic code is being used. If we simply state the base type was being created, the log may have imprecise information in it, which will delay proper investigation of an issue, or even worse, the investigation results in a completely wrong conclusion.

Be Precise in describing the error. Never have an error message that indicates multiple possible reasons for an error. All that does is slow down investigation since multiple potential issues have to be investigated. If there are three reasons that a specific error might be generated, there should be three different error messages, and only the appropriate one is actually written to the log.

Next: Factories