The sprawling landscape of cyber threats and the growing regulatory and statutory requirements has seen many businesses scrambling to find better ways of system logging. Whether it’s routine security monitoring or a need to investigate a problem, logs provide valuable information that indicates the severity of the issue and insights as to what could be the underlying cause.
According to Mezmo, logging helps you trace the activity to specific sessions, users, and requests, all of which ultimately lead to faster and more accurate troubleshooting. ce the activity to specific sessions, users, and requests, all of which ultimately lead to faster and more accurate troubleshooting. Irrespective of the size and complexity of your enterprise systems, useful logging comes down to three critical principles—collecting, customizing, and centralizing the logs (or ‘the Triple Cs of Logging’).
Log collection is where it all begins. Managing log collection across a wide range of systems has been a stubborn headache for system administrators. To get the most out of your log data collection, applying certain tried and tested best practices is vital.
First, the clock on every system you collect logs from must be synchronized to a single, reliable time source. If your clocks aren’t synchronized, then it’s virtually impossible to correlate events taking place on multiple systems correctly. You have several options to choose from to enforce this. The most expensive alternative is setting up a dedicated time server on your network that relies on the radio to synchronize with reference clocks. The cheapest is developing or installing an application that polls a web-based Network Time Protocol (NTP) server like time.windows.com.
Second, regularly analyse the log data you collect to ensure it’s still the data you need to fulfil your compliance, security, and performance obligations. That is especially important given that an organization’s network, server, application, and data environment is never static according to this article from Logit.io. Whenever you add or reconfigure resources in your technology ecosystem, evaluate what impact these changes may have on your log collection strategy.
Third, compute how much disk storage you’ll require for the logs you collect. To do that, you’ll have to study the average volume of data you receive daily as well as how long you need to retain it (in line with relevant data retention laws). If your business is subject to retention laws, you may have to move specific data to long-term storage after a while where you’ll hold it until it exceeds the retention window defined by applicable regulations. Ergo, when computing the storage needed, you have to take into account both short-, medium- and long-term storage.
Fourth, schedule frequent checks of your log collection to ensure the logs are captured correctly. There are few things more deflating than turning to your log files only to discover that they don’t capture the events you most needed to see to resolve a security incident. Given the sheer volume of log data, you’ll typically be dealing with; this routine quality check works best when done through a series of automated tests.
Collecting log information is one thing—applying it to the investigation and resolution of system problems is another. That’s why one of the biggest challenges for administrators is ensuring system logs provide as much information as possible for getting to the bottom of security events but without capturing data that serves no practical purpose. Proper logging pulls out necessary log data from an overwhelming sea of low priority and low-risk log information.
That being said, even in the best thought out logging system, the overwhelming majority of log events will be unnecessary noise. The difference between the best logging systems and those that are mediocre is in how the logging tool allows administrators to filter past the sound and focus on actionable log entries.
For this reason, a logging system should come with a predefined set of alerts but also with features allowing administrators to create custom alerts that are most appropriate for their organization. Customization should give priority to tracking event records that would trigger an immediate warning and active investigation. That would point to a substantial likelihood of malicious actions, persistent excessive system activity, abrupt drop inactivity, or a notable failure or deterioration of a mission-critical system’s performance.
Of course, any logging system that needs an excessive amount of customization to capture the most critical events is not potentially worth it. A good system must come pre-built with the overwhelming majority of alerts administrators would be interested in and require a relatively small amount of customization to realize the desired logging goals.
To be sure that your logs are secure and available, use a dedicated log collection server with strict access control rules and where only a small set of authorized users can log in. The value of sending log files to such a central database cannot be overemphasized. How do you do that, though?
Most systems can be configured to send their log files to a central server. However, you’ll almost always realize more versatility if you employ a third-party agent (such as this C# log parser by Papertrail.com) that specializes in collecting and sending event data to the central database. There are dozens of free utilities that can do the job, but you will usually get more extensive functionality, reliability, and support with commercial opportunities. Centralized log management vendors provide either software-only or appliance-based options. As always, appliance-based tools offer better performance.
Pick the solution that is most effective at securely collecting data from diverse sources—you want to avoid sending precise text log data over your network. The tool should aggregate the data, convert it to a standard format (i.e., normalize it), identify and alert on anomalous log entries, and facilitate querying the log data. Answers to queries you run should come up in a reasonably short time. If you have to wait 60 seconds for a query result to show up, you’ll probably use the system less over time and thus defeat the purpose of log centralization. The system should also make efficient use of network resources and not clog communication channels during peak traffic times.
Both seasoned administrators with decades-long experience in log management and novices who are just getting started in their understanding of logging will find these three principles effective in ensuring the high requirements and low-level visibility of their systems.