In recent years, machine learning has swept across the world of software delivery and is changing the way applications are built, shipped, monitored, and secured. And log monitoring is one of the industries that keeps evolving with new capabilities afforded with machine learning.
Machine learning is the process of using algorithms and computer intelligence to analyze and make sense of large quantities of complex data that would otherwise be difficult or impossible to do by a human security analyst. There are many forms of machine learning from algorithms that can be trained to replicate human decision making at scale to algorithms that take an open-ended approach to find interesting pieces of data with little input or guidance. Differences aside, machine learning looks to get more value out of data in a way that humans can't do manually. This is critical for security, which deals with large quantities of data, and often misses the important data, or catches it too late.
Machine learning is made possible by the power of cloud computing and how it makes crunching big data cheaper and more powerful. Analyzing large quantities of complex data takes a lot of computing power, readily available memory, and fast networking that's optimized for scale. Cloud vendors today provide GPU (graphical processing unit) instances that are particularly well suited for machine learning. The alternate method is to use numerous cheap servers and a distributed approach to analyze the data at scale. This is possible with the advances in distributed computing over the past few years. Additionally, cloud storage with fast I/O speeds is necessary for complex queries to be executed within a short period of time. AWS itself offers multiple storage solutions like Amazon EFS, EBS, and S3. Each of these serve different purposes and are ideal for different types of data workloads. The cloud has stepped up in terms of compute, memory, storage and overall tooling available to support machine learning.
The primary use case for machine learning in log analysis has to do with security. There are many forms of security attacks today, which range in complexity. Email phishing attacks, promo code abuse, credit card theft, account takeover, and data breaches are some of the security risks that log analysis can help protect against. According to the Nilson Report, these types of security attacks cost organizations a whopping $21.8 billion each year. And that doesn’t even include the intangible costs associated with losses in trust, customer relationships, brand value, and more that stem from a security attack. Security threats are real, and log analysis can and should be used to counter them. Attackers are becoming more sophisticated as they look to new technology and tools to carry out their attacks. Indeed, criminals themselves are early adopters of big data, automation tools and more. They use masking software to hide their tracks, bots to conduct attacks at scale, and in some cases have armies of humans assisting in the attack. With such a coordinated effort, they can easily break through the weak defenses of most organizations. That's why you see some of the biggest web companies, the most highly secured government institutions, and the fastest growing startups all fall victim to these attackers, who can breach their defense easily.
Since much of security is conducted by humans, or at most by tools that use rules to authorize or restrict access, attackers eventually understand the rules and find ways to breach them. They may find a limit to the number of requests a gatekeeper tool can handle per second, and then bombard the tool with more requests than the limit. Or they may find unsecured IoT devices on the edge of the network that can be easily compromised and taken control of. This was the case with the famous Dyn DDoS attack which leveraged unsecured DVRs and then bombarded the Dyn network, taking down with it a large percentage of the internet's top websites that relied on Dyn for DNS services. The point is that manual security reviews, and even rule-based security, doesn't scale and is not enough to secure systems against the most sophisticated attackers. What's needed is a machine learning approach to security, and one that leverages logs to detect and stop attacks before they escalate.
Many security risks occur at the periphery of the system, so it's essential to keep a close watch on all possible entry points. End users access the system from outside, and can sometimes knowingly or unknowingly compromise the system. You should be able to spot a malicious user from the smallest of triggers; for example, an IP address or a geo location that is known to be suspicious should be investigated. Login attempts are another giveaway that your system may be under attack. Frequent unsuccessful login attempts are a bad sign and need to be further investigated. Access logs need to be watched closely to spot these triggers, and it is best done by a machine learning algorithm. While manual and rule-based review can work to a certain point, increasingly sophisticated attacks are best thwarted by using machine learning. You may need to crawl external third-party data to identify fraudulent activity, and it helps to look not just inside, but outside of your organization for data that can shed light on suspicious activity. But with growing data sets to analyze, you need more than basic analytics -- you need the scale and power of machine learning to spot patterns, and find the needle in the haystack. Correlating your internal log data with external data sets is a challenge, but it can be done with machine learning algorithms that look for patterns in large quantities of unstructured data.
Machine learning can go further in spotting suspicious patterns from multiple pieces of data. It can look at two different pieces of data, sometimes not obviously associated with each other, and highlight a meaningful pattern. For example, if a new user accesses parts of the system that are sensitive, or tries to gain access to confidential data, a machine learning algorithm can spot this from looking at their browsing patterns or the requests they make. It can decipher that this user is likely looking to breach the system and may be dangerous. Highlighting this behavior an hour or two in advance can potentially prevent the breach from occurring. To do this, machine learning needs to look at the logs showing how the user accesses and moves through the application. The devil is in the details, and logs contain the details. But often, the details are so hidden that human eyes can't spot them; this is where machine learning can step in and augment what's missing in a human review.In today's cloud-native environment, applications are deeply integrated with each other -- no application is an island on its own. This being the case, many attacks occur from neighboring apps which may have escalated their privileges. It's easy to read the news about data breaches and find cases where organizations blame their partner organizations or an integrated third-party app for a security disaster. Monitoring your own system is hard enough, and it takes much more effort and sophistication to monitor outside applications that interact with yours. Whereas humans may overlook the details when monitoring a large number of integrated applications and APIs, a machine learning algorithm can monitor every API call log, every network request that's logged, and every kind of resource accessed by third-party applications. It can identify normal patterns as well as suspicious ones. For example, if an application utilizes a large percentage of available memory and compute for a long period of time, it is a clear trigger. A human may notice this after a few minutes or hours of it occurring, but a machine learning algorithm can spot the anomaly in the first few seconds, and bring it to your attention. Similarly, it can highlight a spike in requests from any single application quickly, and highlight that this may be a threat.
Machine learning algorithms are especially good at analyzing unstructured or semi-structured data like text documents or lines of text. Logs are full of text data that need to be analyzed, and traditional analytics tools like SQL databases are not ideally suited for log analysis. This is why newer tools like Elasticsearch have sprung up to help make sense of log data at scale. Machine learning algorithms work along with these full-text search engines to spot patterns that are suspicious or concerning. It can derive this insight from the large quantities of log data being generated by applications. In today's containerized world, the amount of log data to be analyzed is on the rise, and only with the power of machine learning can you get the most insight in the quickest time.
As you look to get more out of your log data, you need an intelligent logging solution like LogDNA that leverages machine learning to give you insight in a proactive manner. Algorithms are more efficient and faster than humans at reading data, and they should be used to preempt attacks by identifying triggers in log data. As you assess a logging solution, do look at its machine learning features. Similarly, as you plan your logging strategy, ensure machine learning is a key component of your plans, and that you rely not just on traditional manual human review, but leverage the power of machine learning algorithms.