

This algorithm is meant to detect outliers in this kind of data. One of the included algorithms for anomaly detection is called DensityFunction. Splunk’s Machine Learning Toolkit (MLTK) adds machine learning capabilities to Splunk. It’s getting even worse because more events aren’t getting buried by high counts during certain hours of day. Requiring more than 15 data points, there are 14,298 results. The distribution of source count is an exponential distribution: One example would be if we were looking for users logging in from an anomalous number of sources in an hour. This means more data equals more outliers equals more alerts. Standard deviation can be used to find outliers but a certain percentage of data will always be seen as outlier. In security contexts, user behavior is most often an exponential distribution, low values being commonly seen with high values being more rare. Using standard deviation to find outliers is generally recommended for data that is normally distributed. Standard deviation measures the amount of spread in a dataset using the value’s distance from the mean. I will also walk you through the use of streamstats to detect anomalies by calculating how far a numerical value is from its neighbors. In this tutorial we will consider different methods for anomaly detection, including standard deviation and MLTK. Standard deviation, however, isn’t always the best solution despite being commonly used. Just a final detail on the search – as this uses tstats out of the datamodel, then does rather basic manipulations at search time, it’s a fairly lightweight cost to your ES environment.Detecting anomalies is a popular use case for Splunk. You can of course adjust that threshold to something wider or narrower to fit your particular environment. Finally, we drop back in all of our fields, and we see that there were two domains out of the ~2100 results from this host that meet the long domain segment criteria. And /then/ we look for any segment that’s over 25 characters. Now, we split the query up at the dots so we can see each URL segment. The more you can lower your noise floor, the more you can focus on the signals. We also drop out Amazon and the Alexa top sites list to lessen the typical noise. Then we use the truncate_domain macro to get a clean query domain without the URL characters. What we do there is an mvexpand to split our previous multi-value query field into one line item per query. Total count for that query src within that hour.Original sourcetype of the DNS events (useful for later drilldown searching).In this first part of our search, we are pulling our basic building blocks, including: Search Part 1: Pulling The Basic Building Blocks What we need to do is look at that eight-hour span and get a count of DNS events per host, per hour. Make it wider for a less sensitive alert make it narrower for a more sensitive alert. This window can be adjusted to better suit the needs of any specific environment. Oddly enough, there are also three 8-hour chunks in a day, so this is a safe window to use as a first draft. Compare the established average against individual slices of the whole window.Establish a baseline for a machine’s DNS activity.Determine a working time window to use in calculating the average.What we really need is a way to look at how a machine typically behaves during its normal activity, and then alert us when that machine suddenly deviates from its average. The solution here is actually pretty simple once it’s written out in Splunk SPL (search processing language). This makes the search not only excessively noisy, but also very time consuming to tune into something an analyst can act on or even want to look at. Throw in some server systems doing backups of remote hosts or MFPs trying to send scans to user machines, and we suddenly have thousands of machines breaking that 100/hour limit for DNS activity. For most large organizations with busy users, 100 DNS queries in an hour is an easy threshold to break.

However, the stock search only looks for hosts making more than 100 queries in an hour. Splunk ES comes with an “Excessive DNS Queries” search out of the box, and it’s a good starting point. Here we will look at a method to find suspicious volumes of DNS activity while trying to account for normal activity. Starting from the assumption that a host suddenly spewing a ton of DNS queries can be an indicator of either compromise or misconfiguration, we need a mechanism to tell us when this happens. Building on the stock ES “Excessive DNS Queries” to look for suspicious volumes of DNS traffic
