Enterprise Security

Continued from Part 1…
Adding alert on file system modification events
Let’s setup alert that will send email to administrator when some executable script will be modified on Web server under user’s file system space. We will run scheduled search every 5 minutes to scan last 5 minutes worth of modifications. […]

2015-11-03T01:50:05+00:00By Gleb|Enterprise Security, Malware, Splunk, Wordpress|Comments Off on Real time detection and automated root cause analysis of web malware, exploits and backdoors with Splunk. Part 2, Detection and alerting.

One of my enterprise clients observed that certain class of attacks having a number of distinctive characteristics: attacker who possessed correct user account credentials won’t try to engage into malicious behavior right away.

However in many cases such pre-fraudulent activity was still carrying an unusual behavior marks: either session velocity (average number of seconds per hit) or session density (number of hits within the session) or both exceeded normal, baseline session patterns typical for the average client application user’s behavior.

Abnormally high session velocity is also a typical pattern of an automated script-driven session that both fraudsters and competition were using to siphon data from the client’s servers.

One of the possible solutions to detect these activities would be to calculate average session velocity and density and then apply these values to trigger alerts when session metrics exceeded thresholds.

The issue here is that due to the client’s business specific these averages vary greatly depending on the time of the day, time of the week and also the month of the year.

So stuffing some fixed “guessed” threshold values won’t work and will either generate lots of false positives or miss many suspicious sessions. […]

Back in my days at IBM T.J. Watson Research Center where we were working on techniques to detect known and unknown malware, the fast growing challenge was the rising threat of malware’s abilities to become polymorphic.

Malicious snippets of code encrypted themselves and made it very difficult to apply conventional signature based detection techniques.

We’ve developed a tiny virtual machine in C language that was able to load malware code in real time and analyze it’s behavior without need to figure out how to decrypt it. Certain score metrics were assigned to keypoints and function calls and logic was put in place to trigger an alert if “risk” score exceeded certain heuristic threshold.

That technique allowed us to deliver top quality enterprise security solution (purchased by Symantec later on) that was capable of detecting previously unknown threats. That was more than 15 years ago.

While working with financial clients and technology companies today I can see that old behavior pattern analysis stays as strong as ever helping enterprises to discover new types of suspicious behaviors and investigate malicious activities with previously unknown patterns from previously unknown sources.

Industry leaders seems to agree that some of the recent high profile breaches could of been thwarted with properly configured behavior analysis SIEM system in place. […]

In the final part of this writeup I’ll show you the actual query that does it all and explain how it works.

To remind – this is the challenge – what we want to accomplish:

Detect and alert when C-class IP subnet tries to access at least 5 different accounts within an hour and at least 75% of total accounts touched has never been accessed from this subnet *and* from this USER_AGENT within the last 45 days.

And, as you may remember from Part 1, here’s the basic logic that we need to implement to make it happen:

Scan last hour of access log data and find the list of subnets that tried to access multiple accounts within that hour.
For each of these accounts – take username, IP, USER_AGENT and scan the previous 45 days of traffic history to find usernames that has never been touched by this IP/USER_AGENT combo.
Alert if number of found accounts is above threshold.

I’ve spent quite a bit of effort to come up with a single query that does all of the above and in a pretty efficient manner.

The biggest part of challenge is that the query needs to find events (#1 above) but then it needs to run very custom search for each event against summary index that we’ve created (#2 above). And added icing on this cake is that the query needs to return results only if there are *no matches* found for the second part of search.

This quickly gets mind-boggling and it is a rather interesting puzzle to solve with SPL.

The way I solved it – is with a combination of macros + advanced subsearch. But instead of returning traditional results – the subsearch will return new, custom crafted Splunk search query to be executed by the outer search.

Summary indexing is a great way to speedup Splunk searches by pre-creating a subset of only necessary data for specific purpose. In our case we need to filter out of all available WEB traffic data only login events. This will allow us to have very fast, much smaller data subset with all the information we need to reference against when matching with new, suspicious login events.

To proceed with building summary index we need to make a set of assumptions. These assumptions are needed to build the query and all other elements of the solution. You’ll be able to substitute names to your specifics later on if wanted to.

Lets assume you have your WEB logs with all the event data indexed in Splunk already.
All web events are located within index named: logs.
Field names (or aliases):

In these series of posts I’ll cover the complete strategy of utilizing Splunk Enterprise in detecting customer account takeover fraud as well as setting up an automated alerts when such activity is detected.

While I’ve helped to implement these measures for large financial firm – the same approach can be applied to any online enterprise where it is essential to protect online customer accounts, quickly detect suspicious activity and to act and prevent monetary and business losses.

The techniques I am going to describe generate pretty low level of false positives and contain efficient ways to adjust trigger thresholds within multiple metrics for specific business needs. In addition – it is tested and works really well for portals that generate up to 3,000,000-5,000,000 hits a day.

Specific use case that is covered in these posts applies to situation where credentials of multiple clients (sometimes thousands or more) got in the hands of attackers who will try to take advantage of these for monetary, competitive or political gains. With the help of Splunk, enterprise will be able to quickly and automatically detect such situation and take necessary measures to protect business and clients.

I won’t get into the details of multiple possible ways customer credentials may be compromised but the end result is an ability of unauthorized person(s) to access multiple customer accounts and cause significant damages to customers and to business, including large monetary losses.

The worst way the enterprise can learn about cyberattack on their own customers is from CNN.

Splunk gives us all the necessary tools to quickly detect such attacks and stay on top of the game. […]

Once you got all the beautiful and rich traffic data exported from Tealeaf and imported in Splunk – the possibilities are virtually endless to create very powerful search and cross referencing analytics and security investigation tool.

Within my consulting work for a major financial services firm I’ve built a specialized Splunk App that allows using single dashboard to execute multiple searches and visualize results and trends by leveraging Tealeaf data.

In addition to that a number of custom searches and alerts were created to create summary indexes and automatically detect and alert on possible malware infections, notify about suspicious activity patterns and out of bound activities.

Before sharing more details about visualizing trends and malicious traffic patterns – here are few tips on general design of custom Splunk security analytic apps and dashboards with Splunk.

While I cannot offer image snapshots of the actual dashboard’s visuals used within financial firm due to obvious client’s security concerns, I can describe the general approach we took to overcome Tealeaf’s limitations with Splunk as well as number of important points on how to take the most advantage of Splunk as a security research tool. […]

Let’s get our hands dirty. First step in building fraud investigation and security analytics platform with TeaLeaf is making TeaLeaf’s data available for Splunk. Then Splunk will take care of all the deep security queries and specialized investigative dashboarding.

Disclaimer: all data you see on this site was autogenerated for demonstration purposes. It demonstrates concepts and ideas but does not shows any real names, IP addresses and any other information that matches real world events.

TeaLeaf comes with cxConnect for Data Analysis component.

TeaLeaf cxConnect

“Tealeaf cxConnect for Data Analysis is an application that enables the transfer of data from your Tealeaf CX datastore to external reporting environments. Tealeaf cxConnect for Data Analysis can deliver data in real-time to external systems such as event processing systems or enable that data to be retrieved in a batch mode. Extraction of customer interaction data into log files, SAS, Microsoft SQL Server or Oracle databases are supported. Data extraction jobs can be run on a scheduled or ad-hoc basis. Flexible filters and controls can be used to include or exclude any sessions or parts of sessions, according to your business reporting needs”.
Source: IBM TeaLeaf.

IBM TeaLeaf cxConnect hourly log extraction task

Although from my experience “real-time” claim is a long shot (at least I didn’t find a way to accomplish above in real-time), but I managed to do pretty successful regular, hourly, detailed TeaLeaf log exports.

cxConnect Daily extraction task

If you’d try to use cxConnect right off the bat for log exports and select all default options – you’ll end up with humongous set of files that will contain mountain data you don’t really need wasting your disk space. It took me quite a while to configure cxConnect to export data that i need and to make it not include data that i don’t need.

Within cxConnect “Configured Tasks” menu – you may create any scheduled task. For our purpose I’ve created two tasks – one is hourly and second is daily. […]

IBM TeaLeaf is one of the leading customer experience management platforms from IBM.

IBM TeaLeaf is set of tools allowing enterprises to record all customer interactions with their Web Application portals with further capabilities of visual session replays. IBM TeaLeaf also offers a set of interfaces to design custom events, alerts, dashboards and visual reports.

TeaLeaf allows to define custom reporting dimensions that could be very specific to the given business needs.
Tracking clicks, conversions, customer struggles, optimizing sales funnels, analyzing mobile experiences, presenting any kind of data in a visually appealing way are only few of many available benefits that TeaLeaf offers.

TeaLeaf generally offers two ways to search for pieces of information – via it’s browser interface (“Search” menu option) or using RealiTea Viewer (RTV) – separate desktop application allowing to run raw searches via direct connections to TeaLeaf data repository (data canisters).
The main advantage of RTV is that it allows to run searches on currently active sessions (in other words with almost no delay in obtaining fresh traffic data) as well as it is quite fast. […]

As part of being professional consultant building client-specific fraud detection solutions I often witness product pitches by different vendors in a security / fraud detection space.

The recent wave of successful high profile cyberattacks and disastrous data leaks added new level of activity into search for the perfect fraud detection and early alerting solution.

With attackers changing their activity vectors, attack patterns and techniques on a daily basis this makes many legacy fraud detection tools to lose their efficiency and get outdated very quickly.

In the never ending quest to protect enterprise against fraud losses the ideas of Automated Anomaly Detection are picking up steam.

The way it generally works – anomaly detection system would establish baseline for certain predefined dimensions and system would then monitor (often in real time) for deviations from established baseline. Once sufficiently abnormal condition is detected – the alert is issued. Such system could operate pretty automatically and learn from historical and present data constantly updating it’s baselines as well as trigger thresholds. […]