Tuning Concepts for Splunk ES with a Risk Based Alerting Approach

Somerford Blog

Tuning Concepts for Splunk ES with a Risk Based Alerting Approach (RBA Part 2)

Author: Ben Marrable
Release Date: 02/02/2022

Once RBA is set up and running, you may find you are getting too many alerts or perhaps not very many at all, either way it is expected that some tuning is required. There are a number of areas where tuning can be conducted and it is important to understand the effects of each tuning mechanism. Tuning options available are:

Relevant Blog Post:

Enabling a Risk-Based Approach to Cloud Security for Splunk

Tune the detection search itself (Risk Rule)
Tune the amount of Risk associated with the Risk Rule, either based off of the impact or the confidence of the alert.
Tune the risk modifiers and or the risk framework you’re working from, i.e alter what each impact or confidence level should align to risk
Tune the Risk Indicator Rules and what thresholds you’re looking to alert at.

1 and 2 are classed as local tuning mechanisms, whereas 3 and 4 are global tuning mechanisms, i.e. they affect all the Risk Rules in the system.

Depending on the situation, the appropriate tuning mechanism will vary, the starting point is to analyse the number of events generated from each risk rule and the corresponding quantity of risk. This search should assist in that process, it will show each risk rule (Risk Rule) with the corresponding total amount of Risk (Total Risk Score Attributed) it has generated, the average amount of Risk for each trigger(Average Risk Score Applied), the distinct number of entities that have triggered against (Number of Objects Triggered Against) and the total number of times it triggered (Number of Times the Risk Rule Triggered)

| tstats summariesonly=true sum(All_Risk.risk_score) as risk_score, avg(All_Risk.risk_score) as avg_risk dc(All_Risk.risk_object) as risk_objects, count from datamodel=Risk.All_Risk where (All_Risk.risk_object=”*” OR risk_object=”*”) by source | sort 1000 – risk_score, count | rename source as “Risk Rule”, risk_score as “Total Risk Score Attributed”, avg_risk as “Average Risk Score Applied”, risk_objects as “Number of Objects Triggered Against”, count as “Number of Times the Risk Rule Triggered”

Any rules which are producing a significant amount of risk, generally highlight an area to tune, consequently rules that are producing a small amount of risk provides possible scope for an increase. Tuning mechanisms in place here are 1 and 2 in the above list.

Before we talk about false positives, let’s be clear what we mean by false positives in the way of risk based alerting. False positives are very rare in the world of Risk Based alerting, any behaviour in the environment that meets the criteria for the risk rule, is considered a positive event regardless of whether it is malicious or not. This is because Risk Based alerting is all about connecting behaviours across a number of different risk rules.

However, if there are clear and distinct false positives, these could be excluded from the risk rule itself, for example if there’s a vulnerability scanner acting legitimately, consider excluding it from the risk rule. When doing this be as specific as possible, i.e. include the user, host and timeframe as part of the exclusion. An alternative to this could be to add that vulnerability scanner as a risk modifier and have that modifier significantly reduce the risk aligned to it. This is an example of using method 1 for tuning, by filtering out events at source. The second example of using method 1 for tuning occurs for risk rules where there is an element of aggregation involved, for example a risk rule which alerts should a user connect to X number of machines in a day, that threshold if unable to be dynamic based on the users role, could be increased if it is producing many alerts per day.

Once there are no clear repetitive false positives, the next step is to assess the levels of risk being aligned to the risk rule. If the risk rule is clearly producing a significant amount of risk, knowing that it is still very useful to know about, consider reducing the risk score that it generates, this could even be to 0 in certain situations, where you want to know about a specific event and include it in the number of mitre tactics/techniques involved but not alert strictly from any number of those events.

From the getting started guide, you should have a matrix(table) of severity, confidence and risk score factors calculated for each risk rule. Consider further tuning of your confidence levels and aligned risk multipliers for them, with the goal of more refined base risk scores.

The second area is to search across your rules with context around the knowledge of the risk modifiers, the following search will show the count of risk rule matches for both additive and multiplicative risk rule modifiers. Be wary of modifiers that are very common, and consider being more specific if possible.

| from datamodel:Risk | fillnull risk_factor_mult_matched risk_factor_add_matched value=”No Risk Factor Match” | stats values(risk_factor_mult) as “Multiplier Value” values(risk_factor_add) as “Additive Value” count as “Count” by risk_factor_mult_matched, risk_factor_add_matched | rename risk_factor_mult_matched as “Multiplier Risk Factors Matched” risk_factor_add_matched as “Additive Risk Factors Matched”| sort – Count

The final area, once all the above tuning has occurred as best as possible, is to tune the risk indicator rules themselves. The first point of call here is to identify why the risk rules are triggering or even not triggering, i.e. which thresholds are and are not being met. The default search for a 24 hour period is to alert when the threshold hits 100 points of risk, this could be increased or alternatively consider some of the following options:

If you are seeing many alerts created from a single risk rule, consider adding a threshold of source_count >=2
Or if you wish to only be alerted when there is more than one MITRE Technique involved add mitre_technique_count >=2 to the threshold in the search
You could also be very selective and have a threshold such as
| where risk_score>100 AND ((mitre_tactic_id_count >= 2 AND source_count >= 2) OR (annotations.mitre_attack=T1098*))

Experiment with other threshold values to find what works best for your environment.

Good luck with your tuning process in getting the most out of Risk Based Alerting.

Cookie	Duration	Description
ADRUM_BT1	past	This cookie is used to optimize the visitor experience on the website by detecting errors on the website and share the information to support staff.
ADRUM_BTa	past	This cookie is used to optimize the visitor experience on the website by detecting errors on the website and share the information to support staff.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_1170872_23	1 minute	Set by Google to distinguish users.
_gat_gtag_UA_99925054_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
CONSENT	16 years 2 months 24 days 11 hours 26 minutes	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.

Cookie	Duration	Description
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
cookie-test	past	No description
cookielawinfo-checkbox-functional	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
guest	1 month 1 hour	No description available.
jcm	past	No description
jcmc	past	No description
JOTFORM_SESSION	1 month	No description available.
SameSite	past	No description available.
theme	1 month 1 hour	No description available.
userReferer	1 month 1 hour	No description available.

Our Partners

/

Splunk

Okta

Varonis

Netskope

HashiCorp

Lacework

Confluent

Securiti

mnemonic

KnowBe4

OneSpan

Services

Need help?

Government

News

Case Study

Resources

What is Multi-Cloud Security?

Somerford Blog