Open TIDE
Open TIDE
This work is relentlessly challenging. Threats evolve faster than many teams can adapt. Mountains of data
obscure the faint signals of intrusion. And skilled engineers, those who understand the intricacies of both
security and analysis, are in desperately short supply.
One challenge is the ever-changing nature of cyber threats. New threats are constantly emerging, and it
can be difficult to keep up with the latest developments. Another challenge is the volume of data that
needs to be analyzed. Security teams are often faced with terabytes of data, and it can be difficult to
identify the data that is relevant to threats. A third challenge is the lack of skilled detection engineers.
There is a shortage of qualified detection engineers, and this can make it difficult for organizations to find
the talent they need.
Despite the challenges, detection engineering is a critical process that can help organizations protect
themselves from a wide range of cyber threats. By investing in detection engineering, organizations can
reduce the damage and detect faster and better
Cybersecurity is a war of attrition, and the odds often favor our adversaries. Yet, in detection engineering,
we find the potential to shift the balance. It's the unglamorous work of building the digital tripwires that
alert us to danger – the unsexy but absolutely critical act of knowing when we're under attack.
This paper provides a comprehensive overview of detection engineering. It covers the key concepts of
detection engineering, the challenges associated with detection engineering, and the best practices for
developing and maintaining effective detection systems.
Let this paper be the start of your journey to transform detection from a reactive scramble into a proactive
strength. Because in this digital arms race, the best way to prevent tomorrow's breach is to detect the
signs of it today.
Modern DE teams face several challenges, but through several interviews with peers to exchange on
SOC management practices, one which comes back frequently is the lack of common actionable
frameworks to build from. This leads, according to recurrent feedback, having to spend effort and deal
with the complexity of building in-house frameworks, with varying level of investment due to the effort
required, and often the immediate need to address key detection requirements. In general, a lot of
effort of DEs are spent on meta-tasks before starting analysis: writing knowledge articles, aligning
stakeholders, gathering intelligence from different sources and parsing it, etc. Furthermore, throughout
scaling, knowledge gets spread across different systems and people, is stored in different forms, and
overall makes it increasingly harder to get the data under control.
The European Commission CSOC, through internal reorganization, defined a new service vertical
dedicated to Threat Hunting and Detection Engineering, with emphasis on being Threat-Driven. The new
team, CATCH (Cyber Analytics, Trending, Correlations and Hunting), was in the right environment to
develop a top-down framework, leveraging as-code and DevOps principles, expanding automation
capabilities and emphasizing threat and detection modelling. This framework and platform grew to be
OpenTIDE, and after 2 years of continuous adoption and further development, we now propose this
framework to the open SOC and Cyber communities as the current only open blueprint to structure
Detection Engineering and create a consistent system of exchangeable and actionable data.
2.2 DetectionOps
OpenTIDE is best thought of as a Detection Operations engine. It contains a data schema, automations,
concepts and principles that aim to be deployed in a detection engineering team as a central platform.
The core idea of the framework, is that a Detection Engineering team can express all its artifacts as-
code, following common, extensible data schemas, in a continuous flow from threat to detection –
defining the steps to create a detection, and avoiding manual overhead.
DE can be codified into structured YAML (a well-known and human-first alternative to JSON) files called
models, increasing automation opportunities, and exposing actionable data, such as more immediate
coverage metrics. This loop, from threat analysis to detection release, and new intelligence ingestion, is
DetectionOps.
A more top-down approach is MITRE ATT&CK, which is the largest and most useful taxonomy of known
cyber threats. Whilst MITRE ATT&CK provides lots of insights on threats, it also normalizes discrete
procedures into more generalist (sub-)techniques. For DE, the procedural level, or an even more
technical intel level is often required and ATT&CK can provide a general direction but can also fall short
of the more granular technical needs of a DE team.
Thus, for DE purposes, existing frameworks of methodologies are often insufficient, and engineers
supplement this gap with deep dives into particular TTPs or by creating accompanying frameworks like
the Atomic Red Team tool chain. OpenTIDE makes this process structured, as intel can be broken into
suitably-sized chuncks and ingested as threat vector models, indexed and metadata rich. Still connected
to ATT&CK, OpenTIDE aims at being a bridge between DE needs and other forms of intelligence, or
taxonomies which are initially limiting. ayer of abstraction coming from existing frameworks and
directly provide DE with a low-level, inter-related and actionable technical data format.
(1 https://d3fend.mitre.org/resources/D3FEND.pdf)
community
input to grow
Maps to domains.
Active community
and project
evolution.
Can integrate to
TIDEMEC.
Actual innovation
within the space
of cyber security.
(2https://d3fend.mitre.org/resources/D3FEND.pdf )
enables CTI used or created, modelling and
teams to track malware artifacts, engineering.
threat actors and similar
and their work artifacts that can
in a mostly inform detection
structured way modelling and
focused from engineering
the threat efforts. Great
perspective. documentation of
threat actors,
intrusion sets and
campaigns and
meta data around
these.
Relationships can
be modelled
between SDOs.
Palantir ADS An ADS Strong focus on Fully depends Strong and capable
framework is a enabling 1st level on ATT&CK to focus on providing first
set of responder prioritize level analysts the
documentation actions/determin coverage gaps. information/knowledge
templates, ation. needed to make alerts
processes, and No focus on actionable. We’re
conventions Includes alert making CTI working on either
concerning the testing actionable. eventually bringing this
design, (validation). level of capabilities into
implementatio Not created to OpenTIDE or
n, and roll-out Analyzes blind structure DE documenting how we
of ADS. Prior to spots and false processes or achieve the
agreeing on positive support with actionability for first
such a expectations. automation. level responders using
framework, we other processes and
faced major tooling.
challenges with
the
implementatio
n of alerting
strategies: The
lack of rigor,
documentation
, peer-review,
and an overall
quality bar
allowed the
deployment of
low-quality
alerts to
production
systems. This
frequently led
to alerting
apathy or
additional
engineering
time and
resources to fix
and update
alerts.
SpectreOps blog series ‘On The blog series Argues that No A must for anyone
detection’ introduces MITRE ATT&CK implementation doing CTI, red teaming,
(mostly, we lacks granularity follows/accomp purple teaming, blue
guess partially (possibly only anies the teaming and of course
supports also) indirectly) for DE structured detection engineering
arguments purposes. thinking.
around the
complexities of First-ever
the science of attempt to
detection describe the
engineering levels of
and the need complexity that
for granularity are not handled
of threat by any framework
descriptions today. Only the
atomic red team
framework
reaches the
functional level
similar to the blog
series. Defines 2
levels below the
procedural, which
drives detection
coverage
discussion
progress and
enables future
detection
coverage
maturity efforts.
Argues that
visualizing threats
and how
they/systems
work is an
advantage
In general just
fantastic reading
Indicators of Behavior -
OCA
(opencybersecurityallian
ce.org)
2.3.1 Part-conclusion on ontologies/topologies and detection coverage framework topology
All of the listed frameworks aim to reduce the complexity of the global threat landscape in order to
inform the efforts of cyber security teams. There is a need to improve the capabilities of defensive
teams, or attackers will forever remain at an advantage (3). ATT&CK contributed the first-ever structured
effort to reduce the complexity of the threat landscape with a 4-tiered pyramidical structure, and
D3FEND adds countermeasures to that including detection types/categories.
DeTTECT adds the first attempt at mapping data sources into a graph containing detection
implementations and threat actors.
Atomic red team adds unit tests for detection implementations at the functional level.
The SpectreOps blog series adds 2 more levels to the MITRE ATT&CK pyramid, the levels of functional
and literal. It also attempts to provide argumentation for where and how to implement detections at the
right level of granularity of detection modelling and engineering.
MAGMA adds a 3-tier structure, where for the first time detection objectives are mapped to business
objectives, drives and stakeholders. MAGMA also adds the first elements of detection lifecycle
management and a structured maturity approach using CMMI (4). MAGMA adds meta data to measure
detection coverage, effectiveness, weight, and potential improvement value. Additionally, MAGMA adds
the first-ever attempt at creating a structured detection coverage framework via L1 models mapped to
L2,then to L3 with L4 being the detection rule layer.
Of all the frameworks listed and evaluated, MAGMA is by far the most capable for a detection
engineering team, but still falls short of the target.
(3)https://github.com/JohnLaTwC/Shared/blob/master/Defenders%20think%20in%20lists.%20Attackers%20think%
20in%20graphs.%20As%20long%20as%20this%20is%20true%2C%20attackers%20win.md
(4)https://www.betaalvereniging.nl/wp-content/uploads/FI-ISAC-Use-Case-Framework-Full-Documentation.pdf
(5)https://link.springer.com/content/pdf/10.1007/s11227-022-04808-6.pdf?pdf=button
(6) https://d3fend.mitre.org/resources/D3FEND.pdf
3. A semantic approach to improving machine readability of a large-scale attack graph (7)
a. The paper cites previous research on modelling attack graphs and explains the need to
understand attack graphs and paths including attacker state transitions. Proposes an
ontology to make attack graphs understandable.
4. SANS 2022 ATT&CK® and D3FEND™ Report: Incorporating Frameworks
into Your Analysis and Intelligence (8)
a. The paper describes the value of D3FEND in informing blue teams about ways to
itigate threats by using case studies and specifically for the ‘detect’ part of D3fend
shows an exa ple of counting i ple ented detections per D3FEND ‘detect’ category
5. How To Improve Security Monitoring With Detection Engineering Program (9) by Oracle SCS
team
Important points from this article, which is very on-point for the challenges presented in this
white paper:
a. The three main components of the SCS program which drive success are 1) having a
formalized, detection creation process, 2) using a joint, cross-functional, security team,
and 3) using a shared repository to store detection information.
b. We manage the Oracle SaaS Detection Engineering as a joint team, within SCS. It
includes Security Operations, Incident Response, Threat Intelligence, Software
Engineering, technology-specific subject matter experts, and business representatives.
The joint team concept improves collaboration, common goal setting, and, most
importantly, exposes multiple teams to the process and gives them a way to add to the
threat detection requirements. The goal of the team is to develop and manage
technology-solution, agnostic, detection capabilities based on proactive threat research,
and reactively, based on security incidents.
c. For each use case, we track data points such as: MITRE ATT&CK alignment, Detection
technology (SIEM, EDR), Log sources, Functionality description, Deployment status,
Sigma rule
d. Input and Concept Phase
In this phase, which fits into the DevSecOps planning phase, we gather and prioritize use
case requests. Use cases can be inspired from many sources, such as internal research,
forensic investigations, or open source data sets which contain detection rules, such as
the Sigma GitHub repository owned by Florian Roth. We use the MaGMA Framework to
help provide control over the security monitoring process and align security monitoring
to business and compliance needs. In this phase, use cases describe follow-up actions
(incident response) and are tied with business drivers to show how security monitoring
reduces risk in the organization.
(7) https://link.springer.com/content/pdf/10.1007/s11227-018-2394-6.pdf?pdf=button
(8)https://fs.hubspotusercontent00.net/hubfs/2617658/Gated%20Assets/Q123%20SANS%20Mitre%20D3FEND%2
0and%20ATT%26CK%20Report.pdf
(9) https://blogs.oracle.com/cloudsecurity/post/how-to-improve-security-monitoring-with-detection-engineering-
program
e. Deployment Phase
The third phase in our use case lifecycle encompasses the release and deploy phases
found in the DevSecOps framework. When a use case is deployed, the team adds it to
the Cyber Analytics Repository (CAR). Within CAR, we use either Sigma or YARA
languages, and we map all the content to the MITRE ATT&CK matrix and assess any
gaps in detection capability. The DeTT&CT framework is an open source project that
"aims to assist blue teams using ATT&CK to score and compare data log source quality,
visibility coverage, detection coverage, and threat actor behaviours" (GitHub, n.d.).
6. Capturing Detection Ideas to Improve Their Impact by Florian Roth (10)
Salient elements from this article, which is also very on-point for this white paper:
(10) https://cyb3rops.medium.com/capturing-detection-ideas-to-improve-their-impact-311cf4e1c7a8
(11) https://medium.com/brexeng/building-the-threat-detection-ecosystem-at-brex-215e98b2f1bc
c. Include a prioritization score from the risk of the system(s) and the likelihood of the
attack scenario.
d. Produce low alert noise and be tuned quickly if creating alert fatigue for the on-call
team.
e. Trigger automation workflows.
f. Be unit tested and regularly reviewed by the team.
OpenTIDE has developed the concept of DetectionOps as a practice to create as-code objects,
representing the entire Detection Engineering process, currently often flaky, in a repeatable and
codified workflow
During the development of the framework, we noticed that DE time is most often spent away from
detection related work, and takes on a more bureaucratic role, carefully documenting new intel and
defining future improvements. Much of this work is due because of a lack of a common way to express
and work with DE related data. It is also explained by the general difficulty a DE has to get the right data
– intel comes in many forms and not always the one that makes sense when creating detections (too
high level, too hyper specific, untechnical or not focusing on uncovered areas…), and this intel is
documented in whatever ends up being available to the DE, which is often an internal wiki, or notebook.
OpenTIDE’s approach is to break down the DE lifecycle as a threat-driven loop, and aims at delivering
artifacts at every milestone. Threats are analyzed into objects called Threat Vector Models, Detection
Objectives related to those TVMs as Cyber Detection Models, and detection rules fulfilling those
objectives as Managed Detection Rules. Doing so, a DE team is always assigned to meaningful R&D, with
a clear outcome, and less complexity in finding the right form of deliverable; instead, the framework
provides a straightforward path.
on den al i ited
1. Data Ingestion
2. Backlog Prioritization
3. Object Creation
4. Object Review
5. Object Approval
6. Object Merging
As an example implementation:
However, the reality of DE team is that they often do not have a sufficient visibility of their detection
coverage, nor of the threat landscape they should cover. Improving situational awareness is complex, as
it requires keeping a tab on many offensive and defensive topics, and often DE team do not have the
procedures or tooling to take rapid efficient decisions. OpenTIDE embraces and developes the concept
of a prioritized detection backlog, where threats are now stored as objects, they can be linked to project
management procedures and prioritization based on the data they contain.
Measuring threat coverage against detection capabilities is key in OpenTIDE DetectionOps loop,
as it drives every prioritization procedure which happen in a top-down manner, starting from
documented threats and trickling down to detection work.
Prioritization is done by analyzing the incoming intelligence, weighing the intelligence against known
organization key targets and vulnerabilities, and comparing against the current detection coverage. The
goal of a threat backlog is to efficiently and continuously assign work capacity to the most critical
detections to engineer or refine, in other words that will have the most impact to the detection
coverage.
Threat vector Context: Analysis of a chain of event an adversary took to execute malicious commands
over SMB.
Chaining allows to have clearly defined and reusable concepts, but still reconstruct a realistic attack path
without ending up with a very complex TVM, which would be hard to consume and effectively only be
useful once.
Threat vector Context: A legitimate DLL (zipfldr.dll) executed by rundll32 to perform malicious
operation.
Without chaining, similar TVMs would have to be created per exploitable DLL. If the attacker uses
something different than rundll32, the entire exercise has to be redone to be either generalized or
duplicated for the new executable. With chaining, the key concepts can be extracted then related.
Coupled with the concept of vector chaining you inherently also intuitively reach the point of combining
various chained threat vectors into more complex detection ideas. Coupling this one step further with
parallel threat vectors, you end up with detection objectives that can look something like a AND b and
NOT c but with d OR e OR f.
In the optimal situation, all the threat vector data already available globally is catalogued and associated
with detection objectives, turning DE efforts into a simpler process of weighing threat vectors and attack
paths against each other and focusing DE efforts where the most value to be gained is.
Threat Graph approach to support DE practicesIn the last few years, researchers in the global infosec
community13 have attempted to move away from the thinking in ‘lists’ described by John Lambert of
Microsoft in his seminal post14. OpenTIDE enables defenders to think in graphs , and with OpenTIDE,
each object type need only be described as a TLP: CLEAR object once for the entire global community to
have access to it.
OpenTIDE enables researchers and vendors, to share their work (public or private) in a common,
normalized format which is directly actionable by receiving parties. For example, an intelligence report
describing new threats by an actor for example, could be broken down in TVMs, detection ideas as
12
https://www.youtube.com/watch?v=I0pbNlzZQtQ
13
As example https://center-for-threat-informed-defense.github.io/attack-flow/introduction/
(14)
https://github.com/JohnLaTwC/Shared/blob/master/Defenders%20think%20in%20lists.%20Attackers%20t
hink%20in%20graphs.%20As%20long%20as%20this%20is%20true%2C%20attackers%20win.md
linked CDMs, and some example rules are MDRs. This entire package can be ingested, reviewed and
prioritized without further work on the data itself.
Public repository of detection content could not only be directly deployable detection-as-code by
adopting the MDR format, but also fully mapped to threat data and detection objectives, providing rich
context and allowing DE team to expand into new objects, relevant for their organization.
5 Detection Deployment
OpenTIDE methodology and processes are meant to guide Detection Engineering, and once the relevant
objects have been created, there should be clear detection objectives, alongside well defined threats.
The last key object of the framework are Managed Detection Rules, or MDR, which are detection-as-
code objects containing the detection rule configurations.
There are different detection-as-code schema published, such as Microsoft Sentinel, Elastic, or Splunk.
Generally, these are wrapping the platfor ’s API endpoint, and are used to publish detection content in
a public repository.
Due to this, there is little to no interoperability between platforms. This may still be functional for teams
with a single central SIEM product, but does not work as efficiently when there are many tools (IDS,
NDR, EDR, Cloud Detection, Container Security, Mobile Agents, Email Gateways Rules etc.). Most
detection team create an in-house schema, and there are no known standardization for detection-as-
code rules. Sigma is the closest equivalent, and even normalizes to the actual content of the query,
which can then be translated by different backends into actual queries. It has limitation both in its cap
abilities, since it needs to normalize the features it exposes across many platforms, so complex rules
with platfor specific techniques often can’t be used, and it cannot exploit the full potential of
platforms feature. While very efficient to share rules to the community, especially for well-defined
environment like windows, it can not fully scale to DE teams need and is as such rarely operationalized.
OpenTIDE strategy is to provide a common schema for the basic data, alongside platform-level
subschema. A single MDR can contain the detection rule for a single platform, or many. This is especially
useful when the same detection rule is implemented on different SIEMs, or for example when part of
the environment's endpoints is monitored by the SIEM, and another covered by EDR as it’s often the
case in large organizations. A layer of processing also takes place, to create an easier developer
experience – for example, instead of having to write a cron expression, allow the developer to have a
simple time expression ( 3h ) and do the conversion in the backend.
Each of those subschemas contain rich configurations options, so the full platform features can be used.
Queries are written in the natural language for the platform.
5.2 Complex Platform Deployments
We emphasize functionality and expressivity in the MDR Deployer Schemas – platforms are
implemented with the full capability of what their API exposes; codifying different features and
configuration combination into conditional schema where required, or highly validated and if possible
autocompleted. On open-ended platforms such as Splunk, where there can be a large number of
actions, we take a progressive approach – we fully implement the base functionality displayed in the
Saved Search editing GUI, adding the popular notable and risk framework actions on top. Actions can be
further conditionally enabled or disabled at the configuration file level, and even be conditionally
enabled based on the status of the MDR.
Status based configuration allow advanced users to override parameters based on the status of the MDR
(more precisely, configuration status as MDR can have multiple configuration – that will trigger multiple
deployers). This is particularly useful for enabling staging-like deployment : in SecOps, there is rarely a
concept of Staging for detection toolings. Staging SIEM for example are often technically impossible as
the data backend and detection engines are often coupled, or would require data sampling into another
instance which would make it unusable for detection engineering. Similarly, most teams have one EDR
installation, one NDR/IDS console etc. Detection-as-code is missing a staging deployment concept, and
by introducing conditional staging parameters, we can enable new types of workflow – for example, by
changing the severity, disabling notable events outside of the PRODUCTION status etc.
By creating tailored system schemas, and tying them with a backend deployment logic, we can expose to
Detection Engineers highly expressive and productive ways to build detections, while ensuring continuous
validation.
During a Merge Request, the pipeline will identify the MDRs to deploy, and then check whether these
statuses are Staging-grade – if so, it will proceed. If not, then it will block deployment. Once merged on
main, and automated job will promote the MDR and assign it a default Production-grade status before
deploying it to production.
This approach allows a Detection Engineering team to fully work as code, including during the earlier
stages of development, and reduce the chances of disrupting production – several manual steps would
have been needed, whereas a bad commit to the wrong file in a Merge Request will have been blocked
unless the status was downgraded.
Doing so also avoids the confusion with Staging and Production concepts in IT compared to SecOps – we
rarely have spare SIEM instances considered Staging – instead we deploy our Staging or Production rules
to the same production instance – Staging is here considered under the Incident Response lens, i.e.
alerts we can ignore.
1. Pull : The platform is responsible for fetching the current state of the repository, then rectifying
its current state against the target. This is a pattern used by Kubernetes for example, with the
Git Operator.
2. Push (Full) : Every commit to a dedicated branch (for example, main) triggers a complete re-
deployment or re-alignment from a CI/CD pipeline automation, the state of the branch is the
state of the platform. Different branches can represent different environments.
3. Push (Incremental) : Every commit to a branch trigger a localized deployment. This is adequate
when units of deployment can be compartmentalized, with little to no interaction across them.
This strategy allows for smaller deployment units across various branches. This is the approach
that OpenTIDE uses.
Detection Engineering teams leverage SIEM, EDR, NDR, XDR, IDS and Raw Telemetry Database Engines.
The data load on these systems is so high, that a staging/development instance would not make sense –
it would not be able to replicate the data, and most of the tools do not allow a decoupling of the data
backend and the components that would perform the threat detections. As a result, SecOps team do not
have two separate SIEM (or other tools) instances that are functional for detection testing. They develop
and “release” production-grade detection on the same instance most of the time. A few SIEM support
forms of data replay, or sampling, but during detection development it is strongly beneficial to have
access to as much data as possible, so these features are rarely every used in the context of Detection
Engineering.
This means that the target or Staging and Production deployments, in the context of detection-as-code,
is actually the same infrastructure; we deploy Staging and Production rules onto the same SIEM
instance, EDR, etc. This creates a conflict with the Trunk-based develop ent philosophy, as we can’t
deploy the entire branch into a staging or production environment without conflicting with changes
being done parallelly in many Merge Requests – we need to only deploy the MDR that has been created
or modified, without impacting other ongoing development, or overwriting data that is on the main
branch. To resolve this technical challenge, our solution was to allow every commit on any MR to trigger
a CI/CD pipeline, which would contain a job to calculate a git diff, filter it to isolate the MDR that can be
deployed, and then pass them onto the deployment engines. The git diff is performed differently on
main, and on feature branches – on main, the calculation is between the commit the pipeline is running
on and the previous. On feature branches, the root of the branch is found by iterating on every commit
until we find two parents – which is the start of the branch. This allows developers to keep pushing
changes to their branch and MR, and the deployment to deploy the entire change contents with all
cumulated changes.
6.1 Metaschemas
A specialized YAML format has been developed to allow targeting a dynamic generation of JSON
Schemas and YAML templates from a single source of truth. The structure of the metaschema format
strictly follows the JSON Schema syntax (json-schema.org), and adds supplementary namespaced keys
that are used in the target schemas generation, to support automating certain data transformation
(such as fetching a list of possible values from other files, named vocabularies, or importing smaller
common schemas called definitions) or formatting (for example, hiding keys in the template to avoid
clutter but keeping them in the schema).
Metaschema build on top of JSON Schema with custom tooling and functionality
To further support the MDR plugin architecture, each system is represented by a standalone subschema
file, which gets dynamically inserted into the final JSON Schema if the system is enabled at the instance
configuration level.
System Subschemas encapsulate the data modelling part of detection rule deployment algorithms, and
make OpenTIDE modular. New supports can be built, and plugged into the framework.
Schemas are the base of OpenTIDE – they define all the allowed data structure, and ensure a high
standard of consistency and quality
6.5 Vocabularies
OpenTIDE objects often use lists of allowed values as part of
their schema. It would unmaintainable to hardcode those lists
in the schemas themselves, and would prevent reusability. To
solve this issue, dedicated YAML files with a specific schema,
called vocabularies, contain such data. Vocabularies support
several features used to enrich the schemas, for example by
assigning entries against a stage, and being able from the
metaschema to only point to certain stages. During the JSON
Schema generation, the algorithm dynamically seeks, retrieve,
and pivots vocabularies before inserting them in the final
schema.
Indexing was introduced in OpenTIDE to resolve those performance issues, and facilitate the overall data
management and access when developing. The entire repository gets indexed only once, including
vocabularies, schemas etc. stored in the injected CoreTIDE repo, and is stored in an object in memory.
This object is optimized for access speed by using object IDs as keys, and other approaches.
A JSON representation can also be produced, and it is for example leveraged when creating the wiki on
Gitlab Wiki – the index is generated from the main repo, then injected into the wiki repo. Then,
automation picks up the JSON and generates the documentation without more involved steps.
The JSON Index allows complex data operations to be executed in seconds, even on very large
repositories
6.7 Configs
Configuration files are written in TOML, to differentiate more from the large YAML usage. They contain
variables commonly used across automation modules, and allow central management of features. There
is two layers of configuration, which are identical – core-side, and instance-side. When a setting is
modified on the instance-side, it takes precedence from the core-side, allowing to retain default
configurations on the core and to have full customization from the OpenTIDE instance. This setup is
especially useful when introducing new features and related configurations without having to update
manually on the instance-side. On the technical side, this is accomplished during the indexation time,
using a deep merge resolution. This is encapsulated, and can be called by automation modules who
need access to resolved configuration before indexation can occur.
OpenTIDE configuration files allow a large amount of customization and expressivity, pushing Detection-
as-Code further
Secrets are supported with a specific syntax, if a value starts with a dollar sign ($) a function will search
this key in the environment variables. When using CI systems, environment variables are used to inject
secrets in the CI run environment, allowing to manipulate secrets without hardcoding them in the
repository files.
OpenTIDE was developed under the lens of an active SOC environment, where not all Detection
contributors will have a background in software development and may struggle to adopt the key
concepts around Git versioning, file-based configurations, DevOps workflows, distributed development
strategy and other powerful, but often hard to adopt, as-code concepts. Thanks to a strong operational
feedback loop, the framework features were aimed at facilitating as-code adoption as much as possible,
where one DevOps champion is enough to support the rest of the team to migrate to the new way of
working, with tooling in the forefront to provide quality-of-life features.
A crucial feature of OpenTIDE is the IDE experience: during the framework generation job, part of the
platform automation pipeline, templates are generated and turned into VSCode snippets. Developers
can quickly use those snippets to start working, while getting full file validation with JSON Schema,
alongside autocompletion. Greatly reducing onboarding time, auto completion and validation was
crucial for the platform to be adopted in production scenarios. Specific features developed allow to
search a vocabulary containing only IDs by their full name, for example when searching in the ATT&CK
vocabulary, or when trying to reference a model.
Rich auto-completion and documentation
Self-documenting schemas
7.3 Pipelines
When a Detection Engineer commits to their development branch in a Merge Request, or when the MR
is merged to main and creates a merge commit, an automation pipeline is triggered. DEs only need to
observe whether the pipeline passes – if validation jobs fail, they must review the log output which has
been developed to allow immediate mental parsing of what went wrong.
Rich purposed designed logs to give Detection Engineers immediate feedback
Pipelines are used to support many other functionalities, and logs contain a lot of clear information over
what happened.
The instance wiki, containing all the documentation, provides rich pages with links to related objects and
visualization of the knowledge graph, allowing to identify detection coverage gaps and better grasp the
current detection posture.
Complex documentation exposes the rich internal graph-like data structure into powerful visualization
PluginTide is part of the plugin architecture used by OpenTIDE to address Detection-as-Code complexity.
Most DaC frameworks are hardcoded to only support a handful of platform, generally those used by the
SecOps team who developed it. OpenTIDE ambition is to be the de facto platform for DE teams, and as
such the DaC mechanisms were thought from the start to be modular. New deployment modules can be
added, and if a TOML configuration file, and subschema is present, the platform correlate it all and
correctly orchestrate the deployment. PluginTide is the interface exposing active plugins on the platform
- DeployTide.mdr[“splunk”].deploy(deployment=list_of_mdr_uuids_to_deploy) will for example trigger
the deployment routing for splunk across the selected set of MDRs.
DataTide exposes all the required hookups to the data of the OpenTIDE repo. When called, it will index
the repo only once and directly return the data on all subsequent calls.
The Plugin Architecture of OpenTIDE allows to import modules, initialize them, and expose them to the
codebase for more complex operations.
In Deployment modules, a developer can then point to a test files, and keep rerunning the deployment
against them locally to check if everything works instead of having to trigger the pipeline by pushing to
the repository in a Merge Request. Secrets are then often required, and to avoid having to touch the
configuration files, or manually setting the environment variables, the Engines/modules/local_secret.py
file can be used to add those variables, and in DEBUG mode they will be injected. Missing environment
variables will be ignored when resolving the configuration files. local_secret.py is part of the .gitignore
file, so it won’t be pushed to the repo, but local secrets should always be handled with caution.
OpenTIDE custom tooling allows to take in account many specific of Detection-as-Code problematics
8 Collaborative detection engineering: OpenTIDE as a the first enabler
of Detection Engineering sharing
OpenTIDE artifacts can only be shared manually at the time of release. Pending later releases, or
community contributions conforming with the intended spec, it will become possible to automate the
sharing of DE artifacts.
As of Stix 2.1, the objects that exist to capture data in STIX format do not support representation of CTI
data in a way that’s targeted towards consu ption by DE tea s. OpenTIDE allows CSOC teams to obtain
the benefits of data normalization, OpenTIDE objects could be translated as STIX SDO to construct a new
layer of DE friendly data, sitting between traditional CTI data and detections.
8.3 TIDeX
The sharing of OpenTIDE artifacts could also be automated by implementation of a dedicated OpenTIDE
exchange network of servers. Similar to the MISP implementation, the TLP assignment to models would
drive which artifacts an entity would exchange with another.