LLM Soar
LLM Soar
Abstract
This research explores the potential of Large Language Models (LLMs), explicitly using
ChatGPT Actions as dynamic SOAR tools to address evolving cybersecurity threats.
Traditional SOAR systems, though effective, demand significant time and resources for
development and maintenance. The study evaluates their ability to autonomously detect,
analyze, and respond to threats by integrating LLMs into a controlled environment and
simulating various cybersecurity incidents. Findings reveal that LLM-driven SOAR
tools reduce development time, enhance response effectiveness, and improve
communication clarity. However, challenges such as continuous model updates and staff
training were noted. This research provides a framework for implementing LLM-driven
SOAR tools, highlighting their transformative potential in cybersecurity operations and
suggesting areas for further study.
Revolutionizing Cybersecurity: Implementing Large Language Models as 2
Dynamic SOAR Tools
1. Introduction
In today’s digital age, the pace at which cybersecurity threats evolve demands
equally dynamic defense mechanisms. Security Orchestration, Automation, and
Response (SOAR) systems are crucial in managing these threats by automating complex
workflows and responses. Despite their efficacy, the traditional SOAR tools are often
resource-intensive, requiring significant time and expertise to develop and maintain
effective playbooks. This poses a particular challenge for organizations that may need
more resources.
2. Research Method
SOAR platforms. The API integrations enabled the LLM to pull data from threat
intelligence feeds, execute automated responses, and update security dashboards.
Figure 2 is a sample of the configuration for the VirusTotal integration:
approach ensured that the CustomGPT remained practical and up-to-date with
cybersecurity trends and threats.
1. Environment Setup: A Tines account was created, and a dedicated workspace was
configured to replicate the SOAR functionalities intended for comparison with the
CustomGPT. This included setting up data feeds, security tools, and integrations
necessary for incident response and threat management.
2. Automation Configuration: Similar to the CustomGPT, various automatons were
created within Tines to handle tasks such as threat detection, analysis, and response.
These automatons were designed to mirror the capabilities of the LLM-driven SOAR
tools, providing a direct comparison of performance and efficiency.
3. Validation and Testing: The Tines setup underwent a validation phase where the
configured automation was tested against the same scenarios used for the
CustomGPT. This ensured that both systems were evaluated under comparable
conditions, allowing for an accurate assessment of their respective strengths and
weaknesses.
4. Data Collection and Analysis: Data from the Tines SOAR system was collected
and analyzed in parallel with the data from the CustomGPT. Key performance
By setting up both the CustomGPT and a traditional SOAR system with Tines,
the research aimed to provide a comprehensive and comparative analysis of the two
approaches. This dual setup allowed for a robust evaluation of LLM-driven SOAR tools’
potential benefits and limitations in real-world cybersecurity operations.
These diverse scenarios were selected to challenge the LLM’s capabilities across
different cybersecurity incidents, comprehensively assessing its effectiveness and
adaptability.
During the simulation phase, the LLM was allowed to autonomously detect,
analyze, and respond to the introduced threats. The execution of these tasks was
This detailed monitoring provided critical insights into the LLM’s operational
performance and potential to function as a dynamic SOAR tool.
1. Threat Detection and Analysis: Assessing the effectiveness of the LLM in identifying
and analyzing cybersecurity threats, including metrics such as detection accuracy, false
favorable rates, and false negative rates.
2. Response Actions: Evaluating the LLM’s ability to determine and execute appropriate
response measures, focusing on the success rate of automated actions and their
alignment with predefined security protocols.
3. Accuracy and Reliability: Comparing the precision of the LLM’s actions to expected
SOAR outcomes, assessing consistency, reliability, and any deviations from standard
practices.
4. Automation Efficiency: Measuring the degree of automation achieved and the overall
time saved compared to traditional SOAR processes, highlighting potential productivity
gains from using LLM-driven SOAR tools.
The findings were compiled into a detailed report summarizing the feasibility
and effectiveness of using LLMs as dynamic SOAR tools. This report aims to provide a
comprehensive overview of the study’s results, offering valuable insights for
cybersecurity professionals and researchers.
By following this systematic and robust approach, the research ensured that the
study’s findings are reliable, applicable, and beneficial to real-world cybersecurity
operations. This methodology provides a replicable framework for assessing the
potential of LLMs using ChatGPT Actions to function as dynamic SOAR tools, paving
the way for more adaptive, efficient, and effective incident response strategies.
The LLM promptly summarizes the event clearly and understandably. It begins
by extracting relevant data points from the .eml file, ensuring that it gathers all necessary
information for a thorough evaluation. It even uses the VirusTotal integration to enrich
the relevant indicators found within the file. Figure 4 show:
The LLM can contextualize the attack and provide potential motives, past
correlation, and remediation actions. It formats a functional TLDR code block that
could be easily shared or added to an analyst’s case management platform.
For the second example, a malicious code block is used. Here is the sample
payload submitted to the LLM:
For the third example, the CustomGPT was supplied with a reasonably simple
network scanning log for the network example. Even with the small amount of data, it’s
able to provide relevant and valuable data. Figure 7 shows the network payload and
initial analysis:
Figure 8 shows how the LLM provides actionable remediation strategies along
with the functional TLDR summary block:
The approaches for network and malware attack analyses are similar, requiring
the creation of equally detailed and tedious playbooks. Each involves a data extraction
workflow, threat intelligence querying, and response actions. The outputs of these
playbooks are severely limited by the integrations available, and even with integrations,
they need advanced capabilities such as code interpretation, summarization, and
enhanced communication.
For these reasons, only the phishing email example is shown. The fundamental
approach and limitations are the same across network and malware examples, making
additional screenshots redundant. Here’s an example of the phishing playbook output:
The provided image shows a phishing email analysis output using the Tines
platform. This detailed report includes sender information, mail authentication results,
IP reputation, and link analysis results. However, it lacks additional communication to
help interpret the results, making it easier for users to understand the implications
without further investigation. Moreover, the enrichment details often require opening
external links and reading through additional data to gain a complete picture,
highlighting the limitation of traditional SOAR systems in providing immediate
actionable insights.
One of the most remarkable findings from the study is the LLM’s ability to
improvise and adapt to various cybersecurity scenarios. Unlike traditional SOAR
systems, which rely heavily on predefined playbooks, LLMs can generate contextually
appropriate responses in real-time, even when faced with unfamiliar or evolving threats.
Key observations include:
1. Dynamic Threat Detection: The LLM demonstrated superior performance in
identifying new and complex threats not explicitly defined in its training data. For
example, when presented with novel phishing tactics, the LLM was able to analyze
email patterns, identify suspicious elements, and flag potential threats effectively.
2. Adaptive Response Strategies: The LLM’s ability to adapt its response strategies
based on real-time analysis was evident in scenarios involving rapidly changing
threat landscapes. In a simulated ransomware attack, the LLM detected the initial
breach and adjusted its response as the attack evolved, implementing containment
measures and initiating system recovery protocols.
These findings underscore the LLM’s potential to enhance cybersecurity
operations by providing a flexible and responsive defense mechanism capable of
handling a wide range of incidents with minimal predefined instructions.
detect and respond to threats within seconds, whereas traditional SOAR systems,
constrained by static playbooks and manual interventions, took considerably longer.
2. Accuracy and Reliability: The accuracy of the LLM in identifying and mitigating
threats was notably higher. Traditional SOAR systems exhibited higher false
positive and false negative rates, whereas the LLM maintained a lower error margin,
ensuring more reliable threat management.
Network 2% 11%
Phishing 1% 9%
Network 0.8% 6%
Phishing 0.2% 4%
Provides detailed, clear incident reports Generates raw data points needing interpretation
These insights gained from this experiment are crucial for cybersecurity
professionals considering the implementation of LLM-driven SOAR tools and for
researchers aiming to advance this field. Given the significant potential demonstrated by
LLM in automating and enhancing SOAR functions, it is essential to translate these
findings into practical steps and identify areas that require further investigation.
flags anomalies or unexpected behaviors in the LLM’s actions allows for timely human
intervention when necessary.
5. Conclusion
This study explored the potential of Large Language Models (LLMs) using
ChatGPT Actions to function as dynamic Security Orchestration, Automation, and
Response (SOAR) tools. The primary challenge addressed was whether LLMs could
effectively replace traditional SOAR systems, thereby reducing the resource burdens
associated with developing and maintaining SOAR playbooks and enhancing response
effectiveness. This research found that LLMs significantly decreased the time required
to create and update SOAR playbooks, making advanced security automation more
accessible to organizations with limited resources. Additionally, LLMs provided more
accurate and context-aware responses to cybersecurity threats than traditional SOAR
systems. Moving forward, it is essential to focus on enhancing error detection and
correction mechanisms, continuous model training, and user education to fully realize
the benefits of LLMs in SOAR roles. The ability of LLMs to autonomously manage
SOAR functions with high efficiency and accuracy represents a significant advancement
in cybersecurity operations. Organizations should consider implementing LLM-driven
SOAR tools to improve their security posture, making their incident response more
adaptive, efficient, and effective.
References