Security_Data_Visualization_Graphical_Techniques_f..._----_(10_Creating_a_Security_Visualization_System)
Security_Data_Visualization_Graphical_Techniques_f..._----_(10_Creating_a_Security_Visualization_System)
10
Visualization System
Conti, G. (2007). Security data visualization : Graphical techniques for network analysis. No Starch Press, Incorporated.
Created from inflibnet-ebooks on 2024-01-06 09:25:22.
Many of the ideas in this chapter use concepts from the field of human-
centered design, which seeks to consider strengths and weaknesses of users
at all stages of the design process. By definition, humans are integral to the
visualization system. The best systems analyze users and their tasks and then
use these insights to guide the entire development process. It is important
to note that system design is an iterative process that regularly takes into
account user feedback (see Figure 10-1). By continually taking into account
users and tasks, project designers reach the goals of reduced training time,
more efficient task completion, reduced error rate, and increased system
adoption. Your aim should be an efficient and effective system that is satisfy-
ing, even pleasurable, to use.
Visualization,
Feedback interaction, and
dataflow design
direction. You may be an expert and know specifically the issues you need to
address, but more likely you will gain very valuable insight by talking to the
people you anticipate will use your tool. A common design failing is that the
developer assumes he or she is a typical end user; this is not often the case.
If real users participate in your initial design process, you will gain the
added benefit of increased adoption and reduced resistance to future
changes. Ultimately, users will decide whether your project is a success or
failure, so their opinion matters.
If you are unable to gain access to actual users, try to find similar
people in similar fields. The closer the match, the more likely you will
acquire accurate information. Users will tell you, often bluntly, what they
184 Chapter 10
Conti, G. (2007). Security data visualization : Graphical techniques for network analysis. No Starch Press, Incorporated.
Created from inflibnet-ebooks on 2024-01-06 09:25:22.
need help with, but sometimes they won’t know exactly what to ask for.
A way to get around this problem is to observe them at work and carefully
study how they do their jobs. Don’t forget to consider that your users may
not be operating in isolation and, where appropriate, seek opportunities
to facilitate collaboration and communication. In particular, consider the
consumers of your visualizations (clients, other analysts, managers, and
potential customers, for example) and take into account their requirements
as well.
Conti, G. (2007). Security data visualization : Graphical techniques for network analysis. No Starch Press, Incorporated.
Created from inflibnet-ebooks on 2024-01-06 09:25:22.
Potential Data Sources
• Antivirus logs
• Application layer decodes
• Domain Name System (DNS) data
• Extrusion detection system logs
• Filesystem data
• Firewall logs (network- and host-based)
• Honeynet logs
• Host process data
• Intrusion detection alert signatures
• Intrusion detection system logs
• Intrusion prevention system logs
• Netflows data
• Network port data
• Operating systems in use and passive operating system fingerprinting data
• Packet captures
• Packet interarrival times
• Physical locations and IP geolocation data
• Proprietary network sensors and other appliances
• Protocol compliance and noncompliance
• Reconnaissance tool output, such as Nmap output
• Router logs
• System memory
• Unassigned or illegal IP addresses
• Unix syslog
• Virus definition files
• Vulnerability assessment tool output, such as Nessus output
Copyright © 2007. No Starch Press, Incorporated. All rights reserved.
We’ve seen how researchers can create systems based on a single source
of data, such as network packet captures or intrusion detection alerts. While
that is effective, I believe the future lies in combining multiple sources of
data into carefully crafted visualizations. As you examine the preceding list,
consider the data sources you have available and decide which seem well-
suited to solving your particular problems.
186 Chapter 10
Conti, G. (2007). Security data visualization : Graphical techniques for network analysis. No Starch Press, Incorporated.
Created from inflibnet-ebooks on 2024-01-06 09:25:22.
Key Questions for Review of Data Sources
All of these questions require a good deal of thought; some have clear
solutions, while others only lead to more questions. If this all seems a bit
daunting, try starting with something simple. However, as you progress to
more advanced systems, I believe that time spent thinking deeply about
these questions will be time well spent. Regardless of whether you are
designing a simple or advanced system, it is worthwhile to do some trial
Copyright © 2007. No Starch Press, Incorporated. All rights reserved.
and error and just play around with the data. The better you understand
the data, the better you will be able to create an appropriate visualization
system.
Conti, G. (2007). Security data visualization : Graphical techniques for network analysis. No Starch Press, Incorporated.
Created from inflibnet-ebooks on 2024-01-06 09:25:22.
Key Questions for Technology Assessment
• How much RAM and hard drive space is available on the typical user’s
workstation?
• How large is the typical user’s monitor?
• Will you be using centralized projectors or other very large displays?
• What display resolutions can typical client machines handle?
• How much processing power is available on client machines?
• What upgrades might be necessary to handle graphically intensive opera-
tions and to provide quick response times?
• Is there an operational impact when accessing data sources, such as slowing
down important databases?
• Is network bandwidth adequate to transfer the projected data in a timely
fashion?
• Will you be able to process the data at sufficient rates, or will you need to
sample the data to reduce the burden?
• What other critical tasks are the users’ machines performing?
• What operating systems and applications are in use?
188 Chapter 10
Conti, G. (2007). Security data visualization : Graphical techniques for network analysis. No Starch Press, Incorporated.
Created from inflibnet-ebooks on 2024-01-06 09:25:22.
Visualization
1
Live
sensors
Visualization
Memory buffer 2
Graphics Visualization
Archived Parsing Filtering Encoding
engine 3
data
Filter
database
...
User interface
Semantic Visualization
data n
Figure 10-2: A typical visualization system. By breaking down the system into compo-
nents, it is easier to understand, design, evaluate, and optimize it.
Dataflow Design
The flow of data from collection to presentation to retention is a critical
aspect of information visualization design. By this point, you’ve chosen some
number of datastreams to drive your system. These datastreams will flow
through a variety of subsystems, as you’ve seen in Figure 10-2. For example,
you may combine the inputs from several sensors, extract the fields you
desire, and then pass the data to a graphics engine for display. You can use
the flow of data as a way to learn how new collection and processing sys-
tems work, and you can use dataflow analysis to help build or optimize your
system.
Figure 10-2 illustrates the primary components of a typical security
visualization system. Notice that the data flows from real-time and archived
sources through a series of intermediate processing steps and then to a
number of visualization displays. There could be a single data source or
many. Typically, the first processing step involves breaking down the datasets
into individual records and fields, a process called parsing. Recall that I had
Copyright © 2007. No Starch Press, Incorporated. All rights reserved.
Conti, G. (2007). Security data visualization : Graphical techniques for network analysis. No Starch Press, Incorporated.
Created from inflibnet-ebooks on 2024-01-06 09:25:22.
data, but this approach is limited to about 100,000 packets because the
amount of RAM in typical workstations is insufficient to hold more. In the
future, I anticipate using a RAM buffer to provide high-speed access for a
given window of data elements that will be updated by slower, but far larger,
hard drive storage.
The next step is filtering. Note that filtering usually occurs early in the
processing pipeline in order to reduce unnecessary processing downstream.
After filtering the data, the system passes the remaining elements to the
encoding subsystem, where items of interest are flagged for special display
characteristics. Finally, the resulting annotated elements are passed to a
graphic engine that drives the visualization displays. Throughout this pro-
cess, the user influences the pipeline using the system’s interface.
As I discussed in Chapter 9, security visualization systems are likely tar-
gets for attack. As you design your system, consider how external or internal
threats could inject malicious data and harm your system. Depending on the
data you are using to drive your visualization system, inserting malicious data
may be very easy to do. One powerful way to help resist attack is to create
different visualization windows into the data. While it might be easy to attack
one visualization technique, it is often very difficult for an attacker to influ-
ence all of them.
NOTE A single developer (or even a small team of them) will have a difficult time adding
new features to a system. If you can tap into the creativity of your user base, you can
help extend functionality for everyone’s benefit. Tools such as Wireshark have success-
fully done so by allowing users to create and share new protocol parsers. (As a result,
Wireshark can process over 700 different protocols.) Virtually every component of a
visualization system could benefit from this approach. Prime targets include parsers,
encoding schemes, filters, GUI components, signatures, semantic information (such as
lists of illegal IP addresses), and, perhaps most importantly, new visualizations. Imag-
ine if your users could easily drop in different visualization windows to customize the
system to their particular needs. The sharing of these extensions could be accomplished
in several ways, such as with plug-ins, skinnable interfaces, or flat text files.
Copyright © 2007. No Starch Press, Incorporated. All rights reserved.
Visualization Design
The best visualization systems help the user find that which is interest-
ing and minimize the noise of that which is not. For me, this is the most
fun and exciting—but sometimes most difficult—part of designing visualiza-
tion systems. Your goal is to map the available data to the visual display in a
way that provides the insights you desire; in other words, you are creating
useful windows into the data. Inspiration is everywhere; I usually carry a
notebook to sketch out ideas when they strike. A prime source for ideas is
the wide variety of techniques found in the traditional information visualiza-
tion community, as many have not been tried by security researchers. But
beyond academic research, fresh ideas can be found in art, science fiction,
nature, cartography, and even in those highly inaccurate hacker movies.
190 Chapter 10
Conti, G. (2007). Security data visualization : Graphical techniques for network analysis. No Starch Press, Incorporated.
Created from inflibnet-ebooks on 2024-01-06 09:25:22.
As you go through this process, remember that you do not need to con-
strain yourself to a single type of visualization; you may find that multiple
perspectives on the data will best accomplish your goals. This has certainly
been my experience with RUMINT, which allows seven different simultane-
ous views. Multiple perspectives may include big-picture context views as
well as detail views, both graphical and textual—remember the mantra from
Chapter 1, and plan accordingly.
After you’ve considered the problem at hand and sketched out some
prototype visualizations, seek feedback from experts and typical users. Your
results will be worth the effort. After you’ve carefully thought about which
visualizations would be most useful, start to consider how the user will best
interact with them. We’ll look at user interaction in the next section.
NOTE Do not forget to consider reporting requirements for your system. How will your visu-
alizations and supporting analysis be shared with others? One approach might be to
allow analysts to annotate the images, add supporting commentary, and then save the
result as an HTML document. This page could also include summary statistics about
the data, the filters used, and any color coding employed.
Interaction Design
How the user interacts with the system is a key factor in its overall success.
Remember, your goal is to design the interface and interaction meta-
phors that best support your users’ tasks. As I described in Chapter 1,
Ben Shneiderman’s InfoVis mantra (overview first, zoom and filter, details on
demand ) is a time-tested rule you can use to help guide your development.
For example, if you are building a system to monitor network traffic, you
might begin with an overview of the network, then allow the user to pre-
cisely filter unwanted flows, zoom in on areas of interest, and easily call up
specific packet or flow details on demand. The key is that you aren’t just cre-
ating static images to analyze; you are facilitating the users’ exploration of
the data. Interaction allows the users to establish goals, execute a number of
commands, and quickly observe the effect. By repeating these steps, the user
Copyright © 2007. No Starch Press, Incorporated. All rights reserved.
will home in on the areas that most interest him or her. In addition, by gain-
ing more familiarity with the dataset, users will gain unexpected insights.
The interface of your application is the mechanism by which users
dictate their goals to the computer. Interfaces come in many forms, with
the command-line and graphical user interfaces being the most common.
While the command-line interface is a powerful tool beloved by security
experts, it probably isn’t the best choice for building an interactive security
visualization system. It’s not that it can’t be done; it’s just that I haven’t
come across a visualization system that has done it well. Chances are, you
will require a graphical user interface. However, you can emulate the perfor-
mance of the command line by employing keyboard shortcuts to help speed
up user interaction. You may also find a way to allow users to create scripts,
textual listings of commands, which allow portions of the analysis to be auto-
mated and customized.
Conti, G. (2007). Security data visualization : Graphical techniques for network analysis. No Starch Press, Incorporated.
Created from inflibnet-ebooks on 2024-01-06 09:25:22.
Interface design is a tricky process, and many good books and papers
have been written on the subject. One key concept is the use of prototypes.
These can be as simple as paper sketches or as complex as mockups created
with a tool such as Microsoft Visual Studio. By asking potential users to eval-
uate these early designs, you can avoid costly changes after you’ve written
significant amounts of code. As you build your interfaces, don’t expect to
get them 100 percent correct with your first attempt. While prototypes will
help you identify some errors, only real-world use will help identify others.
It will often take several iterations to arrive at an optimal solution.
Take, for example, the RUMINT system I’ve used several times in this
book. The VCR-like interface may look like an obvious choice, but it wasn’t
so obvious at the time. I first included basic VCR playback buttons on the
visualization windows themselves, but I eventually realized that using the
familiar VCR interface as the primary metaphor allowed users to rapidly
understand and intuitively use the program, because I didn’t have to teach
them how to use a VCR. Rather than include playback controls on the win-
dows themselves, I integrated them into a single VCR interface. It was my
repeated trial and error with the early interfaces, combined with user feed-
back that eventually led to this simple, clean solution.
While RUMINT combines the VCR interface with a number of
playback visualization windows, it still needs enhanced windows manage-
ment—RUMINT users spend too much time moving and resizing the
windows. I’m seeking a more efficient approach to solve this problem.
Likewise, the user configures many aspects of the tool to his own needs;
I would like to support this customization by allowing the user to save (and
later load or share) specifications for window locations, sizes, color settings,
and default filters. As I look to the future, I would like to extend the VCR
metaphor to allow direct capture from network sensors by adding recording
functionality. I expect to eventually arrive at similar types of solutions based
on trial and error and user feedback.
192 Chapter 10
Conti, G. (2007). Security data visualization : Graphical techniques for network analysis. No Starch Press, Incorporated.
Created from inflibnet-ebooks on 2024-01-06 09:25:22.
elegantly addresses the filtering problem by letting its users filter via a pow-
erful command-line language, which is able to filter based on any field or set
of fields from the packet data.
As you attempt to incorporate filtering into your own systems, consider
that each dataset has a number of records, individual entries made up of data
fields such as alert levels and IP addresses. You could use this information
to precisely identify specific elements within any dataset. You need not rely
on just these existing fields, however—you could also incorporate additional
semantic information derived from the dataset or from an external source.
While there is no canonical list, the list of ideas I presented in “Review All
Datastreams” on page 185 should get you started. Finally, after you’ve identi-
fied how you will allow your users to precisely identify given objects in the
dataset, you can then allow users to filter these elements or highlight their
presence using color or some other enhancement.
I believe the future lies not only with the ability to create very precise
filters and encoding schemes, but also in allowing users to share their results
with others and to export filtering rules to automated machine processors.
I’ll cover this subject in greater detail in Chapter 11.
NOTE Do not forget to take into account ease of installation. I suspect you have encountered
software that installed easily and ran flawlessly; however, I’m sure you’ve struggled
with the opposite, as well. If your system is difficult to install, that will dramatically
Copyright © 2007. No Starch Press, Incorporated. All rights reserved.
reduce the adoption of your work. Even if your users are experts, your goal should be to
make installation as easy as clicking and running.
Algorithm selection and secure coding practices are also essential con-
siderations. Your algorithms will make a huge difference in the ultimate
performance of the system. Response time and graphic frame rate are
intimately tied to these decisions. Employing techniques such as RAM buf-
fers and background processing and plotting, as well as other best practices
from the gaming and graphics communities, will help ensure the best result.
Likewise, secure coding is essential. As a security tool, your system must be
resistant to attack. Following secure coding best practices will help you avoid
common pitfalls like buffer overflows, format string attacks, and code injec-
tion. Also in this vein, recall the benefits of using a well-vetted file format,
rather than creating your own.
Conti, G. (2007). Security data visualization : Graphical techniques for network analysis. No Starch Press, Incorporated.
Created from inflibnet-ebooks on 2024-01-06 09:25:22.
Test and Evaluate the System
Users will surprise you. Each person has a different set of experiences and
perceptual capabilities and a unique way of thinking, so each user will
inevitably find surprising uses for your tool. At the same time, users may
be confused by what you thought to be the simplest aspects of your system,
perhaps even making you question your initial assumptions. Getting user
feedback early and often will help you avoid many of these issues. You may
need to go back and update previous stages of your work to resolve a prob-
lem. Some issues may be good ideas that you can defer to a later version.
Going from a prototype that only you can use, to a system that a few hun-
dred experts can use, to something widely deployable, is easily two orders
of magnitude more complex.
How do you evaluate success? The final litmus test is how well the system
helps the user perform the tasks you outlined in “Key Questions for User
and Task Analysis” on page 185.
• How many bug reports have you received? What type are they?
• Do any visualizations inadvertently lead the user to incorrect conclusions?
• How well does the system scale with increased data elements and increased
data collection rate?
• How many data elements can it accommodate before the display is
occluded? In other words, what is the tipping point when it becomes
unusable?
• Do any users consider the system to be something that just generates pretty
pictures? Why do they think so?
194 Chapter 10
Conti, G. (2007). Security data visualization : Graphical techniques for network analysis. No Starch Press, Incorporated.
Created from inflibnet-ebooks on 2024-01-06 09:25:22.
NOTE As you attempt to find and remove performance bottlenecks, try to determine whether
the system is CPU, memory, network, data, or human bound.
Testing and evaluating your system should include more than just
human-centric analysis; it should include analysis of traditional computing
platform metrics such as CPU and memory utilization, as well as graphical
frame rate. By combining these results with your human-centric review, you
can find performance bottlenecks. You will probably find there is some lim-
iting factor in system performance that you can work to eliminate.
NOTE Try attacking your system with malicious data. How difficult is it to make the visual-
ization melt down or otherwise make the system unusable or unreliable?
Generate Documentation
Documentation is an often-overlooked aspect of a complete system. Soft-
ware engineering best practices state that documentation generation should
take place throughout the process. I agree, but I recommend that you pay
particular attention to how you describe the operation of the visualizations
themselves. If you only use text, it is difficult to explain how a visualization
works and how data is mapped to the display. I advise you to make ample
use of annotated graphical examples, as I have in this book. Beyond straight-
forward educational examples, consider creating a smart book (described in
Chapter 2) of known types of legitimate and suspicious activity. Perhaps you
could facilitate this by making the creation of smart book pages part of your
system’s report generation functionality.
Conclusions
I’ve covered a lot in this chapter, but I believe its insights and sample
questions will help you avoid mistakes, employ best practices, and make
development easier. Try to use your users’ tasks as guideposts throughout
each of the steps. By doing so, you avoid simply tossing your tool “over the
wall” to your users, and you will likely be rewarded with a successful result.
Copyright © 2007. No Starch Press, Incorporated. All rights reserved.
Conti, G. (2007). Security data visualization : Graphical techniques for network analysis. No Starch Press, Incorporated.
Created from inflibnet-ebooks on 2024-01-06 09:25:22.
Copyright © 2007. No Starch Press, Incorporated. All rights reserved.
Conti, G. (2007). Security data visualization : Graphical techniques for network analysis. No Starch Press, Incorporated.
Created from inflibnet-ebooks on 2024-01-06 09:25:22.