Drawing Graphs With Graphviz: Emden R. Gansner February 1, 2011
Drawing Graphs With Graphviz: Emden R. Gansner February 1, 2011
Emden R. Gansner
February 1, 2011
1
Graphviz Drawing Library Manual, February 1, 2011 2
Contents
1 Introduction 3
1.1 String-based layouts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 dot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 xdot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.3 plain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.4 plain-ext . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.5 GXL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Graphviz as a library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
5 Graphics renderers 23
5.1 The GVJ t data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.2 Inside the obj state t data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.3 Color information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6 Adding Plug-ins 29
6.1 Writing a renderer plug-in . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
6.2 Writing a device plug-in . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
6.3 Writing an image loading plug-in . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
7 Unconnected graphs 35
1 Introduction
The Graphviz package consists of a variety of software for drawing attributed graphs. It implements a
handful of common graph layout algorithms. These are:
dot A Sugiyama-style hierarchical layout[STT81, GKNV93].
neato An implementation of the Kamada-Kawai algorithm[KK89] for “symmetric” layouts. This is a vari-
ation of multidimensional scaling[KS80, Coh87].
fdp An implementation of the Fruchterman-Reingold algorithm[FR91] for “symmetric” layouts. This lay-
out is similar to neato, but there are performance and feature differences.
circo A circular layout combining aspects of the work of Six and Tollis[ST99, ST00] and Kaufmann and
Wiese[KW].
In addition, Graphviz provides an assortment of more general-purpose graph algorithms, such as transitive
reduction, which have proven useful in the context of graph drawing.
The package was designed[GN00] to rely on the “program-as-filter” model of software, in which dis-
tinct graph operations or transformations are embodied as programs. Graph drawing and manipulation are
achieved by using the output of one filter as the input of another, with each filter recognizing a common,
text-based graph format. One thus has an algebra of graphs, using a scripting language to provide the base
language with variables and function application and composition.
Despite the simplicity and utility of this approach, some applications need or desire to use the software
as a library with bindings in a non-scripting language, rather than as primitives composed using a scripting
language. The Graphviz software provides a variety of ways to achieve this, running a spectrum from very
simple but somewhat inflexible to fairly complex but offering a good deal of application control.
1.1.1 dot
This format relies on the DOT language to describe the graphs, with attributes attached as name-value pairs.
The graph library provides a parser for graphs represented in DOT. Using this, it is easy to read the
graphs and query the desired attributes using agget or agxget. For more information on these functions,
see Section 2.1.1. The string representations of the various types referred to are described in Appendix E.
On output, the graph will have a bb attribute of type rectangle, specifying the bounding box of the
drawing. If the graph has a label, its position is specified by the lp attribute of type point.
Graphviz Drawing Library Manual, February 1, 2011 4
Each node gets pos, width and height attributes. The first has type point, and indicates the center
of the node in points. The width and height attributes are floating point numbers giving the width and
height, in inches, of the node’s bounding box. If the node has a record shape, the record rectangles are given
in the rects attribute. This has the format of a space-separated list of rectangles. If the node is a polygon
(including ellipses) and the vertices attribute is defined for nodes, this attribute will contain the vertices
of the node, in inches, as a space-separated list of pointf values. For ellipses, the curve is sampled, the
number of points used being controlled by the samplepoints attribute. The points are given relative
to the center of the node. Note also that the points only give the node’s basic shape; they do not reflect
any internal structure. If the node has peripheries greater than one, or a shape like "Msquare", the
vertices attribute does not represent the extra curves or lines.
Every edge is assigned a pos attribute having splineType type. If the edge has a label, the label
position is given in the lp of type point.
1.1.2 xdot
The xdot format is a strict extension of the dot format, in that it provides the same attributes as dot as
well as additional drawing attributes. These additional attributes specify how to draw each component of the
graph using primitive graphics operations. This can be particularly helpful in dealing with node shapes and
edge arrowheads. Unlike the information provided by the vertices attribute described above, the extra
attributes in xdot provide all geometric drawing information, including the various types of arrowheads
and multiline labels with variations in alignment. In addition, all the parameters use the same units.
There are six new attributes, listed in Table 1. These drawing attributes are only attached to nodes and
edges. Clearly, the last four attributes are only attached to edges.
The value of these attributes are strings consisting of the concatenation of some (multi-)set of the 7
drawing operations listed in Table 2. The color, font name, and style values supplied in the C, c, F , and S
operations have the same format and interpretation as the color, fontname, and style attributes in the
source graph.
In handling alignment, the application may want to recompute the string width using its own font draw-
ing primitives.
The text operation is only used in the label attributes. Normally, the non-text graphics operations are
only used in the non-label attributes. If, however, a node has shape="record" or an HTML-like label
is involved, a label attribute may also contain various graphics operations. In addition, if the decorate
attribute is set on an edge, its label attribute will also contain a polyline operation.
All coordinates and sizes are in points. If an edge or node is invisible, no drawing operations are attached
to it.
Graphviz Drawing Library Manual, February 1, 2011 5
1.1.3 plain
The plain format is line-based and very simple to parse. This works well for applications which need or
wish to avoid using the graph library. The price for this simplicity is that the format encodes very little
detailed layout information beyond basic position information. If an application needs more than what is
supplied in the format, it should use the dot or xdot format.
There are four types of lines: graph, node, edge and stop. The output consists of a single graph
line; a sequence of node lines, one for each node; a sequence of edge lines, one for each edge; and a
single terminating stop line. All units are in inches, represented by a floating point number.
As noted, the statements have very simple formats.
graph scale width height
node name x y width height label style shape color fillcolor
edge tail head n x1 y1 ... xn yn [label xl yl] style color
stop
We now describe the statements in more detail.
graph The width and height values give the width and height of the drawing. The lower left corner of
the drawing is at the origin. The scale value indicates how the drawing should be scaled if a size
attribute was given and the drawing needs to be scaled to conform to that size. If no scaling is
necessary, it will be set to 1.0. Note that all graph, node and edge coordinates and lengths are given
unscaled.
node The name value is the name of the node, and x and y give the node’s position. The width and height
are the width and height of the node. The label, style, shape, color and fillcolor values give the node’s
label, style, shape, color and fillcolor, respectively, using default attribute values where necessary. If
the node does not have a style attribute, "solid" is used.
Graphviz Drawing Library Manual, February 1, 2011 6
edge The tail and head values give the names of the head and tail nodes. n is the number of control
points defining the B-spline forming the edge. This is followed by 2 ∗ n numbers giving the x and
y coordinates of the control points in order from tail to head. If the edge has a label attribute,
this comes next, followed by the x and y coordinates of the label’s position. The edge description is
completed by the edge’s style and color. As with nodes, if a style is not defined, "solid" is used.
1.1.4 plain-ext
The plain-ext format is identical with the plain format, except that port names are attached to the
node names in an edge, when applicable. It uses the usual DOT representation, where port p of node n is
given as n:p.
1.1.5 GXL
The GXL [Win02] dialect of XML is a widely accepted standard for representing attributed graphs as text,
especially in the graph drawing and software engineering communities. As an XML dialect, there are
many tools available for parsing and analyzing graphs represented in this format. Other graph drawing
and manipulation packages either use GXL as their main graph language, or provide a translator. In this,
Graphviz is no different. We supply the programs dot2gxl and gxl2dot for converting between the
DOT and GXL formats. Thus, if an application is XML-based, to use the Graphviz tools, it needs to insert
these filters as appropriate between its I/O and the Graphviz layout programs.
Here, we just note the gvc parameter. This is a handle to a Graphviz context, which contains drawing
and rendering information independent of the properties pertaining to a particular graph. For the present,
we view this an abstract parameter required for various Graphviz functions. We will discuss it further in
Section 4.
2.1.1 Attributes
In addition to the abstract graph structure provided by nodes, edges and subgraphs, the Graphviz libraries
also support graph attributes. These are simply string-valued name/value pairs. Attributes are used to specify
any additional information which cannot be encoded in the abstract graph. In particular, the attributes are
heavily used by the drawing software to tailor the various geometric and visual aspects of the drawing.
Reading attributes is easily done. The function agget takes a pointer to a graph component (node,
edge or graph) and an attribute name, and returns the value of the attribute for the given component. Note
that the function may return either NULL or a pointer to the empty string. The first value indicates that
the given attribute has not been defined for any component in the graph of the given kind. Thus, if abc
is a pointer to a node and agget(abc,"color") returns NULL, then no node in the root graph has a
color attribute. If the function returns the empty string, this usually indicates that the attribute has been
defined but the attribute value associated with the specified object is the default for the application. So, if
agget(abc,"color") now returns "", the node is taken to have the default color. In practical terms,
these two cases are very similar. Using our example, whether the attribute value is NULL or "", the drawing
code will still need to pick a color for drawing and will probably use the default in both cases.
Setting attributes is a bit more complex. Before attaching an attribute to a graph component, the code
must first set up the default case. This is accomplished by a call to agraphattr, agnodeattr or
agedgeattr for graph, node or edge attributes, respectively. The types of the 3 functions are identical.
They all take a graph and two strings as arguments, and return a representation of the attribute. The first
string gives the name of the attribute; the second supplies the default value, which must not be NULL. The
graph must be the root graph.
Once the attribute has been initialized, the attribute can be set for a specific component by calling
the first three arguments being the same as those of agset. This function first checks that the named
attribute has been declared for the given graph component. If it has not, it declares the attribute, using its
last argument as the required default value. It then sets the attribute value for the specific component.
When an attribute is assigned a value, the graph library replicates the string. This means the application
can use a temporary string as the argument; it does not have to keep the string throughout the application.
Each node, edge, and graph maintains its own attribute values. Obviously, many of these are the same
strings, so to save memory, the graph library uses a reference counting mechanism to share strings. An
application can employ this mechanism by using the agstrdup() function. If it does, it must also use the
agstrfree() function if it wishes to release the string. Graphviz supports HTML-like tables as labels.
To allow these to be handled transparently, the library uses a special version of reference counted strings.
To create one of these, one uses agstrdup html() rather than agstrdup(). The agstrfree() is
still used to release the string.
Note that some attributes are replicated in the graph, appearing once as the usual string-valued attribute,
and also in an internal machine format such an int, double or some more structured type. An application
should only set attributes using strings and agset. The implementation of the layout algorithm may change
the machine-level representation or may change when it does the conversion from a string value. Hence,
the low-level interface cannot be relied on by the application. Also note that there is not a one-to-one
correspondence between string-valued attributes and internal attributes. A given string attribute might be
Graphviz Drawing Library Manual, February 1, 2011 10
encoded as part of some data structure, might be represented via multiple fields, or may have no internal
representation at all.
In order to expedite the reading and writing of attributes for large graphs, Graphviz provides a lower-
level mechanism for manipulating attributes which can avoid hashing a string. Attributes have a represen-
tation of type Agsym_t. This is basically the value returned by the initialization functions agraphattr,
etc. It can also be obtained by a call to agfindattr, which takes a graph component and an attribute
name. If the attribute has been defined, the function returns a pointer to the corresponding Agsym_t value.
This can be used to directly access the corresponding attribute value, using the functions agxget and
agxset. These are identical to agget and agset, respectively, except that instead of taking the attribute
name as the second argument, they use the index field of the Agsym_t value to extract the attribute value
from an array.
Due to the nature of the implementation of attributes in Graphviz, an application should, if possible,
attempt to define and initialize all attributes before creating nodes and edges.
The drawing algorithms in Graphviz use a large collection of attributes, giving the application a great
deal of control over the appearance of the drawing. For more detailed information on what the attributes
mean, the reader should consult the manual Drawing graphs with dot.
We can divide the attributes into those that affect the placement of nodes, edges and clusters in the
layout and those, such as color, which do not. Table 4 gives the node attributes which have the potential to
change the layout. This is followed by Tables 5, 6 and 7, which do the same for edges, graphs, and clusters.
Note that in some cases, the effect is indirect. An example of this is the nslimit attribute, which
potentially reduces the effort spent on network simplex algorithms to position nodes, thereby changing
the layout. Some of these attributes affect the initial layout of the graph in universal coordinates. Others
only play a role if the application uses the Graphviz renderers (cf. Section 2.3), which map the drawing
into device-specific coordinates related to a concrete output format. For example, Graphviz only uses the
center attribute, which specifies that the graph drawing should be centered within its page, when the
library generates a concrete representation. The tables distinguish these device-specific attributes by a †
symbol at the start of the Use column.
Tables 8, 9, 10 and 11 list the node, edge, graph and cluster attributes, respectively, that do not effect
Graphviz Drawing Library Manual, February 1, 2011 11
the placement of components. Obviously, the values of these attributes are not reflected in the position
information of the graph after layout. If the application handles the actual drawing of the graph, it must
decide if it wishes to use these attributes or not.
Among these attributes, some are used more frequently than others. A graph drawing typically needs to
encode various application-dependent properties in the representations of the nodes. This can be done with
text, using the label, fontname and fontsize attributes; with color, using the color, fontcolor,
fillcolor and bgcolor attributes; or with shapes, the most common attributes being shape, height,
width, style, fixedsize, peripheries and regular,
Edges often display additional semantic information with the color and style attributes. If the edge
is directed, the arrowhead, arrowsize, arrowtail and dir attributes can play a role. Using splines
rather than line segments for edges, as determined by the splines attribute, is done for aesthetics or clarity
rather than to convey more information.
There are also a number of frequently used attributes which affect the layout geometry of the nodes
and edges. These include compound, len, lhead, ltail, minlen, nodesep, pin, pos, rank,
rankdir, ranksep and weight. Within this category, we should also mention the pack and overlap
attributes, though they have a somewhat different flavor.
The attributes described thus far are used as input to the layout algorithms. There is a collection of
attributes, displayed in Table 12, which, by convention, Graphviz uses to specify the geometry of a layout.
After an application has used Graphviz to determine position information, if it wants to write out the graph
in DOT with this information, it should use the same attributes.
In addition to the attributes described above which have visual effect, there is a collection of attributes
used to supply identification information or web actions. Table 13 lists these.
Name Use
bb bounding box of drawing or cluster
lp position of graph, cluster or edge label
pos position of node or edge control points
rects rectangles used in records
vertices points defining node’s boundary, if requested
Name Use
URL hyperlink associated with node, edge, graph or cluster
comment comments inserted into output
headURL URL attached to head label
headhref synonym for headURL
headtarget browser window associated with headURL
headtooltip tooltip associated with headURL
href synonym for URL
tailURL URL attached to tail label
tailhref synonym for tailURL
tailtarget browser window associated with tailURL
tailtooltip tooltip associated with tailURL
target browser window associated with URL
tooltip tooltip associated with URL
as those of the layout programs listed in Section 1. Thus, "dot" is used to invoke dot, etc.3
The layout algorithm will do everything that the corresponding program would do, given the graph and
its attributes. This includes assigning node positions, representing edges as splines4 , handling the special
case of an unconnected graph, plus dealing with various technical features such as preventing node overlaps.
There are two special layout engines available in the library: "nop" and "nop2". These correspond to
running the neato command with the flags -n and -n2, respectively. That is, they assume the input graph
already has position information stored for nodes and, in the latter case, some edges. They can be used to
route edges in the graph or perform other adjustments. Note that they expect the position information to be
stored as pos attributes in the nodes and edges. The application can do this itself, or use the dot renderer.
For example, if one wants to position the nodes of a graph using a dot layout, but wants edges drawn as
line segments, one could use the following code. The first call to gvLayout lays out the graph using dot;
Agraph_t* G;
GVC_t* gvc;
/*
* Create gvc and graph
*/
the first call to gvRender attaches the computed position information to the nodes and edges. The second
call to gvLayout adds straight-line edges to the already positioned nodes; the second call to gvRender
outputs the graph in png for on stdout.
The allowed strings are the same ones used with the -T flag when the layout program is invoked from a
command shell.
After a graph has been laid out using gvLayout, an application can perform multiple calls to the
rendering functions. A typical instance might be
in which the graph is laid out using the dot algorithm, followed by PNG bitmap output and a corresponding
map file which can be used in a web browser.
Sometimes, an application will decide to do its own rendering. An application-supplied drawing routine,
such as drawGraph in Figure 1 can then read this information, map it to display coordinates, and call
routines to render the drawing.
One simple way to do this is to use the position and drawing information as supplied by the dot or
xdot format (see Sections 1.1.1 and 1.1.2). To get this, the application can call the appropriate renderer,
passing a NULL stream pointer to gvRender5 as in Figure 2. This will attach the information as string
attributes. The application can then use agget to read the attributes.
On the other hand, an application may desire to read the primitive data structures used by the algorithms
to record the layout information. In the remainder of this section, we describe in reasonable detail these data
structures. An application can use these values directly to guide its drawing. In some cases, for example,
with arrowheads attached to bezier values or HTML-like labels, it would be onerous for an application to
fully interpret the data. For this reason, if an application wishes to provide all of the graphics features while
avoiding the low-level details of the data structures, we suggest either using xdot approach, described
above, or supplying its own renderer plug-in as described in Section 5.
In general, the graph library allows an application to define specific data fields which are compiled
into the node, edge and graph structures. These have the names
• Agnodeinfo_t
• Agedgeinfo_t
• Agraphinfo_t
respectively. The Graphviz layout algorithms rely on a specific set of fields to record position and drawing
information, and therefore provide definitions for the three fields. Thus, the definitions of the information
fields are fixed by the layout library and cannot be altered by the application.6
These information structures occur as the field named u in the node, edge and graph structure. The def-
inition of the information structures as defined by the layout code is given in types.h, along with various
auxiliary types such as point or bezier. This file also provides macro expressions for accessing these
fields. Thus, if np is a node pointer, the width field can be read using np->u.width or ND_width(np).
Edge and graph attributes follow the same convention, with prefixes ED_ and GD_, respectively. We strongly
5
This convention only works, and only makes sense, with the dot and xdot renderers. For other renders, a NULL stream will
cause output to be written on stdout.
6
This is a limitation of the graph library. We plan to remove this restriction by moving to a mechanism which allows arbitrary
dynamic extensions to the node, edge and graph structures. Meanwhile, if the application requires the addition of extra fields, it
can define its own structures, which should be extensions of the components of the information types, with the additional fields
attached at the end. Then, instead of calling aginit(), it can use the more general aginitlib(), and supply the sizes of its
nodes, edges and graphs. This will ensure that these components will have the correct sizes and alignments. The application can
then cast the generic graph types to the types it defined, and access the additional fields.
Graphviz Drawing Library Manual, February 1, 2011 17
deprecate the former access method, for the usual reason of good programming style. By using the macros,
source code will not be affected by any changes to the how the value is provided.
Each node has ND coord i, ND width and ND height attributes. The value of ND coord i gives
the position of the center of the node, in points.7 The ND width and ND height attributes specify the
size of the bounding box of the node, in inches.
Edges, even if a line segment, are represented as B-splines or piecewise Bezier curves. The spl attribute
of the edge stores this spline information. It has a pointer to an array of 1 or more bezier structures. Each
of these describes a single piecewise Bezier curve as well as associated arrowhead information. Normally,
a single bezier structure is sufficient to represent an edge. In some cases, however, the edge may need
multiple bezier parts, as when the concentrate attribute is set, whereby mostly parallel edges are
represented by a shared spline. Of course, the application always has the possibility of drawing a line
segment connecting the centers of the edge’s nodes.
If a subgraph is specified as a cluster, the nodes of the cluster will be drawn together and the entire
subgraph is contained within a rectangle containing no other nodes. The rectangle is specified by the bb
attribute of the subgraph, the coordinates in points in the global coordinate system.
faithful in the rendering, it may be preferable to use the xdot information or to supply its own renderer
plugin.
For edges, each bezier structure has, in addition to its list of control points, fields for specifying
arrowheads. If bp points to a bezier structure and the bp->sflag field is true, there should be an
arrowhead attached to the beginning of the bezier. The field bp->sp gives the point where the nominal tip
of the arrowhead would touch the tail node. (If there is no arrowhead, bp->list[0] will touch the node.)
Thus, the length and direction of the arrowhead is determined by the vector going from bp->list[0] to
bp->sp. The actual shape and width of the arrowhead is determined by the arrowtail and arrowsize
attributes. Analogously, an arrowhead at the head node is specified by bp->eflag and the vector from
bp->list[bp->size-1] to bp->ep.
The label field (ND_label(n), ED_label(e), GD_label(g)) encodes any text label associated
with a graph object. Edges, graphs and clusters will occasionally have labels; nodes almost always have a
label, since the default label is the node’s name. The basic label string is stored in the text field, while the
fontname, fontcolor and fontsize fields describe the basic font characteristics. In many cases, the
basic label string is further parsed, either into multiple, justified text lines, or as a nested box structure for
HTML-like labels or nodes of record shape. This information is available in other fields.
In all of the algorithms, the first step is to call a layout-specific initialization function. These func-
tions initialize the graph for the particular algorithm. This will first call common routines to set up basic
data structures, especially those related to the final layout results and code generation. In particular, the
size and shape of nodes will have been analyzed and set at this point, which the application can access
via the ND width, ND height, ND ht, ND lw, ND rw, ND shape, ND shape info and ND label
attributes. Initialization will then establish the data structures specific to the given algorithm. Both the
generic and specific layout resources are released when the corresponding cleanup function is called in
gvFreeLayout (cf. Section 2.4).
By default, the layout algorithms position the edges as well as the nodes of the graph. As this may be
expensive to compute and irrelevant to an application, an application may decide to avoid this. This can be
achieved by setting the graph’s splines attribute to the empty string "".
The algorithms all end with a postprocessing step. The role of this is to do some final tinkering with
the layout, still in layout coordinates. Specifically, the function rotates the layout for dot (if rankdir is
set), attaches the root graph’s label, if any, and normalizes the drawing so that the lower left corner of its
bounding box is at the origin.
Except for dot, the algorithms also provide a node’s position, in inches, in the array give by ND pos.
3.1 dot
The dot algorithm produces a ranked layout of a graph respecting edge directions if possible. It is particu-
larly appropriate for displaying hierarchies or directed acyclic graphs. The basic layout scheme is attributed
to Sugiyama et al.[STT81] The specific algorithm used by dot follows the steps described by Gansner et
al.[GKNV93]
The steps in the dot layout are:
initialize
rank
mincross
position
sameports
splines
compoundEdges
After initialization, the algorithm assigns each node to a discrete rank (rank) using an integer program
to minimize the sum of the (discrete) edge lengths. The next step (mincross) rearranges nodes within
ranks to reduce edge crossings. This is followed by the assignment (position) of actual coordinates to
the nodes, using another integer program to compact the graph and straighten edges. At this point, all nodes
will have a position set in the coord attribute. In addition, the bounding box bb attribute of all clusters are
set.
The sameports step is an addition to the basic layout. It implements the feature, based on the edge
attributes "samehead" and "sametail", by which certain edges sharing a node all connect to the node
at the same point.
Edge representations are generated in the splines step. At present, dot draws all edges as B-splines,
though some edges will actually be the degenerate case of a line segment.
Although dot supports the notion of cluster subgraphs, its model does not correspond to general com-
pound graphs. In particular, a graph cannot have edges connecting two clusters, or a cluster and a node. The
layout can emulate this feature. Basically, if the head and tail nodes of an edge lie in different, non-nested
clusters, the edge can specify these clusters as a logical head or logical tail using the lhead or ltail
Graphviz Drawing Library Manual, February 1, 2011 20
attribute. The spline generated in splines for the edge can then be clipped to the bounding box of the
specified clusters. This is accomplished in the compoundEdges step.
3.2 neato
The layout computed by neato is specified by a virtual physical model, i.e., one in which nodes are treated
as physical objects influenced by forces, some of which arise from the edges in the graph. The layout is
then derived by finding positions of the nodes which minimize the forces or total energy within the system.
The forces need not correspond to true physical forces, and typically the solution represents some local
minimum. Such layouts are sometimes referred to as symmetric, as the principal aesthetics of such layouts
tend to be the visualization of geometric symmetries within the graph. To further enhance the display of
symmetries, such drawings tend to use line segments for edges.
The model used by neato comes from Kamada and Kawai[KK89], though it was first introduced by
Kruskal and Seely[KS80] in a different format. The model assumes there is a spring between every pair of
vertices, each with an ideal length. The ideal lengths are a function of the graph edges. The layout attempts
to minimize the energy in this system.
initialize
position
adjust
splines
As usual, the layout starts with an initialization step. The actual layout is parameterized by the mode
and model attributes. The mode attribute determines how the optimization problem is solved, either by the
default, stress majorization[GKN04] mode, (mode="major"), or the gradient descent technique proposed
by Kamada and Kawai[KK89] (mode="KK"). The latter mode is typically slower than the former, and
introduces the possibility of cycling. It is maintained solely for backward compatibility.
The model indicates how the ideal distances are computed between all pairs of nodes. By default, neato
uses a shortest path model (model="shortpath"), so that the length of the spring between nodes p and
q is the length of the shortest path between them in the graph. Note that the shortest path calculation takes
into account the lengths of edges as specified by the "len" attribute, with one inch being the default.
If mode="KK" and the graph attribute p pack is false, neato sets the distance between nodes in separate
connected components to 1.0 + Lavg · |V|, where Lavg is the average edge length and |V| is the number
of nodes in the graph. This supplies sufficient separation between components so that they do not overlap.
Typically, the larger components will be centrally located, while smaller components will form a ring around
the outside.
In some cases, an application may decide to use the circuit model (model="circuit"), a model
based on electrical circuits as first proposed by Cohen[Coh87]. In this model, the spring length is derived
from resistances using Kirchoff’s law. This means that the more paths between p and q in the graph, the
smaller the spring length. This has the effect of pulling clusters closer together. We note that this approach
only works if the graph is connected. If the graph is not connected, the layout automatically reverts to the
shortest path model.
The third model is the subset model (model="subset"). This sets the length of each edge to be the
number of nodes that are neighbors of exactly one of the end points, and then calculates remaining distances
using shortest paths. This helps to separate nodes with high degree.
The basic algorithm used by neato performs the layout assuming point nodes. Since in many cases, the
final drawing uses text labels and various node shapes, the drawing ends up with many nodes overlapping
each other. For certain uses, the effect is desirable. If not, the application can use the adjust step to
reposition the nodes to eliminate overlaps. This is controlled by the graph attribute "overlap".
Graphviz Drawing Library Manual, February 1, 2011 21
With nodes positioned, the algorithm proceeds to draw the edges using its splines function. By
default, edges are drawn as line segments. If, however, the "splines" graph attribute is set to true, the
edges will be constructed as splines[DGKN97], routing them around the nodes. Topologically, the spline
follows the shortest path between two nodes while avoiding all others. Clearly, for this to work, there can be
no node overlaps. If overlaps exist, edge creation reverts back to line segments. When this function returns,
the positions of the nodes will be recorded in their coords attribute, in points.
The programmer should be aware of certain limitations and problems with the neato algorithm. First,
as noted above, if mode="KK", it is possible for the minimization technique used by neato to cycle, never
finishing. At present, there is no way for the library to detect this, though once identified, it can easily be
fixed by simply picking another initial position. Second, although multiedges affect the layout, the spline
router does not yet handle them. Thus, two edges between the same nodes will receive the same spline.
Finally, neato provides no mechanism for drawing clusters. If clusters are required, one should use the fdp
algorithm, which belongs to the same family as neato and is described next.
3.3 fdp
The fdp layout is similar in appearance to neato and also relies on a virtual physical model, this time
proposed by Fruchterman and Reingold[FR91]. This model uses springs only between nodes connected
with an edge, and an electrical repulsive force between all pairs of nodes. Also, it achieves a layout by
minimizing the forces rather than energy of the system.
Unlike neato, fdp supports cluster subgraphs. In addition, it allows edges between clusters and nodes,
and between cluster and clusters. At present, an edge from a cluster cannot connect to a node or cluster with
the cluster.
initialize
position
splines
The layout scheme is fairly simple: initialization; layout; and a call to route the edges. In fdp, because
it is necessary to keep clusters separate, the removal of overlaps is (usually) obligatory.
3.4 twopi
The radial layout algorithm represented by twopi is conceptually the simplest in Graphviz. Following an
algorithm described by Wills[Wil97], it takes a node specified as the center of the layout and the root of the
generated spanning tree. The remaining nodes are placed on a series of concentric circles about the center,
the circle used corresponding to the graph-theoretic distance from the node to the center. Thus, for example,
all of the neighbors of the center node are placed on the first circle around the center. The algorithm allocates
angular slices to each branch of the induced spanning tree to guarantee enough space for the tree on each
ring. At present, the algorithm does not attempt to visualize clusters.
initialize
position
adjust
splines
As usual, the layout commences by initializing the graph. This is followed by the position step,
which is parameterized by the central node, specified by the graph’s root attribute. If unspecified, the
Graphviz Drawing Library Manual, February 1, 2011 22
algorithm will select some “most central” node, i.e., one whose minimum distance from a leaf node is
maximal.
As with neato, the layout allows an adjust step to eliminate node-node overlaps. Again as with neato,
the call to splines computes drawing information for edges. See Section 3.2 for more details.
3.5 circo
The circo algorithm is based on the work of Six and Tollis[ST99, ST00], as modified by Kaufmann and
Wiese[KW]. The nodes in each biconnected component are placed on a circle, with some attempt to mini-
mize edge crossings. Then, by considering each component as a single node, the derived tree is laid out in
a similar fashion to twopi, with some component considered as the root node.
initialize
position
splines
As with fdp, the scheme is very simple. By construction, the circo layout avoids node overlaps, so no
adjust step is necessary.
The first argument is an array of three character pointers providing version information; see Section 4.1
below for a description of this data. The second argument is a string giving a name for the user. If desired,
the application can call the library function gvUsername() to obtain this value. These strings are stored
in the GVC t and used in various messages and comments.
For convenience, the Graphviz library provides a simple way to create a context:
which is what we have used in the examples shown here. This uses version information created when
Graphviz was built, plus the value returned by gvUsername().
One can initialize a GVC t to record a list of graphs, layout algorithms and renderers. To do this, the
application should call the function gvParseArgs:
This function takes the context value, plus an array of strings using the same conventions as the parameters
to main function in a C program. In particular, argc should be the number of values in argv. If argv[0]
is the name of one of the layout algorithms, this will be bound to the GVC t value and used at layout time.
The remaining argv values, if any, are interpreted exactly like the allowed command line flags for any
Graphviz program. Thus, "-T" can be used to set the output type, and "-o" can be used to specify the
output files.
For example, the application can use a synthetic argument list
GVC_t* gvc = gcContext();
char* args[] = {
"dot",
"-Tgif", /* gif output */
"-oabc.gif" /* output to file abc.gif */
};
gvParseArgs (gvc, sizeof(args)/sizeof(char*), args);
to specify a dot layout in GIF output written to the file abc.gif. Another approach is to use a program’s
actual argument list, after removing flags not handled by Graphviz.
Most of the information is stored in a GVC t value for use during rendering. However, if the argv
array contains non-flag arguments, i.e., strings after the first not beginning with "-", these are taken
to be input files defining a stream of graphs to be drawn. These graphs can be accessed by calls to
gvNextInputGraph.
Once the GVC t has been initialized this way, the application can call gvNextInputGraph to get
each input graph in sequence, and then invoke gvLayoutJobs and gvRenderJobs to do the specified
layouts and renderings. See Appendix C for a typical example of this approach.
We note that gvLayout basically attaches the graph and layout algorithm to the GVC t, as would be
done by gvParseArgs, and then invokes gvLayoutJobs. A similar remark holds for gvRender and
gvRenderJobs.
5 Graphics renderers
All graph output done in Graphviz goes through a renderer with the type gvrender engine t, used in
the call to gvRender. In addition to the renderers which are part of the library, an application can provide
its own, allowing it to specialize or control the output as necessary. See Section 6.1 for further details.
Graphviz Drawing Library Manual, February 1, 2011 24
As in the layout phase invoked by gvLayout, all control over aspects of rendering are handled via
graph attributes. For example, the attribute outputorder determines whether all edges are drawn before
any nodes, or all nodes are drawn before any edges.
Before describing the renderer functions in detail, it may be helpful to give an overview of how output
is done. Output can be viewed as a hierarchy of document components. At the highest level is the job,
representing an output format and target. Bound to a job might be multiple graphs, each embedded in some
universal space. Each graph may be partitioned into multiple layers as determined by a graph’s layers
attribute, if any. Each layer may be divided into a 2-dimensional array of pages. A page will then contain
nodes, edges, and clusters. Each of these may contain an HTML anchor. During rendering, each component
is reflected in paired calls to its corresponding begin ... and end ... functions. The layer and
anchor components are omitted if there is only a single layer or the enclosing component has no browser
information.
Figure 3 lists the names and type signatures of the fields of gv render engine t, which are used to
emit the components described above.9 All of the functions take a GVJ t* value, which contains various
information about the current rendering, such as the output stream, if any, or the device size and resolution.
Section 5.1 describes this data structure.
Most of the functions handle the nested graph structure. All graphics output is handled by the textpara,
ellipse, polygon, beziercurve, and polyline functions. The relevant drawing information such
as color and pen style is available through the obj field of the GVJ t* parameter. This is described in Sec-
tion 5.2. Font information is passed with the text.
We note that, in Graphviz, each node, edge or cluster in a graph has a unique id field, which can be
used as a key for storing and accessing the object.
In the following, we describe the functions in more detail, though many are self-explanatory. All posi-
tions and sizes are in points.
begin job(job) Called at the beginning of all graphics output for a graph, which may entail drawing
multiple layers and multiple pages.
end job(job) Called at the end of all graphics output for graph. The output stream is still open, so the
renderer can append any final information to the output.
begin graph(job) Called at the beginning of drawing a graph. The actual graph is available as
job->obj->u.g.
begin layer(job,layerName,n,nLayers) Called at the beginning of each layer, only if nLayers >
0. The layerName parameter is the logical layer name given in the layers attribute. The layer
has index n out of nLayers, starting from 0.
begin page(job) Called at the beginning of a new output page. A page will contain a rectangular
portion of the drawing of the graph. The value job->pageOffset gives the lower left corner of
the rectangle in layout coordinates. The point job->pagesArrayElem is the index of the page in
the array of pages, with the page in the lower left corner indexed by (0,0). The value job->zoom
provides a scale factor by which the drawing should be scaled. The value job->rotation, if
non-zero, indicates that the output should be rotated by 90◦ counterclockwise.
9
Any types mentioned in this section are either described in this section or in Appendix E.
Graphviz Drawing Library Manual, February 1, 2011 25
begin cluster(job) Called at the beginning of drawing a cluster subgraph. The actual cluster is
available as job->obj->u.sg.
end cluster(job) Called at the end of drawing the current cluster subgraph.
begin nodes(job) Called at the beginning of drawing the nodes on the current page. Only called if
the graph attribute outputorder was set to a non-default value.
end nodes(job) Called when all nodes on a page have been drawn. Only called if the graph attribute
outputorder was set to a non-default value.
begin edges(job) Called at the beginning of drawing the edges on the current page. Only called if
the graph attribute outputorder was set to a non-default value.
end edges() Called when all edges on the current page are drawn. Only called if the graph attribute
outputorder was set to a non-default value.
begin node(job) Called at the start of drawing a node. The actual node is available as job->obj->u.n.
Graphviz Drawing Library Manual, February 1, 2011 26
begin edge(job) Called at the start of drawing an edge. The actual edge is available as job->obj->u.e.
textpara(job, p, txt) Draw text at point p using the specified font and fontsize and color. The
txt argument provides the text string txt->str, stored in UTF-8, a calculated width of the string
txt->width and the horizontal alignment txt->just of the string in relation to p. The values
txt->fontname and txt->fontname give the desired font name and font size, the latter in
points.
The base line of the text is given by p.y. The interpretation of p.x depends upon the value of
txt->just. Basically, p.x provides the anchor point for the alignment.
txt->just p.x
’n’ Center of text
’l’ Left edge of text
’r’ Right edge of text
The leftmost x coordinate of the text, the parameter most graphics systems use for text placement, is
given by p.x + j * txt->width, where j is 0.0 (-0.5,-1.0) if txt->just is ’l’(’n’,’r’),
respectively. This representation allows the renderer to accurately compute the point for text place-
ment that is appropriate for its format, as well as use its own mechanism for computing the width of
the string.
resolve color(job, color) Resolve a color. The color parameter points to a color representa-
tion of some particular type. The renderer can use this information to resolve the color to a represen-
tation appropriate for it. See Section 5.3 for more details.
ellipse(job, ps, filled) Draw an ellipse with center at ps[0], with horizontal and vertical
half-axes ps[1].x - ps[0].x and ps[1].y - ps[0].y using the current pen color and line
style. If filled is non-zero, the ellipse should be filled with the current fill color.
polygon(job, A, n, filled) Draw a polygon with the n vertices given in the array A, using the
current pen color and line style. If filled is non-zero, the polygon should be filled with the current
fill color.
Graphviz Drawing Library Manual, February 1, 2011 27
polyline(job,A,n) Draw a polyline with the n vertices given in the array A, using the current pen
color and line style.
comment(job, text) Emit text comments related to a graph object. For nodes, calls will pass the
node’s name and any comment attribute attached to the node. For edges, calls will pass a string
description of the edge and any comment attribute attached to the edge. For graphs and clusters, a
call will pass a any comment attribute attached to the object.
Although access to the graph object being drawn is available through the GVJ t value, a renderer can
often perform its role by just implementing the basic graphics operations. It need have no information about
graphs or the related Graphviz data structures. Indeed, a particular renderer need not define any particular
rendering function, since a given entry point will only be called if non-NULL.
common This points to various information valid throughout the duration of the application using Graphviz.
In particular, common->user gives the user name associated to the related GVC t value (see Sec-
tion 4), and common->info contains Graphviz version information, as described in Section 4.1.
output file The FILE* value for an open stream on which the output should be written, if relevant.
pagesArraySize The size of the array of pages in which the graph will be output, given as a point.
If pagesArraySize.x or pagesArraySize.y is greater than one, this indicates that a page
size was set and the graph drawing is too large to be printed on a single page. Page (0,0) is the page
containing the bottom, lefthand corner of the graph drawing; page (1,0) will contain that part of the
graph drawing to the right of page (0,0); etc.
bb The bounding box of the layout in the universal space in points. It has type boxf.
boundingBox The bounding box of the layout in the device space in device coordinates. It has type box.
pageOffset The origin of the current page in the universal space in points.
obj Information related to the current object being rendered. This is a pointer of a value of type obj state t.
See Section 5.2 for more details.
type and u The type field indicates what kind of graph object is currently being rendered. The possible
values are ROOTGRAPH OBJTYPE, CLUSTER OBJTYPE, NODE OBJTYPE and EDGE OBJTYPE,
indicating the root graph, a cluster subgraph, a node and an edge, respectively. A pointer to the actual
object is available via the subfields u.g, u.sg, u.n and u.e, respectively, of the union u.
pencolor The gvcolor t value indicating the color used to draw lines, curves and text.
pen The style of pen to be used. The possible values are PEN NONE, PEN DOTTED, PEN DASHED and
PEN SOLID.
penwidth The size of the pen, in points. Note that, by convention, a value of 0 indicates using the smallest
width supported by the output format.
fillcolor The gvcolor t value indicating the color used to fill closed regions.
Note that font information is delivered as part of the textpara t value passed to the textpara function.
As for the url, tooltip and target fields, these will point to the associated attribute value of the current
graph object, assuming it is defined and that the renderer support map, tooltips, and targets, respectively (cf.
Section 6.1).
Graphviz Drawing Library Manual, February 1, 2011 29
Before a color is used in rendering, Graphviz will process a color description provided by the input
graph into a form desired by the renderer. This is three step procedure. First, Graphviz will see if the
color matches the renderer’s known colors, if any. If so, the color representation is COLOR STRING.
Otherwise, the library will convert the input color description into the renderer’s preferred format. Finally,
if the renderer also provides a resolve color function, Graphviz will then call that function, passing a
pointer to the current color value. The renderer then has the opportunity to adjust the value, or convert it into
another format. In a typical case, if a renderer uses a color map, it may request RGB values as input, and
then store an associated color map index using the COLOR INDEX format. If the renderer does a conversion
to another color type, it must reset the type field to indicate this. It is this last representation which will
be passed to the renderer’s drawing routines. The renderer’s known colors and preferred color format are
described in Section 6.1 below.
6 Adding Plug-ins
The Graphviz framework allows the programmer to use plug-ins to extend the system in several ways.
For example, the programmer can add new graph layout engines along with new renderers and their re-
lated functions. Table 15 describes the plug-in APIs supported by Graphviz. Each plug-in is defined
by an engine structure containing its function entry points, and a features structure specifying features
Graphviz Drawing Library Manual, February 1, 2011 30
supported by the plug-in. Thus, a renderer is defined by values of type gvrender engine t and
gvrender features t.
Once all of the plug-ins of a given kind are defined, they should be gathered into a 0-terminated array
of element type gvplugin installed t, whose fields are shown in Figure 5. The fields have the
int id;
char *type;
int quality;
void *engine;
void *features;
following meanings.
id Identifier for a given plug-in within a given package and with a given API kind. Note that the id need
only be unique within its plug-in package, as these packages are assumed to be independent.
quality An arbitrary integer used for ordering plug-ins with the same type. Plug-ins with larger values
will be chosen before plug-ins with smaller values.
As an example, suppose we wish to add various renderers for bitmap output. A collection of these might
be combined as follows.
gvplugin_installed_t render_bitmap_types[] = {
{0, "jpg", 1, &jpg_engine, &jpg_features},
{0, "jpeg", 1, &jpg_engine, &jpg_features},
{1, "png", 1, &png_engine, &png_features},
{2, "gif", 1, &gif_engine, &gif_features},
{0, NULL, 0, NULL, NULL}
};
Note that this allows "jpg" and "jpeg" to refer to the same renderers. For the plug-in kinds without a
features structure, the feature pointer in its gvplugin installed t should be NULL.
All of the plug-ins of all API kinds should then be gathered into a 0-terminated array of element type
gvplugin api t. For each element, the first field indicates the kind of API, and the second points to the
array of plug-ins described above (gvplugin installed t).
Continuing our example, if we have supplied, in addition to the bitmap rendering plug-ins, plug-ins to
render VRML, and plug-ins to load images, we would define
gvplugin_api_t apis[] = {
{API_render, &render_bitmap_types},
{API_render, &render_vrml_types},
{API_loadimage, &loadimage_bitmap_types},
{0, 0},
};
Graphviz Drawing Library Manual, February 1, 2011 31
Here render vrml types and render vrml types are also 0-terminated arrays of element type
gvplugin installed t. Note that there can be multiple items of the same API kind.
A final definition is used to attach a name to the package of all the plug-ins. This is done using a
gvplugin library t structure. Its first field is a char* giving the name of the package. The second
field is a gvplugin api t* pointing to the array described above. The structure itself must be named
gvplugin name LTX library, where name is the name of the package as defined in the first field.
For example, if we have decided to call our package "bitmap", we could use the following definition:
gvplugin_library_t gvplugin_bitmap_LTX_library = { "bitmap", apis };
To finish the installation of the package, it is necessary to create a dynamic library containing the
gvplugin library t value and all of the functions and data referred by it, either directly or indi-
rectly. The library must be named gvplugin name, where again name is the name of the package.
The actual filename of the library will be system-dependent. For example, on Linux systems, our library
gvplugin bitmap would have filename libgvplugin bitmap.so.3.
In most cases, Graphviz is built with a plug-in version number. This number must be included in
the library’s filename, following any system-dependent conventions. The number is given as the value of
plugins in the file libgvc.pc, which can be found in the directory lib/pkgconfig where Graphviz
was installed. In our example, the “3” in the library’s filename gives the version number.
Finally, the library must be installed in the Graphviz library directory, and dot -c must be run to
add the package to the Graphviz configuration. Note that both of these steps typically assume that one has
installer privileges.10
In the remainder of this section, we shall look at the first three types of plug-in APIs in more detail.
int flags;
double default margin;
double default pad;
pointf default pagesize;
pointf default dpi;
char **knowncolors;
int sz knowncolors;
color type t color type;
char *device;
char *loadimage target;
default margin Default margin size in points. This is the amount of space left around the drawing.
default pad Default pad size in points. This is the amount by which the graph is inset within the
drawing region. Note that the drawing region may be filled with a background color.
default pagesize Default page size size in points. For example, an 8.5 by 11 inch letter-sized page
would have a default pagesize of 612 by 792.
default dpi Default resolution, in pixels per inch. Note that the x and y values may be different to
support non-square pixels.
knowncolors An array of character pointers giving a lexicographically ordered 11 list of the color names
supported by the renderer.
color type The preferred representation for colors. See Section 5.3.
device The name of a device, if any, associated with the renderer. For example, a renderer using GTK
for output might specify "gtk" as its device. If a name is given, the library will look for a plug-in
of type API device with that name, and use the associated functions to initialize and terminate the
device. See Section 6.2.
loadimage target The name of the preferred type of image format for the renderer. When a user-
supplied image is given, the library will attempt to find a function that will convert the image from
its original format to the renderer’s preferred one. A user-defined renderer may need to provide, as
additional plug-ins, its own functions for handling the conversion.
which are called at the beginning and end of rendering each job. The initialize routine might open a canvas
on window system, or set up a new page for printing; the finalize routine might go into an event loop after
which it could close the output device.
Flag Description
GVRENDER DOES ARROWS Built-in arrowheads on splines
GVRENDER DOES LAYERS Supports graph layers
GVRENDER DOES MULTIGRAPH OUTPUT FILES If true, the renderer’s output can contain multiple renderings
GVRENDER DOES TRUECOLOR Supports a truecolor color model
GVRENDER Y GOES DOWN Output coordinate system has the origin in the upper left corner
GVRENDER X11 EVENTS For GUI plug-ins, defers actual rendering until the GUI event loop
invokes job->callbacks->refresh()
GVRENDER DOES TRANSFORM Can handle transformation (scaling, translation, rotation) from univer-
sal to device coordinates. If false, the library will do the transformation
before passing any coordinates to the renderer
GVRENDER DOES LABELS Wants an object’s label, if any, provided as text during rendering
GVRENDER DOES MAPS Supports regions to which URLs can be attached. If true, URLs are
provided to the renderer, either as part of the job->obj or via the
renderer’s begin anchor function
GVRENDER DOES MAP RECTANGLE Rectangular regions can be mapped
GVRENDER DOES MAP CIRCLE Circular regions can be mapped
GVRENDER DOES MAP POLYGON Polygons can be mapped
GVRENDER DOES MAP ELLIPSE Ellipses can be mapped
GVRENDER DOES MAP BSPLINE B-splines can be mapped
GVRENDER DOES TOOLTIPS If true, tooltips are provided to the renderer, either as part of the
job->obj or via the renderer’s begin anchor function
GVRENDER DOES TARGETS If true, targets are provided to the renderer, either as part of the
job->obj or via the renderer’s begin anchor function
GVRENDER DOES Z Uses a 3D output model
When called, loadimage is given the current job, a pointer to the input image us, and the bounding box
b in device coordinates where the image should be written. The boolean filled value indicates whether
the bounding box should first be filled.
The type value for an image loading plug-in’s gvplugin installed t entry should specify the
input and output formats it handles. Thus, a plug-in converting JPEG to GIF would be called "jpeg2gif".
Since an image loader may well want to read in an image in some format, and then render the image using
the same format, it is quite reasonable for the input and output formats to be identical, e.g. "gif2gif".
Concerning the type usershape t, its most important fields are shown in Figure 7. These fields have
char *name;
FILE *f;
imagetype t type;
unsigned int x, y;
unsigned int w, h;
unsigned int dpi;
void *data;
size t datasize;
void (*datafree)(usershape t *us);
f An open input stream to the image’s data. Since the image might be processed multiple times, the
Graphviz Drawing Library Manual, February 1, 2011 34
application should use a function such as fseek to make sure the file pointer points to the beginning
of the file.
type The format of the image. The formats supported in Graphviz are FT BMP, FT GIF, FT PNG,
FT JPEG, FT PDF, FT PS and FT EPS. The value FT NULL indicates an unknown image type.
x and y The coordinates of the lower-left corner of image in image units. This is usually the origin but
some images such as those in PostScript format may be translated away from the origin.
data, datasize, datafree These fields can be used to cache the converted image data so that the file
I/O and conversion need only be done once. The data can be stored via data, with datasize
giving the number of bytes used. In this case, the image loading code should store a clean-up handler
in datafree, which can be called to release any memory allocated.
If loadimage does caching, it can check if us->data is NULL. If so, it can read and cache the
image. If not, it should check that the us->datafree value points to its own datafree routing.
If not, then some other image loader has cached data there. The loadimage function must them
call the current us->datafree function before caching its own version of the image.
The code template in Figure 8 indicates how caching should be handled.
if (us->data) {
if (us->datafree != my_datafree) {
us->datafree(us); /* free incompatible cache data */
us->data = NULL;
us->datafree = NULL;
us->datasize = 0;
}
}
if (!us->data) {
/* read image data from us->f and convert it;
* store the image data into memory pointed to by us->data;
* set us->datasize and us->datafree to the appropriate values.
*/
}
if (us->data) {
/* emit the image data in us->data */
}
7 Unconnected graphs
All of the basic layouts provided by Graphviz are based on a connected graph. Each is then extended to
handle the not uncommon case of having multiple components. Most of the time, the obvious approach is
used: draw each component separately and then assemble the drawings into a single layout. The only place
this is not done is in neato when the mode is "KK" and pack="false" (cf. Section 3.2).
For the dot algorithm, its layered drawings make the merging simple: the nodes on the highest rank
of each component are all put on the same rank. For the other layouts, it is not obvious how to put the
components together.
The Graphviz software provides the library pack to assist with unconnected graphs, especially by
supplying a technique for packing arbitrary graph drawings together quickly, aesthetically and with efficient
use of space. The following code indicates how the library can be integrated with the basic layout algorithms
given an input graph g and a GVC t value gvc.
graph_t *sg;
FILE *fp;
graph_t** cc;
int i, ncc;
The call to ccomps splits the graph g into its connected components. ncc is set to the number of
components. The components are represented by subgraphs of the input graph, and are stored in the returned
array. The function gives names to the components in a way that should not conflict with previously existing
subgraphs. If desired, the third argument to ccomps can be used to designate what the subgraphs should
be called. Also, for flexibility, the subgraph components do not contain the associated edges.
Certain layout algorithms, such as neato, allow the input graph to fix the position of certain nodes,
indicated by ND pinned(n) being non-zero. In this case, all nodes with a fixed position need to be laid
out together, so they should all occur in the same “connected” component. The pack library provides
pccomps, an analogue to ccomps for this situation. It has almost the same interface as ccomps, but
takes a boolean* third parameter. The function sets the boolean pointed to to true if the graph has nodes
with fixed positions. In this case, the component containing these nodes is the first one in the returned array.
Graphviz Drawing Library Manual, February 1, 2011 36
Continuing with the example, we take one component at a time, use nodeInduce to create the corre-
sponding node-induced subgraph, and then lay out the component with gvLayout. Here, we use neato for
each layout, but it is possible to use a different layout for each component.12
Next, we use the pack function pack graph to reassemble the graph into a single drawing. To
position the components, pack uses the polyomino-based approach described by Freivalds et al[FDK02].
The first three arguments to the function are clear. The fourth argument indicates whether or not there are
fixed components.
The pack graph function uses the graph’s packmode attribute to determine how the packing should
be done. At present, packing uses the single algorithm mentioned above, but allows three varying granular-
ities, represented by the values "node", "clust" and "graph". In the first case, packing is done at the
node and edge level. This provides the tightest packing, using the least area, but also allows a node of one
component to lie between two nodes of another component. The second value, "clust", requires that the
packing treat top-level clusters with a set bounding box GD bb value like a large node. Nodes and edges
not entirely contained within a cluster are handled as in the previous case. This prevents any components
which do not belong to the cluster from intruding within the cluster’s bounding box. The last case does the
packing at the graph granularity. Each component is treated as one large node, whose size is determined by
its bounding box.
Note that the library automatically computes the bounding box of each of the components. Also, as
a side-effect, pack graph finishes by recomputing and setting the bounding box attribute GD bb of the
graph.
The final step is to free the component subgraphs.
Although dot and neato have their specialized approaches to unconnected graphs, it should be noted that
these are not without their deficiencies. The approach used by dot, aligning the drawings of all components
along the top, works well until the number of components grows large. When this happens, the aspect ratio
of the final drawing can become very bad. neato’s handling of an unconnected graph can have two draw-
backs. First, there can be a great deal of wasted space. The value chosen to separate components is a simple
function of the number of nodes. With a certain edge structure, component drawings may use much less
area. This can produce a drawing similar to a classic atom: a large nucleus surrounded by a ring of electrons
with a great deal of empty space between them. Second, the neato model is essentially quadratic. If the
components are drawn separately, one can see a dramatic decrease in layout time, sometimes several orders
of magnitudes. For these reasons, it sometimes makes sense to apply the twopi approach for unconnected
graphs to the dot and neato layouts. In fact, as we’ve noted, neato layout typically uses the pack
library by default.
12
At present, the dot layout has a limitation that it only works on a root graph. Thus, to use dot for a component, one needs to
create a new copy of the subgraph, apply dot and then copy the position attributes back to the component.
Graphviz Drawing Library Manual, February 1, 2011 37
References
[Coh87] J. Cohen. Drawing graphs to convey proximity: an incremental arrangement meth od. ACM
Transactions on Computer-Human Interaction, 4(11):197–229, 1987.
[FDK02] K. Freivalds, U. Dogrusoz, and P. Kikusts. Disconnected graph layout and the polyomino
packing approach. In P. Mutzel et al., editor, Proc. Symp. Graph Drawing GD’01, volume
2265 of Lecture Notes in Computer Science, pages 378–391, 2002.
[FR91] Thomas M. J. Fruchterman and Edward M. Reingold. Graph Drawing by Force-directed Place-
ment. Software – Practice and Experience, 21(11):1129–1164, November 1991.
[GKN04] E. Gansner, Y. Koren, and S. North. Graph drawing by stress majorization. In Proc. Symp.
Graph Drawing GD’04, September 2004.
[GKNV93] Emden R. Gansner, Eleftherios Koutsofios, Stephen C. North, and Kiem-Phong Vo. A Tech-
nique for Drawing Directed Graphs. IEEE Trans. Software Engineering, 19(3):214–230, May
1993.
[GN00] E.R. Gansner and S.C. North. An open graph visualization system and its applications to
software engineering. Software – Practice and Experience, 30:1203–1233, 2000.
[KK89] T. Kamada and S. Kawai. An algorithm for drawing general undirected graphs. Information
Processing Letters, 31(1):7–15, April 1989.
[KN94] Eleftherios Koutsofios and Steve North. Applications of Graph Visualization. In Proceedings
of Graphics Interface, pages 235–245, May 1994.
[KS80] J. Kruskal and J. Seery. Designing network diagrams. In Proc. First General Conf. on Social
Graphics, pages 22–50, 1980.
[KW] M. Kaufmann and R. Wiese. Maintaining the mental map for circular drawings. In
M. Goodrich, editor, Proc. Symp. Graph Drawing GD’02, volume 2528 of Lecture Notes in
Computer Science, pages 12–22.
[LBM97] W. Lee, N. Barghouti, and J. Mocenigo. Grappa: A graph package in Java. In G. DiBattista, ed-
itor, Proc. Symp. Graph Drawing GD’97, volume 1353 of Lecture Notes in Computer Science,
1997.
[ST99] Janet Six and Ioannis Tollis. Circular drawings of biconnected graphs. In Proc. ALENEX 99,
pages 57–73, 1999.
[ST00] Janet Six and Ioannis Tollis. A framework for circular drawings of networks. In Proc. Symp.
Graph Drawing GD’99, volume 1731 of Lecture Notes in Computer Science, pages 107–116.
Springer-Verlag, 2000.
[STT81] K. Sugiyama, S. Tagawa, and M. Toda. Methods for Visual Understanding of Hierarchical Sys-
tem Structures. IEEE Trans. Systems, Man and Cybernetics, SMC-11(2):109–125, February
1981.
Graphviz Drawing Library Manual, February 1, 2011 38
[Wil97] G. Wills. Nicheworks - interactive visualization of very large graphs. In G. DiBattista, editor,
Symposium on Graph Drawing GD’97, volume 1353 of Lecture Notes in Computer Science,
pages 403–414, 1997.
[Win02] A. Winter. Gxl - overview and current status. In Procs. International Workshop on Graph-
Based Tools (GraBaTs), October 2002.
Graphviz Drawing Library Manual, February 1, 2011 39
simple: simple.o
dot: dot.o
demo: demo.o
clean:
rm -rf simple dot demo *.o
13
For completeness, we note that it may be necessary to explicitly link in the following additional libraries, depending on the
options set when Graphviz was built: expat, fontconfig, freetype2, pangocairo, cairo, pango, gd, jpeg, png, z,
ltdl, and other libraries required by Cairo and Pango. Typically, though, most builds handle these implicitly.
14
They can also be found, along with the Makefile, in the dot.demo directory of the Graphviz source.
Graphviz Drawing Library Manual, February 1, 2011 40
#include <gvc.h>
gvc = gvContext();
if (argc > 1)
fp = fopen(argv[1], "r");
else
fp = stdin;
g = agread(fp);
gvLayout(gvc, g, "dot");
gvFreeLayout(gvc, g);
agclose(g);
return (gvFreeContext(gvc));
}
gvc = gvContext();
gvParseArgs(gvc, argc, argv);
}
gvLayoutJobs(gvc, g);
gvRenderJobs(gvc, g);
prev = g;
}
return (gvFreeContext(gvc));
}
/* Set an attribute - in this case one that affects the visible rendering */
agsafeset(n, "color", "red", "");
struct {
int x, y;
}
The fields can either give an absolute position or represent a vector displacement. A pointf type is the
same, with int replaced with double. A box type is the structure
struct {
point LL, UR;
}
representing a rectangle. The LL gives the coordinates of the lower-left corner, while the UR is the upper-
right corner. A boxf type is the same, with point replaced with pointf.
The following gives the accepted string representations corresponding to values of the given types.
Whitespace is ignored when converting these values from strings to their internal representations.
point "x,y" where (x,y) are the integer coordinates of a position in points (72 points = 1 inch).
pointf "x,y" where (x,y) are the floating-point coordinates of a position in inches.
rectangle "llx,lly,urx,ury" where (llx,lly) is the lower left corner of the rectangle and
(urx,ury) is the upper right corner, all in integer points.
spline This type has an optional end point, an optional start point, and a space-separated list of N =
3n + 1 points for some positive integer n. An end point consists of a point preceded by "e,"; a
start point consists of a point preceded by "s,". The optional components are separated by spaces.
The terminating list of points p1 , p2 , . . . , pN gives the control points of a B-spline. If a start point
is given, this indicates the presence of an arrowhead. The start point touches one node of the corre-
sponding edge and the direction of the arrowhead is given by the vector from p1 to the start point. If
the start point is absent, the point p1 will touch the node. The analogous interpretation holds for an
end point and pN .