Face Long 14

Social Structure of Facebook Networks
Amanda L. Traud1,2 , Peter J. Mucha1,3 , and Mason A. Porter4,5
Carolina Center for Interdisciplinary Applied Mathematics, Department of Mathematics,

University of North Carolina, Chapel Hill, NC 27599-3250, USA
2
Carolina Population Center, University of North Carolina, Chapel Hill, NC 27516-2524, USA
3
Institute for Advanced Materials, Nanoscience & Technology,
University of North Carolina, Chapel Hill, NC 27599-3216, USA
4
Oxford Centre for Industrial and Applied Mathematics, Mathematical Institute,
University of Oxford, OX1 3LB, UK
5
CABDyN Complexity Centre, University of Oxford, OX1 1HB, UK
Abstract
We study the social structure of Facebook friendship networks at one hundred
American colleges and universities at a single point in time, and we examine the
roles of user attributesgender, class year, major, high school, and residenceat
these institutions. We investigate the influence of common attributes at the dyad
level in terms of assortativity coefficients and regression models. We then examine larger-scale groupings by detecting communities algorithmically and comparing
them to network partitions based on the user characteristics. We thereby compare
the relative importances of different characteristics at different institutions, finding
for example that common high school is more important to the social organization
of large institutions and that the importance of common major varies significantly
between institutions. Our calculations illustrate how microscopic and macroscopic
perspectives give complementary insights on the social organization at universities
and suggest future studies to investigate such phenomena further.
Preprint submitted to Social Networks
February 10, 2011
1. Introduction
Since their introduction, social networking sites (SNSs) such as Friendster, MySpace, Facebook, Orkut, LinkedIn, and myriad others have attracted hundreds of
millions of users, many of whom have integrated SNSs into their daily lives to communicate with friends, send e-mails, solicit opinions or votes, organize events, spread
ideas, find jobs, and more [Boyd and Ellison, 2007]. Facebook, an SNS launched
in February 2004, now overwhelms numerous aspects of everyday life, and it has
become an immensely popular societal obsession [Boyd, 2007b, Boyd and Ellison,
2007, Lewis et al., 2008b, Mayer and Puller, 2008]. Facebook members can create
self-descriptive profiles that include links to the profiles of their friends, who may
or may not be offline friends. Facebook requires that anybody who one wants to
add as a friend confirm the relationship, so Facebook friendships define a network
(graph) of reciprocated ties (undirected edges) that connect individual users.
The emergence of SNSs such as Facebook and MySpace has revolutionized the
availability of social and demographic data, which has in turn had a significant impact
on the study of social networks [Boyd and Ellison, 2007, Krebs, 2008, Lievrouw
and Livingstone, 2005]. It is possible to acquire very large data sets from SNSs,
though of course the population online and actively using SNSs is a biased sample
of the broader population. Services like Facebook also contain large quantities of
demographic data, as many users now voluntarily reveal voluminous amounts of
detailed personal information. An especially exciting aspect of studying SNSs is
that they provide an opportunity to examine social organization at unprecedented
levels of size and detail, and they also provide new venues to test sampling effects
2
[Kurant et al., 2011]. One can investigate the structure of an SNS like Facebook to
examine it as a network in its own right, and ideally one can also try to take one step
further and infer interesting insights regarding the offline social networks that an SNS
imperfectly parallels. Most people tend to draw their Facebook friends from their
real-life social networks [Boyd and Ellison, 2007], so it is not entirely unreasonable to
use Facebook networks as a proxy for an offline social network. (Of course, as noted
by Hogan [2009], one does need to be aware of significant limitations when taking
such a leap of faith.)
Social scientists, information scientists, and physical scientists have all jumped
on the SNS data bandwagon [Rosenbloom, 2007]. It would be impossible to exhaustively cite all of the research in this area, so we only highlight a few results; additional
references can be found in the review by Boyd and Ellison [2007]. Boyd [2007a] also
wrote a popular essay about her empirical study of Facebook and MySpace, concluding that Facebook tends to appeal to a more elite and educated cross section
than MySpace. The company RapLeaf [Sodera, 2008] has compiled global demographics on the age and gender usage of numerous SNSs. Other recent studies have
investigated the manifestation on SNSs of race and ethnicity [Gajjala, 2007], religion
[Nyland and Near, 2007], gender [Geidner et al., 2007, Hjorth and Kim, 2005], and
national identity [Fragoso, 2006]. Preliminary research has also suggested that online
friendship networks can be exploited to improve shopper recommendation systems
on websites such as Amazon [Zheng et al., 2007].
Several papers have attempted to increase understanding of how SNS friendships
form. For example, Kumar et al. [2006] examined preferential attachment models
of SNS growth, concluding that it is important to consider different classes of users.

Lampe et al. [2007] explored the relationship between profile elements and number
of Facebook friends, and other scholars have examined the importance of geography
[Liben-Nowell et al., 2005] and online message activity [Golder et al., 2007] to online
friendship formation. Other papers have established strong correlations between network participation and website activity, including the motivation of people to join
particular groups [Backstrom et al., 2006], the recommendations of online groups
[Spertus et al., 2005], online messages and friendship formation [Golder et al., 2007],
interaction activity versus sense of belonging [Chin and Chignell, 2007], and the role
of explicit ideological relationship designations in affecting voting behavior [Brzozowski et al., 2008, Hogg et al., 2008]. Lewis et al. [2008b] used Facebook data for
an entire class of freshmen at an unnamed, private American university to conduct
a quantitative study of social networks and cultural preferences. The same data set
was also used to examine user privacy settings on Facebook [Lewis et al., 2008a].
In the present paper, we study the complete Facebook networks of 100 American
college and universities from a single-day snapshot in September 2005. This paper
is a sequel to our previous research on 5 of these institutions [Traud et al., 2010],
in which we developed some of the methodology that we employ here. In September 2005, one needed a .edu e-mail address to become a member of Facebook, and
the majority of friendship ties were within the same institution. We thus ignore
links between nodes at different institutions and study the Facebook networks of
the 100 institutions as 100 separate networks. For each network, we have categorical data encompassing the gender, major, class year, high school, and residence
(e.g., dormitory, House, fraternity, etc.) of the users. We examine homophily and
community structure (network partitions that are obtained algorithmically) for each
of the networks and compare the community structure to partitions based on the
given categorical data. We thereby compare and contrast the organizations of the
100 different Facebook networks, which arguably allows us to compare and contrast
the organizations of the underlying university social networks that they imperfectly
represent. In addition to the inherent interest of these Facebook networks, our investigation is important for subsequent use of these networkswhich were formed
via ostensibly the same generative mechanism onlineas benchmark examples for
numerous types of computations, such as new community detection methods.
The remainder of this paper is organized as follows. We first discuss the Facebook data and present the methods that we used for testing homophily at the dyad
level and demographic prevalences at the community level. We then present and
discuss results on the largest connected components of the networks, student-only
subnetworks, and single-gender subnetworks. Finally, we summarize and discuss our
findings.
2. Data
The data that we use was sent directly to us in anonymized form by Adam
DAngelo of Facebook. It consists of the complete set of users (nodes) from the
Facebook networks at each of 100 American institutions (which we enumerate in
Table A.1) and all of the friendship links between those users pages as they existed
in September 2005. The data clearly identifies most institutions, although there are a
small number of disambiguation problems. For instance, 4 different UC institutions
5
plus Cal are in the data, and there are 2 Texas listings. Each institution in the
data includes a number appearing as part of its name that appears to correspond to
the order in which each institution joined Facebook. The data can be downloaded
at http://people.maths.ox.ac.uk/porterm/data/facebook100.zip.
Similar snapshots of Facebook data from 10 Texas institutions were analyzed
recently by Mayer and Puller [2008], and a snapshot from a diverse private college in
the Northeast U.S. was studied by Lewis et al. [2008b]. Other studies of Facebook
have typically obtained data either through surveys [Boyd and Ellison, 2007] or
through various forms of automated sampling [Gjoka et al., 2010], thereby missing
nodes and links that can impact the resulting graph structures and analyses. We
consider only ties between people at the same institution, yielding 100 separate
realizations of university social networks and allowing us to compare the structures
at different institutions.
We consider four networks for each of the 100 Facebook data sets: the largest
connected component of the full network (which we hereafter identify as Full), the
largest connected component of the student-only network (Student), the largest
connected component of the female-only network (Female), and the largest connected component of the male-only network (Male). The Male and Female networks are each subsets of the Full network rather than the Student network. Each
network has a single type of unweighted, undirected connection between nodes and
can thus be represented as an adjacency matrix A with elements Aij = Aji indicating
the presence (Aij = 1) or absence (Aij = 0) of a tie between nodes i and j. The
resulting tangle of nodes and links, which we illustrate for the Reed College student
Facebook network in Figure 1, can obfuscate any organizational structure that might
be present.
The data also includes limited demographic (categorical) information that is volunteered by users on their individual pages: gender, class year, and (using anonymous
numerical identifiers) high school, major, and residence. We use a Missing label
for situations in which individuals did not volunteer a particular characteristic. The
different characteristics allow us to make comparisons between institutions, under
the assumption (see the discussion by Boyd and Ellison [2007]) that the communities and other elements of structural organization in Facebook networks reflect (even
if imperfectly) the social communities and organization of the offline networks on
which theyre based. It is an important research issue to determine just how imperfect this might be [Hogan, 2009], but this is far beyond the scope of the present paper
(though we hope that others will take on this particular challenge). The conclusions
that we draw in this paper apply directly to the university Facebook networks from
September 2005, and we expect that they can provide insight about the real-world
social networks at the institutions as well.
3. Methods
We study each network at both the dyad level and the community level. We first
consider homophily [McPherson et al., 2001, Newman, 2010, Wasserman and Faust,
1994]) quantified by assortativity coefficients using the available categorical data.
For some of the smaller networks, we additionally perform independent logistic regression on node pairs to obtain the log odds contributions to edge presence between
two nodes that have the same categorical-data value. We similarly fit exponential
7
random graph models (ERGMs) [Frank and Strauss, 1986, Handcock et al., 2008,
Lubbers and Snijders, 2007, Robins et al., 2007, Wasserman and Pattison, 1996]
with triangle terms to these smaller networks. Finally, we partition the networks by
algorithmically detecting communities [Fortunato, 2010, Porter et al., 2009], which
we compare to the given categorical data using the technique in this papers prequel [Traud et al., 2010]. Calculating assortativity values and log odds contributions
allows us to examine microscopic features of the networks, while comparing algorithmic partitions of the networks to the categorical data allows us to examine their
macroscopic features. As we illustrate below, both perspectives are important
because they provide complementary insights.
3.1. Assortativity
A general measure of scalar assortativity r relative to a categorical variable is
given by Newman [2003, 2010]:
r=
tr(e) ke2 k
[1, 1] ,
1 ke2 k
(1)
where e = E/kEk is the normalized mixing matrix, the elements Eij indicate the
number of edges in the network that connect a node of type i (e.g., a person with
a given major) to a node of type j, and the entry-wise matrix 1-norm kEk is equal
to the sum of all entries of E. By construction, this formula yields r = 0 when the
amount of assortative mixing is the same as that expected independently at random
(i.e., eij is simply the product of the fraction of nodes of type i and the fraction of
nodes of type j), and it yields r = 1 when the mixing is perfectly assortative.
3.2. Logistic Regression and Exponential Random Graphs

We further measure the influence of the available user characteristics on the likelihood of a friendship tie via a fit by logistic regression (under an assumption of
independent dyads) and by an ERGM specification that includes triangle terms. Our
focus is on trying to calculate the propensity for two nodes with the same categorical value to form a tie. We consider each of the four categorical variables (major,
residence, year, and high school) and use the ERGM package in R [Handcock et al.,
2008] for both models (treating each network as undirected). We used R 2.11.1
and the statnet package version 2.1-1, and we note that different versions of R
and statnet caused different degrees of convergence with the structural elements
in the model. We obtained results for the 16 smallest institutions. (We did these
calculations on a 32-bit operating system, which restricts the network sizes that
can be processed.) Both models that we consider are based on a standard ERGM
parametrization P {Y = A} = exp{ g(A)}/() describing the distribution of
graphs with model coefficients corresponding to statistics calculated from the adjacency matrix A (with a normalizing factor to ensure that the formula yields a
probability distribution) [Frank and Strauss, 1986, Handcock et al., 2008, Lubbers
and Snijders, 2007, Robins et al., 2007, Wasserman and Pattison, 1996].
In the first model (logistic regression), we include five statistics (with five corresponding coefficients): the total density of ties (edges) and the common classifications (nodematch) from each of four node/user characteristics: residence, class
year, major, and high school. For example, the highschool contribution describes the
additional log-odds predisposition for a friendship tie when two users are from
the same high school. In all cases, we ignore possible contributions from missing
characteristic data: two nodes with the same missing data field are not treated as
having the same value for the characteristic. Rather than include gender explicitly
in the model, we instead additionally fit the model to the single-gender subnetworks
in order to be consistent with the treatment of gender in the community-level comparisons below. In the second model (an ERGM), we add a triangle statistic to
account for the observed amount of transitivity in the network data. This gives a
total of six coefficients: edges, common residence, common class year, common
major, common high school, and the triangle coefficient.
3.3. Community Detection
The global organization of social networks often includes coexisting modular (horizontal) and hierarchical (vertical) organizational structures, and myriad papers have
attempted to interpret such organization through the computational identification
of community structure. Communities are defined in terms of cohesive groups of
nodes with more internal connections (between nodes in the same group) than external connections (between nodes in the group and nodes in other groups). As
discussed at length in two recent review articles [Fortunato, 2010, Porter et al., 2009]
and in references therein, the ensemble of techniques available to detect communities
is both numerous and diverse. Existing techniques include hierarchical clustering
methods such as single linkage clustering, centrality-based methods, local methods,
optimization of quality functions such as modularity and similar quantities, spectral partitioning, likelihood-based methods, and more. Communities are considered
to not be merely structural modules but are also expected to have functional im10
portance because of the large number of common ties among nodes in a community.
For example, communities in social networks might correspond to circles of friends or
business associates and communities in the World Wide Web might encompass pages
on closely-related topics. In addition to remarkable successes on benchmark problems, investigations of community structure have observed correspondence between
communities and ground truth groups in diverse application areasincluding the
reconstruction of college football conferences [Girvan and Newman, 2002] and the
investigation of such structures in algorithmic rankings [Callaghan et al., 2007]; the
investigation of committee assignments [Porter et al., 2005], legislation cosponsorship
[Zhang et al., 2008], and voting blocs [Mucha et al., 2010, Waugh et al., 2009] in the
United States Congress; the examination of functional groups in metabolic networks
[Guimer`a and Amaral, 2005]; the study of ethnic preferences in school friendship
networks [Gonzalez et al., 2007]; and the study of social structures in mobile-phone
conversation networks [Onnela et al., 2007]
In the present paper, we investigate the community structures of the Facebook
networks from each of the 100 colleges and universities. (See the visualization of the
community structure for Reed College in Figure 2.) For each institution, we consider the Full, Student, Female, and Male networks. We seek to determine how well
the demographic labels included in the data correspond to algorithmically computed
communities. Assortativity provides a local measure of homophily, but that does
not provide sufficient information to draw conclusions about the global organization
of the Facebook networks. For example, two students who attended the same high
school are typically more likely to be friends with each other than are two students
11
who attended different high schools, but this will not necessarily have a meaningful
community-level effect unless enough of the students went to common high schools.
As we we will see below, high school tends to be a much more dominant organizing
characteristic of the social structure at the large institutions than at small institutions, presumably because of a significant frequency of common high school pairs at
the large institutions.
We identify communities by optimizing the modularity quality function Q =
b2i ), where eij denotes the fraction of ends of edges in group i for which the
P
other end of the edge lies in group j and bi = j eij is the fraction of all ends of
P
i (eii
edges that lie in group i. High values of modularity correspond to community assignments with greater numbers of intra-community links than expected at random (with
respect to a particular null model [Fortunato, 2010, Newman, 2006a, Porter et al.,
2009]). Although numerous other community detection methods are also available,
modularity optimization is perhaps the most popular way to detect communities and
it has been successfully applied to many applications [Fortunato, 2010, Porter et al.,
2009]. One might also consider using a method that includes a resolution parameter
[Reichardt and Bornholdt, 2006] to avoid issues with resolution limits [Fortunato
and Barthelemy, 2007]. However, our primary focus is on global organization of the
networks, so we limit our attention to the default resolution of modularity. This
focus arguably biases our study of communities to the largest structures, such as
those influenced by common class year, making the observed correlations with other
demographic characteristics even more striking.
To try to ensure that the communities we detect are properties of the data rather
12
than of the algorithms that we used, we optimize modularity (with default resolution) using 6 different combinations of spectral optimization, greedy optimization,
and Kernighan and Lin [1970] (KL) node-swapping steps (in the manner discussed
by Newman [2006b]). Specifically, we use (1) recursive partitioning by the leading
eigenvector of a modularity matrix [Newman, 2006a], (2) recursive partitioning by the
leading pair of eigenvectors (including the Richardson et al. [2009] extension of the
method in Newman [2006a]), (3) the Louvain greedy method [Blondel et al., 2008],
and each of these three supplemented with small increases in the quality Q that can
be obtained using KL node swaps. Each of these 6 methods yields a community
partition, and we obtain our comparisons (described in Section 3.4) by considering
each of these 6 partitions.
Modularity optimization is NP-hard [Brandes et al., 2008], so one must be cautious about the large number of degenerate partitions in the modularity landscape
[Good et al., 2010]. However, by detecting coarse observablesin particular, the
global organization of a Facebook network based on the given categorical dataand
considering results that are averaged over multiple optimization methods, one can
obtain interesting insights. The specific best partition will vary from one method
to another, but some of the predicted coarse organizational structure of the networks
(see below) is robust to the choice of community detection algorithm.
3.4. Comparing Communities to Node Data
Once we have detected communities for each institution, we will compare the
algorithmically-obtained community structure to the available categorical data for
the nodes. We recently developed a methodology to accomplish this goal in Traud
13
et al. [2010] (where we considered only 5 institutions among the 100 in order to
illustrate the techniques). This method of comparison can be applied to the output
of any hard partitioning algorithm in which each node is assigned to precisely one
community (cf. soft partitioning methods, in which communities can overlap). We
briefly review that methodology here.
To compare a network partition to the categorical demographic data, we standardize (using a z-score) the Rand coefficient of the communities in that partition
compared to partitioning based purely on each of the four categorical variables (one
at a time). For each comparisons, we calculate the Rand z-score z in terms of the
total number of pairs of nodes in the network M, the number of pairs that are in the
same community M1 , the number of pairs that have the same categorical value M2 ,
and the number of pairs of nodes that are both in the same community and have
the same categorical value w [Traud et al., 2010]. The Rand coefficient is given in
term of these quantities by S = [w + (M M1 M2 + w)]/M [Rand, 1971]. We then
calculate the z-score for the Rand coefficient as [Hubert, 1977, Traud et al., 2010]
1
z=
w
M1 M2
w
M
(2)
where
(4M1 2M)2 (4M2 2M)2
C1 C2
M
+
2
16
256M
16n(n 1)(n 2)
2
[(4M1 2M) 4C1 4M][(4M2 2M)2 4C2 4M]
,
+
64n(n 1)(n 2)(n 3)
w2 =
14
(3)
n is the number of nodes in the network, the coefficients C1 and C2 are given by
C1 = n(n2 3n 2) 8(n + 1)M1 + 4
n3i ,
C2 = n(n2 3n 2) 8(n + 1)M2 + 4
n3j ,
(4)
nij denotes an element of a contingency table and indicates the number of nodes that
are classified into the ith group of the first partition and the jth group of the second
P
P
partition, ni = j nij is a row sum, and nj = i nij is a column sum. Each z-score
indicates the deviation from randomness in comparing the community structure with
the partitioning based purely on that single demographic characteristic. One needs
to be cautious when interpreting such deviations from randomness as a strength
of correlation. In particular, given the dependence on system size inherent in this
measure, one should not overinterpret the relative values of z-scores from different
institutions. Nevertheless, the z-scores provide a reasonable proxy quantity both for
the statistical significance of correlation and for the relative strength of correlation
in a specified network.
4. Results
We now use the methods outlined in the previous section to study the Facebook
networks. We first follow the order of presentation above and then make some
observations in combinations. Complete results are available in the tables in the
appendix.
15
4.1. Assortativity
We tabulate the assortativities based on gender, major, residence, class year, and
high school for all networks (and subsets thereof) in Table A.2.
For almost all of the institutions and each of the 4 network subsets, the class year
attribute produces higher assortativity values than the other available demographic
characteristics. However, Rice University (31), California Institute of Technology
(36), University of Georgia (50), University of Michigan (67), Auburn University
(71), and University of Oklahoma (97) are each examples in which residence provides the highest assortativity values (again, for each of the 4 network subsets). We
discussed Caltech as a focal example in Traud et al. [2010], in which we introduced
the community comparison methods that we employ below.
Other institutions have varying orderings of class year and residence assortativity
among the 4 network subsets. At MIT (8), USF (51), Notre Dame (57), University of
Maine (59), UC (61), UC (64), and MU (78), residence gives the highest assortativity
in the Male networks. The UCF (52) Female network has its highest assortativity
with residence. Both the Full network and the Male network for University of California at Santa Cruz (68) have their highest assortativity values with residence. Both
the Male and Female networks at University of Illinois at Urbana-Champaign (20),
Tulane (29), UC (33), Florida State University (53), Cal (65), University of Mississippi (66), University of Indiana (69), Texas (80), Texas (84), University of Wisconsin
(87), Baylor (93), University of Pennsylvania (94), and University of Tennessee (95)
have their highest assortativity values with residence; all other networks from these
institutions have their highest assortativity with class year.
16
Some outlying observations can be tied directly to small samples. For example,
Simmons (81) is a female-only college. It has only four males in the Full network;
none of the males had any connections with another male, so the gender assortativity
values for both the Full and Student components are very close to 0. Similar gender
numbers are also present in the data from Wellesley (22) and Smith (60).
4.2. Dyad-Level Regression and Exponential Random Graphs
We use the two statistical models described in Section 3.2 to study the 16 smallest
institutions. The (dyad-independent) logistic regression model includes contributions
from edges (network density) and matched user (node) characteristics for each of
four demographic variables. We present the results for this model in Table A.3. The
second model that we consider is an ERGM, which supplements the first model with
a structural triangle contribution. We present the results for this model in Table
A.4. These calculations give views of the networks at the microscropic (dyad-level)
scale that supplement the results that we obtained using the assortativity statistics.
We consider the results from the 16 smallest institutions by fitting the models to
each of their Full, Student, Female, and Male networks. Because all of the resulting model coefficients appear to be statistically significant at a p-value of less than
104 , we interpret the importance of node matching on the different demographic
characteristics directly from the magnitude of the corresponding model coefficients.
We summarize the results for these 16 institutions using the box plots in Figures
3 and 4. The box plots identify the outliers by institution number: Caltech (36),
Oberlin (44), Smith (60), Simmons (81), Vassar (85), and Reed (98). (As we have
only performed this regression analysis for the 16 smallest institutions in the data,
17
one should not jump to conclusions from this list of outliers.) For all institutions
and all four types of networks for each institution, the highest coefficient in the employed ERGM model (with triangle terms) is given for matching the High School
category, and the value of this coefficient is significantly higher than those for the
other node-matching coefficients. Only the Caltech (36) Female network has ERGM
coefficients for Year, Residence, and High School that are very close to each other.
4.3. Comparison of Communities
We now discuss community-level results for each network using z-scores of the
Rand coefficient to compare partitions obtained via algorithmic community detection to partitions based on each characteristic. That is, each community-detection
result identifies a group assignment for each node, thereby producing a partition
(called a hard partition) in which each node is assigned to exactly one community. One can also obtain a hard partition for each network by selecting a single characteristic and grouping nodes according to that characteristic. Every network that we study (including the subnetworks) has at least one z-score in the set
{zMajor , zYear , zHS , zResidence } with a value greater than 5. Although the distribution of Rand coefficients is decidedly not Gaussian, particularly in the tails of the
distributions [Brook and Stirling, 1984, Kulisnkaya, 1994, Traud et al., 2010], this
z = 5 threshold indicates that at least one characteristic in each network exhibits
strong statistical significance. Moreover, we will see that the vast majority of our
comparisons below exceed the z = 2 threshold. (That is, they essentially lie outside
95% confidence intervals.)
To visualize and compare the varied strengths of organization according to the
18
different demographic characteristics, we represent the four z-scores obtained for

each network (Full, Student, Female, and Male) of an institution using 3-dimensional
barycentric (tetrahedral) coordinates [Franklin, 2002, Weisstein, 2011]. We start by
setting all negative z-scores to 0, as all observed negative z-score values are small
enough to be statistically insignificant. We then normalize by the sum of the z-scores
to obtain
z1 =
zMajor
,
zMajor + zYear + zHS + zResidence
z2 =
zResidence
,
z3 =
zYear
,
z4 =
zHS
.
(5)
From these 4 z-score values, we calculate coordinates X = (x1 , x2 , x3 ) located inside

a tetrahedron. For example, one can obtain a tetrahedron whose vertices are p1 =
(1, 0, 0), p2 = (cos(2/3), sin(2/3), 0), p3 = (cos(4/3), sin(4/3), 0), and p4 =
(0, 0, 2)) with the transformation

X = (T Z) + p4 ,

T = p1 p4 p2 p4 p3 p4 ,
z1
Z=
z2 .
z3
(6)
The information from z4 = 1 (z1 + z2 + z3 ) is implicitly included in (6) because of

19
the normalization. Each of the 4 vertices of the tetrahedron corresponds to a limit

in which the corresponding z-score completely dominates the other three z-scores.
That is, at a vertex, the entire z-score sum arises from the corresponding component.
Because of the strong role of class year, we visualize the tetrahedra from a perspective located above the vertex corresponding to class year and project the result
into the opposing face of the tetrahedron. We calculate the point X for each of the
6 algorithmic partitions of each network (i.e., using the aforementioned 6 different
community detection methods). For each institution, we plot a disk whose center
lies at the midpoint of these 6 X coordinates. The width of each disk is proportional
to the maximum observed difference between these 6 sets of coordinates (with these
distances separated into bins of width .1, as indicated in the legends of Figures 58).
For example, in Figure 5, the Pepperdine (86) results have a maximum distance of
.0141 between partitions, so Pepperdie (86) is represented by one of the smallest
disks. Harvard (1) has a maximum distance of .1581 between partitions; this lies
in [.1, .2), so Harvard (1) is represented by one of the disks of second smallest size.
We emphasize that the computed differences are much larger than the span of the
depicted disks, whose sizes allow one to discern the results from different institutions.
In Figures 58, we show each of the 100 institutions, identified by number (see
Table A.1), using a disk that we have color-coded according to the Cartesian distance
of its center from the Year vertex. Class year is the predominant organizing category
among the ones present in the data, so most of the institutions are located very close
to the Year vertex. We zoom in on the Year vertex for each figure in order to better
discern the relative importance of class year at the institutions. Importantly, the
20
social organization of a few institutions differs considerably from that of the majority.
Each of these institutions lies close to the Residence vertex, so their community
structures are organized predominantly according to dormitory residence. Foremost
among these institutions are Rice (31) and California Institute of Technology (36).
As we discussed in Traud et al. [2010], California Institute of Technology (Caltech)
is well-known to be organized almost exclusively according to its undergraduate
House system [Looijen and Porter, 2007].
In repeatedly observing a strong correlation of class year with community structure, it is relevant to recall that the community detection method that we have
employed optimizes modularity at the default resolution. Because of the resolution
limit of modularity [Fortunato and Barthelemy, 2007], it might be interesting to explore individual networks at different scales using resolution parameters [Fortunato,
2010, Porter et al., 2009, Reichardt and Bornholdt, 2006]. We reiterate, however,
that our focus in the present paper is on large-scale features rather than precise node
membership of network partitions.
In Figure 5, we show the social organization tetrahedron for the Full networks
(i.e, for the the largest connected components of the complete networks) for each
institution. Although the community structure of nearly all of the Full networks are
organized overwhelmingly by class year, a few of them are also heavily influenced by
dormitory residence. (We already mentioned above that Rice (31) and Caltech (36)
are organized predominantly by Residence.) For example, dormitory residence also
dominates the community structure at UC Santa Cruz [UCSC] (68), though to a
lesser extent than at Rice and Caltech. We also observe relatively high Residence z-
21
scores at Smith (60), Auburn (71), and University of Oklahoma (97). Major seems to
be most important relative to the other available characteristics at Oberlin (44) and
Maine (59), though in both cases its relative correlation pales in comparison to that of
class year. High School seems to be most important at USF (51) and Tennessee (95),
though class year is again more important. Most of the institutions are clustered
tightly near the Year vertex, but Residence can often be rather important (and
sometimes even the most important category, as we have seen in three cases).
In Figure 6, we show the social organization tetrahedron for the Student networks
(i.e., for the largest connected component of the student-only subnetworks) for each
institution. As we saw with the Full networks, most of the institutions have community structures that are organized overwhelming according to class year. Rice,
Caltech, Smith, UCSC, Auburn, and Oklahoma are again exceptions, as dormitory
residence also exerts considerable (or even primary) influence at these institutions.
Additionally, considering the Student network reduces the relative dominance of the
Year vertex, although it clearly still dominates the social organization. This feature
is illustrated by institutions such as UC (64), UF (21), and Rutgers (89).
In Figure 7, we show the social organization tetrahedron for the Female networks
(i.e., for the largest connected component of the female-only subnetworks) for each
institution. Class year is once again the overwhelmingly dominant organizing characteristic, and dormitory residence is again important at institutions such as Rice,
Caltech, Smith, UCSC, Auburn, and Oklahoma. However, we now observe an increased importance of the High School vertex. USF (51), Tennessee (95), UF (21),
FSU (53), and GWU (54) all lie closer to the High School vertex than was the case
22
in the Full and Student networks.

In Figure 8, we show the social organization tetrahedron for the Male networks
(i.e., for the largest connected component of the male-only subnetworks) for each
institution. Class year is once again the overwhelmingly dominant organizing characteristic, and dormitory residence is again the most important category at institutions such as Rice, Caltech, and UCSC. Interestingly, considering the Male network
suggests that Residence is the most important factor for the social organization for
the males at Notre Dame (57). Residence also exerts an important influence on the
males at Michigan (67). This is starkly different from what we observed for these
institutions in the Full, Student, and Female networks (and would seem to be something interesting to investigate more thoroughly in the future using other data and
methods). The Male UCF (52), MSU (24), USF (51), Auburn (71), and Maine (59)
networks are strongly influenced by High School. The Male networks at Texas (80),
Rutgers (89), and University of Illinois at Urbana-Champaign (20) stand out from
other universities because of their proximity to the Major vertex.
4.4. Discussion
As described above, we see using the z-scores of the Rand coefficients for demographic characteristics versus algorithmic community assignments that Year is the
strongest organizing factor at most institutions but that Residence is much more
important for the community organization at some institutions than at others. The
correlation with Residence is especially prominent at Rice (31) and Caltech (36).
We also observe that the Male networks tend to be more scattered around Year, as
some institutions exhibit a stronger correlation with Major, whereas others have a
23
stronger correlation with high school. This suggests that there are potential differences in the gender patterns of friendships, which would be interesting to investigate
in future studies with new data. We do not explore this general issue further and
instead attempt to identify interesting comparisons with the results that we obtained
above. Although it is of course impossible to be exhaustive in our observations, we
present all of our assortativity values, regression model coefficients, and communitycomparing z-scores in the tables in Appendix A. We also highlight some interesting
facets of our results.
Of particular interest is the comparison of results from the dyad-level regression
models to those from community-level correlations. We note in particular that the
logistic regression and exponential random graph models that we employed for the
smallest 16 institutions specify that almost all institutions and all of their subnetworks give the highest model coefficient contribution towards a link between nodes
from a common High School. However, as we have seenand which is particularly
evident using the visualizations with tetrahedraat the community level, most institutions are organized by class year and have a relatively small correlation with
high school.
Even in the rare cases in which the rank ordering of the four correlations (with
Year, Residence, Major, and High School) at the community level matches that obtained via dyad-level model coefficients, such as with the logistic regression model for
the Full and Female networks from Caltech (36), the relative sizes of the contributions at the dyad level are completely different from those observed at the community
level. Caltech supplies an illustrative example of the different insights obtained from
24
community-detection versus logistic regression and exponential random graph models

both because of its small size and because of its outlying correlation with dormitory
residence at the community level. A simple interpretation of the apparent dichotomy
between the dyad-level model coefficients and the correlations at the community scale
is that the presence of two students from the same high school at a small institution
like Caltech yields a significant increase in the likelihood of a tie between those students. Even though the corresponding model coefficient is smaller than in any of the
other of the 16 smallest institutions, it is comparable to that for common residence
(called Houses at Caltech). Nevertheless, the very small number of node pairs at
Caltech that have the same high school relative to the total number of node pairs has
a very small effect at the community level, as the algorithmically obtained communities are correlated overwhelmingly with House affiliation. The ERGM result with
triangle contributions makes this distinction even more striking, as the common high
school coefficient is actually larger than the coefficient from common House.
We also observe other features that might be worthy of future investigation using
other data sets and methodologies. We report the results of our calculations in depth
in Tables A.1A.5. Here we highlight only a few potentially interesting examples in
which different methods or different subnetworks yield apparently different qualitative conclusions. For example, we found that Major is the second most important
factor for the organization of the communities in all of the Oberlin (44) networks,
but only for the Full and Male networks does the logistic regression give the second
highest coefficient for Major. We also observed that the relative ordering of Major at
the same institution is sometimes gender-dependent. For example, Major gives the
25
second largest z-score in the Female and Male networks of Stanford (3), but it gives
the fourth largest z-score in Stanfords Full network. Even more interesting, Major
gives the second largest z-score for the Female network at UVA (16), the third largest
z-score for UVAs Male network, and the fourth largest z-score for its Full network.
The communities in the Auburn (71) Female network are dominated by Residence,
but those in the other Auburn networks are not. Similarly, the communities in the
MIT (8) Male network are dominated by Residence, but those in the other MIT (8)
networks are not. Another interesting disparity based on gender occurs in the communities in the Tennessee (95) Full and Student networks, which have their second
largest contributions from High School, whereas those in the other two Tennessee
networks have their second largest contributions from Residence.
5. Conclusions
We have studied the social structure of Facebook friendship networks at one
hundred American institutions at a single point in time (using data from September
2005). To compare the organizations of the 100 institutions using categorical data,
we considered both microscopic and macroscopic perspectives. In particular, calculating assortativity coefficients and regression model coefficients based on observed
ties allows one to examine homophily at the local level, and algorithmic community
detection allows a complementary macroscopic picture. These approaches complement each other, providing different perspectives on investigations of these Facebook
networks. Such complementary calculations are particularly valuable when the microscopic and macroscopic perspectives identify different dominant contributions.
For example, in the Caltech networks, the assumed ground truth of the importance
26
of the House system is captured better by computing community structure.

This real-world ensemble of 100 networks formed by ostensibly similar mechanisms has the potential to provide a testing ground for various models of network
formation. Because of the useful comparisons such an ensemble of data can facilitate, this data will similarly be useful for studies of dynamic processes on networks,
algorithmic community detection, and so on. Because of the different rates of initial
Facebook adoption at different institutions, the single point in time represented by
the data might usefully describe different stages in the formation of an online social
network. In order to pursue such ideas further, one needs to start by studying the
networks for their own sake and comparing their structures. This was the goal of the
present paper. In particular, we have identified some of the key differences across
these 100 realizations of online social networks.
Some of our observations confirm conventional wisdom or are intuitively clear,
providing soft verification of our investigation via expected results. For example,
we found that class year is often important, Houses are important at Caltech, and
high school plays a greater role in the social organization of large universities than
it does at smaller institutions (where there are typically fewer pairs of people from
the same high school). Other results are quite fascinating and merit further investigation. In particular, the differences in the community structures of the female-only
and male-only networks would be interesting to investigate in both offline and online
settings. The Facebook data suggests that women are typically more likely to have
friends within their common residence (among the demographic data to which we
have access) but that the characteristics in the communities in the male-only net-
27
works exhibit a wider variation. Investigating this thoroughly would require different
data sets and methodologies, especially if one wishes to discern the causes of such
friendships from observed correlations.
The Facebook networks that we study offer imperfect representations of corresponding real-life social networks, which have different properties from online social
networks. It is thus crucial that our results are complemented by studies of the
corresponding real networks in order to quantify the extent of such differences.
Acknowledgements
We thank Adam DAngelo and Facebook for providing the data used in this
study. We also acknowledge Sandra Gonzalez-Bailon and Erik Kelsic for useful discussions. We thank Christina Frost for developing some of the graph visualization
code that we used (available at http://netwiki.amath.unc.edu/VisComms). ALT
was funded by the NSF through the UNC AGEP (NSF HRD-0450099) and by the
UNC ECHO program. PJM was funded by the NSF (DMS-0645369) and the UNC
ECHO program. MAP acknowledges a research award (#220020177) from the James
S. McDonnell Foundation.
28
Figure 1: Largest connected component of the student-only subset of the Reed College Facebook
network. (We used a Fruchterman and Reingold [1991] visualization.) Different node shapes and
gray scale indicate different class years (gray circles denote users who did not identify an affiliation), and the edges are randomly shaded for easy viewing. Clusters of nodes with the same
grayscale/shape suggest that common class year has an important effect on the aggregate Facebook
structure.
29
Figure 2: [Color] (Left) Vizualization of community structure of the Reed College Student Facebook
network shown in Figure 1. Node shapes and colors indicate class year (gray dots denote users who
did not identify an affiliation), and the edges are randomly shaded for easy viewing. We place
the communities using a Fruchterman and Reingold [1991] layout and use a Kamada and Kawai
[1989] layout to position the nodes within communities [Traud et al., 2009]. (Right) The same
network layout but with each community depicted as a pie. Larger pies represent communities
with larger numbers of nodes. Darker edges indicate the presence of more connections between the
corresponding communities.
30
60
36
36
2
44
60
60
Model Coefficients
Model Coefficients
60
44
()Edges
Year
Residence
High School
Major
()Edges
Year
Full Networks
High School
Major
60
60
36
36
44
Model Coefficients
Model Coefficients
Residence
Student Networks
36
36
3
36
2
85
()Edges
Year
Residence
High School
Major
Female Networks
()Edges
Year
Residence
High School
Major
Male Networks
Figure 3: Box plots (indicating median, quartiles, extent, and outliers of the distribution) of the
logistic regression nodematch coefficients for the 16 smallest institutions in the data for the model
described in the main text. We plot the edges values to present results with greater resolution.
We separately present our results for the Full, Student, Female, and Male networks.
31
60
4
3
Model Coefficients
Model Coefficients
60
36
2
1
36
3
2
60
81
0
1
85
()Edges
Triangles
Year
Residence
High School
Major
()Edges
Triangles
Full Networks
6
High School
Major
5
60
44 85
98
Model Coefficients
Model Coefficients
Residence
3
2
36
4
3
2
Year
Student Networks
()Edges
Triangles
Year
Residence
High School
Major
Female Networks
()Edges
Triangles
Year
Residence
High School
Major
Male Networks
Figure 4: Box plots (indicating median, quartiles, extent, and outliers of the distribution) of the
exponential random graph model coefficients described in the main text for the 16 smallest institutions in the data. We plot the edges values to present results with greater resolution. We
separately present our results for the Full, Student, Female, and Male networks.
32
1.1
.9
.7
.5
.3
.1
d [0,0.1): 63 cases
Difference From Year
High School
d [.1,.2): 25 cases
d [.2,.3): 2 cases
d [.3,.4): 3 cases
d .4459: FSU 53
d [.5,.6): 3 cases
d .6250: Texas 84
d .7971: Auburn 71
d .8283: Texas 80
51
Year
80
Major
97
71
60
51
68
High School (0.18)

59
31 36
Residence
84
.2
.1
Major (0.19)
Year
.3
95
80
53
21
35
44
66
50
28
85
38
65
76
97
71
42
Residence (0.36)
67
Figure 5: [Color online] (Upper Left) Social organization tetrahedron for the community structures
of the Full component (largest connected component) of the networks for each of the 100 institutions.
Lighter disks indicate an organization that is based more predominantly on class year. See the main
text for a description of this figure. (Lower Right) Magnification near the Year vertex. The legend
illustrates the disk size as a function of the maximum distance d between the 6 different partitions
of the network. Most cases (88 out of 100 institutions) have d < .2.
33
1.1
.9
.7
.5
.3
.1
High School
d [0,0.1): 72 cases
d [.1,.2): 16 cases
d [.2,.3): 5 cases
d .3126: Maine 59
d .4084: USF 51
d [.6, .7): 4 cases
d .7244: Texas 80
51
59
Year
21
38
Major
44
68
High School (0.19)
36 31
51
.22
Residence
.1467
.0733
59
21
95
Year
3
6
80
7
9
2 1
5
Major (0.19)
64
8
44
97
71
67
38
65
60
Residence (0.58)
of the Student component of the networks for each of the 100 institutions. Lighter disks indicate an
organization that is based more predominantly on class year. See the main text for a description
of this figure. (Lower Right) Magnification near the Year vertex. As in Figure 5, the disk sizes
correspond to the maximum distances between partitions.
34
High School
1.1
.9
.7
.5
.3
.1
d [0,0.1): 50 cases
d [.1,.2): 30 cases
d [.2,.3): 7 cases
d [.3,.4): 4 cases
d .4418: UCF 52
d .5772: Oklahoma 97
d [.6,.7): 3 cases
d [.7, .8): 2 cases
d .8666: UF 21
d .9314: Texas 84
Year
21
Major
35 38
71
97
68
36
31
High School (0.41)

51
Residence
.3
53
21
.2
80
50
Major (0.38)
Year
84
95
.1
59
14
3
6
7
9 2
10
5 8
44
38
3566
67
71
Residence (0.36)
97
of the Female component of the networks for each of the 100 institutions. Lighter disks indicate an
organization that is based more predominantly on class year. See the main text for a description
of this figure. (Lower Right) Magnification near the Year vertex. As in the two previous figures,
the disk sizes indicate the maximum distances between partitions.
35
1.1
.9
.7
.5
.3
.1
d [0,0.1): 31 cases
High School
d [.1,.2): 35 cases
d [.2,.3): 14 cases
d [.3,.4): 9 cases
d [.4,.5): 4 cases
d [.5,.6): 3 cases
d .6542: UIllinois 20
95
71
59
Year
Major
97
57 68
High School (0.47)
Residence
95
.3
.2
71
.1
24
52
59
36 31
51
80
20
66
Year69
35
97
Major (0.34)
50
13
5
34
42
67
89
25
43
39
44
38
Residence (0.37)
of the Male component of the networks for each of the 100 institutions. Lighter disks indicate an
organization that is based more predominantly on class year. See the main text for a description of
this figure. (Lower Right) Magnification near the Year vertex. As in the three previous figures, disk
size indicates the maximum distance between partitions. We note that there are more d > .2 cases
here than in the previous figures, which illustrates the greater variability in the relative positions
of the z-scores in the different Male networks than was the case for the Full, Student, and Female
networks.
36
Appendix A. Tables
In Table A.1, we give for each of the 100 institutions the numbers of nodes
and edges for each of the Facebook networks (and subsets thereof) that we have
investigated. In Table A.2, we give the assortativity values for each of the networks.
For each institution, we calculate assortativity values for Gender only for the Full
and Student network subsets. We calculate Major, Residence, Year, and High School
assortativity values for each of the four network subsets (Full, Student, Female, and
Male).
Recall that we studied regression models for the 16 institutions with the smallest
Facebook networks. In Table A.4, we report the results of a logistic regression model
with edge and nodematch terms. (All coefficients differ from zero with p-values
less than 1 104 .) In Table A.5, we similarly report the results of an ERGM that
supplements the logistic regression model with triangle terms. (Again, all resulting
model coefficients differ from zero with a p-value of less than 1 104 .)
In Table A.5, we report the maximum z-score for each demographic category that
we obtained from the 6 different community detection partitions (described in the
text) of each Facebook network (and their subsets) compared to categorical partitions
based on each of Major, Residence, Year, and High School. We divide the networks
in this table into five sections: (1) networks for which the High School category gives
the highest z-score; (2) networks for which the Residence category gives the highest
z-score; (3) networks for which Year gives the highest z-score and High School gives
the second highest; (4) networks for which Year gives the highest z-score and Major
gives the second highest; and (5) networks for which Year gives the highest z-score
37
and Residence gives the second highest.
38
Table A.1: Characteristics for each of the networks and subnetworks: institution name, the identifying number given by Facebook, number of nodes in each network and subnetwork, and the
number of edges in each network and subnetwork.
39
Institution
Harvard
Columbia
Stanford
Yale
Cornell
Dartmouth
UPenn
MIT
NYU
BU
Brown
Princeton
Berkeley
Duke
Georgetown
UVA
BC
Tufts
Northeastern
U Illinios
UF
Wellesley
Michigan
MSU
Northwestern
UCLA
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Continued on Next Page. . .
Nodes (Full, Student, Female, Male)

(15086, 7425, 5865, 6850)
(11706, 8057, 5864, 4209)
(11586, 7183, 4562, 5501)
(8561, 5405, 3572, 3891)
(18621, 12843, 8028, 8538)
(7677, 4705, 3052, 3417)
(14888, 10106, 6405, 6625)
(6402, 4283, 2298, 3359)
(21623, 17039, 11723, 7822)
(19666, 15391, 10914, 7124)
(8586, 6038, 3914, 3657)
(6575, 4496, 2701, 3095)
(22900, 18376, 10848, 9694)
(9885, 6681, 4280, 4577)
(9388, 6365, 4379, 3937)
(17178, 12453, 8327, 7182)
(11498, 8684, 5565, 4999)
(6672, 4892, 3197, 2818)
(13868, 12133, 6667, 6050)
(30795, 25385, 13899, 14663)
(35111, 27343, 17945, 14777)
(2970, 2207, 2653, 22)
(30106, 23164, 13846, 13473)
(32361, 26786, 16635, 13193)
(10537, 7730, 4948, 4591)
(20453, 16571, 10446, 8029)
Edges (Full, Student, Female, Male)

(824595, 404415, 173639, 187742)
(444295, 296971, 135234, 76037)
(568309, 345561, 132904, 135932)
(405440, 258886, 85133, 95992)
(790753, 511386, 203303, 171118)
(304065, 176665, 68675, 70858)
(686485, 446037, 172277, 150449)
(251230, 158838, 58906, 70094)
(715673, 542431, 211226, 118898)
(637509, 486545, 207332, 96593)
(384519, 245521, 92083, 74005)
(293307, 190257, 69195, 64679)
(852419, 630929, 234714, 161454)
(506437, 343382, 134610, 114931)
(425619, 272625, 102398, 82406)
(789308, 536625, 243621, 148532)
(486961, 345943, 126788, 95907)
(249722, 168309, 70154, 47561)
(381920, 323478, 102143, 71331)
(1264421, 1000965, 375286, 276147)
(1465654, 1075152, 483889, 265983)
(94899, 63727, 78002, 120)
(1176489, 848003, 328382, 246890)
(1118767, 898385, 328898, 192714)
(488318, 349543, 145552, 96843)
(747604, 577811, 228164, 128975)
Table A.1 Continued from previous page
40
Institution
Emory
UNC
Tulane
UChicago
Rice
WashU
UC
UCSD
USC
Caltech
UCSB
Rochester
Bucknell
Williams
Amherst
Swarthmore
Wesleyan
Oberlin
Middlebury
Hamilton
Bowdoin
Vanderbilt
Carnegie
UGA
USF
UCF
FSU
GWU
Johns Hopkins
Number
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55

(7449, 5781, 3851, 2926)
(18158, 14217, 9616, 6996)
(7740, 5901, 3741, 3337)
(6561, 4414, 2791, 2955)
(4083, 2895, 1800, 1973)
(7730, 5737, 3658, 3441)
(16800, 14702, 8533, 6853)
(14936, 13015, 7430, 6187)
(17440, 13514, 7962, 7858)
(762, 543, 217, 459)
(14917, 12658, 7851, 5850)
(4561, 3674, 2040, 2190)
(3824, 3082, 1929, 1632)
(2788, 2029, 1315, 1204)
(2235, 1643, 1009, 1012)
(1657, 1257, 766, 744)
(3591, 2736, 1671, 1487)
(2920, 2364, 1471, 1139)
(3069, 2363, 1477, 1293)
(2312, 1831, 1128, 989)
(2250, 1734, 1043, 993)
(8063, 5849, 3798, 3530)
(6621, 4973, 2399, 3594)
(24380, 19381, 13350, 9234)
(13367, 12285, 7229, 5062)
(14936, 13735, 7796, 6404)
(27731, 22949, 15031, 10885)
(12164, 9261, 6235, 4807)
(5157, 3930, 2099, 2546)

(330008, 244456, 111924, 55536)
(766796, 570192, 240130, 131304)
(283912, 204485, 92290, 51763)
(208088, 132259, 48371, 46236)
(184826, 121648, 43119, 45274)
(367526, 262403, 106564, 76825)
(522141, 431035, 154626, 92905)
(443215, 368225, 129064, 83237)
(801851, 585374, 232975, 163575)
(16651, 11508, 2349, 6266)
(482215, 389090, 154411, 74414)
(161403, 120921, 42081, 37381)
(158863, 121538, 53049, 28053)
(112985, 76797, 27967, 24866)
(90954, 62252, 22374, 19398)
(61049, 41869, 14968, 13689)
(138034, 98758, 35448, 24262)
(89912, 64203, 24174, 15464)
(124607, 85848, 32059, 24577)
(96393, 70744, 27068, 19901)
(84386, 61309, 20931, 17437)
(427829, 304350, 136857, 81976)
(249959, 172299, 56588, 67771)
(1174051, 893735, 436380, 177771)
(321209, 284813, 93302, 49271)
(428987, 373759, 137897, 77479)
(1034799, 799849, 347239, 167004)
(469511, 347323, 131028, 88642)
(186572, 136555, 48265, 44544)
41
Institution
Syracuse
Notre Dame
Maryland
Maine
Smith
UC
Villanova
Virginia
UC
Cal
Mississippi
Michigan
UCSC
Indiana
Vermont
Auburn
USFCA
Wake
Santa
American
Haverford
Williams
MU
JMU
Texas
Simmons
Bingham
Temple
Texas
Number
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84

(13640, 10756, 7043, 5489)
(12149, 9035, 6018, 5145)
(20829, 17651, 9541, 9611)
(9065, 8031, 4583, 3714)
(2970, 2322, 2596, 18)
(13736, 11904, 6394, 5919)
(7755, 6022, 3680, 3260)
(21319, 17509, 8584, 11053)
(6810, 6253, 3210, 2918)
(11243, 10093, 4903, 5581)
(10519, 8698, 5193, 4535)
(3745, 3241, 921, 2578)
(8979, 8022, 4653, 3586)
(29732, 24401, 14768, 12547)
(7322, 6397, 3942, 2675)
(18448, 15699, 9034, 8227)
(2672, 2410, 1681, 763)
(5366, 4060, 2525, 2422)
(3578, 3011, 1902, 1471)
(6370, 5142, 3641, 2219)
(1446, 1125, 727, 616)
(6472, 5068, 3284, 2621)
(15425, 13377, 8016, 6341)
(14070, 12160, 8427, 4762)
(31538, 25867, 15571, 13541)
(1510, 1302, 1399, 0)
(10001, 8222, 4590, 4614)
(13653, 12404, 7112, 5262)
(36364, 30182, 17556, 0)

(543975, 403646, 181071, 84908)
(541336, 386160, 158766, 118013)
(744832, 595877, 204673, 156394)
(243245, 196814, 64780, 45544)
(97133, 64949, 75830, 24)
(442169, 350186, 112232, 87103)
(314980, 248763, 100132, 54946)
(698175, 541632, 174033, 162409)
(155320, 137662, 38981, 31333)
(351356, 300118, 88615, 78266)
(610910, 478908, 204081, 107035)
(81901, 63490, 11813, 31325)
(224578, 194833, 66048, 36442)
(1305757, 1029487, 380700, 229919)
(191220, 159707, 63460, 25651)
(973918, 774952, 349929, 154251)
(65244, 57006, 26725, 7213)
(279186, 207772, 87714, 54047)
(151747, 123252, 48015, 25950)
(217654, 168330, 74317, 33897)
(59589, 46373, 17287, 11671)
(266378, 195605, 82338, 50639)
(649441, 532098, 227362, 114203)
(485564, 400307, 182959, 57190)
(1219639, 952918, 398776, 219953)
(32984, 27885, 30177, 0)
(362892, 270202, 89912, 75552)
(360774, 316028, 99928, 55747)
(1590651, 1209367, 459165, 0)
42
Institution
Vassar
Pepperdine
Wisconsin
Colgate
Rutgers
Howard
UConn
UMass
Baylor
Penn
Tennessee
Lehigh
Oklahoma
Reed
Brandeis
Trinity
Number
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100

(3068, 2353, 1688, 1084)
(3440, 2663, 1858, 1345)
(23831, 19598, 12059, 9840)
(3482, 2702, 1691, 1471)
(24568, 20636, 11803, 10662)
(4047, 3478, 2531, 1302)
(17206, 14746, 8443, 7430)
(16502, 14183, 8040, 7148)
(12799, 10287, 7025, 4929)
(41536, 35753, 18179, 20013)
(16977, 14303, 8342, 7408)
(5073, 4144, 2060, 2645)
(17420, 14586, 8164, 7870)
(962, 803, 496, 348)
(3887, 3003, 1981, 1511)
(2613, 2065, 1222, 1139)

(119161, 86464, 36200, 17250)
(152003, 113352, 52811, 25144)
(835946, 649051, 243289, 157022)
(155043, 110916, 45592, 27734)
(784596, 613950, 209893, 160699)
(204850, 172360, 63446, 28308)
(604867, 477272, 164460, 114877)
(519376, 415863, 138884, 86903)
(679815, 514816, 241420, 109488)
(1362220, 1080608, 330980, 306922)
(770658, 611236, 242648, 138326)
(198346, 153623, 52837, 43734)
(892524, 709698, 284279, 170890)
(18812, 14133, 5334, 2984)
(137561, 98346, 38842, 23790)
(111996, 80946, 29608, 21042)
Table A.2: Assortativity values for each category for each of the
4 networks (Full, Student, Female, and Male) for each of the 100
institutions. We only calculate assortativity by Gender for the Full
and Student networks. (We leave blank spots in the corresponding
table entries for the Male and Female networks.)
Institution No.
Harvard 1
Gender
Major
Residence
Year
High School
Columbia 2
Gender
Major
Residence
Year
High School
Stanford 3
Gender
Major
Residence
Year
High School
Yale 4
Gender
Major
Residence
Year
High School
Cornell 5
Gender
Major
Residence
Year
High School
Dartmouth 6
Gender
Major
Residence
Year
High School
Continued on Next
Full
Student
Female
Male
0.058144
0.056293
0.14679
0.47981
0.023132
0.049178
0.046659
0.11951
0.60723
0.02419
0.051852
0.13803
0.4871
0.024247
0.064064
0.15431
0.44035
0.026473
0.087283
0.045257
0.13271
0.51348
0.029259
0.085847
0.036112
0.13551
0.6002
0.030061
0.043728
0.1625
0.55303
0.03254
0.06024
0.14249
0.47743
0.028501
0.056583
0.048574
0.12067
0.44456
0.021472
0.049545
0.033901
0.10887
0.54456
0.023851
0.042221
0.1499
0.43978
0.022906
0.058083
0.16531
0.40632
0.022649
0.036704
0.041703
0.26727
0.48308
0.018269
0.031144
0.046659
0.11951
0.60723
0.02419
0.041228
0.27204
0.52242
0.019705
0.044829
0.26567
0.43417
0.020295
0.090725
0.10367
0.25426
0.47504
0.033164
0.0879
0.095703
0.23819
0.56588
0.03543
0.10503
0.35124
0.47434
0.029579
0.10218
0.34471
0.42828
0.037021
0.10284
0.03729
0.17773
0.49014
0.014366
Page. . .
0.062793
0.029281
0.12551
0.61052
0.015213
0.037923
0.24733
0.53787
0.015285
0.039882
0.28336
0.41358
0.014707
43

Institution
Full
Student
Female
UPenn 7
Gender
0.090547
0.082236
Major
0.057783
0.052869
0.059828
Residence
0.26299
0.23519
0.34473
Year
0.49567
0.58593
0.51831
High School
0.031771
0.034454
0.032007
MIT 8
Gender
0.12547
0.12123
Major
0.064428
0.050336
0.055101
Residence
0.22879
0.21894
0.2289
Year
0.36162
0.44011
0.38954
High School
0.01376
0.01468
0.010054
NYU 9
Gender
0.0031371 0.0075726
Major
0.1268
0.12332
0.13444
Residence
0.18013
0.18231
0.20598
Year
0.55774
0.63339
0.6041
High School
0.040848
0.042842
0.047228
BU 10
Gender
0.020528
0.0085149
Major
0.075268
0.067428
0.088444
Residence
0.16527
0.16702
0.18178
Year
0.53835
0.60101
0.55882
High School
0.033229
0.035644
0.035456
Brown 11
Gender
0.028344
0.022871
Major
0.037606
0.031711
0.036197
Residence
0.14005
0.11967
0.15728
Year
0.49714
0.5805
0.53351
High School
0.024364
0.026467
0.026364
Princeton 12
Gender
0.065004
0.056889
Major
0.047399
0.041528
0.047898
Residence
0.087218
0.08736
0.094754
Year
0.49472
0.58005
0.5055
High School
0.019708
0.021743
0.018108
Berkeley 13
Gender
0.05132
0.049543
Major
0.067516
0.060843
0.062582
Residence
0.22188
0.22188
0.28492
Year
0.38881
0.44605
0.41394
44
Male
0.062728
0.34866
0.41714
0.034844
0.067387
0.34262
0.28538
0.016056
0.14657
0.19867
0.50102
0.043782
0.073732
0.16631
0.49861
0.037717
0.049159
0.15337
0.44248
0.025049
0.051068
0.096722
0.47155
0.024282
0.081518
0.24591
0.35734

Institution
Full
Student
Female
High School
0.093854
0.10511
0.09404
Duke 14
Gender
0.10142
0.09467
Major
0.044488
0.038852
0.04269
Residence
0.15759
0.14444
0.20203
Year
0.50438
0.59617
0.51913
High School
0.017841
0.018765
0.0173
Georgetown 15
Gender
0.0145
0.0144
Major
0.043888
0.039255
0.051745
Residence
0.17252
0.16217
0.18518
Year
0.55753
0.63132
0.61052
High School
0.023726
0.025272
0.027982
UVA 16
Gender
0.09671
0.092075
Major
0.052832
0.044482
0.05371
Residence
0.24736
0.22987
0.39102
Year
0.45702
0.54055
0.4704
High School
0.080388
0.090883
0.07725
BC 17
Gender
0.014488
0.018241
Major
0.042043
0.039347
0.056971
Residence
0.13948
0.13585
0.18835
Year
0.65284
0.7109
0.69439
High School
0.03104
0.033057
0.034607
Tufts 18
Gender
0.062789
0.058108
Major
0.041948
0.036881
0.042922
Residence
0.12883
0.1301
0.14631
Year
0.49957
0.5624
0.52421
High School
0.018698
0.019595
0.017244
Northeastern 19
Gender
0.0060778 0.0090892
Major
0.11008
0.11148
0.15408
Residence
0.19165
0.18973
0.24407
Year
0.45301
0.49285
0.45307
High School
0.04064
0.04198
0.039224
UIllinios 20
Gender
0.11274
0.1107
Major
0.056579
0.049491
0.056117
Residence
0.30805
0.29699
0.46106
45
Male
0.10399
0.047729
0.22614
0.45159
0.01879
0.049058
0.18187
0.50492
0.033592
0.054598
0.2914
0.41752
0.085465
0.038293
0.16084
0.61562
0.049044
0.040925
0.14288
0.46043
0.022329
0.11922
0.20364
0.46986
0.051935
0.063856
0.38529

Institution
Full
Student
Female
Year
0.40105
0.44748
0.40391
High School
0.17099
0.18571
0.15955
UF 21
Gender
0.080715
0.086848
Major
0.048888
0.033266
0.051199
Residence
0.1722
0.16487
0.26309
Year
0.33037
0.3805
0.33527
High School
0.19396
0.21222
0.18286
Wellesley 22
Gender
0.24612
0.34984
Major
0.036528
0.030181
0.036367
Residence
0.12412
0.12657
0.12957
Year
0.42758
0.50529
0.43504
High School
0.011628
0.011878
0.01156
Michigan 23
Gender
0.075279
0.074023
Major
0.066496
0.058583
0.066627
Residence
0.24729
0.23608
0.34287
Year
0.4287
0.4834
0.4614
High School
0.1341
0.14867
0.13738
MSU 24
Gender
0.0062134
0.0026391
Major
0.044483
0.035909
0.051764
Residence
0.20243
0.19487
0.26554
Year
0.36438
0.39615
0.40368
High School
0.21165
0.22566
0.22716
Northwestern 25
Gender
0.10459
0.090528
Major
0.096123
0.092399
0.090561
Residence
0.25352
0.23384
0.34364
Year
0.4148
0.48
0.42714
High School
0.02089
0.022152
0.019596
UCLA 26
Gender
0.030467
0.023894
Major
0.050995
0.046488
0.0519
Residence
0.23154
0.20795
0.31686
Year
0.39128
0.44527
0.41189
High School
0.084865
0.091624
0.088625
Emory 27
Gender
0.092473
0.077871
Major
0.030405
0.026256
0.031341
46
Male
0.36043
0.19005
0.051894
0.24929
0.31326
0.19965
0
0
0
0
0.074332
0.28886
0.3765
0.14946
0.048454
0.28035
0.32903
0.2291
0.096049
0.32476
0.33697
0.021729
0.056997
0.32916
0.33708
0.089885
0.028936

Institution
Full
Student
Female
Residence
0.22074
0.2051
0.28108
Year
0.4804
0.54765
0.48275
High School
0.021119
0.022094
0.020682
UNC 28
Gender
0.059837
0.054977
Major
0.051147
0.03949
0.055363
Residence
0.20244
0.18547
0.29838
Year
0.39641
0.44001
0.43188
High School
0.13418
0.14774
0.14124
Tulane 29
Gender
0.10083
0.089719
Major
0.052579
0.046683
0.042796
Residence
0.35296
0.31314
0.52709
Year
0.43938
0.49969
0.44421
High School
0.020694
0.022112
0.017922
UChicago 30
Gender
0.045819
0.02327
Major
0.053921
0.042612
0.048741
Residence
0.2979
0.29065
0.34267
Year
0.36493
0.44342
0.38316
High School
0.016078
0.017018
0.016629
Rice 31
Gender
0.030086
0.037858
Major
0.055225
0.053592
0.057052
Residence
0.48463
0.50373
0.48341
Year
0.31044
0.36622
0.34153
High School
0.01626
0.017492
0.01625
WashU 32
Gender
0.093908
0.078041
Major
0.040688
0.036203
0.042983
Residence
0.16649
0.16153
0.16449
Year
0.51858
0.60038
0.49102
High School
0.018106
0.019508
0.01696
UC 33
Gender
0.020505
0.017157
Major
0.039329
0.036344
0.041681
Residence
0.38242
0.36102
0.56732
Year
0.45403
0.50143
0.46414
High School
0.10514
0.11384
0.10954
UCSD 34
Gender
0.030454
0.023472
47
Male
0.31422
0.42816
0.020963
0.052124
0.2459
0.32994
0.12872
0.059948
0.45224
0.36371
0.029477
0.063178
0.32858
0.32378
0.016953
0.061407
0.50887
0.28657
0.016986
0.038292
0.20308
0.46872
0.019846
0.044466
0.48007
0.42992
0.12326

Institution
Full
Student
Female
Major
0.035369
0.031125
0.036381
Residence
0.34879
0.35474
0.39866
Year
0.46907
0.52443
0.48974
High School
0.093135
0.09945
0.095086
USC 35
Gender
0.086128
0.082815
Major
0.089529
0.085723
0.085458
Residence
0.23664
0.22404
0.36691
Year
0.38035
0.4421
0.37387
High School
0.047729
0.051794
0.043397
Caltech 36
Gender
0.053988
0.063652
Major
0.038181
0.032153
0.037191
Residence
0.44862
0.4261
0.39713
Year
0.26941
0.32452
0.27821
High School
0.0021083
0.0013258
0.0045746
UCSB 37
Gender
0.0032421 0.0082636
Major
0.043069
0.037063
0.042058
Residence
0.28977
0.27745
0.35618
Year
0.45738
0.50761
0.45584
High School
0.062972
0.065716
0.066477
Rochester 38
Gender
0.075802
0.062384
Major
0.073274
0.075311
0.062719
Residence
0.26009
0.25573
0.29881
Year
0.43413
0.50658
0.43889
High School
0.01863
0.020022
0.019074
Bucknell 39
Gender
0.11681
0.089238
Major
0.049732
0.045376
0.042046
Residence
0.19656
0.19363
0.20697
Year
0.52877
0.59216
0.52878
High School
0.011668
0.012164
0.0096712
Williams 40
Gender
0.070636
0.061434
Major
0.034038
0.031456
0.033924
Residence
0.12502
0.1327
0.13728
Year
0.50961
0.59507
0.53198
High School
0.011862
0.012397
0.012915
Amherst 41
48
Male
0.040088
0.40003
0.43005
0.10091
0.096026
0.31081
0.3378
0.055851
0.03799
0.48219
0.26326
0.0022829
0.054137
0.39801
0.44318
0.070505
0.087977
0.298
0.38851
0.017675
0.0633
0.24857
0.46584
0.011993
0.038403
0.12653
0.45584
0.011736

Institution
Full
Student
Female
Gender
0.059762
0.064803
Major
0.032494
0.027742
0.024605
Residence
0.07939
0.081067
0.093603
Year
0.46484
0.5633
0.5028
High School
0.0096515
0.010387
0.0081754
Swarthmore 42
Gender
0.066274
0.057145
Major
0.042928
0.035928
0.040775
Residence
0.1125
0.10938
0.12065
Year
0.371
0.44168
0.41634
High School
0.0032133
0.0033519
0.0026259
Wesleyan 43
Gender
0.035248
0.029464
Major
0.052135
0.045478
0.046273
Residence
0.12099
0.12786
0.13057
Year
0.46709
0.53116
0.49504
High School
0.01814
0.018384
0.017886
Oberlin 44
Gender
0.020251
0.019512
Major
0.1092
0.10563
0.12493
Residence
0.14628
0.15053
0.17002
Year
0.33547
0.38621
0.36911
High School
0.011915
0.012102
0.012714
Middlebury 45
Gender
0.039529
0.042807
Major
0.038122
0.031508
0.04139
Residence
0.1809
0.18998
0.1993
Year
0.51295
0.58057
0.54598
High School
0.015759
0.016164
0.01597
Hamilton 46
Gender
0.091762
0.08225
Major
0.0328
0.030457
0.03215
Residence
0.11161
0.11338
0.1298
Year
0.45166
0.54088
0.49397
High School
0.010438
0.010566
0.0087614
Bowdoin 47
Gender
0.042728
0.032009
Major
0.031993
0.028842
0.028737
Residence
0.11211
0.11812
0.15247
Year
0.51385
0.58252
0.55795
High School
0.013362
0.01407
0.013032
49
Male
0.03902
0.077428
0.40988
0.013311
0.054311
0.11301
0.32337
0.001478
0.067817
0.13583
0.42467
0.020264
0.1097
0.13695
0.29632
0.010669
0.03541
0.18188
0.47478
0.018134
0.030141
0.12034
0.37736
0.01371
0.038996
0.10406
0.44429
0.015678

Institution
Full
Student
Female
Vanderbilt 48
Gender
0.15914
0.15295
Major
0.057808
0.050729
0.048375
Residence
0.22425
0.20099
0.3496
Year
0.48666
0.56071
0.48428
High School
0.019962
0.020536
0.015998
Carnegie 49
Gender
0.098085
0.092743
Major
0.15093
0.1519
0.15089
Residence
0.18273
0.17075
0.21612
Year
0.39268
0.4658
0.38505
High School
0.016876
0.018754
0.011115
UGA 50
Gender
0.10448
0.10731
Major
0.034962
0.02641
0.035481
Residence
0.35648
0.33619
0.46957
Year
0.36497
0.40394
0.39145
High School
0.18348
0.19198
0.16532
USF 51
Gender
0.075474
0.078515
Major
0.032853
0.030439
0.042756
Residence
0.19188
0.18663
0.29587
Year
0.27191
0.28617
0.30904
High School
0.14244
0.14936
0.16314
UCF 52
Gender
0.028764
0.021757
Major
0.034455
0.031307
0.035413
Residence
0.19772
0.18732
0.31224
Year
0.31247
0.33465
0.30132
High School
0.14418
0.15322
0.13304
FSU 53
Gender
0.039309
0.03696
Major
0.047367
0.0384
0.051585
Residence
0.25252
0.23802
0.36324
Year
0.31389
0.35125
0.32971
High School
0.14471
0.16133
0.14871
GWU 54
Gender
0.028096
0.011395
Major
0.047802
0.041989
0.050596
Residence
0.16993
0.17085
0.2082
Year
0.51408
0.58898
0.53626
50
Male
0.069603
0.27768
0.43518
0.026794
0.14232
0.28937
0.38927
0.023089
0.041416
0.45489
0.31287
0.19317
0.037232
0.27454
0.2527
0.16121
0.041218
0.25123
0.30039
0.17058
0.054106
0.41138
0.26617
0.13913
0.048212
0.16795
0.42325

Institution
Full
Student
Female
High School
0.02117
0.022365
0.024065
Johns Hopkins 55
Gender
0.097163
0.083004
Major
0.072487
0.06961
0.061883
Residence
0.11328
0.10975
0.12769
Year
0.43519
0.50643
0.39671
High School
0.013418
0.013674
0.0096795
Syracuse 56
Gender
0.062272
0.049058
Major
0.08486
0.08303
0.08314
Residence
0.29631
0.26958
0.42636
Year
0.46546
0.52061
0.46121
High School
0.027175
0.029138
0.025941
Notre Dame 57
Gender
0.13322
0.13636
Major
0.052909
0.046047
0.061221
Residence
0.25385
0.25489
0.39752
Year
0.54048
0.59216
0.5906
High School
0.029735
0.031858
0.031986
Maryland 58
Gender
0.055805
0.050381
Major
0.059522
0.054895
0.059657
Residence
0.21008
0.19921
0.29692
Year
0.43647
0.47709
0.45585
High School
0.14769
0.15776
0.13898
Maine 59
Gender
0.0048684 0.0044251
Major
0.070152
0.067908
0.086998
Residence
0.19745
0.18997
0.23231
Year
0.26187
0.28512
0.30453
High School
0.20099
0.21493
0.22691
Smith 60
Gender
0.025215
0.017828
Major
0.053371
0.048253
0.054598
Residence
0.30562
0.29189
0.3179
Year
0.32133
0.39366
0.34014
High School
0.0093412
0.0098405
0.0097773
UC 61
Gender
0.0026005
0.0014892
Major
0.066643
0.062017
0.083427
Residence
0.27335
0.26547
0.34182
51
Male
0.020061
0.078166
0.13263
0.42893
0.01772
0.10169
0.35714
0.40872
0.030692
0.04508
0.50916
0.42931
0.034602
0.069243
0.24207
0.40721
0.1872
0.088474
0.28775
0.2253
0.20329
0
0
0
0
0.059913
0.4018

Institution
Full
Student
Female
Year
0.4115
0.46746
0.43012
High School
0.08413
0.093561
0.090804
Villanova 62
Gender
0.10071
0.096156
Major
0.060202
0.055361
0.071107
Residence
0.16962
0.15806
0.22334
Year
0.61654
0.66335
0.59316
High School
0.02329
0.024489
0.02276
Virginia 63
Gender
0.067095
0.057635
Major
0.060287
0.054583
0.068616
Residence
0.15205
0.14909
0.22211
Year
0.36899
0.41636
0.38663
High School
0.12282
0.13498
0.11567
UC 64
Gender
0.028302
0.037206
Major
0.045181
0.041865
0.048236
Residence
0.26799
0.25085
0.38261
Year
0.37168
0.39993
0.40031
High School
0.072122
0.077137
0.08064
Cal 65
Gender
0.022119
0.016641
Major
0.11423
0.10827
0.12993
Residence
0.29555
0.27884
0.45107
Year
0.37541
0.39621
0.41918
High School
0.070578
0.072898
0.072193
Mississippi 66
Gender
0.11372
0.11882
Major
0.046073
0.036491
0.043854
Residence
0.31288
0.29658
0.50398
Year
0.31098
0.34297
0.35285
High School
0.10962
0.11228
0.08534
Mich 67
Gender
0.057543
0.047547
Major
0.081618
0.07707
0.086468
Residence
0.3366
0.32164
0.45865
Year
0.24825
0.27529
0.2006
High School
0.047638
0.052466
0.040519
UCSC 68
Gender
0.027405
0.031676
Major
0.053702
0.0468
0.05926
52
Male
0.37918
0.096607
0.06106
0.19118
0.58783
0.038226
0.060859
0.19839
0.34308
0.13209
0.054324
0.42637
0.34304
0.089578
0.12329
0.3968
0.31782
0.080967
0.049216
0.48978
0.2597
0.12691
0.093527
0.47499
0.26151
0.056137
0.059811

Institution
Full
Student
Female
Residence
0.46643
0.47961
0.47039
Year
0.45865
0.49464
0.49587
High School
0.067136
0.070717
0.074411
Indiana 69
Gender
0.015087
0.0044208
Major
0.047884
0.038628
0.056678
Residence
0.37129
0.35624
0.53021
Year
0.41219
0.45152
0.45347
High School
0.16625
0.17704
0.16154
Vermont 70
Gender
0.0036621
0.014154
Major
0.055376
0.050502
0.068597
Residence
0.21007
0.19916
0.26119
Year
0.5063
0.54177
0.5371
High School
0.065318
0.06906
0.066523
Auburn 71
Gender
0.094621
0.096364
Major
0.04538
0.036049
0.041933
Residence
0.38947
0.37038
0.49466
Year
0.27767
0.30329
0.29163
High School
0.15038
0.15753
0.12497
USFCA 72
Gender
0.024033
0.028703
Major
0.081763
0.076698
0.1049
Residence
0.26237
0.26274
0.31567
Year
0.47505
0.51439
0.50801
High School
0.025866
0.027336
0.032416
Wake 73
Gender
0.13086
0.11785
Major
0.03659
0.031566
0.02823
Residence
0.22678
0.21241
0.31356
Year
0.41429
0.47786
0.39824
High School
0.015307
0.016306
0.012378
Santa 74
Gender
0.035313
0.025709
Major
0.048632
0.046148
0.047093
Residence
0.18136
0.18756
0.19668
Year
0.45487
0.50032
0.45808
High School
0.03527
0.036976
0.03723
American 75
Gender
0.027141
0.0094
53
Male
0.48138
0.4235
0.07741
0.051613
0.55282
0.33748
0.17705
0.051976
0.27715
0.43136
0.075569
0.0671
0.67726
0.24262
0.17876
0.060111
0.27651
0.50134
0.027827
0.039131
0.30473
0.38007
0.017433
0.059648
0.19262
0.42026
0.058071

Institution
Full
Student
Female
Major
0.051212
0.045396
0.050926
Residence
0.27291
0.25268
0.34858
Year
0.41408
0.45927
0.45175
High School
0.010271
0.010732
0.011032
Haverford 76
Gender
0.064272
0.054273
Major
0.032048
0.023701
0.028859
Residence
0.12563
0.12757
0.13299
Year
0.39636
0.43004
0.42873
High School
0.005221
0.0052433
0.0041392
William 77
Gender
0.11732
0.12261
Major
0.043556
0.037253
0.037788
Residence
0.20145
0.20292
0.24859
Year
0.43441
0.49976
0.42536
High School
0.034523
0.038074
0.033857
MU 78
Gender
0.12117
0.10461
Major
0.050475
0.046852
0.055255
Residence
0.32594
0.30644
0.42817
Year
0.50289
0.54881
0.50485
High School
0.085662
0.091223
0.075373
JMU 79
Gender
0.0091065 0.02467
Major
0.059693
0.053025
0.067999
Residence
0.18614
0.18697
0.23382
Year
0.51017
0.55723
0.52289
High School
0.095835
0.10166
0.10032
Texas 80
Gender
0.094481
0.09529
Major
0.066552
0.060959
0.059698
Residence
0.277
0.26055
0.38651
Year
0.33772
0.37278
0.34662
High School
0.16581
0.17687
0.14959
Simmons 81
Gender
0.0079753
0.0016002
Major
0.069744
0.067302
0.070109
Residence
0.18681
0.18622
0.18725
Year
0.53133
0.58624
0.53767
High School
0.014088
0.014332
0.01412
Bingham 82
54
Male
0.052386
0.32847
0.31716
0.0082203
0.031393
0.12643
0.32713
0.0043886
0.045295
0.29039
0.42774
0.033152
0.046951
0.45396
0.42438
0.10153
0.069393
0.22017
0.43857
0.10848
0.085572
0.37658
0.31836
0.1849
0
0
0
0

Institution
Full
Student
Female
Gender
0.014996
0.012791
Major
0.051405
0.045179
0.065719
Residence
0.17423
0.17577
0.19625
Year
0.35108
0.39424
0.3803
High School
0.06676
0.073947
0.062064
Temple 83
Gender
0.066799
0.074255
Major
0.064454
0.059374
0.076531
Residence
0.22858
0.22793
0.27919
Year
0.45579
0.48935
0.52795
High School
0.084141
0.088428
0.093098
Texas 84
Gender
0.063071
0.057343
Major
0.059176
0.054567
0.06316
Residence
0.3122
0.29062
0.48019
Year
0.30725
0.33335
0.31826
High School
0.14923
0.15895
0.14844
Vassar 85
Gender
0.0020152
0.010138
Major
0.049476
0.039809
0.052645
Residence
0.24338
0.25645
0.25538
Year
0.4668
0.52476
0.5198
High School
0.010575
0.011074
0.011257
Pepperdine 86
Gender
0.059314
0.044794
Major
0.037597
0.027735
0.035587
Residence
0.22932
0.19797
0.35892
Year
0.42753
0.49054
0.43535
High School
0.0082151
0.0083703
0.0081095
Wisconsin 87
Gender
0.046707
0.042587
Major
0.039519
0.033034
0.048021
Residence
0.34247
0.33784
0.45756
Year
0.4046
0.45413
0.46552
High School
0.14583
0.1551
0.14719
Colgate 88
Gender
0.089986
0.059097
Major
0.036249
0.032867
0.032131
Residence
0.17303
0.16399
0.19561
Year
0.54994
0.63084
0.56655
High School
0.012534
0.012983
0.011424
55
Male
0.053214
0.19757
0.32871
0.084548
0.07327
0.2333
0.40727
0.10703
0.067313
0.38312
0.30186
0.15246
0.058073
0.23329
0.39599
0.011445
0.041034
0.27511
0.37374
0.0073864
0.043372
0.39075
0.31841
0.15286
0.039073
0.25962
0.46105
0.014768

Institution
Full
Student
Female
Rutgers 89
Gender
0.030869
0.026827
Major
0.066469
0.059502
0.076172
Residence
0.23624
0.23484
0.28458
Year
0.39203
0.43844
0.4293
High School
0.1539
0.16603
0.15901
Howard 90
Gender
0.092243
0.095614
Major
0.049663
0.043986
0.063512
Residence
0.1699
0.15873
0.24497
Year
0.42913
0.48277
0.52221
High School
0.016297
0.016431
0.022396
UConn 91
Gender
0.011767
0.01262
Major
0.052949
0.050796
0.076642
Residence
0.12621
0.1287
0.17427
Year
0.40678
0.44814
0.46042
High School
0.14734
0.15911
0.14731
UMass 92
Gender
0.046136
0.05332
Major
0.078534
0.072308
0.10077
Residence
0.22818
0.22156
0.27402
Year
0.4384
0.47642
0.48831
High School
0.11549
0.12382
0.11756
Baylor 93
Gender
0.095714
0.085888
Major
0.050155
0.043635
0.056381
Residence
0.33442
0.29666
0.54796
Year
0.39637
0.44627
0.41824
High School
0.056062
0.0578
0.050649
Penn 94
Gender
0.020922
0.020392
Major
0.054699
0.049229
0.066691
Residence
0.24383
0.23052
0.41069
Year
0.39899
0.43205
0.44012
High School
0.14658
0.15873
0.13416
Tennessee 95
Gender
0.054272
0.048663
Major
0.042589
0.03426
0.043655
Residence
0.22654
0.20872
0.34945
Year
0.29128
0.3139
0.30665
56
Male
0.069841
0.28142
0.37527
0.17869
0.048478
0.20127
0.39081
0.013863
0.049408
0.14729
0.36441
0.17911
0.095808
0.27678
0.38243
0.14951
0.051998
0.50984
0.35905
0.058399
0.059916
0.35064
0.37095
0.18528
0.05075
0.33083
0.26175
Table
Institution
High School
Lehigh 96
Gender
Major
Residence
Year
High School
Oklahoma 97
Gender
Major
Residence
Year
High School
Reed 98
Gender
Major
Residence
Year
High School
Brandeis 99
Gender
Major
Residence
Year
High School
Trinity 100
Gender
Major
Residence
Year
High School
A.2 Continued from previous page

Full
Student
Female
Male
0.17172
0.18116
0.15056
0.20465
0.06954
0.049472
0.28169
0.49992
0.018758
0.059833
0.045209
0.25827
0.55849
0.019471
0.040137
0.43546
0.49009
0.013934
0.056438
0.39254
0.44806
0.024868
0.11176
0.04115
0.40326
0.29235
0.1583
0.1172
0.032522
0.39682
0.31461
0.16712
0.039645
0.58012
0.29748
0.12993
0.04512
0.5948
0.2493
0.17418
0.021903
0.047233
0.13295
0.34748
0.0032333
0.012225
0.037594
0.13377
0.39118
0.0028504
0.058292
0.14915
0.42112
0.0020284
0.052558
0.090487
0.27715
0.0016893
0.019401
0.041497
0.19401
0.52964
0.014241
0.022782
0.035748
0.19338
0.61682
0.014872
0.04476
0.22725
0.58517
0.014966
0.043044
0.18293
0.47524
0.014663
0.052012
0.050441
0.10577
0.5079
0.01613
0.041459
0.043578
0.10634
0.58971
0.016656
0.045839
0.12206
0.55402
0.014522
0.065184
0.10248
0.43875
0.021751
57
Table A.3: Logistic regression coefficients for a model combining a

density (edge) term and nodematch contributions for the increased
propensity of two nodes with the same categorical value to have
an edge connected between them. This is calculated individually
for Year, Residence, High School, and Major. We give the standard error for each coefficient in parentheses. All coefficients are
statistically significantly different from zero with p-value less than
1104. Wellesley (22), Smith (60), and Simmons (81) are femaleonly institutions, so we list the values for their Male networks as
NA.
58
Wellesley 22
Full
Student
Female
Male
Caltech 36
Full
Student
Female
Male
Williams 40
Full
Student
Female
Male
Amherst 41
Full
Student
Female
Male
Swarthmore 42
Full
Student
Female
Edge
4.4291(0.0047086)
4.3656(0.0063232)
4.4673(0.0054048)
NA
Edge
3.6903(0.012891)
3.4932(0.017086)
3.045(0.035464)
3.7902(0.022582)
Edge
4.221(0.0045298)
4.1503(0.0062345)
4.2218(0.0097145)
4.0071(0.0095503)
Edge
3.9164(0.0049089)
3.8449(0.0066932)
3.8278(0.010538)
3.8611(0.010649)
Edge
3.635(0.0058633)
3.5712(0.0077451)
3.5944(0.012607)
Year
1.8249(0.0067097)
1.7437(0.0082165)
1.8587(0.0073823)
NA
Year
1.5382(0.018233)
1.4006(0.021534)
1.4288(0.049983)
1.5104(0.029781)
Year
2.1133(0.0063052)
2.0076(0.007883)
2.1577(0.012801)
1.9273(0.013487)
Year
2.0068(0.0069995)
1.9466(0.0086346)
2.1198(0.014293)
1.8312(0.015146)
Year
1.7006(0.0085934)
1.6388(0.010329)
1.7912(0.017337)
Residence
1.2546(0.012002)
1.2512(0.013129)
1.2749(0.012577)
NA
Residence
2.4151(0.018644)
2.3896(0.022905)
2.1684(0.053205)
2.4803(0.029657)
Residence
0.93506(0.011943)
0.95814(0.012448)
1.0063(0.023198)
0.88885(0.024598)
Residence
1.1385(0.017204)
1.0997(0.018196)
1.1944(0.034174)
1.2709(0.033384)
Residence
0.70677(0.014092)
0.70249(0.015382)
0.70752(0.028369)
High School
3.1738(0.041398)
3.2966(0.05217)
3.19(0.044896)
NA
High School
2.3789(0.14869)
2.5169(0.1944)
1.3514(0.43722)
2.887(0.23382)
High School
3.1413(0.036901)
3.3846(0.047399)
3.1839(0.06889)
3.0015(0.07507)
High School
2.7878(0.043122)
3.0146(0.053588)
2.9552(0.091756)
2.6513(0.076298)
High School
2.8177(0.087157)
3.108(0.11187)
3.1246(0.17728)
Major
0.70232(0.013501)
0.62071(0.01745)
0.66471(0.014677)
NA
Major
0.53388(0.02881)
0.47013(0.035936)
0.44336(0.072743)
0.51028(0.044684)
Major
0.63891(0.01226)
0.59197(0.015798)
0.62403(0.02406)
0.63484(0.023232)
Major
0.56196(0.014974)
0.45937(0.019933)
0.44155(0.03109)
0.57283(0.028969)
Major
0.71062(0.015732)
0.62213(0.020307)
0.71791(0.03107)
59
Male
Oberlin 44
Full
Student
Female
Male
Middlebury 45
Full
Student
Female
Male
Hamilton 46
Full
Student
Female
Male
Bowdoin 47
Full
Student
Female
Male
Smith 60
Full
Student
Female
Male
USFCA 72
Full
Student
Female
Male
Haverford 76
Full
Student
Female
3.4944(0.012307)
Edge
4.3357(0.0045547)
4.3477(0.0057572)
4.382(0.0092964)
4.2048(0.011081)
Edge
4.4107(0.0045357)
4.4519(0.0061279)
4.4496(0.0096593)
4.2906(0.01027)
Edge
3.9231(0.0047892)
3.9278(0.0062417)
3.8481(0.0095168)
3.7214(0.010321)
Edge
4.0994(0.0053132)
4.0369(0.0068883)
4.0971(0.011542)
4.0007(0.011566)
Edge
4.5226(0.0048951)
4.5565(0.0064751)
4.6143(0.0058739)
NA
Edge
4.6268(0.0058034)
4.6201(0.0064663)
4.7401(0.0097375)
4.3531(0.018115)
Edge
3.4051(0.0060883)
3.2009(0.0074664)
3.3442(0.011877)
1.553(0.018316)
Year
1.4322(0.0071089)
1.4406(0.0081899)
1.512(0.013473)
1.3069(0.017141)
Year
2.0753(0.0059187)
2.0652(0.0073273)
2.1589(0.01186)
1.9748(0.013337)
Year
1.8442(0.0067955)
1.8496(0.0080481)
1.9502(0.012943)
1.5511(0.015012)
Year
2.0771(0.0073015)
1.9903(0.0087614)
2.1747(0.014967)
1.8768(0.016042)
Year
1.44(0.0070185)
1.4702(0.008444)
1.5123(0.0079156)
NA
Year
1.6115(0.0083192)
1.6342(0.0088723)
1.6713(0.013044)
1.6391(0.025116)
Year
1.7879(0.0088662)
1.6081(0.0099901)
1.9069(0.016546)
0.73946(0.027981)
Residence
1.0716(0.013797)
1.1044(0.014159)
1.1808(0.024895)
1.0426(0.031315)
Residence
0.76052(0.0074835)
0.82491(0.0082675)
0.82401(0.014034)
0.72369(0.016737)
Residence
0.84034(0.011975)
0.83128(0.012498)
0.9582(0.021613)
0.90341(0.025581)
Residence
0.9616(0.012875)
0.96466(0.013573)
1.1435(0.023846)
1.0069(0.027116)
Residence
3.0814(0.0086746)
3.065(0.010562)
3.1297(0.009554)
NA
Residence
0.90162(0.011441)
0.8675(0.01168)
0.99928(0.016643)
0.96613(0.033156)
Residence
0.45404(0.011702)
0.39078(0.012184)
0.42992(0.022171)
2.4786(0.18762)
High School
3.2257(0.042543)
3.3936(0.050744)
3.3713(0.077285)
3.051(0.10063)
High School
3.3979(0.031385)
3.6831(0.04)
3.6264(0.063049)
3.3119(0.064183)
High School
3.026(0.042724)
3.2264(0.052715)
3.0543(0.085707)
3.1322(0.081785)
High School
3.1465(0.041196)
3.3839(0.050362)
3.1707(0.083632)
3.2853(0.080941)
High School
3.8(0.049519)
4.0877(0.062345)
3.9079(0.054194)
NA
High School
3.1032(0.031585)
3.2174(0.033997)
3.3412(0.044919)
2.7253(0.088089)
High School
2.9137(0.07691)
3.0223(0.092203)
2.9156(0.14531)
0.73991(0.030376)
Major
1.4604(0.010714)
1.3832(0.01303)
1.5071(0.019651)
1.3883(0.025176)
Major
0.79632(0.012067)
0.71206(0.015883)
0.77215(0.023298)
0.76615(0.024516)
Major
0.66129(0.014902)
0.59501(0.018189)
0.65026(0.02958)
0.57013(0.027734)
Major
0.63376(0.015324)
0.58314(0.018703)
0.58128(0.033008)
0.68168(0.028132)
Major
0.93814(0.013074)
0.86763(0.017044)
0.94156(0.01446)
NA
Major
0.66308(0.011574)
0.629(0.012479)
0.83048(0.017243)
0.41582(0.032116)
Major
0.64285(0.019116)
0.51009(0.02355)
0.59125(0.034678)
60
Male
Simmons 81
Full
Student
Female
Male
Vassar 85
Full
Student
Female
Male
Reed 98
Full
Student
Female
Male
Trinity 100
Full
Student
Female
Male
3.2342(0.013433)
Edge
4.2939(0.0087542)
4.2823(0.010262)
4.266(0.0093971)
NA
Edge
4.4257(0.0045202)
4.3601(0.0058449)
4.582(0.0088969)
4.195(0.011564)
Edge
3.6205(0.0099372)
3.6229(0.012141)
3.6937(0.020247)
3.3999(0.025163)
Edge
4.1159(0.0046382)
4.1318(0.0060873)
4.0764(0.0098017)
4.0567(0.010444)
1.5054(0.020176)
Year
1.9127(0.011746)
1.941(0.013004)
1.8995(0.01233)
NA
Year
1.813(0.0060722)
1.7041(0.0072602)
1.9572(0.011179)
1.5908(0.015975)
Year
1.5(0.015705)
1.4782(0.017725)
1.614(0.028894)
1.3096(0.039777)
Year
2.0271(0.0063319)
2.0143(0.0076607)
2.1975(0.012669)
1.7776(0.014516)
0.44079(0.02505)
Residence
0.71252(0.017391)
0.67853(0.017657)
0.69221(0.017762)
NA
Residence
1.3142(0.007704)
1.4151(0.0083399)
1.3373(0.013584)
1.2077(0.020043)
Residence
1.4399(0.033769)
1.4925(0.034523)
1.5385(0.060679)
1.2103(0.086745)
Residence
0.77702(0.012227)
0.7988(0.01275)
0.79113(0.022218)
0.76933(0.027546)
2.9901(0.16665)
High School
3.1819(0.061849)
3.2452(0.06925)
3.16(0.063873)
NA
High School
3.4271(0.039439)
3.7486(0.049088)
3.7342(0.0691)
3.1518(0.093015)
High School
2.9666(0.14784)
3.0584(0.17396)
2.8801(0.24827)
3.2633(0.36283)
High School
3.1233(0.032458)
3.4011(0.040157)
3.2724(0.067273)
3.0224(0.060545)
0.62004(0.040993)
Major
0.95847(0.019342)
0.93096(0.021004)
0.93484(0.019949)
NA
Major
0.92801(0.012093)
0.79613(0.015441)
0.8989(0.021542)
1.0176(0.028251)
Major
0.78979(0.029502)
0.6773(0.035648)
0.86436(0.049343)
1.0037(0.067066)
Major
0.80619(0.012694)
0.71446(0.016092)
0.89966(0.025649)
0.73818(0.024533)
Table A.4: ERGM coefficients for the model (described in the text) that combines density (edge) and triangle terms with nodematch contributions representing the increased propensity for two nodes with the same categorical value
to have an edge connected between them. (This is calculated individually for
Year, Residence, High School, and Major.) We give the standard error for each
coefficient in parentheses. All coefficients are statistically significantly different from zero with p-value less than 1 104 . Wellesley (22), Smith (60), and
Simmons (81) are female-only institutions, so we list the values for their Male
networks as NA.
61
Wellesley 22
Full
Student
Female
Male
Caltech 36
Full
Student
Female
Male
Williams 40
Full
Student
Female
Male
Amherst 41
Full
Student
Female
Male
Swarthmore 42
Full
Student
Female
Male
Oberlin 44
Full
Student
Female
Male
Middlebury 45
Full
Student
Female
Edges
5.5166(0.29946)
5.395(0.40299)
5.5395(0.47528)
NA
Edges
4.9776(0.0013776)
4.8284(0.001786)
4.5427(0.058123)
4.9734(0.033352)
Edges
5.3284(0.19432)
5.1347(0.24863)
5.3368(0.013971)
5.2602(0.014726)
Edges
5.0914(0.097866)
4.9092(0.071772)
5.0074(0.01569)
5.106(0.016455)
Edges
4.8312(0.17358)
4.698(0.011101)
4.7717(0.018696)
4.8247(0.019575)
Edges
5.3989(0.088183)
5.3757(0.67096)
5.4066(0.013259)
5.2834(0.016255)
Edges
5.5042(0.67137)
5.3837(0.77243)
5.5484(0.0067934)
Triangles
0.18714(0.00040795)
0.18873(0.00054665)
0.20854(0.00050963)
NA
Triangles
0.17766(1.64e 005)
0.1836(1.89e 005)
0.34325(0.0067684)
0.28127(0.0030727)
Triangles
0.14604(0.00031271)
0.16169(0.00043304)
0.28741(0.0012684)
0.26068(0.001206)
Triangles
0.12103(0.00030109)
0.12695(0.011275)
0.21904(0.0011268)
0.24842(0.0013)
Triangles
0.12423(0.016066)
0.12352(8.09e 005)
0.21474(0.0013858)
0.24087(0.001571)
Triangles
0.19739(0.015958)
0.21399(0.00056897)
0.38758(0.0016842)
0.39322(0.0021443)
Triangles
0.14939(0.07867)
0.15998(0.00039806)
0.29748(0.0011819)
Year
1.0432(1.1574)
0.89815(0.93197)
1.0713(0.70673)
NA
Year
0.99434(0.0014976)
0.89239(0.0017737)
1.0623(0.016542)
1.0405(0.036294)
Year
0.85073(0.0080651)
0.60271(0.47267)
0.89619(0.01105)
1.0773(0.0035795)
Year
0.88125(0.30594)
0.73901(0.18123)
0.98463(0.018087)
0.99941(0.019207)
Year
0.96422(0.26284)
0.85491(0.018375)
1.0746(0.00017492)
0.98215(0.022589)
Year
0.7668(0.42501)
0.68832(1.9864)
0.79641(0.097431)
0.8105(0.021123)
Year
0.98073(0.9038)
0.79027(1.1441)
0.92112(0.026494)
Residence
1.2079(0.014731)
1.262(0.83757)
1.2339(0.53312)
NA
Residence
1.1638(0.0010284)
1.2991(0.00098228)
1.3504(0.035112)
1.1781(0.039183)
Residence
1.1718(0.82606)
1.1717(0.94455)
1.3187(0.0024651)
1.1681(0.0051222)
Residence
0.88007(0.60964)
0.94128(0.9257)
0.98007(1.1477)
0.86681(0.006822)
Residence
0.79737(0.34465)
0.85656(0.034037)
0.93786(0.0024781)
0.6948(0.00082592)
Residence
1.1172(0.9797)
1.1047(2.2993)
1.2488(3.0612)
1.0634(0.027757)
Residence
0.61487(1.2033)
0.64183(0.9573)
0.56416(0.0037157)
High School
3.612(8.5012)
3.7139(8.9764)
3.6698(6.0111)
NA
High School
2.8536(0.087757)
3.0022(0.12434)
1.6776(0.099503)
3.3862(0.25271)
High School
3.5184(4.5405)
3.7627(15.1999)
3.6318(0.13512)
3.4581(0.044289)
High School
3.1539(2.531)
3.3902(3.3786)
3.3534(12.2931)
3.0624(0.12347)
High School
3.2278(11.7489)
3.5384(0.14695)
3.6738(0.38146)
2.8505(0.14226)
High School
3.5716(69.9269)
3.7576(66.9912)
3.7024(30.0234)
3.3725(1.3266)
High School
3.7714(8.0401)
4.0159(11.4279)
4.0895(0.086636)
Major
0.58573(0.01648)
0.44432(1.2761)
0.61932(1.858)
NA
Major
0.64673(0.0021013)
0.59894(0.001494)
0.64556(0.011212)
0.61173(0.17517)
Major
0.39443(0.42159)
0.38077(1.3158)
0.45421(0.0031961)
0.31836(0.0021898)
Major
0.5757(0.89419)
0.53866(0.93067)
0.5091(0.47295)
0.5913(0.0030018)
Major
0.63143(0.75281)
0.52548(0.014076)
0.49991(0.0025816)
0.61786(0.003426)
Major
0.4834(0.79777)
0.66727(3.0137)
0.37583(0.90965)
0.83047(0.018054)
Major
0.51794(5.4033)
0.47678(2.1974)
0.61937(0.020853)
62
Male
Hamilton 46
Full
Student
Female
Male
Bowdoin 47
Full
Student
Female
Male
Smith 60
Full
Student
Female
Male
USFCA 72
Full
Student
Female
Male
Haverford 76
Full
Student
Female
Male
Simmons 81
Full
Student
Female
Male
Vassar 85
Full
Student
Female
Male
Reed 98
Full
Student
Female
Male
Trinity 100
Full
Student
5.4012(0.014882)
Edges
5.1526(0.15758)
5.0475(0.20987)
5.1103(0.00025686)
5.2164(0.017572)
Edges
5.1231(0.49764)
4.9871(0.17599)
5.1156(0.00099468)
5.2312(0.00035892)
Edges
5.7499(0.46896)
5.6751(0.35105)
5.7559(0.14784)
NA
Edges
5.5339(0.08133)
5.4978(0.4816)
5.6942(0.013524)
5.2138(0.024502)
Edges
4.5864(0.17922)
4.5248(0.011488)
4.5842(0.018029)
4.5974(0.021694)
Edges
5.1447(0.011497)
5.0396(0.012814)
5.0919(0.012148)
NA
Edges
5.4365(0.042913)
5.3447(0.66641)
5.4876(0.01176)
5.2473(0.016541)
Edges
4.7342(0.014847)
4.6732(0.017455)
4.7287(0.028907)
4.4269(0.036284)
Edges
5.2594(0.50302)
5.144(0.82673)
0.27717(3.31e 006)
Triangles
0.13229(0.010524)
0.13542(0.00038928)
0.22978(0.0010554)
0.25379(0.0012891)
Triangles
0.12537(0.0003663)
0.13258(0.0004469)
0.2751(0.00149)
0.28383(0.0015608)
Triangles
0.23032(0.040735)
0.25538(0.00069)
0.28145(0.0054268)
NA
Triangles
0.21369(0.019896)
0.218(0.00060193)
0.31715(0.0013151)
0.4218(0.0034552)
Triangles
0.099312(0.00033645)
0.097998(0.00038937)
0.18037(0.0011641)
0.20668(0.0015226)
Triangles
0.2364(0.00096724)
0.23882(0.0001411)
0.23884(0.00075486)
NA
Triangles
0.16286(0.00033258)
0.1653(0.0004181)
0.31638(9.93e 005)
0.31715(0.0017542)
Triangles
0.19271(0.0010667)
0.20779(0.0013555)
0.34763(0.0037349)
0.38754(0.0057624)
Triangles
0.13124(0.067744)
0.13839(0.030441)
1.165(0.01032)
Year
0.89247(0.38942)
0.78719(0.39866)
0.91713(0.1191)
1.1662(0.019208)
Year
0.84602(0.44413)
0.73847(0.010887)
0.89048(0.0045874)
1.1377(0.00064391)
Year
1.0244(0.71782)
0.87631(0.61986)
1.0443(0.22894)
NA
Year
0.81903(0.31274)
0.77135(0.20243)
1.0311(0.016106)
0.71035(0.032585)
Year
0.88251(0.36604)
0.8307(0.012088)
0.83102(0.020477)
1.0335(0.024614)
Year
0.62361(0.015007)
0.52922(0.01664)
0.61268(0.0044493)
NA
Year
0.89224(0.36023)
0.78236(1.1368)
0.85905(0.013732)
1.0972(0.019096)
Year
0.89641(0.019018)
0.81768(0.021382)
0.97335(0.034508)
0.81151(0.047624)
Year
0.88149(1.0778)
0.66726(0.81806)
0.70103(0.023808)
Residence
0.60097(0.9291)
0.61503(0.85391)
0.6133(0.99652)
0.81602(0.0043879)
Residence
0.86002(1.7887)
0.9108(0.85058)
0.96132(0.015526)
0.81674(0.0047172)
Residence
1.318(1.7496)
1.4951(1.5255)
1.2561(0.53313)
NA
Residence
0.75232(0.48257)
0.75418(0.5606)
0.85946(0.031407)
0.74138(0.0012662)
Residence
0.4303(0.49797)
0.50822(0.027194)
0.45689(0.026914)
0.47377(0.00061681)
Residence
0.04641(0.022947)
0.017321(0.093667)
0.0096188(0.026243)
NA
Residence
1.0009(1.5575)
1.0869(0.73536)
1.0423(0.0013954)
1.0899(0.024127)
Residence
1.5839(0.41431)
1.586(0.040154)
1.7788(0.0037734)
1.2315(0.005725)
Residence
0.81391(1.9123)
0.86642(1.6612)
3.7024(0.051105)
High School
3.4533(2.764)
3.639(5.3808)
3.5653(6.2228)
3.6524(0.13302)
High School
3.5147(14.3023)
3.7404(2.129)
3.6624(0.068102)
3.7484(0.056753)
High School
4.3908(27.5879)
4.6639(26.1729)
4.466(6.4417)
NA
High School
3.3646(3.7542)
3.4592(2.2951)
3.5978(0.072184)
3.0314(0.020408)
High School
3.4762(5.3548)
3.6771(0.19927)
3.5413(0.15545)
3.5159(0.26688)
High School
3.6491(0.066845)
3.6926(0.16704)
3.6168(0.099064)
NA
High School
3.8325(13.4455)
4.1626(2.7852)
4.0763(0.073194)
3.5254(0.05049)
High School
3.4991(10.8338)
3.5945(0.18597)
3.3521(0.12521)
3.9308(0.15859)
High School
3.5938(3.3991)
3.7899(23.8254)
0.24567(0.0032737)
Major
0.57065(0.66635)
0.52289(1.3097)
0.71784(0.30967)
0.26404(0.0025362)
Major
0.53053(1.4671)
0.48614(0.94847)
0.62781(0.0045592)
0.405(0.0021074)
Major
0.95945(1.1995)
0.96959(1.0772)
0.96695(0.7678)
NA
Major
0.65908(0.40194)
0.61892(0.66365)
0.78521(0.0056399)
0.41939(0.00063143)
Major
0.68087(0.86129)
0.54286(0.016676)
0.60269(0.04109)
0.66119(0.0035633)
Major
0.95822(0.006254)
0.88137(0.012457)
0.94789(0.0039044)
NA
Major
0.76399(0.83096)
0.66064(1.1939)
0.77276(0.033551)
0.71203(0.0019611)
Major
0.94969(0.24672)
0.80753(0.04143)
0.8996(0.0035483)
0.96303(0.0026607)
Major
0.62169(1.5141)
0.57219(1.6865)
Female
Male
5.2108(0.014321)
5.3106(4.73e 006)
0.21239(0.00096177)
0.27532(0.0013436)
1.1086(0.016232)
1.1575(0.26773)
0.78131(0.0033131)
0.92286(7.0628)
3.8014(0.056335)
3.5167(22.874)
0.75326(0.0233)
0.53424(0.6823)
63
Table A.5: Maximum z-scores of the Rand coefficient obtained for

the 6 employed community detection algorithms (see the discussion
in the text) for each categorical variable in every network (Full,
Student, Female, and Male) for each of the 100 institutions. We
italicize z-scores that are less than 2. We divide the table into
five parts: (1) networks in which High School yields the highest
z-score, (2) networks in which Residence yields the highest z-score,
(3) networks in which Year yields the highest z-score and High
School yields the second highest z-score, (4) networks in which
Year yields the highest z-score and Major yields the second highest
z-score, and (5) networks in which Year yields the highest z-score
and Residence yields the second highest z-score.
Institution and Network
High School:
Auburn 71 Male
Tennessee 95 Male
Residence:
Rice 31 Full
Caltech 36 Full
UCSC 68 Full
Rice 31 Student
Caltech 36 Student
UCSC 68 Student
Rice 31 Female
Caltech 36 Female
UCSC 68 Female
Auburn 71 Female
Rice 31 Male
Caltech 36 Male
Notre Dame 57 Male
UCSC 68 Male
Year then High School:
Harvard 1 Full
USF 51 Full
Tennessee 95 Student
USF 51 Female
UCF 52 Female
MSU 24 Male
USF 51 Male
UCF 52 Male
Maine 59 Male
Smith 60 Male
Major
Residence
Graduation Year
High School
37.6893
15.6741
14.8497
20.4034
42.7784
42.8019
70.6776
58.5508
15.3123
4.0649
36.7889
19.2677
3.0762
28.8843
4.912
1 .4788
24.6517
11.011
13.9046
3.3216
15.0014
23.497
1404.4502
222.9566
945.8502
1523.4423
202.1448
1240.1219
882.3474
74.1988
558.2736
62.316
703.6264
168.4986
881.5186
421.0489
196.14
8.5967
481.3222
137.8101
13.7929
584.0597
45.6332
7.6852
315.4706
14.4551
30.5491
7.7892
301.8338
185.544
4.3858
4.9078
10.3043
1 .6253
6.303
5.7107
2.9773
1 .2637
6.338
33.9673
2.3049
1 .0672
8.4277
6.0851
32.9283
13.2168
23.9067
6.484
13.0291
15.5217
9.9473
6.0026
14.6714
14.6714
46.2515
17.0962
78.8738
10.1971
11.0127
11.4586
17.6237
22.8679
13.517
13.517
707.9697
178.794
486.6029
105.5474
349.4409
105.1908
133.8587
135.8974
31.8319
31.8319
47.4424
19.1333
86.6653
24.1933
24.3501
31.4381
29.9162
30.2374
24.5193
24.5193
64

Major
Residence Graduation Year
Year then Major:
Northwestern 25 Full
63.7255
61.9673
952.1696
Oberlin 44 Full
98.1388
66.8654
453.1168
Carnegie 49 Full
51.3599
25.8138
731.3975
Johns Hopkins 55 Full
47.1995
42.8154
691.5817
Maine 59 Full
19.7247
19.4293
294.1129
MU 78 Full
109.3402
83.3228
2156.4469
Texas 84 Full
75.9868
66.8923
942.1053
Pepperdine 86 Full
19.7587
16.1209
514.1583
Rutgers 89 Full
65.5981
58.1302
1006.3321
Yale 4 Student
43.7072
42.9174
1749.1995
Wellesley 22 Student
32.9359
18.2914
604.0402
Northwestern 25 Student
56.7216
43.7364
761.809
Oberlin 44 Student
121.7061
97.7768
422.8126
Middlebury 45 Student
35.7311
25.1887
1021.1259
Carnegie 49 Student
58.7709
28.1146
678.5146
Johns Hopkins 55 Student 51.59
46.8618
977.8851
Maine 59 Student
18.1278
6.8457
198.4138
Texas 84 Student
59.7362
40.1543
627.1121
Rutgers 89 Student
54.9329
46.3295
854.1174
Harvard 1 Female
49.824
46.035
594.5535
Stanford 3 Female
49.2033
29.9579
402.8892
Yale 4 Female
44.0142
27.4764
919.2215
Berkeley 13 Female
46.1685
30.443
886.1622
Duke 14 Female
54.4993
54.2103
817.5276
UVA 16 Female
66.7572
52.4103
657.0101
Northwestern 25 Female
32.1659
29.69
434.8702
UChicago 30 Female
33.2089
23.6909
438.7235
Amherst 41 Female
27.3288
21.743
294.6258
Oberlin 44 Female
100.2134
48.2684
260.1137
Carnegie 49 Female
47.4929
40.3651
407.0773
Johns Hopkins 55 Female
28.47
20.2697
333.9934
Maryland 58 Female
44.1338
41.5469
822.4117
Maine 59 Female
38.3416
20.8041
200.7833
UC 61 Female
28.5215
14.385
622.9345
UC 64 Female
14.4139
11.9664
149.0986
JMU 79 Female
36.1024
36
796.4756
Bingham 82 Female
22.3648
21.321
284.0935
Temple 83 Female
46.1286
31.3653
757.509
Rutgers 89 Female
53.8238
16.1881
488.0149
UConn 91 Female
34.9672
25.4951
723.1034
65
High School
17.7493
5.7939
7.8962
4.8342
17.8845
11.864
18.3328
2.6847
15.0646
12.443
11.3959
11.8733
5.4097
10.1694
4.2095
1 .0234
12.0078
11.3447
6.1141
36.2596
14.1069
6.8457
5.0467
8.8687
9.9162
2.8402
4.7979
3.507
3.455
4.2153
3.4239
17.8818
31.3413
5.8206
10.9256
4.201
15.0616
9.9322
14.8522
10.5517

Major
Penn 94 Female
39.9864
26.9059
936.2613
Stanford 3 Male
54.1737
27.4921
371.1569
Yale 4 Male
24.8967
23.3821
370.9076
NYU 9 Male
94.4143
54.1045
1021.7862
UIllinios 20 Male
46.3144
31.5598
336.3754
UF 21 Male
23.1165
22.9452
507.6766
Wellesley 22 Male
23.1165
22.9452
507.6766
Northwestern 25 Male
37.7307
31.9062
345.6295
UC 33 Male
22.1665
20.0385
567.4739
Oberlin 44 Male
72.4224
27.9138
101.0428
Carnegie 49 Male
36.6684
25.7608
387.3158
FSU 53 Male
29.3943
10.5416
287.6117
Johns Hopkins 55 Male
58.2022
34.3057
407.8593
Syracuse 56 Male
25.7511
11.4028
336.2923
Virginia 63 Male
26.3338
6.2308
407.6451
MU 78 Male
29.748
19.2228
399.0186
JMU 79 Male
28.1338
18.0068
384.1443
Texas 80 Male
43.5806
34.4119
304.2728
Simmons 81 Male
43.5806
34.4119
304.2728
Bingham 82 Male
15.1071
13.4701
256.2084
Temple 83 Male
28.3467
19.8583
384.461
Texas 84 Male
66.4811
18.9098
411.8199
Pepperdine 86 Male
16.5056
14.9359
252.9983
Rutgers 89 Male
48.8296
15.5107
380.7912
UMass 92 Male
33.2549
30.8727
355.3603
Penn 94 Male
38.4917
5.8735
658.7507
Year then Residence:
Columbia 2 Full
36.5153
54.924
1374.3083
Stanford 3 Full
14.7863
41.9154
664.1124
Yale 4 Full
20.5117
25.1205
1099.4233
Cornell 5 Full
25.2506
89.5878
1593.1804
Dartmouth 6 Full
11.1027
50.1377
1020.2318
UPenn 7 Full
20.3689
95.7981
1923.527
MIT 8 Full
21.171
50.5652
729.7555
NYU 9 Full
56.5096
174.4169
2330.6687
BU 10 Full
38.1522
158.5765
2002.8006
Brown 11 Full
43.7958
93.3695
1528.6473
Princeton 12 Full
46.2822
87.7417
1378.4171
Berkeley 13 Full
36.1926
69.1185
1363.2005
Duke 14 Full
11.5831
57.9147
976.1039
Georgetown 15 Full
22.4966
167.8567
2653.5486
66
High School
27.0165
16.5656
8.67
6.5701
28.4218
20.272
20.272
6.6782
7.6147
3.2915
5.4715
12.0415
3.0992
3.6838
10.1691
7.6544
2.6807
22.4089
22.4089
9.2008
5.1958
11.5949
0 .16511
12.6565
3.0583
26.3601
7.2293
29.9686
11.8998
21.2781
28.9704
44.2896
17.3298
26.0983
33.7839
20.0139
25.3283
17.9455
17.3751
24.5735

Major
UVA 16 Full
12.2439
62.6574
819.8208
BC 17 Full
38.8203
122.7586
2681.1323
Tufts 18 Full
42.0213
145.7541
1358.3353
Northeastern 19 Full
13.8347
88.16
1681.8672
UIllinios 20 Full
40.0749
86.7386
1199.8824
UF 21 Full
26.8015
64.8004
724.6443
Wellesley 22 Full
29.1131
54.6635
742.4539
Michigan 23 Full
43.3415
82.7687
1649.3178
MSU 24 Full
26.7874
87.0085
1009.6651
UCLA 26 Full
40.0622
74.6241
1468.1327
Emory 27 Full
19.8861
70.0149
900.4854
UNC 28 Full
23.1838
121.3854
776.3694
Tulane 29 Full
16.1778
56.0665
671.0963
UChicago 30 Full
24.7146
24.9071
662.2972
WashU 32 Full
47.243
136.8021
1623.2865
UC 33 Full
29.034
67.3425
1357.1099
UCSD 34 Full
97.574
152.9926
2473.4545
USC 35 Full
29.9297
78.0274
453.1745
UCSB 37 Full
22.0941
85.3381
1198.933
Rochester 38 Full
75.4887
108.3232
552.4707
Bucknell 39 Full
43.169
157.6246
1028.8064
Williams 40 Full
32.068
60.4559
812.548
Amherst 41 Full
10.2116
23.6193
463.9533
Swarthmore 42 Full
22.6533
56.4236
409.7389
Wesleyan 43 Full
29.8018
66.9798
675.9864
Middlebury 45 Full
42.3292
113.8323
1101.4348
Hamilton 46 Full
20.829
74.4081
560.5977
Bowdoin 47 Full
24.8872
45.9771
561.9283
Vanderbilt 48 Full
24.2425
37.5841
794.3818
UGA 50 Full
12.0682
110.7201
632.719
UCF 52 Full
11.5819
31.3943
561.5652
FSU 53 Full
58.5762
67.7109
1076.3823
GWU 54 Full
19.4831
137.0233
1452.65
Syracuse 56 Full
21.2014
79.1388
994.8975
Notre Dame 57 Full
35.8248
88.0761
1881.5372
Maryland 58 Full
44.0964
75.7046
1602.6115
Smith 60 Full
16.4571
95.3916
153.5555
UC 61 Full
18.8061
25.2746
1013.0898
Villanova 62 Full
33.6844
170.8182
1887.6701
Virginia 63 Full
17.2208
35.5328
1071.0521
UC 64 Full
18.8043
31.4198
354.0712
67
High School
30.7617
26.6722
12.8595
5.3753
28.2528
22.0401
6.2652
16.4774
13.2794
10.9341
14.1023
17.0108
14.6558
8.6329
15.1781
15.5454
24.8996
28.5275
16.7272
5.2523
6.7421
7.8417
4.9872
15.926
3.319
12.7389
4.3781
6.5484
1 .9616
18.017
19.6318
16.1353
15.8778
4.9392
10.6129
24.4243
6.2027
21.766
14.0708
14.2972
5.8402

Major
Cal 65 Full
22.3719
85.1005
373.0507
Mississippi 66 Full
21.4284
108.929
558.1005
Mich 67 Full
9.8872
46.015
187.978
Indiana 69 Full
39.078
114.936
1044.2262
Vermont 70 Full
20.8336
99.1303
1558.6193
Auburn 71 Full
10.5381
59.1251
420.8563
USFCA 72 Full
6.336
62.5181
570.4495
Wake 73 Full
26.1448
56.1613
694.32
Santa 74 Full
33.3483
60.5256
718.0548
American 75 Full
33.0985
67.6809
883.8311
Haverford 76 Full
24.1638
106.6988
504.0081
William 77 Full
14.5855
44.6274
566.6482
JMU 79 Full
32.9706
164.9227
2124.0334
Texas 80 Full
43.334
91.3065
1167.3767
Simmons 81 Full
6.6006
97.1966
562.7712
Bingham 82 Full
13.6329
41.6889
455.3084
Temple 83 Full
27.821
56.434
824.6862
Vassar 85 Full
25.4652
112.3232
632.4143
Wisconsin 87 Full
10.7722
105.468
805.9753
Colgate 88 Full
51.7552
151.8996
974.1691
Howard 90 Full
7.9386
80.5889
658.1969
UConn 91 Full
14.4766
53.9008
1578.398
UMass 92 Full
20.5369
102.4828
1214.4527
Baylor 93 Full
30.583
91.7255
1033.2767
Penn 94 Full
20.3234
125.4115
999.8411
Tennessee 95 Full
7.4046
64.8443
322.5114
Lehigh 96 Full
34.0617
90.1525
917.7177
Oklahoma 97 Full
9.3109
73.593
230.908
Reed 98 Full
7.6974
43.7343
228.6649
Brandeis 99 Full
38.2251
125.1044
868.2479
Trinity 100 Full
37.721
79.8919
685.0894
Harvard 1 Student
99.3188
213.099
3154.0767
Columbia 2 Student
32.8855
91.0248
1320.438
Stanford 3 Student
26.4357
28.338
1181.4398
Cornell 5 Student
37.3974
85.9041
1152.2628
Dartmouth 6 Student
9.4328
44.9326
1342.6896
UPenn 7 Student
31.488
91.4797
2173.4481
MIT 8 Student
17.3812
91.6139
692.0884
NYU 9 Student
44.8664
120.8477
2673.6414
BU 10 Student
33.4053
258.2824
2387.7149
Brown 11 Student
44.2575
133.344
1631.0673
68
High School
7.2362
31.6267
2.542
26.1569
3.754
21.1789
1 .6058
3.2152
11.5192
2.9285
5.6861
10.0082
10.8734
33.7045
1 .0627
6.2484
2.0241
8.9735
14.576
8.087
0 .90495
13.8896
9.5124
9.884
17.3355
23.1512
13.0267
21.6188
1 .8708
3.4923
9.2816
40.0407
17.5707
18.8581
20.4126
25.7134
33.1935
2.8024
14.1117
33.2104
17.9475

Major
Princeton 12 Student
84.0965
158.3249
2312.1908
Berkeley 13 Student
39.664
59.8217
1773.2792
Duke 14 Student
42.1562
109.9259
1625.8706
Georgetown 15 Student
58.095
713.9046
3190.4117
UVA 16 Student
33.6856
54.3253
1303.4887
BC 17 Student
37.7245
137.9774
2075.5561
Tufts 18 Student
44.0604
232.5307
1403.1154
Northeastern 19 Student
16.3267
78.4349
1471.5152
UIllinios 20 Student
29.5538
80.4961
985.0426
UF 21 Student
30.1609
33.8981
787.2977
Michigan 23 Student
24.607
79.2378
1156.9499
MSU 24 Student
22.1766
99.8345
1173.9383
UCLA 26 Student
42.8876
97.7212
1466.9496
Emory 27 Student
29.1968
95.3337
814.384
UNC 28 Student
41.4134
117.5846
1091.3867
Tulane 29 Student
43.9375
105.2506
998.7201
UChicago 30 Student
23.6328
29.7628
636.2057
WashU 32 Student
32.9746
150.03
1274.5297
UC 33 Student
24.5734
89.2671
1178.5197
UCSD 34 Student
53.5176
109.0151
1624.1133
USC 35 Student
20.8078
80.71
721.8665
UCSB 37 Student
42.9674
60.5252
1297.0472
Rochester 38 Student
86.4081
196.7464
834.5066
Bucknell 39 Student
40.6673
135.4627
1047.8529
Williams 40 Student
51.8306
148.8178
1132.3255
Amherst 41 Student
32.3323
41.0426
685.2003
Swarthmore 42 Student
17.0493
53.0758
493.4119
Wesleyan 43 Student
19.0223
52.8176
452.412
Hamilton 46 Student
51.5924
118.0699
748.7682
Bowdoin 47 Student
72.3407
92.3423
981.5874
Vanderbilt 48 Student
45.9202
159.4014
1359.1648
UGA 50 Student
18.1844
99.7849
883.623
USF 51 Student
13.6059
21.033
186.1287
UCF 52 Student
11.4236
32.0796
497.4315
FSU 53 Student
48.4726
78.8696
1223.7047
GWU 54 Student
21.6065
201.6003
1669.3019
Syracuse 56 Student
17.5188
78.1943
786.3957
Notre Dame 57 Student
52.6066
125.0482
2181.1603
Maryland 58 Student
44.0943
46.5097
1222.2689
Smith 60 Student
18.3255
129.2886
310.6547
UC 61 Student
22.5229
30.6567
647.5664
69
High School
57.5358
16.8765
18.918
87.9151
24.9138
11.8712
21.5303
4.8722
20.804
27.9011
10.7809
15.1437
14.6514
14.2458
14.5701
11.7835
7.7714
2.2012
20.1586
19.8743
24.1509
11.7636
2.9423
0 .46698
10.0502
2.5291
32.2286
3.6865
4.2397
6.3982
9.1869
18.5072
19.2879
20.1894
16.989
10.3499
1 .956
0 .40435
16.3225
5.8634
16.7394

Major
Villanova 62 Student
36.2301
171.1948
1588.9672
Virginia 63 Student
21.2617
36.2811
920.744
UC 64 Student
14.0342
21.1245
257.877
Cal 65 Student
17.0294
89.6235
390.6672
Mississippi 66 Student
21.1674
110.3752
568.1393
Mich 67 Student
14.0186
55.0223
209.6002
Indiana 69 Student
28.6287
92.3373
987.3405
Vermont 70 Student
28.7064
150.2745
1657.4994
Auburn 71 Student
2.6014
54.069
312.2621
USFCA 72 Student
6.8763
81.5858
469.0973
Wake 73 Student
32.3749
97.8744
728.8018
Santa 74 Student
33.9205
106.301
841.1835
American 75 Student
28.9713
141.3917
980.732
Haverford 76 Student
25.6894
39.6899
450.7222
William 77 Student
41.0331
68.8142
673.3207
MU 78 Student
95.4327
115.7901
1800.4576
JMU 79 Student
36.3872
77.4386
1655.2735
Texas 80 Student
36.2796
89.1015
853.338
Simmons 81 Student
0 .70079 42.5431
297.8748
Bingham 82 Student
15.4974
33.7727
439.2416
Temple 83 Student
23.7069
55.8141
831.2747
Vassar 85 Student
57.8415
81.6655
829.158
Pepperdine 86 Student
17.662
37.6874
803.1151
Wisconsin 87 Student
39.5571
94.3383
1139.9521
Colgate 88 Student
49.3259
121.6428
1033.6546
Howard 90 Student
5.8215
97.8822
840.9143
UConn 91 Student
17.0707
24.9723
1095.6517
UMass 92 Student
10.0075
79.5962
701.4362
Baylor 93 Student
29.866
82.9446
993.8766
Penn 94 Student
19.6921
28.3107
975.0955
Lehigh 96 Student
27.4748
78.102
652.2015
Oklahoma 97 Student
7.1162
78.6706
315.7085
Reed 98 Student
6.9233
32.6469
223.167
Brandeis 99 Student
38.2334
298.1184
1487.4693
Trinity 100 Student
88.3279
140.3587
847.1625
Columbia 2 Female
59.911
69.1459
1362.6955
Cornell 5 Female
22.1182
72.6081
429.2737
Dartmouth 6 Female
34.1195
35.1162
681.7989
UPenn 7 Female
31.5802
44.3256
606.0889
MIT 8 Female
23.5711
50.6999
419.3002
NYU 9 Female
45.9961
120.1278
1598.5805
70
High School
8.0883
4.1314
5.1582
7.768
25.5351
8.9041
21.7507
4.3133
30.7547
2.6253
3.8915
8.4787
2.4023
4.6039
7.29
20.8494
1 .3604
23.3638
2.2797
16.5861
2.4749
0 .48056
0 .25577
16.7557
5.9009
2.9387
10.4414
7.3308
8.0472
12.6765
5.6521
23.247
5.16
3.0373
10.661
7.1149
11.7625
9.5636
14.928
2.8138
8.6466

Major
BU 10 Female
53.9423
105.1393
1140.3863
Brown 11 Female
58.7272
92.6819
973.3376
Princeton 12 Female
52.1966
67.0946
734.7138
Georgetown 15 Female
31.9053
155.4224
1567.5843
BC 17 Female
49.5108
102.7311
1754.3548
Tufts 18 Female
58.8867
101.6042
981.3506
Northeastern 19 Female
36.7836
60.9091
857.2153
UIllinios 20 Female
29.9189
44.1633
492.9557
UF 21 Female
15.57
59.8834
539.1188
Wellesley 22 Female
24.824
43.6688
481.3682
Michigan 23 Female
39.4274
77.0744
808.026
MSU 24 Female
31.7871
67.8264
735.6949
UCLA 26 Female
41.9293
46.6748
849.1839
Emory 27 Female
35.5263
62.3702
577.1687
UNC 28 Female
27.9996
69.4798
581.1929
Tulane 29 Female
19.6643
63.9106
376.8734
WashU 32 Female
31.7261
82.981
667.6735
UC 33 Female
41.052
62.2458
818.7266
UCSD 34 Female
52.0376
120.4275
1105.0261
USC 35 Female
8.8438
56.4958
319.0773
UCSB 37 Female
21.7453
32.5224
638.1004
Rochester 38 Female
46.5744
70.0176
287.6943
Bucknell 39 Female
60.1675
108.3969
665.8829
Williams 40 Female
41.9029
71.3542
509.9976
Swarthmore 42 Female
21.833
40.1472
266.4082
Wesleyan 43 Female
44.6585
63.2915
508.9586
Middlebury 45 Female
55.6877
68.0488
665.9784
Hamilton 46 Female
26.8269
51.7173
339.0963
Bowdoin 47 Female
42.3633
50.0206
443.6413
Vanderbilt 48 Female
27.462
56.7277
295.3028
UGA 50 Female
18.8815
91.4275
544.2733
FSU 53 Female
21.5892
47.6875
549.1743
GWU 54 Female
19.9327
76.0316
761.2666
Syracuse 56 Female
19.2093
56.8257
412.6795
Notre Dame 57 Female
67.0792
232.5318
1501.7183
Smith 60 Female
26.4079
170.5962
188.8864
Villanova 62 Female
29.0151
86.401
624.5293
Virginia 63 Female
22.1619
23.3153
531.7238
Cal 65 Female
5.7165
53.9128
346.3433
Mississippi 66 Female
18.4433
83.3607
333.3368
Mich 67 Female
9.7333
21.4232
29.1654
71
High School
9.9511
15.2166
13.3183
25.4317
17.3038
10.7485
7.029
27.0522
21.7408
4.7456
23.0555
37.9331
13.9386
9.2169
11.9534
5.8434
9.4384
20.077
8.6492
12.0622
6.7378
5.2591
1 .4486
2.4484
20.5374
2.3624
11.2319
3.7296
5.1717
3.5987
24.5577
29.1432
6.434
6.608
8.8552
6.4544
2.8517
5.0608
8.2776
17.7199
11.6094

Major
Indiana 69 Female
38.4795
77.679
741.5341
Vermont 70 Female
17.9002
27.6421
498.663
USFCA 72 Female
13.9446
69.3035
403.2942
Wake 73 Female
17.1753
50.5103
243.4489
Santa 74 Female
19.8578
38.9959
360.4748
American 75 Female
10.5018
27.6507
440.7661
Haverford 76 Female
13.0506
58.8908
257.4976
William 77 Female
40.4325
54.4722
496.434
MU 78 Female
32.8577
44.8537
749.5846
Texas 80 Female
13.6345
46.9406
193.0758
Simmons 81 Female
12.0512
114.095
581.5996
Texas 84 Female
25.7781
54.8669
366.4442
Vassar 85 Female
59.7099
99.5634
722.4313
Pepperdine 86 Female
19.049
25.8813
364.2763
Wisconsin 87 Female
45.2369
79.1916
747.3274
Colgate 88 Female
56.7463
76.3272
525.7816
Howard 90 Female
4.4286
84.2908
477.9668
UMass 92 Female
26.1798
73.2526
677.894
Baylor 93 Female
34.1381
83.2792
585.4352
Tennessee 95 Female
4.9822
44.0004
224.1669
Lehigh 96 Female
21.45
65.4041
270.3559
Oklahoma 97 Female
12.6438
60.9166
64.3757
Reed 98 Female
6.0268
36.8867
179.4781
Brandeis 99 Female
47.3222
203.5125
936.7937
Trinity 100 Female
78.5774
101.3843
513.9692
Harvard 1 Male
29.9891
61.2086
945.123
Columbia 2 Male
22.5228
50.3682
595.2026
Cornell 5 Male
37.3604
82.2361
657.6319
Dartmouth 6 Male
10.4391
37.5486
325.2692
UPenn 7 Male
13.0499
45.7206
390.0242
MIT 8 Male
11.1715
55.5604
134.1896
BU 10 Male
36.675
79.3923
818.0288
Brown 11 Male
36.9388
46.2645
637.6901
Princeton 12 Male
22.1418
38.5684
591.3097
Berkeley 13 Male
48.5051
49.1365
827.3955
Duke 14 Male
20.8442
44.6352
493.8388
Georgetown 15 Male
13.8031
102.8551
834.0682
UVA 16 Male
21.7722
33.2059
565.5523
BC 17 Male
27.0521
55.7863
1299.8541
Tufts 18 Male
25.8441
63.2841
442.9834
Northeastern 19 Male
23.647
37.3742
645.8802
72
High School
17.6171
2.5953
4.9854
5.0599
12.1288
4.5333
2.3453
4.5646
9.6638
16.4519
0 .81406
15.1859
3.3778
5.1597
17.0405
3.1809
2.2627
8.2668
7.5727
33.3907
4.5936
19.4639
7.6875
2.2307
7.1634
30.5129
5.8084
20.4949
12.6391
16.7294
2.275
9.7889
11.7407
8.5448
12.8703
6.8439
14.6271
18.2025
6.0791
3.6724
2.8086

Major
Michigan 23 Male
26.2099
34.0209
457.296
UCLA 26 Male
23.6027
36.9133
470.0726
Emory 27 Male
19.7766
44.5154
406.9838
UNC 28 Male
18.3456
20.2713
359.8668
Tulane 29 Male
12.3714
24.9828
194.3111
UChicago 30 Male
14.2932
15.7707
302.7644
WashU 32 Male
20.5816
68.9716
555.3128
UCSD 34 Male
28.0347
49.934
553.0263
USC 35 Male
16.3558
40.3269
298.4942
UCSB 37 Male
15.6701
26.3186
498.8095
Rochester 38 Male
41.7858
59.7379
200.7157
Bucknell 39 Male
20.8154
40.5595
317.9352
Williams 40 Male
21.8309
77.2712
453.5945
Amherst 41 Male
15.8332
20.5199
262.1057
Swarthmore 42 Male
13.9237
32.22
170.3011
Wesleyan 43 Male
33.9264
42.3695
281.1386
Middlebury 45 Male
24.4431
37.6956
416.4853
Hamilton 46 Male
12.511
26.9825
191.0375
Bowdoin 47 Male
21.7334
32.5075
240.1141
Vanderbilt 48 Male
21.2927
37.7789
358.7814
UGA 50 Male
28.1885
37.8464
402.9449
GWU 54 Male
15.1894
52.1764
417.4244
Maryland 58 Male
23.3088
40.4079
467.3145
UC 61 Male
9.4534
16.8844
289.5348
Villanova 62 Male
18.5382
71.4608
870.5405
UC 64 Male
9.762
11.0808
104.4673
Cal 65 Male
21.4686
44.5414
262.3191
Mississippi 66 Male
0 .68732 33.5095
146.2177
Mich 67 Male
5.8373
33.0694
103.8467
Indiana 69 Male
28.3009
42.4138
300.8445
Vermont 70 Male
9.4226
27.6424
226.9582
USFCA 72 Male
1 .6826
32.2394
147.5292
Wake 73 Male
9.5267
25.8423
152.0677
Santa 74 Male
15.3374
27.1709
184.3393
American 75 Male
4.7386
13.578
156.7257
Haverford 76 Male
12.6174
30.7299
156.1152
William 77 Male
11.7023
35.6013
205.7983
Vassar 85 Male
47.3923
48.796
255.5571
Wisconsin 87 Male
29.5032
35.2799
355.807
Colgate 88 Male
30.0573
82.1001
379.9489
Howard 90 Male
11.523
29.0063
193.9819
73
High School
23.536
5.2943
4.7514
5.1274
5.1027
6.3597
4.9773
7.3025
19.8062
7.3265
3.909
1 .4625
5.5404
3.4246
13.7607
9.7353
6.4722
2.8667
3.9085
3.8178
27.9265
3.969
18.4576
6.9456
2.3759
6.4155
9.787
16.8436
7.2399
24.6824
2.576
5.028
1 .6298
7.945
3.5432
2.9856
6.8069
2.741
13.2475
3.171
3.8611

Major
UConn 91 Male
11.792
14.3681
441.0755
Baylor 93 Male
27.9866
50.0392
523.6701
Lehigh 96 Male
26.4709
50.6683
333.6791
Oklahoma 97 Male
28.6091
40.5003
119.488
Reed 98 Male
5.1599
12.0418
60.2894
Brandeis 99 Male
18.6288
56.9973
376.3272
Trinity 100 Male
25.3231
38.5451
279.1799
High School
9.7034
7.8315
3.7436
28.4517
1 .4911
1 .3155
3.6043
References
Backstrom, L., Huttenlocher, D., Kleinberg, J., Lan, X., 2006. Group formation
in large social networks: Membership, growth, and evolution, in: Proceedings
of 12th International Conference on Knowledge Discovery in Data Mining. ACM
Press, New York, NY, pp. 4454.
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E., 2008. Fast unfolding
of communities in large networks. Journal of Statistical Mechanics: Theory and
Experiment 2008, P10008.
Boyd,
D.M.,
Facebook
2007a.
and
MySpace.
Viewing
american
Apophenia
class
blog
divisions
essay.
through
June
24.
http://www.danah.org/papers/essays/ClassDivisions.html.
Boyd, D.M., 2007b. Why youth (heart) social network sites: The role of networked
publics in teenage social life, in: Buckingham, D. (Ed.), MacArthur Foundation
Series on Digital Learning - Youth, Identity, and Digital Media Volume. MIT Press,
Cambridge, MA, pp. 119142.
74
Boyd, D.M., Ellison, N.B., 2007. Social network sites: Definition, history, and scholarship. Journal of Computer-Mediated Communication 13, 11.
Brandes, U., Delling, D., Gaertler, M., Goerke, R., Hoefer, M., Nikoloski, Z., Wagner,
D., 2008. On modularity clustering. IEEE Transactions on Knowledge and Data
Engineering 20, 172188.
Brook, R.J., Stirling, W.D., 1984. Agreement between observers when the categories
are not specified in advance. British Journal of Mathematical and Statistical
Psychology 37, 271282.
Brzozowski, M., Hogg, T., Szabo, G., 2008. Friends and foes: Ideological social networking, in: Proceedings of the SIGCHI Conference on Human Factors in Computing. ACM Press, New York, NY.
Callaghan, T., Mucha, P.J., Porter, M.A., 2007. Random walker ranking for NCAA
division I-A football. American Mathematical Monthly 114, 761777.
Chin, A., Chignell, M., 2007. Identifying active subgroups within online communities, in: Proceedings of the Centre for Advanced Studies (CASCON) Conference,
Toronto, Canada.
Fortunato, S., 2010. Community detection in graphs. Physics Reports 486, 75174.
Fortunato, S., Barthelemy, M., 2007. Resolution limit in community detection. Proceedings of the National Academy of Sciences 104, 3641.
Fragoso, S., 2006. WTF a crazy Brazilian invasion, in: Sudweeks, F., Hrachovec, H.
(Eds.), Proceedings of CATaC 2006. Murdoch University, Murdoch, Australia.
75
Frank, O., Strauss, D., 1986. Markov graphs. Journal of the American Statistical
Association 81, 832842.
Franklin, J.N., 2002. Methods of Mathematical Economics: Linear and Nonlinear
Programming, Fixed-Point Theorems. SIAM, Philadelphia, PA.
Fruchterman, T.M.J., Reingold, E.M., 1991. Graph drawing by force-directed placement. SoftwarePractice and Experience 21, 11291164.
Gajjala, R., 2007. Shifting frames: Race, ethnicity, and intercultural communication
in online social networking and virtual work, in: Hinner, M.B. (Ed.), The Role
of Communication in Business Transactions and Relationships. Peter Lang, New
York, NY, pp. 257276.
Geidner, N.W., Fook, C.A., Bell, M.W., 2007. Masculinity and online social networks: Male self-identification on Facebook.com, in: Paper presented at Eastern
Communication Association 98th Annual Meeting. Providence, RI.
Girvan, M., Newman, M.E.J., 2002. Community structure in social and biological
networks. Proceedings of the National Academy of Sciences 99, 78217826.
Gjoka, M., Kurant, M., Butts, C.T., Markopoulou, A., 2010. Walking in Facebook:
A Case Study of Unbiased Sampling of OSNs, in: Proceedings of IEEE INFOCOM
10, San Diego, CA.
Golder, S.A., Wilkinson, D., Huberman, B.A., 2007. Rhythms of social interaction:
Messaging within a massive online network, in: Steinfield, C., Pentland, B., Ack-
76
erman, M., Contractor, N. (Eds.), Proceedings of Third International Conference

on Communities and Technologies. Springer, London, U.K., pp. 4166.
Gonzalez, M.C., Herrmann, H.J., Kertesz, J., Vicsek, T., 2007. Community structure
and ethnic preferences in school friendship networks. Physica A 379, 307316.
Good, B.H., de Montjoye, Y.A., Clauset, A., 2010. Performance of modularity maximization in practical contexts. Physical Review E 81, 046106.
Guimer`a, R., Amaral, L.A.N., 2005. Functional cartography of complex metabolic
networks. Nature 433, 895900.
Handcock, M.S., Hunter, D.R., Butts, C.T., Goodreau, S.M., Morris, M., 2008. ERGM:
A package to fit, simulate and diagnose exponential-family models for networks.
Journal of Statistical Software 24, 129.
Hjorth, L., Kim, H., 2005. Being there and being here: Gendered customising of
mobile 3G practices through a case study in Seoul. Convergence 11, 4955.
Hogan,
through
B.,
2009.
the
Facebook
comparison
API.
of
on
Working
and
offline
paper,
networks
available
at
http://papers.ssrn.com/sol3/papers.cfm?abstract id=1331029.
Hogg, T., Wilkinson, D., Szabo, G., Brzozowski, M., 2008. Multiple relationship
types in online communities and social networks, in: Proceedings of the AAAI
Spring Symposium on Social Information Processing. AAAI Press.
Hubert, L., 1977. Nominal scale response agreement as a generalized correlation.
British Journal of Mathematical and Statistical Psychology 30, 98103.
77
Kamada, T., Kawai, S., 1989. An algorithm for drawing general undirected graphs.
Information Processing Letters 31, 715.
Kernighan, B.W., Lin, S., 1970. An efficient heuristic procedure for partitioning
graphs. The Bell System Technical Journal 49, 291307.
Krebs, V., 2008. Orgnet.com: Social network analysis software & services for organizations, communities, and their consultants. http://www.orgnet.com.
Kulisnkaya, E., 1994. Large sample results for permutation tests of association.
Communications in Statistics Theory and Methods 23, 29392963.
Kumar, R., Novak, J., Tomkins, A., 2006. Structure and evolution of online social
networks, in: KDD 06: Proceedings of the 12th ACM SIGKDD international
conference on Knowledge discovery and data mining, ACM, New York, NY, USA.
pp. 611617.
Kurant, M., Gjoka, M., Butts, C.T., Markopoulou, A., 2011. Walking on a graph
with a magnifying glass. arXiv:1101.5463 .
Lampe, C., Ellison, N.B., Steinfeld, C., 2007. A familiar Face(book): Profile elements
as signals in an online social network, in: Proceedings of Conference on Human
Factors in Computing Systems. ACM Press, New York, NY, pp. 435444.
Lewis, K., Kaufman, J., Christakis, N.A., 2008a. The taste for privacy: An analysis of
college student privacy settings in an online social network. Journal of ComputerMediated Communication 14, 79100.
78
Lewis, K., Kaufman, J., Gonzalez, M., Wimmer, M., Christakis, N.A., 2008b. Tastes,
ties, and time: A new (cultural, multiplex, and longitudinal) social network dataset
using Facebook.com. Social Networks 30, 330342.
Liben-Nowell, D., Novak, J., Kumar, R., Raghavan, P., Tomkins, A., 2005. Geographic routing in social networks. Proceedings of the National Academy of
Sciences 102, 1162311628.
Lievrouw, L.A., Livingstone, S. (Eds.), 2005. The Handbook of New Media. Sage
Publications Ltd., London, UK. updated student edition.
Looijen, A.H., Porter, M.A., 2007. Legends of Caltech III: Techer in the Dark.
Caltech Alumni Association, Pasadena, CA.
Lubbers, M.J., Snijders, T.A.B., 2007. A comparison of various approaches to the
exponential random graph model: A reanalysis of 102 student networks in school
classes. Social Networks 29, 489507.
Mayer, A., Puller, S.L., 2008. The old boy (and girl) network: Social network
formation on university campuses. Journal of Public Economics 92, 329347.
McPherson, M., Smith-Lovin, L., Cook, J.M., 2001. Birds of a feather: Homophily
in social networks. Annual Review of Sociology 27, 415444.
Mucha, P.J., Richardson, T., Macon, K., Porter, M.A., Onnela, J.P., 2010. Community structure in time-dependent, multiscale, and multiplex networks. Science
328, 876878.
79
Newman, M.E.J., 2003. Mixing patterns in networks. Physical Review E 67, 026126.
Newman, M.E.J., 2006a. Finding community structure in networks using the eigenvectors of matrices. Physical Review E 74, 036104.
Newman, M.E.J., 2006b. Modularity and community structure in networks. Proceedings of the National Academy of Sciences 103, 85778582.
Newman, M.E.J., 2010. Networks: An Introduction. Oxford University Press, Oxford, U.K.
Nyland, R., Near, C., 2007. Jesus is my friend: Religiosity as a mediating factor in Internet social networking use, in: Paper presented at AEJMC Midwinter
Conference. Reno, NV.
Onnela, J., Saramaki, J., Hyvonen, J., Szabo, G., Lazer, D., Kaski, K., Kertesz,
J., Barabasi, A.L., 2007. Structure and tie strengths in mobile communication
networks. Proceedings of the National Academy of Sciences 104, 73327336.
Porter, M.A., Mucha, P.J., Newman, M., Warmbrand, C.M., 2005. A network analysis of committees in the United States House of Representatives. Proceedings of
the National Academy of Sciences 102, 70577062.
Porter, M.A., Onnela, J., Mucha, P.J., 2009. Communities in networks. Notices of
the American Mathematical Society 56, 10821097, 11641166.
Rand, W.M., 1971. Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association 66, 846850.
80
Reichardt, J., Bornholdt, S., 2006. Statistical mechanics of community detection.

Physical Review E 74.
Richardson, T., Mucha, P.J., Porter, M.A., 2009. Spectral tripartitioning of networks.
Physical Review E 80, 036111.
Robins, G., Pattison, P., Kalish, Y., Lusher, D., 2007. An Introduction to Exponential Random Graph (p ) Models for Social Networks. Social Networks 29, 173191.
Rosenbloom, S., 2007. On Facebook, scholars link up with data. New York Times
(17 December).
Sodera,
data
V.,
of
2008.
social
Rapleaf
network
study
users
Press
reveals
gender
release,
and
age
available
at
http://business.rapleaf.com/company press 2008 07 29.html.

Spertus, E., Sahami, M., B
uy
ukkokten, O., 2005. Evaluating similarity measures: A
large-scale study in the orkut social network, in: Proceedings of 11th International
Conference on Knowledge Discovery in Data Mining. ACM Press, New York, NY,
pp. 678684.
Traud, A.L., Frost, C., Mucha, P.J., Porter, M.A., 2009. Visualization of communities
in networks. Chaos 19, 041104.
Traud, A.L., Mucha, P.J., Porter, M.A., Kelsic, E.D., 2010. Comparing community
structure to characteristics in online collegiate social networks. ArXiv:0809.0690
(to appear in SIAM Review).
81
Wasserman, S., Faust, K., 1994. Social Network Analysis: Methods and Applications. Structural Analysis in the Social Sciences, Cambridge University Press,
Cambridge, UK.
Wasserman, S., Pattison, P., 1996. Logit models and logistic regressions for social
networks. i: An introduction to markov graphs and p . Psychometrika 61, 401425.
Waugh, A.S., Pei, L., Fowler, J.H., Mucha, P.J., Porter, M.A., 2009. Party polarization in Congress: A network science approach. ArXiv:0907.3509.
Weisstein, E.W., 2011. Barycentric coordinates, in Wolfram Mathworld. Available
at http://mathworld.wolfram.com/BarycentricCoordinates.html .
Zhang, Y., Friend, A.J., Traud, A., L., Porter, M.A., Fowler, J.H., Mucha, P.J.,
2008. Community structure in Congressional cosponsorship networks. Physica A
387, 17051712.
Zheng, R., Provost, F., Ghose, A., 2007. Social network collaborative filtering.
Preprint (CeDED working paper).
82

Face Long 14

Uploaded by

Copyright:

Available Formats

Face Long 14

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Face Long 14

Uploaded by

Copyright:

Available Formats

Social Structure of Facebook Networks

Amanda L. Traud1,2 , Peter J. Mucha1,3 , and Mason A. Porter4,5

Carolina Center for Interdisciplinary Applied Mathematics, Department of Mathematics,

February 10, 2011

of SNS growth, concluding that it is important to consider different classes of users.

3.2. Logistic Regression and Exponential Random Graphs

C2 = n(n2 3n 2) 8(n + 1)M2 + 4

different demographic characteristics, we represent the four z-scores obtained for

From these 4 z-score values, we calculate coordinates X = (x1 , x2 , x3 ) located inside

(0, 0, 2)) with the transformation

The information from z4 = 1 (z1 + z2 + z3 ) is implicitly included in (6) because of

the normalization. Each of the 4 vertices of the tetrahedron corresponds to a limit

in the Full and Student networks.

community-detection versus logistic regression and exponential random graph models

of the House system is captured better by computing community structure.

Difference From Year

High School (0.18)

Difference From Year

Difference From Year

High School (0.19)

Difference From Year

Difference From Year

High School (0.41)

Difference From Year

Difference From Year

High School (0.47)

Difference From Year

and Residence gives the second highest.

Continued on Next Page. . .

Nodes (Full, Student, Female, Male)

Edges (Full, Student, Female, Male)

Table A.1 Continued from previous page

Continued on Next Page. . .

Nodes (Full, Student, Female, Male)

Edges (Full, Student, Female, Male)

Table A.1 Continued from previous page

Continued on Next Page. . .

Nodes (Full, Student, Female, Male)

Edges (Full, Student, Female, Male)

Table A.1 Continued from previous page

Nodes (Full, Student, Female, Male)

Edges (Full, Student, Female, Male)

Table A.2 Continued from previous page

Table A.2 Continued from previous page

Table A.2 Continued from previous page

Table A.2 Continued from previous page

Table A.2 Continued from previous page

Table A.2 Continued from previous page

Table A.2 Continued from previous page

Table A.2 Continued from previous page

Table A.2 Continued from previous page

Table A.2 Continued from previous page

Table A.2 Continued from previous page

Table A.2 Continued from previous page

Table A.2 Continued from previous page

A.2 Continued from previous page

Table A.3: Logistic regression coefficients for a model combining a

Table A.5: Maximum z-scores of the Rand coefficient obtained for

Table A.5 Continued from previous page

Table A.5 Continued from previous page