




























































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Exam; Class: Fiat Lux Freshman Seminars; Subject: Statistics; University: University of California - Los Angeles; Term: Spring 2009;
Typology: Exams
1 / 234
This page cannot be seen from the preview
Don't miss anything!





























































































Version 2.0-
Date 2009-06-
Title Tools for Social Network Analysis
Author Carter T. Butts
Maintainer Carter T. Butts
Depends R (>= 2.0.0), utils
Suggests network, rgl, numDeriv, SparseM, statnet
Description A range of tools for social network analysis, including node and graph-level indices, structural distance and covariance methods, structural equivalence detection, p* modeling, random graph generation, and 2D/3D network visualization.
License GPL (>= 2)
URL http://erzuli.ss.uci.edu/R.stuff
Repository CRAN
Date/Publication 2009-06-08 07:08:
add.isolates......................................... 5 bbnam............................................ 6 bbnam.bf.......................................... 10 betweenness......................................... 13 bicomponent.dist...................................... 16 blockmodel......................................... 17 blockmodel.expand..................................... 19 bn.............................................. 20 bonpow........................................... 23 brokerage.......................................... 25 centralgraph......................................... 27
add.isolates 5
add.isolates Add Isolates to a Graph
Description
Adds n isolates to the graph (or graphs) in dat.
Usage
add.isolates(dat, n, return.as.edgelist = FALSE)
Arguments
dat one or more input graphs. n the number of isolates to add. return.as.edgelist logical; should the input graph be returned as an edgelist (rather than an adja- cency matrix)?
Details
If dat contains more than one graph, the n isolates are added to each member of dat.
Value
The updated graph(s).
Note
Isolate addition is particularly useful when computing structural distances between graphs of dif- ferent orders; see the above reference for details.
Author(s)
Carter T. Butts 〈[email protected]〉
References
Butts, C.T., and Carley, K.M. (2001). “Multivariate Methods for Inter-Structural Analysis.” CASOS Working Paper, Carnegie Mellon University.
See Also
isolates
bbnam 7
nprior Network prior matrix. This must be a matrix of dimension n x n, containing the arc/edge priors for the criterion network. (E.g., nprior[i,j] gives the prior probability of i sending the relation to j in the criterion graph.) Non- matrix values will be coerced/expanded to matrix form as appropriate. If no network prior is provided, an uninformative prior on the space of networks will be assumed (i.e., Pr(i → j) = 0. 5 ). Missing values are not allowed. em Probability of a false negative; this may be in the form of a single number, one number per observation slice, one number per (directed) dyad, or one number per dyadic observation (fixed model only). ep Probability of a false positive; this may be in the form of a single number, one number per observation slice, one number per (directed) dyad, or one number per dyadic observation (fixed model only). emprior Parameters for the (Beta) false negative prior; these should be in the form of an (α, β) pair for the pooled model, and of an n × 2 matrix of (α, β) pairs for the actor model (or something which can be coerced to this form). If no emprior is given, a weakly informative prior (1,11) will be assumed; note that this may be inappropriate, as described below. Missing values are not allowed. epprior Parameters for the (Beta) false positive prior; these should be in the form of an (α, β) pair for the pooled model, and of an n × 2 matrix of (α, β) pairs for the actor model (or something which can be coerced to this form). If no epprior is given, a weakly informative prior (1,11) will be assumed; note that this may be inappropriate, as described below. Missing values are not allowed. diag Boolean indicating whether loops (matrix diagonals) should be counted as data. mode A string indicating whether the data in question forms a "graph" or a "digraph" reps Number of replicate chains for the Gibbs sampler (pooled and actor models only). draws Integer indicating the total number of draws to take from the posterior distribu- tion. Draws are taken evenly from each replication (thus, the number of draws from a given chain is draws/reps). burntime Integer indicating the burn-in time for the Markov Chain. Each replication is iterated burntime times before taking draws (with these initial iterations being discarded); hence, one should realize that each increment to burntime increases execution time by a quantity proportional to reps. (pooled and actor models only) quiet Boolean indicating whether MCMC diagnostics should be displayed (pooled and actor models only). outmode posterior indicates that the exact posterior probability matrix for the cri- terion graph should be returned; otherwise draws from the joint posterior are returned instead (fixed model only). anames A vector of names for the actors (vertices) in the graph. onames A vector of names for the observers (possibly the actors themselves) whose reports are contained in the input data. compute.sqrtrhat A boolean indicating whether or not Gelman et al.’s potential scale reduction measure (an MCMC convergence diagnostic) should be computed (pooled and actor models only).
8 bbnam
Details
The bbnam models a set of network data as reflecting a series of (noisy) observations by a set of participants/observers regarding an uncertain criterion structure. Each observer is assumed to send false positives (i.e., reporting a tie when none exists in the criterion structure) with probability e+, and false negatives (i.e., reporting that no tie exists when one does in fact exist in the criterion structure) with probability e−. The criterion network itself is taken to be a Bernoulli (di)graph. Note that the present model includes three variants:
By default, the bbnam routine returns (approximately) independent draws from the joint poste- rior distribution, each draw yielding one realization of the criterion network and one collection of accuracy parameters (i.e., probabilities of false positives/negatives). This is accomplished via a Gibbs sampler in the case of the pooled/actor model, and by direct sampling for the fixed probabil- ity model. In the special case of the fixed probability model, it is also possible to obtain directly the posterior for the criterion graph (expressed as a matrix of Bernoulli parameters); this can be controlled by the outmode parameter. As noted, the taking of posterior draws in the nontrivial case is accomplished via a Markov Chain Monte Carlo method, in particular the Gibbs sampler; the high dimensionality of the problem (O(n^2 + 2n)) tends to preclude more direct approaches. At present, chain burn-in is determined ex ante on a more or less arbitrary basis by specification of the burntime parameter. Eventually, a more systematic approach will be utilized. Note that insufficient burn-in will result in inaccurate posterior sampling, so it’s not wise to skimp on burn time where otherwise possible. Similarly, it is wise to employ more than one Markov Chain (set by reps), since it is possible for trajectories to become “trapped” in metastable regions of the state space. Number of draws per chain being equal, more replications are usually better than few; consult Gelman et al. for details. A useful measure of chain convergence, Gelman and Rubin’s potential scale reduction (
Rˆ), can be computed using the compute.sqrtrhat parameter. The potential scale reduction measure is an ANOVA-like com- parison of within-chain versus between-chain variance; it approaches 1 (from above) as the chain converges, and longer burn-in times are strongly recommended for chains with scale reductions in excess of 1.2 or thereabouts.
10 bbnam.bf
References
Butts, C. T. (2003). “Network Inference, Error, and Informant (In)Accuracy: A Bayesian Ap- proach.” Social Networks, 25(2), 103-140. Gelman, A.; Carlin, J.B.; Stern, H.S.; and Rubin, D.B. (1995). Bayesian Data Analysis. London: Chapman and Hall. Gelman, A., and Rubin, D.B. (1992). “Inference from Iterative Simulation Using Multiple Se- quences.” Statistical Science, 7, 457-511. Krackhardt, D. (1987). “Cognitive Social Structures.” Social Networks, 9, 109-134.
See Also
npostpred, event2dichot, bbnam.bf
Examples
#Create some random data g<-rgraph(5) g.p<-0.8g+0.2(1-g) dat<-rgraph(5,5,tprob=g.p)
#Define a network prior pnet<-matrix(ncol=5,nrow=5) pnet[,]<-0. #Define em and ep priors pem<-matrix(nrow=5,ncol=2) pem[,1]<- pem[,2]<- pep<-matrix(nrow=5,ncol=2) pep[,1]<- pep[,2]<-
#Draw from the posterior b<-bbnam(dat,model="actor",nprior=pnet,emprior=pem,epprior=pep, burntime=100,draws=100) #Print a summary of the posterior draws summary(b)
bbnam.bf Estimate Bayes Factors for the bbnam
Description
This function uses monte carlo integration to estimate the BFs, and tests the fixed probability, pooled, and pooled by actor models. (See bbnam for details.)
bbnam.bf 11
Usage
bbnam.bf(dat, nprior=0.5, em.fp=0.5, ep.fp=0.5, emprior.pooled=c(1, 11), epprior.pooled=c(1, 11), emprior.actor=c(1, 11), epprior.actor=c(1, 11), diag=FALSE, mode="digraph", reps=1000)
Arguments
dat Input networks to be analyzed. This may be supplied in any reasonable form, but must be reducible to an array of dimension m×n×n, where n is |V (G)|, the first dimension indexes the observer (or information source), the second indexes the sender of the relation, and the third dimension indexes the recipient of the relation. (E.g., dat[i,j,k]==1 implies that i observed j sending the relation in question to k.) Note that only dichotomous data is supported at present, and missing values are permitted; the data collection pattern, however, is assumed to be ignorable, and hence the posterior draws are implicitly conditional on the observation pattern. nprior Network prior matrix. This must be a matrix of dimension n x n, containing the arc/edge priors for the criterion network. (E.g., nprior[i,j] gives the prior probability of i sending the relation to j in the criterion graph.) Non- matrix values will be coerced/expanded to matrix form as appropriate. If no network prior is provided, an uninformative prior on the space of networks will be assumed (i.e., Pr(i → j) = 0. 5 ). Missing values are not allowed. em.fp Probability of false negatives for the fixed probability model ep.fp Probability of false positives for the fixed probability model emprior.pooled (α, β) pairs for the (beta) false negative prior under the pooled model epprior.pooled (α, β) pairs for the (beta) false positive prior under the pooled model emprior.actor Matrix of per observer (α, β) pairs for the (beta) false negative prior under the per observer/actor model, or something that can be coerced to this form epprior.actor Matrix of per observer ((α, β) pairs for the (beta) false positive prior under the per observer/actor model, or something that can be coerced to this form diag Boolean indicating whether or not the diagonal should be treated as valid data. Set this true if and only if the criterion graph can contain loops. Diag is false by default. mode String indicating the type of graph being evaluated. "digraph" indicates that edges should be interpreted as directed; "graph" indicates that edges are undi- rected. Mode is set to "digraph" by default. reps Number of Monte Carlo draws to take
Details
The bbnam model (detailed in the bbnam function help) is a fairly simple model for integrating in- formant reports regarding social network data. bbnam.bf computes log Bayes Factors (integrated
betweenness 13
betweenness Compute the Betweenness Centrality Scores of Network Positions
Description
betweenness takes one or more graphs (dat) and returns the betweenness centralities of po- sitions (selected by nodes) within the graphs indicated by g. Depending on the specified mode, betweenness on directed or undirected geodesics will be returned; this function is compatible with centralization, and will return the theoretical maximum absolute deviation (from maximum) conditional on size (which is used by centralization to normalize the observed centralization score).
Usage
betweenness(dat, g=1, nodes=NULL, gmode="digraph", diag=FALSE, tmaxdev=FALSE, cmode="directed", geodist.precomp=NULL, rescale=FALSE, ignore.eval=TRUE)
Arguments
dat one or more input graphs. g integer indicating the index of the graph for which centralities are to be calcu- lated (or a vector thereof). By default, g=1. nodes vector indicating which nodes are to be included in the calculation. By default, all nodes are included. gmode string indicating the type of graph being evaluated. "digraph" indicates that edges should be interpreted as directed; "graph" indicates that edges are undi- rected. gmode is set to "digraph" by default. diag boolean indicating whether or not the diagonal should be treated as valid data. Set this true if and only if the data can contain loops. diag is FALSE by default. tmaxdev boolean indicating whether or not the theoretical maximum absolute deviation from the maximum nodal centrality should be returned. By default, tmaxdev==FALSE. cmode string indicating the type of betweenness centrality being computed (directed or undirected geodesics, or a variant form – see below). geodist.precomp A geodist object precomputed for the graph to be analyzed (optional) rescale if true, centrality scores are rescaled such that they sum to 1. ignore.eval logical; ignore edge values when computing shortest paths?
Details
The shortest-path betweenness of a vertex, v, is given by
CB (v) =
i,j:i 6 =j,i 6 =v,j 6 =v
givj gij
14 betweenness
where gijk is the number of geodesics from i to k through j. Conceptually, high-betweenness vertices lie on a large number of non-redundant shortest paths between other vertices; they can thus be thought of as “bridges” or “boundary spanners.” Several variant forms of shortest-path betweenness exist, and can be selected using the cmode argument. Supported options are as follows:
directed Standard betweenness (see above), calculated on directed pairs. (This is the default option.) undirected Standard betweenness (as above), calculated on undirected pairs (undirected graphs only). endpoints Standard betweenness, with direct connections counted towards ego’s score. This expresses the intuition that individuals’ control over their own direct contacts should be con- sidered in their total score (e.g., when betweenness is interpreted as a measure of information control). proximalsrc Borgatti’s proximal source betweenness, given by
CB (v) =
i,j:i 6 =v,i 6 =j,j→v
givj gij
This variant allows betweenness to accumulate only for the last intermediating vertex in each incoming geodesic; this expresses the notion that, by serving as the “proximal source” for the target, this particular intermediary will in some settings have greater influence or control than other intervening parties. proximaltar Borgatti’s proximal target betweenness, given by
CB (v) =
i,j:i 6 =v,i→v,i 6 =j
givj gij
This counterpart to proximal source betweenness (above) allows betweenness to accumulate only for the first intermediating vertex in each outgoing geodesic; this expresses the notion that, by serving as the “proximal target” for the source, this particular intermediary will in some settings have greater influence or control than other intervening parties. proximalsum The sum of Borgatti’s proximal source and proximal target betweenness scores (above); this may be used when either role is regarded as relevant to the betweenness calcula- tion. lengthscaled Borgetti and Everett’s length-scaled betweenness, given by
CB (v) =
i,j:i 6 =j,i 6 =v,j 6 =v
dij
givj gij
where dij is the geodesic distance from i to j. This measure adjusts the standard betweenness score by downweighting long paths (e.g., as appropriate in circumstances for which such paths are less-often used). linearscaled Geisberger et al.’s linearly-scaled betweenness:
CB (v) =
i,j:i 6 =j,i 6 =v,j 6 =v
dij
givj gij
16 bicomponent.dist
bicomponent.dist Calculate the Bicomponents of a Graph
Description
bicomponent.dist returns the bicomponents of an input graph, along with size distribution and membership information.
Usage
bicomponent.dist(dat, symmetrize = c("strong", "weak"))
Arguments
dat a graph or graph stack. symmetrize symmetrization rule to apply when pre-processing the input (see symmetrize).
Details
The bicomponents of undirected graph G are its maximal 2-connected vertex sets. bicomponent.dist calculates the bicomponents of G, after first coercing to undirected form using the symmetrization rule in symmetrize. In addition to bicomponent memberships, various summary statistics re- garding the bicomponent distribution are returned; see below.
Value
A list containing
members A list, with one entry per bicomponent, containing component members. memberships A vector of component memberships, by vertex. (Note: memberships may not be unique.) Vertices not belonging to any bicomponent have membership values of NA. csize A vector of component sizes, by bicomponent. cdist A vector of length |V (G)| with the (unnormalized) empirical distribution func- tion of bicomponent sizes.
Note
Remember that bicomponents can intersect; when this occurs, the relevant vertices’ entries in the membership vector are assigned to one of the overlapping bicomponents on an arbitrary basis. The members element of the return list is the safe way to recover membership information.
Author(s)
Carter T. Butts 〈[email protected]〉
blockmodel 17
References
Brandes, U. and Erlebach, T. (2005). Network Analysis: Methodological Foundations. Berlin: Springer.
See Also
component.dist, cutpoints, \code{cutpoints}
Examples
#Draw a moderately sparse graph g<-rgraph(25,tp=2/24,mode="graph")
#Compute the bicomponents bicomponent.dist(g)
blockmodel Generate Blockmodels Based on Partitions of Network Positions
Description
Given a set of equivalence classes (in the form of an equiv.clust object, hclust object, or membership vector) and one or more graphs, blockmodel will form a blockmodel of the input graph(s) based on the classes in question, using the specified block content type.
Usage
blockmodel(dat, ec, k=NULL, h=NULL, block.content="density", plabels=NULL, glabels=NULL, rlabels=NULL, mode="digraph", diag=FALSE)
Arguments
dat one or more input graphs. ec equivalence classes, in the form of an object of class equiv.clust or hclust, or a membership vector. k the number of classes to form (using cutree). h the height at which to split classes (using cutree). block.content string indicating block content type (see below). plabels a vector of labels to be applied to the individual nodes. glabels a vector of labels to be applied to the graphs being modeled. rlabels a vector of labels to be applied to the (reduced) roles. mode a string indicating whether we are dealing with graphs or digraphs. diag a boolean indicating whether loops are permitted.
blockmodel.expand 19
Examples
#Create a random graph with some edge structure g.p<-sapply(runif(20,0,1),rep,20) #Create a matrix of edge #probabilities g<-rgraph(20,tprob=g.p) #Draw from a Bernoulli graph #distribution
#Cluster based on structural equivalence eq<-equiv.clust(g)
#Form a blockmodel with distance relaxation of 10 b<-blockmodel(g,eq,h=10) plot(b) #Plot it
blockmodel.expand Generate a Graph (or Stack) from a Given Blockmodel Using Partic- ular Expansion Rules
Description
blockmodel.expand takes a blockmodel and an expansion vector, and expands the former by making copies of the vertices.
Usage
blockmodel.expand(b, ev, mode="digraph", diag=FALSE)
Arguments
b blockmodel object. ev a vector indicating the number of copies to make of each class (respectively). mode a string indicating whether the result should be a “graph” or “digraph”. diag a boolean indicating whether or not loops should be permitted.
Details
The primary use of blockmodel expansion is in generating test data from a blockmodeling hypoth- esis. Expansion is performed depending on the content type of the blockmodel; at present, only density is supported. For the density content type, expansion is performed by interpreting the inter- class density as an edge probability, and by drawing random graphs from the Bernoulli parameter matrix formed by expanding the density model. Thus, repeated calls to blockmodel.expand can be used to generate a sample for monte carlo null hypothesis tests under a Bernoulli graph model.
Value
An adjacency matrix, or stack thereof.
20 bn
Note
Eventually, other content types will be supported.
Author(s)
Carter T. Butts 〈[email protected]〉
References
Doreian, P.; Batagelj, V.; and Ferligoj, A. (2005). Generalized Blockmodeling. Cambridge: Cam- bridge University Press. White, H.C.; Boorman, S.A.; and Breiger, R.L. (1976). “Social Structure from Multiple Networks I: Blockmodels of Roles and Positions.” American Journal of Sociology, 81, 730-779.
See Also
blockmodel
Examples
#Create a random graph with some edge structure g.p<-sapply(runif(20,0,1),rep,20) #Create a matrix of edge #probabilities g<-rgraph(20,tprob=g.p) #Draw from a Bernoulli graph #distribution
#Cluster based on structural equivalence eq<-equiv.clust(g)
#Form a blockmodel with distance relaxation of 15 b<-blockmodel(g,eq,h=15)
#Draw from an expanded density blockmodel g.e<-blockmodel.expand(b,rep(2,length(b$rlabels))) #Two of each class g.e
bn Fit a Biased Net Model
Description
Fits a biased net model to an input graph, using moment-based or maximum pseudolikelihood techniques.
Usage
bn(dat, method = c("mple.triad", "mple.dyad", "mple.edge", "mtle"), param.seed = NULL, param.fixed = NULL, optim.method = "BFGS", optim.control = list(), epsilon = 1e-05)