IDENTIFICATION, ESTIMATION, AND SENSITIVITY ANALYSIS OF CONTAGION EFFECTS USING LONGITUDINAL SOCIAL NETWORK DATA By Ran Xu A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Measurement and Quantitative Methods Doctor of Philosophy 2016 ABSTRACT IDENTIFICATION, ESTIMATION, AND SENSITIVITY ANALYSIS OF CONTAGION EFFECTS USING LONGITUDINAL SOCIAL NETWORK DATA By Ran Xu Contagion effects, also known as peer effects or social influence process, refer to the phenomenon whereby people tend to assimilate the behavior of those with whom they have interaction in a social network. With the availability of longitudinal social network data, studies of contagion effects have become more and more central to social science, with many applications in the field of education, such as the diffusion of innovation, change of health behaviors, academic outcomes among adolescents, and the implementation of practices among teachers (Valente, 1995, 1996; Christakis et al., 2007, 2008; Sacerdote, 2000; Frank et al, 2004). However, contagion effects are usually difficult to identify as they are often entangled with other factors such as homophily in the selection process, an individual™s preference for the same social settings, etc. Methods currently available either do not solve these problems or require strong assumptions. Furthermore, there is still a significant degree of misconception about why identifying contagion effects is a problem, and when these methods should be applied. For this dissertation, in the first chapter I will clarify why and when we will encounter problems identifying contagion effects. Specifically I will frame this in terms of an omitted variable bias problem; and then I will explore the magnitude of bias in the estimation of contagion effects in various situations, and possible remedies under an OLS framework. In the second chapter I will propose some alternative estimation methods that have the potential to correctly identify contagion effects under weaker assumptions when there are unobserved variables present. In the third chapter I will propose a set of simulation-based sensitivity analysis methods that can test the robustness of inferences made in social network analysis, especially inferences about contagion effects. Copyright by RAN XU 2016 iv TABLE OF CONTENTS LIST OF TABLES-------------------------------------------------------------------------- vi LIST OF FIGURES------------------------------------------------------------------------ vii CHAPTER 1: Identification--------------------------------------------------------------- 1 I. Introduction and Literature Review------------------------------------------------ 1 II. Theoretical Framework------------------------------------------------------------- 3 i. Identifying contagion effects: where does bias come from?----------------- 3 ii. Situations where the contagion effect is identifiable using OLS----------- 8 III. Monte-Carlo Simulation------------------------------------------------------------ 10 i. Simulation example--------------------------------------------------------------- 10 ii. Magnitude of bias in different situations-------------------------------------- 12 IV. Possible Solutions: Multidimensional Priors ------------------------------------ 22 V. Discussion and Conclusion--------------------------------------------------------- 25 CHAPTER 2: Estimation Methods------------------------------------------------------- 27 I. Introduction---------------------------------------------------------------------------- 27 II. Theoretical Framework-------------------------------------------------------------- 27 i. Random vs Fixed effects---------------------------------------------------------- 27 ii. Instrumental variable methods--------------------------------------------------- 29 iii. Latent variable approach-------------------------------------------------------- 32 III. Monte-Carlo Simulation------------------------------------------------------------ 37 IV. Robustness Test----------------------------------------------------------------------- 44 i. Results with covariates------------------------------------------------------------ 44 ii. Results with cluster membership----------------------------------------------- 47 iii. Fixed intercept-------------------------------------------------------------------- 50 V. Discussion and Conclusion----------------------------------------------------------- 52 CHAPTER 3: Sensitivity Analysis--------------------------------------------------------- 55 I. Introduction and Literature Review-------------------------------------------------- 55 II. Theoretical Framework--------------------------------------------------------------- 59 i. Sensitivity analysis through rewiring networks-------------------------------- 59 ii. Mechanisms for rewiring--------------------------------------------------------- 59 III. Applying Sensitivity Analysis: Examples on Contagion Effects--------------- 65 IV. Analytic Solutions-------------------------------------------------------------------- 71 i. First set of analytical solutions--------------------------------------------------- 73 ii. Second set of analytical solutions----------------------------------------------- 78 iii. Validation-------------------------------------------------------------------------- 84 V. Discussion and Conclusion--------------------------------------------------------- 94 APPENDIX----------------------------------------------------------------------------------- 99 v REFERENCES------------------------------------------------------------------------------ 109 vi LIST OF TABLES Table 1: Experimental condition ---------------------------------------------------------- 18 Table 2: Experimental condition 2--------------------------------------------------------- 24 Table 3: Influence model example--------------------------------------------------------- 68 vii LIST OF FIGURES Figure 1: Omitted variable bias------------------------------------------------------------- 6 Figure 2: Cases when contagion effects are identified---------------------------------- 10 Figure 3: Simulation example where contagion effects are identified---------------- 11 Figure 4: Magnitude of bias for prior ----------------------------------------------------- 14 Figure 5: Magnitude of bias for exposure ------------------------------------------------ 16 Figure 6: Results with multi-dimensional prior------------------------------------------ 23 Figure 7: Dynamic model with unobserved term---------------------------------------- 33 Figure 8: Influence model in structural equation model ------------------------------- 34 Figure 9: Simulation results for prior------------------------------------------------------ 39 Figure 10: Simulation results for exposure----------------------------------------------- 41 Figure 11: Results with covariates -------------------------------------------------------- 45 Figure 12: Results with cluster membership--------------------------------------------- 48 Figure 13: Results with fixed intercept --------------------------------------------------- 50 Figure 14: Structural rewiring example -------------------------------------------------- 64 Figure 15: Impact of rewiring on the estimate------------------------------------------- 69 Figure 16: Analytical solution framework ----------------------------------------------- 72 Figure 17: Distribution of percent of the ties rewired----------------------------------- 79 Figure 18: Strong influence - random rewiring example------------------------------- 85 Figure 19: Strong influence - homophily rewiring example--------------------------- 86 Figure 20: Strong influence Œ anti-homophily rewiring example--------------------- 87 Figure 21: Moderate influence Œ random rewiring example --------------------------- 88 Figure 22: Moderate influence Œ homophily rewiring example ----------------------- 89 Figure 23: Moderate influence Œ anti-homophily rewiring example ----------------- 90 viii Figure 24: Weak influence Œ random rewiring example ------------------------------- 91 Figure 25: Weak influence Œ homophily rewiring example --------------------------- 92 Figure 26: Weak influence Œ anti-homophily rewiring example ---------------------- 93 Figure S1: Cases when true prior is 0 (Big N vs Big T)------------------------------- 101 Figure S2: Anti-homophily rewiring example (better fit)----------------------------- 106 Figure S3: homophily rewiring example when network is smaller------------------ 107 Figure S4: homophily rewiring example when network is denser------------------- 108 1 CHAPTER 1: Identification I. Introduction and Literature Review Endogenous social effects, which have long been central to the field of social science (Asch, 1952; Merton, 1957; Erbring and Young, 1979; Bandura, 1986), are defined as the propensity for the behavior of an individual to vary along with the prevalence of that behavior in some reference group containing the individual (Manski, 1993). Within the framework of social network analysis, the endogenous social effects are also known as ficontagionfl or fisocial influencefl, and the reference group can be one™s network neighborhood. Contagion effects have also received much attention and have been widely studied (Kandel, 1978; Marsden and Friedkin, 1993; Doreian, 2001; An, 2011) as they have various implications for issues such as health behavior (e.g. obesity and smoking), information diffusion, or change in teacher practices, among others (Christakis et al, 2007, 2008; Valente, 1995, 1996; Frank et al, 2004). However, these types of contagion effects are usually difficult to identify, as it is difficult to separate such influences from other processes when there is network autocorrelation in the data, i.e. when we observe that people who are closely related to each other tend to be similar in some salient individual behavior and attitude dimensions, it is difficult to tell which is the underlying mechanism that generates these patterns. It could be influence and contagion (Friedkin, 1999, 2001; Oetting and Donnermeyer, 1998) whereby actors assimilate the behavior of their network members; or selection mechanisms, more specifically homophily (Lazarsfeld and Merton, 1954; Byrne, 1971; McPherson and Smith-Lovin, 1987; McPherson et al, 2001), where actors seek to interact with similar others; or it could be due to different social contexts where people with prior similarities can select themselves into the same social setting, and actual friendship formation just reflects the opportunities of meeting in this social setting (Feld, 1981, 1982; Kalmijn & Flap, 2 2001).1 Several notable attempts that try to identify contagion effects include modeling the co-evolution of selection and influence (Snijder et al., 2007; Steglich et al., 2010), using indirect ties from third parties as instrumental variables (Bramoullé et al., 2009; An, 2011), or Propensity Score Matching (Aral et al., 2009). But there is still considerable misconception about when it is problematic to identify contagion effects, and why these methods would need to be applied. Furthermore, all the methods mentioned above require some form of strong assumptions such as the exponential-family parametric assumption, the standard IV assumption, the assumption that all of the dependence is captured by observable covariates, and so on, each of which imposes substantial limits on the forms of data where these methods can actually be applied. The difficulty of identification caused by entanglement between contagion effects and other confounding variables (environmental factors, or the attributes of egos and alters, for example) can be easily framed as an omitted variable bias problem. What is less obvious is that the dilemma caused by co-evolution of the influence and selection processes can essentially be framed as an omitted variable bias problem as well. As pointed out by Steglich (2010), one of the important concerns is the fipossibility that there may be non-observed variables co-determining the probabilities of change in network and/or behaviorfl. Shalizi and Thomas (2011) have shown that when there is a latent trait that co-determines both influence and selection in network data, contagion effects are generally unidentifiable, mainly due to the fact that contagion and homophily (selection) are generically confounded through this latent trait. 1 There are also structural constraints such as transitivity, preferential attachment etc. which could cause people to become friends. However these mechanisms in themselves do not entangle with influence (e.g. one befriends with another having high popularity but different behavior). Another mechanism must be present to induce similarity between these friends (e.g. selection of common friends based on similarity in attributes). In these cases consideration goes back to the original three mechanisms, namely influence, selection based on homophily, and environmental factors. 3 In this chapter, first I will clarify why contagion effects are difficult to identify; specifically, I will frame identification as an omitted variable bias problem. Then I will give examples where contagion effects can be identified using conventional approaches (e.g. OLS), and explore the magnitude of bias occurring when estimating contagion effects under various scenarios. Finally, I will propose a possible solution under the OLS framework that has the potential to correctly identify contagion effects, and then carry out simulation studies to examine the performance of this solution. II. Theoretical Framework i. Identifying contagion effects: where does bias come from? To understand where the bias comes from when identifying contagion effects, first we need to specify our ficausalfl models in terms of influence and selection. After specifying our model we then show how the estimation of contagion effects can suffer from bias. A network behavioral (influence) model can be represented as (,,,)itijjiiYfZYXc (1) where the behavior of node i at time t is a function of the behavior of network members Yj, other variables X specific to node i, network relations Z and unobserved effects ci. For example, adolescents™ alcohol use (Yit) can be a function of their previous alcohol use (Yit-1), their close friends™ alcohol use (Yjt-1), their own cigarette use (Xit-1) and some latent disposition for substance abuse (ci). Specifically, we choose a dynamic linear form (Friedkin 1990) : 110112311ijtjtitititiitijtZYYYXceZ , (2) where Yit-1 is the prior behavior of i, Zijt-1 is a dummy variable indicating if there is a 4 link from i to j at time t-1, i.e. 1 if yes and 0 otherwise, and 111ijtjtijtZYZ represents the weighted average behavior among the network neighbors of i, which is the exposure term (contagion) of interest2, and Xit-1 represents other variables that might affect the behavioral outcome Y. We choose this form of behavioral model for several reasons: (1) we choose linear models as they have greater flexibility when compared with models like SIENA (Steglich et al., 2010), and because of the availability of methods that they may produce, under suitable assumptions, for unobserved time-constant actor differences (Steglich et al., 2010; Mouw, 2006); (2) we only use lagged endogenous variables (if X is exogenous, i.e. if X is not ficausedfl by Y then X can be contemporaneous), which to us is a more realistic assumption that there is some lag in the transmission of social effects. In addition, such formulations require less strict conditions for identification of social effects (Manski, 1993). One might argue that there are also contemporaneous social effects that should be included, which constitute the true fistructural modelfl (Sims, 1980; Bramoullé et al., 2009). Even if this is true, however, the identification of contemporaneous effects often requires strong structural restrictions or valid instrumental variables (Sims, 1980; Manski, 1993; Wooldridge, 2010), and including both contemporaneous and lagged effects when identifying contagion effects can cause problems both in estimation and interpretation (Lyons, 2011; VanderWeele and An, 2011; VanderWeele et al., 2012). As Sims (1980) has argued for vector-autoregressive models, the type of fireduced formfl models in equation 2 do not require fitoo many incredible restrictionsfl for identification, and are still very useful in forecasting and analysis. For the selection process, let Zijt = 1 if there is a connection from node i to node j at time t, and let Z*ijt be a latent variable defined as *0112311411||||||ijtijtijitjtitjtijtZZccYYXX , (3) 2 From now on I will use the term ficontagion effectsfl to represent 2, finetwork exposure termfl to represent 111ijtjtijtZYZ. 5 where c represents a time invariant unobserved trait for i and j, Y represents the behavior of interest, X represents the exogenous variables and ~(0,1)ijtN. By defining Zijt as *100ijtijtifZZotherwise (4) we know that Zijt follows a standard probit model (Wooldridge, 2010) where 0112311411(1)(||||||)ijtijtijitjtitjtPZZccYYXX (5) The models described in equations (2) and (5) are now called simply models 2 and 5. Model 5 represents the selection model. Through models 2 and 5 we now have described the co-evolution of the influence and selection processes, which operate through the same sets of observed and unobserved variables. And the magnitude of contagion effects is represented by the 2 in model 2. To understand where any bias comes from, we need to know that in order to get consistent estimates in model 2 using OLS, one key assumption is that unobserved errors have to be uncorrelated with observed variables. In this case, if either the idiosyncratic error eit or a latent trait ci is correlated with observed variables, we will have biased estimates. For now we only focus on the latent trait ci and assume that eit does not correlate with observed variables in model 2. (Different exogeneity assumptions must hold for different estimation methods, for more details see Wooldridge (2010)). We already know that ci correlates with Yit-1 2 < 0 in model 5, such that there is a homophily based selection which operates through a latent trait, we know (i) person i will select person j with a similar latent trait, and (ii) person j™s behavior is a function of person j™s latent trait cj, which is similar to ci through selection; and (iii) together ci will be correlated with person j™s behavior, which is analogous to the exposure term in model 2. As c is unobserved, this violates the key assumption of OLS, so that estimates in model 2 will be inconsistent, and the contagion (exposure) effect is unidentifiable. 6 To give an example, assuming that delinquency is a function of an unobserved risk-taking tendency (arrow Bi in the figure below), and when there is homophily based selection which operates through this unobserved variable, (i) person i will select person j who is similar on the unobserved risk-taking tendency (arrow A in the figure) ; (ii) person j™s delinquency behavior is a function of person j™s risk-taking tendency (arrow Bj), which is similar to person i™s risk-taking tendency through selection; and (iii) because of (i) and (ii) the risk-taking tendency for person i will be correlated with person j™s delinquency behavior (arrow C in the figure). As the risk-taking tendency is unobserved, this violates the key assumption of OLS, so that estimates may be inconsistent, and the contagion (exposure) effect is unidentifiable. For an analogous algebraic argument see appendix 3. Figure 1: Omitted variable bias Hence through a regression framework we have explained what it really means by stating that selection (homophily) is confounded with influence, and as we can see, this can directly translate into an fiomitted variable biasfl problem, under which there are some omitted variables that we do not control for, but which affect both selection 7 and behavioral outcomes3. So instead of stating that contagion effects are unidentifiable because selection operates at the same time, a more meaningful question to ask might be fiwhat factors in the selection process might also affect behavioral outcomes?fl And as long as we have that variable controlled in model 2, we should not be worried about selection (homophily) being confounded with influence any more. Note that the intention of this framework is to shift the discussion of contagion effects towards being more theory-based instead of methods-based, and by grounding the discussion in theory this framework allows us to devise clear and testable alternative hypotheses, which is the key to making strong inference in any field of science (Platt, 1964). One might attempt to use models that can model influence and selection at the same time (SIENA for example) to separate influence from selection, however as Steglich (2010) pointed out, such models still will not work when finon-observed variables co-determine the probabilities of change in network and/or behaviorfl. One might borrow from the causal inference literature (Rosenbaum and Rubin, 1983) and use methods such as propensity matching (Aral et al., 2009), but that still does not deal with unobserved variable problems as strong ignorability assumes observed variables carry all the dependency between outcomes and treatment assignments (contagion or network exposure). Till now, unobserved variables seem to create problems which are impossible to overcome, which will cause bias in the estimation of contagion effects in most cases. However, the magnitude of bias in various situations has not been fully explored. So one of our research aims is to ascertain the severity of bias in the estimation of contagion effects in various situations. Furthermore, there are still situations where contagion effects can be identified using simple OLS. Next, we will illustrate situations where contagion effects can be identified, and give some specific examples. 3 Note that unobserved variables, that only affect behavioral outcomes but not selection, may cause estimation problems as well, but that is not the focus here. 8 ii. Situations where the contagion effect is identifiable using OLS Our previous illustration seems to suggest that the un-identifiability of contagion effects due to the presence of an unobserved latent trait is a problem which is impossible to overcome, however there are still some situations, described as follows, where we can get consistent (or adequate) estimates of contagion effects just using OLS. (i) A latent trait only exists in the selection process, not in the influence process. That means that variables affecting the behavioral outcome of interest are all observed. For example if all factors (family structure, parental control, skills deficit etc.) that affect an adolescent™s delinquency behavior are observed, and the unobserved risk-taking tendency only affects how the adolescent chooses friends but does not directly affect behavioral outcomes, then in this case the unobserved factors are no longer correlated with observed variables in the influence model, so that a contagion effect can be identified. Note that in this case there still could be strong homophily in the selection process, and homophily can still depend on the unobserved risk-taking tendency; but in this case selection is no longer confounded with influence, since unobserved factors that affect selection do not affect influence. (ii) There is still a common trait that codetermines selection and influence, but the common trait is observed. For example, if we have a psychologically sound measure for the risk-taking tendency that affects both delinquency behavior as well as the adolescent™s choice of friends, then there are not any unobserved factors that are correlated with observed variables, so that the contagion effect is identified. In this case, there still could be strong homophily in the selection process, and homophily depends on the same trait (risk-taking tendency) that appears in the influence model. But that will not affect our estimation as we have controlled these dependencies by 9 controlling for what is common in selection and influence. (iii) The latent trait is still unobserved in the behavioral model, but we can find good proxies for the latent trait. For example if we find behavioral problems at youth is a good proxy for a risk-taking tendency, in this case we can replace risk-taking tendency by behavioral problems at youth, and we can still get good estimates of the contagion effect, regardless of what types of homophily exist in the selection process. (iv) In one very special case, the latent risk-taking tendency is still unobserved in a behavior model, but networks do not endure, for example adolescents constantly rewire networks and randomly choose new friends. In this case, it is possible that the exposure term in the influence model 2 is not correlated with the prior term or latent trait, and can be consistently estimated, even though estimates for the prior term may be inconsistent, as they have to correlate with the unobserved risk-taking tendency by design. But this case is barely interesting or realistic (perhaps possible in an experimental setting or in a scenario where adolescents meet each other for the first time and constantly change interaction partners during the first couple weeks) so it will not be discussed further here. Note that even if there is no homophilous selection in networks, the exposure term pertinent to influence model 2 will still correlate with the prior term through the influence process, as long as the network is stable (even in the weakest sense), and this will cause an identification problem. This is different from Shalizi and Thomas (2011), which states that contagion effects are identifiable when any latent trait fithat influences the social tie formation is kept from being latentfl. 10 Figure 2: Cases when contagion effects are identified III. Monte-Carlo Simulation i. Simulation example Next we give simulated examples illustrating situations where contagion effects are identified when we observe every variable affecting behavior outcome of interest, even with strong homophily in the selection process. First we simulate a data set where there is a common trait codetermining selection and influence, but the common trait is observed. In this case there is strong homophily in network selection, but all variables affecting behavioral outcomes are observed. Specifically let the influence and selection model be: 111121ijtjtititiitijtZYYYXeZ (6) 1(1)(0.60.5||)ijtijtijPZZXX (7) All variables are defined as in models 2 and 5, except that X here is an observed time 11 invariant trait for nodes, and follows a N(0,1) distribution. We start from a random network and at each time point we let nodes update their behavior and networks according to models 6 and 7. Then we take data from 4 time points and estimate model 6 using OLS. Other configurations include: node size = 40, density = 0 = -1 = 2 = -0.5 (from this configuration we will have stable networks with strong homophily effects based on X), 2~(0,0.2)iteN1 2 1 2 =0.9. By setting up this way we only change the dynamics of the relationship, and at the same time do not affect the equilibrium of the system (Kiviet, 1995). Figure 3: Simulation example where contagion effects are identified 12 In our simulated example, homophily dominates and correlations between prior, exposure term and X (time-invariant trait) are as follows: Corrprior,Expo = 0.81, Corprior,X = 0.97, CorExpo,X = 0.8. Figure 3 above shows the mean bias for estimating the prior term and the exposure term in model 6. As we can see there is practically no bias in estimation using OLS after we control for the time invariant trait, despite the fact that there is strong homophily in our data, and the correlation between prior and exposure is as high as 0.8. Furthermore, note that OLS recovered the true parameters, given that the selection process was accounted for but not necessarily directly modeled (e.g. SIENA). This is because bias is induced by unobserved confounded variables, not by information that is accounted for in the model (such as that captured in X and Yit-1). ii. Magnitude of bias in different situations In this section we investigate the bias of OLS estimates of model 2 when we ignore the latent trait. Specifically we are interested in the magnitude of bias under different situations. For simplicity we do not include other observed variable Xs, and we let the fitruefl influence model be: 1111231ijtjtititiitijtZYYYceZ (8) And correspondingly, the fitruefl selection model can be represented as 01(1)(||)ijtijPZcc (9) Simulation configuration. While there are many factors that could affect the magnitude of the bias, such as density, the magnitude of the 3), variance of idiosyncratic error, etc, we focus on the following: (1) sample size. As N and T are important in panel data, which usually have large N and small T, we will focus on the number of nodes and the number of time points. 13 (2) level of homophily. As homophily operates through the latent trait in our simulated data, higher homophily means higher correlation between network exposure and the latent trait (also higher correlation between network exposure and prior). We are interested in how homophily affects the magnitude of bias. (3) magnitude of true coefficients for prior and network exposure, as different levels of influence might affect the magnitude of bias in estimation. Specifically, let a simulation configuration be as follows: (1) we vary the number of time points to be 2 or 5; (2) we vary the numbers of nodes to be 40 or 80; (3) we vary the homophily level to be (i) no homophily, where the random network that does not change over time, (ii) low homophily, where correlation between Prior and Exposure is 0 = - 1 = - 0.15 for N = 0 = - 1 = - 0.1 for N = 80. Note that 1 to control the 0 to control the overall density); (iii) high homophily: correlation between Prior and Exposure to be around 0.4, like some examples we found in empirical data (Penual et al., 2012; Venkatesh et al., 2000) 0 = - 1 = - 0.45 for N = 0 = -1 = -0.3 for N = 80); and (iv) we 1 2 1 2 =0.9. This is the same as before, only changing the dynamics of the relationship, not at the same time the equilibrium of the system (Kiviet, 1995). In each configuration we start from a random network and simulate based on models in (8) and (9), and estimate the model 111121ijtjtitititijtZYYYeZ , (10) finding 1 2. Other model configurations 3=0.1 (to keep a consensus within the intial range for ci (Friedkin, 1999)), and 2~(0,0.2)iteN. In each network we kept the average out-degree for nodes to be the same (~8).4 4 We kept nodes™ average out-degree to be the same instead of the density of the network, since bias for contagion effects will be larger for networks having more nodes, given that the density is the same. One possible reason is that the exposure term will have smaller variance when the network has 14 Figure 4: Magnitude of bias for prior more nodes. Keeping the density the same when network size is large means the average degree is actually much higher, resulting in smaller variance in exposure. That will cause more bias in estimation, given that all other correlations are the same. 15 Figure 4 (cont™d) 16 Figure 5: Magnitude of bias for exposure 17 Figure 5 (cont™d) 18 Corprior,unobs Corprior,Expo CorExpo,unobs High homophily(N=40,T =5) .71 .38 .53 High homophily(N=80,T =5) .71 .38 .53 High homophily(N=40,T =2) .71 .37 .54 High homophily(N=80,T =2) .71 .38 .54 Low homophily(N=40,T =5) .65 .1 .17 Low homophily(N=80,T =5) .65 .1 .15 Low homophily(N=40,T =2) .65 .08 .17 Low homophily(N=80,T =2) .65 .09 .15 Table 1: Experimental condition 19 Table 1 (cont™d) No homophily (N=40,T=5) .61 .06 -.04 No homophily (N=80,T=5) .61 .1 -.02 No homophily (N=40,T=2) .62 .05 -.04 No homophily (N=80,T=2) .62 .09 -.02 The Mean Biases for the prior term and the network exposure term under various conditions are shown in Figures 4 and 5 respectively. Mean correlations between the prior, unobserved trait and network exposure are shown in Table 1. There are several things to note concerning the prior term: (1) the bias for prior is generally smaller when the true coefficient is larger; and (2) the OLS estimates for the prior term are consistently upwardly biased, and the magnitude of bias does not change much with time, the number of nodes and the various selection processes, although bias seems to be a little smaller for high homophily selection processes. The bias for network exposure presents a more interesting pattern: (1) contrary to estimates of the prior term, the bias for network exposure is generally smaller when the true coefficient is smaller; (2) when homophily is present in the selection process, the network exposure is upwardly biased, and the magnitude of bias is much smaller for lower levels of homophily; (3) when the network is static, estimates for 20 network exposure are downwardly biased; and (4) the magnitude of bias is smaller when T is larger, but does not vary greatly with different N. From the results above we can see that an uncontrolled latent trait that codetermines selection and influence will indeed create biased estimates for both prior and contagion effects, possibly leading to invalid inference. However, in the low homophily case where network exposure is moderately correlated with the latent trait and the prior term, the magnitude of the bias is relatively small, especially compared with bias in estimates of the prior. It is when there is high homophily through a latent trait that the bias in estimates of contagion effects is large. One possible reason for the prior term to be consistently upwardly biased is that it has a consistently high correlation with the unobserved trait (> 0.6). There is a larger variation in the magnitude of bias for the exposure term, possibly due to its smaller variance and the larger standard error of the regression coefficient. The direction of bias shows an interesting pattern: it is upward when there is latent homophily in the selection process, but downward when the network is static. We now provide some intuition for why this might be the case. Write the fitruefl influence model in matrix form as 112YYY , (11) where Y is the behavioral outcome, Y-1 represents the prior, Yrepresents network the latent trait c and the idiosyncratic error e. Then, the 2 can be represented as (see appendix A for a derivation) '1''1'22111‹‹()()()YYYYYYY . (12) As 1‹ is upwardly biased and there is a positive correlation between prior and exposure, the first part of equation (12) is always negative. Thus a) when there is homophilous selection based on a latent trait, the exposure term will be positively 21 correlated with the latent trait and '1'()YYYwill be positive, and as a result 2‹will be upwardly biased; and b) when the network is static, since there is a zero or a slightly negative correlation between exposure and the latent trait, the second term '1'()YYYis almost 0 and negligible. Together, the right-hand side of equation (12) will be negative and thus 2‹will be downwardly biased. For more details see the technical Appendix 1. Overall, our results have several implications: (1) the magnitude of bias for network exposure is smaller when the true coefficient is smaller, but the magnitude of bias increases as the true coefficient of network exposure increases. One possible reason is that when contagion dominates the system, nodes become homogenous and there is smaller variance in the nodes™ outcomes and hence more instability. (2) If there are unobserved variables codetermining selection and influence, we will observe a unneglectable bias in estimating network exposure when CorExpo,unobs is bigger than 0.2. But note that bias is only due to the unobserved variables, not the observed ones, as shown in the example above (Figure 1) where we still have unbiased estimates even when there is strong homophily present, i.e. high correlation between prior and exposure. Assuming there is a strong level of homophily in our data (Corprior,Expo = 0.3~0.4 in Penual et al., 2012; Venkatesh et al., 2000; Nash et al., 2005), the strongest correlation between exposure and covariate variables can vary from .2 to .4 in some empirical examples we have seen, suggesting that we have to control at least several of the most significant predictors in order to be somewhat confident that it is less likely there are still some variables we have not controlled for that have a correlation larger than .2 with exposure term. (3) As contagion effects are upwardly biased when there is homophilous selection we have not controlled for, we should be alert for contagion effects even if our 22 model has found significant results, since we are more prone to type I error; however if there is no network selection and the network is static, contagion effects are more likely to be downwardly biased, and we can be more confident of our inference if model estimates for contagion effects are positively significant. IV. Possible Solutions: Multidimensional Priors Previous sections show that estimates of contagion effects will be biased as long as there is a latent trait codetermining influence and selection. In this section, we propose a theory-based solution that can be easily implemented under an OLS framework, which is to include multidimensional pre-tests. It is known that in quasi-experiments, multiple pre-tests should be included (Shadish et al., 2008; Steiner et al., 2010; Concato et al., 2000) to control for prior differences that can affect treatment assignment and potential outcomes. For example, in order to recover the treatment effect in a non-randomized experiment aimed at improving mathematics performance, both pretests for mathematics and vocabulary need to be controlled (Shadish et al., 2008). If we think of network exposure as one form of treatment, and since networks are usually not randomly assigned, we will face similar situations as the ones we face in quasi-experiments. And including multiple pre-tests has the potential to control for prior differences and thus reduce bias in the estimates. To illustrate more we construct our fitruefl influence model as 11011231ijtjtititiitijtZYYYceZ (13) But here we also have a variable X that correlates with ci , that is, there are some variables that do not directly affect the outcome of interest but can affect the behavioral outcome and network selection through ci (e.g. Y is smoking, X can be alcohol use for the same person, and ci can represent the inclination for substance abuse for that person, so that X is correlated with c ). Given a homophilous selection 23 process based on a latent trait, we are particularly interested in whether including X can improve our estimation of the real network exposure effect, and how the correlation between X and ci can affect our estimation. Specifically, we take a high homophily example (N = 40, T = 2, density = 0.2), and let the fitruefl influence and selection model be 11211210.1*,~(0,1),~(0,0.2)ijtjtititiitiitijtZYYYcecNeNZ, and 1(1)(0.80.3||)ijtijtijPZZcc, and let X be 2*,~(0,1)itiititXrcNr (14) where r represents the correlation between X and the latent trait ci.. Then we vary correlation as 0.2,0.4,0.6,0.8, in each configuration, and 1 2 from 1 2 = 0.9; and we estimate the influence model as 11112311ijtjtititititijtZYYYXeZ . (15) Figure 6: Results with multi-dimensional prior 24 Figure 6 (cont™d) Corprior,X CorExpo,X Corunobs,X r=0.2 .13 .09 .19 r=0.4 .28 .19 .39 r=0.6 .41 .28 .59 r=0.8 .55 .38 .79 Table 2: Experimental condition 2 As the results show in Figure 6, with an increase in correlation between observed X and latent trait ci , including the prior of X as a control significantly reduces the bias, both in the actual prior term and in the network exposure term. So we conclude that 25 it is important to include multiple pre-tests to adjust for prior or unobserved differences that can affect both behavioral outcomes and network selection. V. Discussion and Conclusion In this chapter we have dealt with the identification problem of contagion effects. Specifically, we frame the difficulty of identifying contagion effects (e.g. influence confounded with selection, social context etc.) as an omitted variable bias problem. And we show that in general cases when there is a latent trait that co-determines influence and selection, methods such as SIENA and propensity score matching cannot identify contagion effects either. After that, we give situations and examples where contagion effects can be identified using traditional methods (e.g. OLS), as well as describing the magnitude of bias in different situations. And finally, we have proposed some possible remedies under an OLS framework. While omitted variable bias has been widely studied in many theoretical and empirical studies, it has not been related to entanglement between different processes in social networks until very recently (Shalizi & Thomas, 2011). And although entanglement between influence and social context can be easily framed as omitted variable bias, the fact that entanglement between influence and selection can also be framed as omitted variable bias is less obvious. And the analysis presented in this chapter is different from the general omitted variable bias problem in other fields, given that the selection process is known to be a main alternative mechanism in social network analysis. The present chapter contributes to this stream of literature by further clarifying the problem, as well as exploring the consequences of omitted variable bias in different situations. There are also many limitations for the material in this chapter. The derivation and simulation here are not intended to describe the magnitude of bias across all different 26 scenarios, but merely to show the existence of the bias, and the magnitude of the bias in the most basic setups. In future work we should also consider other alternative models or alternative forms of influence. For example, we can consider unobserved variables across multiple levels, which are commonly seen in empirical studies, and which can create estimation challenges as well. Possible remedies might include using dummies representing different settings or controlling group level means, but we leave these possibilities to future work. We can also consider how different forms of models will produce different results. For example, the network exposure term here represents the norm of network neighbors, but there could be other forms of influence such as imitation and learning. And we can explore how to represent processes such as preferential attachment and transitivity in the influence model, and how estimates in these models will be affected by the omitted variable problem. In an ideal world we would correctly measure all variables that can affect influence, selection, and social context, so that there will be no remaining omitted variable problem, and OLS can be applied (although a reflection problem still exists). But given the limited richness of the data, an omitted variable problem is almost inevitable in any empirical study, in which we know OLS estimates for contagion effects will be biased. So in the next chapter, we turn to some alternative estimation methods that have the potential to correctly identify contagion effects, given that there are omitted variables which either only affect influence or co-determine influence and selection. 27 CHAPTER 2: Estimation Methods I. Introduction In the previous chapter, we framed the problem of estimating contagion effects as an omitted variable bias problem. So in this chapter, we propose several methods which are inherently designed to deal with the problems of unobserved variables, and we will perform simulation experiments to investigate how well these estimators perform. II. Theoretical Framework Historically there are many approaches that deal with omitted/unobserved variable bias. Here we follow three well-known schools of thoughts: random vs fixed effects, the instrumental variables approach, and latent variable methods. We will introduce each, and explain how they can be applied in a social network context to identify contagion effects where a latent trait co-determines influence and selection. Then we will propose alternative estimation methods from each school of thought and discuss their advantages and disadvantages. i. Random vs Fixed effects The random-effects model, also known as the multilevel model or HLM, is widely used in the field of social science, especially education (Bryk & Raudenbush, 2002). The model assumes there is an unobserved constant for each unit in which observations are nested (students nested in classrooms for example). And if the network data is panel data with time nested within individuals, a random-effects model seems to be a reasonable way to estimate the influence model 2, as it deals with unobserved effects (Schonfeld & Rindskopf, 2007). But one of the key assumptions for a random-effects model is that unobserved effects are uncorrelated with observed variables (Wooldridge, 2010), which is clearly violated here, as ci 28 correlates with the prior term in model 2 by design, and ci correlates with the exposure term when there is homophily in the selection process. As a result, random-effects estimates will be inconsistent here. Fixed-effects models are also one of the commonly used methods in the field of social science such as in economics. The starting point is the same assumption that there is unobserved between-unit heterogeneity (ci in this case), while there is no assumption that ci and observed variables are uncorrelated. Then the unobserved ci are removed from the model, and the transformed model re-estimated to get fifixed-effectsfl estimates. Two common approaches to removing unobserved heterogeneity are (i) using N-1 dummy variables to represent N units, or (ii) remove a within-unit mean for each variable. For simplicity, we exclude variable X and represent model 2 as 01121itititiitYYYce, (16) where 1itY represents the network exposure term. Let .11TiittYYT,1.101TiittYYT, 1.101TiittYYT,.11TiitteeT. Then, subtracting these within-unit time averages from original model in (16), we have 1121ititititYYYe, (17) where .11.1,,ititiititiYYYYYYetc. By using this transformation we remove the latent trait ci , and pooled OLS estimates of the model in (17) are fifixed-effectsfl estimates, which will be consistent if regular OLS assumptions are met, such as the unobserved errors are uncorrelated with observed variables, etc. However, as model in (17) is a dynamic model, which includes a lagged dependent variable as a predictor, the transformed error term will be correlated with the transformed prior term by design. Nickell (1981) shows that fixed effects estimates of the prior will be downwardly biased, and the magnitude of bias will be proportional to 1/T. 29 Furthermore, the transformed network exposure term will also be inconsistent, and the magnitude of bias depends on the relationship between the transformed prior term and the transformed network exposure term. And as long as network exposure is correlated with the prior term (either through influence or homophilous selection), fixed effects estimates of contagion effects will be inconsistent. ii. Instrumental variable methods Instrumental variable (IV) methods are often used in situations where explanatory variables are correlated with the error terms, which can be caused by simultaneity, omitted variables, measurement error, etc. These type of methods work through identifying a set of new variables that only correlate with endogenous explanatory variables, but not with the unobserved error terms, and thus achieve consistent estimation by fiblocking outfl the correlation between the endogenous variable and unobserved errors (An, 2011; Wooldridge, 2010). There have been a few studies that used IV methods to identify contagion effects. For example Duncan et al (1968) used a friend™s intelligence as an IV for the friend™s occupational and educational aspirations. Angrist and Lang (2004) used the predicted number of transferred-in disadvantaged students to study their effects on the academic performance of students in the receiving schools. O™Malley et al (2014) used genetic alleles as IVs to estimate peer effects on weight status. An (2015) used friends™ family smoking status to estimate peer effects on smoking. However, all these IV methods require a strong theoretical argument of validity for the instrumental variables, which thus are essentially untestable. And we will also encounter inconsistency problems and large standard error of estimates when we have weak instruments or data with small sample sizes (Bound et al., 1995; Wooldridge, 2010). There are also studies exploiting structural properties of networks to identify instrumental variables. For example, Bramoullé et al. (2009) argued that if there are intransitive triads, for example i->j->k but i and k are not connected, then i™s 30 outcome can be used as instruments for j to estimate contagion effects for k™s outcome, since k is not directly influenced by i. However, the identification of the model would require the validity of the instrument such that i does not influence k through any alternative path, and simulations by Bramoullé et al. (2009) have shown that the quality of IV estimates depend on specific network structural properties as well (precision decreases with denser networks and complex functions of intransitivity). Given the strong assumptions required by various instrumental variables above, alternatively we propose to exploit the dynamic nature of our data and our model. Specifically, as pointed out by Anderson & Hsiao (1982), under specific assumptions, past values of one™s own outcomes can be used as instruments for endogenous variables in a dynamic model. To see this, we first-difference our influence model 2 to remove the unobserved effects ci (we exclude X in model for simplicity): 1121ititititYYYe, (18) where 1112112,,itititititititititYYYYYYYYY, etc. As in the fixed effects approach, this transformation will induce a correlation between itYand ite, thus biasing the estimates. However, under a sequential exogeneity assumption, which states that errors (shocks) in the future are independent of past values of y (which seems to be a very reasonable assumption if errors do not contain omitted variables and are structural/idiosyncratic), plus an assumption that errors are serial-independent, a natural instrument would be the past values of Y for each time period, which will correlate with 1itY but not withite, and hence satisfy the IV assumption. For example, in a panel data with 3 time points, for 21iiYY the instrumental variable can be 1iY as it does not correlate with32iiee, and this will generate consistent estimates 1. To be more concrete, using the delinquency example, to model the change score of delinquency from time 2 to time 3, we can use adolescents™ delinquency score at 31 time 1 as an instrumental variable for the change score of delinquency from time 1 to time 2. Note that in our setup the transformed network exposure (contagion) term does not correlate with the transformed error term, as the effect of network exposure on outcomes in model 2 is not simultaneous, but lagged. To see this more clearly write 112itititYYYand1itititeee , showing that these two terms are independent, since any change in 1ite will be reflected in itY but not in1itY or 2itY. In this sense, the exposure term is fiexogenousfl, so that it can be identified without extra instrumental variables5. However, since all past values of Y can potentially be instruments, Arellano and Bond (1991) proposed using the entire set of instruments in a generalized method of moments (GMM) procedure to improve efficiency. Specifically, let the matrix of instrumental variables for individual i to be Zi , as follows: (19)6 Each column in Z represents an instrumental variable z. As we can see, each variable z is uncorrelated with the error term in the model in (18), such that E[z™e] = 0. Writing the model in (18) as Y=XB+E, we can use 2-stage least squares (2SLS) estimation to achieve consistent estimates: in the first stage regress X on Z; in the second stage regress Y on the predicted value of X from the first stage regression. Together 2SLS can be written as 5 Note that if influence is contemporaneous instead of lagged, it is also possible to use past values of exposure terms as instrumental variables and thus achieve identification. 6 The instrumental variable matrix was constructed this way to create as many moment condition as possible, zeroes were added instead of missing values, and thus still keeping orthogonality. 32 1112‹(()()())(()()())NNNNNNSLSiiiiiiiiiiiiiiiiiiBXZZZZXXZZZZY Alternatively, we can use a GMM-IV estimator, which can be represented as 1‹‹‹(()())(()())NNNNGMMiiiiiiiiiiiiBXZWZXXZWZY, where W is a weighting matrix which is the inverse of the variance-covariance matrix of iiZE. And as we can see, the only difference between the 2SLS and GMM-IV estimators is that they use a different weighting matrix. This GMM IV approach is shown to be generally consistent and efficient as , but in empirical work the optimal number of moment conditions that should be used for estimation is not that clear (Judson and Owen, 1999; Kiviet, 1995; and Wansbeek and Bekker, 1996). And simulation by Ziliak (1997) also has shown that there could be a downward bias in GMM estimates as the number of moment conditions expands. Furthermore, it is shown that this method will also suffer from the weak-1 approaches 1 (Wooldridge 2010)7. As none of these similar methods have been applied to social network data, the performance of such estimators remains largely unknown. In this chapter, we incorporate this estimator of contagion effects and examine how well it performs in the context of social network panel data. iii. Latent variable approach Structural equation modeling (SEM) is also known as an alternative approach to deal with latent variables (Kaplan, 2007; Kline, 2011). SEM is widely used in the social sciences mainly due to its ability to isolate observational error from measurement using latent constructs (Hancock, 2003). But it can also be used to model unobserved variables in the estimation procedure. For example Barnes et al. (2000) use latent growth modeling to study the alcohol use of adolescents, with latent variables representing adolescents™ initial drinking behavior and rates of increase in alcohol 7 Note empirically that an important diagnostic test uses auto-correlation of the error terms in model in (18). By construction, errors should exhibit AR(1) behavior but not AR(2) 33 use. In a social network context, as described above, if we treat an unobserved trait that codetermines influence and selection as a latent variable, borrowing from the SEM framework we can estimate a latent variable in model 2, and then hopefully correctly identify the contagion effects. Figure 7: Dynamic model with unobserved term In a paper by Bollen & Brand (2010), a structural equation modeling based approach is discussed to estimate parameters in dynamic models with unobserved heterogeneity. Figure 7 provides a graphic depiction of their model, where Y represents the outcome of interest and X is the contemporaneous exogenous variable (subscripts indicate different time points). As can be seen in Figure 7, error variances and the coefficients for the time-varying variables across different time points are set to be equal. The latent time-i, representing unobserved heterogeneity, are allowed to correlate with both the exogenous variables and the lagged values of the outcome variable, Y. In principle, the Bollen and Brand (2010) model should provide accurate estimates of the ARDL (auto-regressive distributed lag) model with unobserved heterogeneity, since it models the unobserved heterogeneity without running into any incidental parameter problem (Lancaster, 2000). It allows correlations between unobserved effects and exogenous variables and a lagged dependent variable. Further, it models a dependent variable conditioned 34 on an initial observation, y1, thereby avoiding the initial condition problem (Anderson & Hsiao, 1981; Wooldridge, 2005). Unfortunately, simulation studies have not been performed to evaluate the performance of this SEM approach. Therefore, we will incorporate this method with our influence model, and use simulation to examine the performance of this method. Specifically, we represent the model in (16) as in Fig 8. Figure 8: Influence model in structural equation model Here, Yrepresents a behavioral outcome,Yrepresents network exposure, and c represents the latent trait that codetermines influence and selection. For example, Ycan represent the delinquency behavior of a focal adolescent, and Ycan represent the delinquency behavior of his/her friends, while c represents the unobserved risk-taking tendency. By setting up as in Fig 8, we follow Bollen & Brand™s (2010) framework and allow c to be estimated as a latent variable, which at the same time correlates with a lagged dependent variable as well as the network exposure term. We will obtain model estimates by maximum likelihood estimation. In principle, all previously proposed methods that have been well established in other fields are supposed to deal with various forms of unobserved heterogeneity 35 when estimating influence models, such as unobserved vertex attributes, unobserved social environmental factors, latent homophily etc. As a special case, if the unobserved variable in the influence process also co-determines the selection process (homophily based on a latent trait), then any information about this latent trait, based on the selection process, can be borrowed and used in the estimation of the influence model, and in theory this will reduce the bias in estimating contagion effects. So next we will propose an estimation procedure that borrows information from the selection process in order to estimate contagion effects in the influence model. Our approach builds on the theoretical logic of latent space models as applied to social-network data (Hoff et al., 2002). Latent space models assume that each individual has a filatent positionfl that lies in an unobserved n-dimensional social space, and the probability of interaction between any two actors depends on the latent positions of these two actors. Specifically, they take a logistic form and specify the selection model as 'log(1|,,,,)||ijijijijijoddsZccxxcc (20) Here, ijZ indicates whether there is an interaction from i to j, ijxis a vector of observed covariates (at dyadic level or node level), c indicates the latent position of i and j, and ||ijcc represents the Euclidean distance between i and j™s latent position. The parameters Maximum-Likelihood Estimation (MLE) or Markov Chain Monte Carlo (MCMC) methods, and the latent position c is estimated by Minimum Kullback-Leibler (MKL) estimates (Shortreed & Handcock, 2004). Note that if there are no covariates in model in (20), this model is similar in principle to multidimensional scaling (Kruskal, 1964), which put nodes at positions in n-dimensional space based on their network relations. As described in Hoff et al. (2002), crude estimates of individual positions from multidimensional scaling are actually used as starting values in their estimation procedure. As shown in chapter 1, if there is a latent trait c co-determining the selection and 36 influence processes, by accounting for this latent trait as a covariate in the influence model, we can achieve a consistent estimation of contagion effects. Although the latent trait c is generally unobserved, here from the latent space model we can produce estimated latent social positions, which operate in the same way in the selection process as the actual latent trait. Thus these estimates of latent positions can be used as proxies for the latent trait ci, and can be included when estimating an influence model such as in model 2, and this will in-principle reduce the bias in estimation of contagion effects that are due to the omitted variable problem (see example in Chapter 1; Wooldridge, 2010).8 For example, to model adolescents™ delinquency behavior, we can first use a latent space model to model the friendship network of adolescents and acquire an estimated filatent social positionfl for each individual, and then use these estimates as proxies for the unobserved risk-taking tendency in the influence model. Note that this method works because it accounts for the unobserved trait that determines the homophilous selection in the influence model. The scale of the estimated latent social positions might be very different from the actual latent trait, but as long as latent social positions are highly correlated with the actual latent trait (actors who are close to each other on latent social positions are also close to each other in terms of a latent trait), contagion effects can still be consistently estimated. However, if the social network data is longitudinal, the latent space model as described in model in (20) cannot produce consistent latent position estimates across different time points, as it is static in nature. Extensions of latent space models that apply to dynamic social network data have been proposed (Sarkar & Moore, 2005), but they do not assume constant latent positions across time, which violates one of our key assumptions, that individuals possess time-invariant latent-traits. In addition, there is the difficulty of implementation in software. So instead we propose a two- 8 Note that under the same principle we would benefit from estimating influence and selection at the same time, using models like SIENA (Snijder et al 2010), but SIENA does not deal with the omitted variable issue. 37 step estimator for contagion effects: 1. we estimate a latent space model and acquire the latent position estimates for each time point, and 2. we include estimated values of latent positions for all available time points as proxies for the latent trait c, and estimate our usual influence model using OLS. Specifically, letting qt be the latent position estimates for time t, assume that 01Tttcqr . (21) Under the assumption that (|,,)(|,)EyxcqEyxc and (,)0Covxr (X represents all independent variables in the influence model), q can be valid proxies for the latent trait c and thus the contagion effect is identified. Essentially we assume that each q is an imperfect measure of c, and by including all q™s we will have a better approximation to c, and thus better estimation of contagion effects.9 After the description of all the proposed methods, our main research question is whether estimation methods proposed above can correctly identify contagion effects when there are unobserved variables. Specifically we form several hypotheses based on literature we reviewed above: (1) the performance of estimators will be affected by the number of nodes (N), and the number of time points (T) available in the data; (2) random and fixed effects estimates for contagion effects will be more biased than other estimates; (3) GMM-IV estimates will perform well when we have more time points and the true coefficient of the prior term is small; and (4) SEM and latent space estimates will be more robust and generally produce less bias. III. Monte-Carlo Simulation In this section, we use Monte-Carlo simulation to examine the performance of each 9 As all q™s are probably highly correlated, this will possibly create a multicollinearity issue when estimating the influence model. But this will not affect estimation of prior and contagion effects, as all these q™s are explaining unique variances represented by c under the assumption (|,,)(|,)EyxcqEyxc 38 estimator for contagion effects proposed above: a fixed effects estimator, a random-effects estimator, a GMM-IV estimator, a SEM estimator, and a latent-space adjusted estimator. For simplicity, we do not include exogenous observed variable Xs, and we let the fitruefl influence model be: 1111231ijtjtititiitijtZYYYceZ . (22) Correspondingly let the fitruefl selection model be represented as 01(1)(||)ijtijPZcc (23) Simulation configuration. While there are many factors which could affect the performance of estimators, such as density, the magnitude of the 3), the variance of idiosyncratic error, etc., we focus on the performance of estimators under the following cases. (1) High homophilous selection determined by the latent trait; (2) sample size: As N and T are important in panel data, which usually have large N and small T, we will focus on the number of nodes and the number of time points; (3) the magnitude of the true coefficient for prior and network exposure, as different levels of influence might affect the performance of estimators. Specifically, let a simulation configuration be as follow: (i) in each simulation we fix each agent™s latent trait to be a constant drawn from a normal distribution N(0,1); (ii) we vary number of time points to be 3 or 6; (iii) we vary number of nodes to be 40 or 80; (iv) we keep the homophily level to be high, and correlation between Prior and Exposure to be around 0.4 (Penual et al., 2012; Venkatesh et al., 2000) 0 = -1 = -0 = -1 = -0.3 for N=80); (v) we vary 1 2 from 1 2 =0.9. As before this only changes the dynamics of the relationship, and does not affect the equilibrium of the system (Kiviet, 1995). In each configuration, we start from a random network and simulate based on equation (22) and (23), using the 1 39 2.10 Other model configurations 3 = 0.1 (to keep consensus within the initial range of ci (Friedkin, 1999)), 2~(0,0.2)iteN, and density = 0.2. Figure 9: Simulation results for prior 10 In T=2 cases, for IV estimation we use 2SLS with behavioral outcomes from first time point as instrumental, which is shown to be the same as GMM-IV estimation when the model is just-identified (Wooldridge 2002;2010). 40 Figure 9 (cont™d) 41 Figure 10: Simulation results for exposure 42 Figure 10 (cont™d) The biases of mean estimates for prior and Exposure are shown in Fig 9 and 10 respectively (each point is a result of 500 simulations. For latent space modeling each point is a result of 100 simulations, due to its longer running time). For 43 estimates of the prior: (1) the magnitude of bias is smaller for all estimation methods when we include more time points (bigger T), while it is not affected much by increasing node sizes (bigger N); (2) random effects estimates are always positively biased and fixed effects estimates are always negatively biased, which is as expected and consistent with Nickell (1981); and (3) GMM-IV estimates exhibit small bias when the true coefficient is small, but as the true coefficient of the prior increases, we will be more likely to encounter weak instrument problems, and the magnitude of bias thus increases (Wooldridge, 2010; Arellano and Bond, 1991). Also when T = 2, GMM-IV estimates are more unstable with larger variance (we exclude several outliers in this case). (4) SEM estimates are one of the least biased estimates when T is large; however the bias is larger when T is small. (5) Latent space estimates are the least biased estimates in all cases, out-performing other estimates. They are stable and have small bias even when T is small. For estimates of the network exposure term: (1) as above, the magnitude of bias is smaller for all estimation methods when we include more time points (bigger T), while the bias is not decreasing when we increase node size (bigger N); (2) random effects estimates are always positively biased and fixed effects estimates are always negatively biased, although fixed effects estimates for the exposure term produce much smaller bias than fixed effects estimates for the prior term, especially when T is large; (3) GMM-IV estimates overall have small bias, though they are more unstable when T is small (bias and variance are larger, and we exclude several outliers when T = 2); (4) SEM and latent space estimates out-perform other estimates in terms of producing the smallest bias across all cases. Overall, random-effects and fixed-effect estimates produce the biggest bias among all estimators; GMM-IV estimates sometimes have small bias but are unstable and largely biased when T is small and the true coefficient of the prior is large; SEM and latent space estimators outperform others in terms of producing the smallest bias across most cases, especially for identifying contagion effects. But note that SEM 44 estimates for prior can be severely biased when T is small, and the latent space method is very time consuming as it uses a simulation based estimation method (MCMC) and requires lots of burn-in time (Hoff et al., 2002). IV. Robustness Test i. Results with covariates We also test the robustness of results when we include covariates in both influence and selection models. Specifically, in the data generating process, let the influence and selection process be 11112341ijtjtititiiitijtZYYYcXeZ. (24) Correspondingly, let the fitruefl selection model be represented as 012(1)(||||)ijtijijPZccXX (25) All notations are the same as previously used, and X is an observed time invariant attribute that follows a N(0,1) distribution. All simulation setups are as before, except that we include X when estimating each proposed model. 45 Figure 11: Results with covariates 46 Figure 11 (cont™d) The results above are generally consistent with our main results, that is, (1) random and fixed effects produce the largest bias, while (2) the latent space adjusted 47 approach produces the least bias across all cases, and (3) SEM and GMM-IV perform well with more time points, but perform poorly (especially in estimating the prior term) when we have a short time frame. One thing to note is that the bias is generally smaller for each estimation method in the presence of covariates. A possible reason is that by including an observed covariate X, the correlation between the unobserved trait and observed variables (outcome, exposure term etc.) become smaller. And as a result, the impact of any omitted variable becomes less important. ii. Results with cluster membership Although the latent space adjusted approach performs best across all cases, it is very computationally intensive, and usually requires lots of computer space and time. And, the estimated social positions are somewhat arbitrary, and can easily vary from simulation to simulation, something which lacks sociological meaning. Instead, we will try an alternative approach: using cluster membership as a proxy for the latent trait. The rationale is that, as for latent social position, actors™ cluster membership can also account for the selection process, so that actors embedded in similar networks belong to the same cluster. This measure is coarser than the latent social position, but is much more computationally efficient. The detail of this approach works as follows: 1. we use a community detection algorithm (e.g. Kliqfinder, Girvan-Newman etc.) to find the cluster membership for each actor; and 2. we create dummy variables to represent each cluster and include these dummies when estimating the influence model. We use the previous simulation setup as in our main results, and we test how this new estimator performs: 48 Figure 12: Results with cluster membership 49 Figure 12 (cont™d) The results show that, as for the latent space adjusted approach, the method that accounts for cluster membership is robust to different T and N. However, this method cannot eliminate much bias, and its performance is only better than the random-effects and fixed-effects estimators in most cases. A possible reason is that 50 using dummies to represent an actor™s cluster membership is a relatively poor proxy for the actor™s latent trait that also drives the actor™s network selection. iii. Fixed intercept To deal with the scaling issue in the latent space adjusted approach, we test the robustness of our model by starting the simulation with a different prior distribution of the intercept in the latent space model. Specifically, in the latent space model we have 0log(1)||ijtijoddsZcc, (26) Where we use MCMC estimation and the prior distribution of 0 follows a N(0,1). Figure 13: Results with fixed intercept 51 Figure 13 (cont™d) 52 Figure 13 (cont™d) The results show that with different intercepts, estimates from the new latent space adjusted approach is almost identical to the original latent space adjusted approach. So we conclude that although shifting the intercept will possibly change the scale of latent social positions, it does not change the correlation between estimated latent social positions and the actual latent trait, so that the new latent social positions are still valid proxies for the latent trait, which will eliminate much bias in estimating contagion effects. V. Discussion and Conclusion While contagion effects have important implications for both theoretical and empirical studies, they are generally difficult to identify, as influence processes are often entangled with other processes such as selection and environmental factors. Here we show that this entanglement/difficulty can essentially be framed as an 53 omitted variable bias problem, and the methods currently used (e.g. SIENA, propensity score etc) either do not deal with this problem or require strong assumptions. In this chapter, we propose several alternative estimation methods that have the potential to identify contagion effects when there are omitted variables present, and we use Monte Carlo simulation to test the performance of these estimators. Although we choose a specific form of influence model (dynamic model with mean influence), our methods have the potential to be adapted to other forms of influence models. A possible extension of the proposed latent space adjusted approach is to apply it to multilevel data. For example, if we have students nested within classrooms, most of the networks we observe will be within classrooms and there will be few ties between classrooms. And to identify contagion/peer effects across classrooms, we need to adjust our approach to reflect the network structure in the latent space. If we estimate the latent social position using all networks as one global network, and estimate the influence model using all available data, the estimated latent social position will not reflect the difference in actors™ latent trait, because people holding similar latent traits might not know each other, due to the structural constraint. For example, if node A from organization 1 hold a similar latent trait to node B from organization 2, if we estimate networks from organizations 1 and 2 as a whole, the latent space estimates of persons A and B will deviate greatly from each other as they do not have a tie, even though they are similar in terms of the latent trait. And this will give biased results in the subsequent estimation of influence. Note that their lack of relationship is due to structural constraint rather than dissimilarity in the latent trait, and by not accounting for the structural constraints in the latent space model the estimated social position will be a poor proxy for the actual latent trait. For the multilevel network, I propose to estimate the contagion effect in each network using latent space adjustment. Then we will use a meta-analysis method to 54 estimate overall contagion/peer effects. Note that although none of these methods can 100% eliminate the bias in the estimation of contagion effects, our simulation results do suggest that these methods can still significantly reduce the bias under plausible assumptions, especially the latent space adjusted approach. We have no intention in stating that we have found the cure for identifying contagion effects, since there is no universal cure, and an estimation method is only part of the solution. Furthermore, the choice of the appropriate estimation methods almost always depends on the empirical situation. Nonetheless, we believe that with plausible alternative explanations that come from good theory, carefully measured covariates from longitudinal data, and a set of appropriate estimation models, we can effectively inform the debate about the contagion effects, and move forward scientifically. 55 CHAPTER 3: Sensitivity Analysis I. Introduction and Literature Review Though much progress has been made in modeling social network data, the validity of the network observations is still relatively a blind spot in available methods for social network analysis (Steglich et al., 2010; Moffitt, 2001). Most or all currently available statistical analysis methods assume that network observations are perfectly accurate and fully representative, while we know that social network data are sometimes unreliable and prone to error, especially network relations (Marsden, 1990; 2005). And this lack of validity in network observations is not just a result of simple random measurement errors, but often due to systematic bias that can lead to the misinterpretation of actors™ preferences for network selections, which have a substantial impact on issues related to causal inferences. As a consequence, these misinterpretations that are manifest in observed networks could directly decrease the validity of the study and limit the inference we can draw from the data, such as those pertaining to inferences of contagion effects. In this chapter we explore a sensitivity analysis framework (Rosenbaum & Rubin, 1983; Frank et al., 2013; VanderWeele, 2011) for making inferences under the concerns of lack of validity in social network relations. First we will discuss the misinterpretation of actors™ preferences that are manifest in observed network relations, and introduce the idea of simulation-based sensitivity analysis through the rewiring of observed network relations; and then we will talk about different mechanisms for rewiring; after that we will apply our proposed methods to test the robustness of inferences for contagion effects and give specific examples; finally we will derive two sets of analytical solutions for random, homophily and anti-homophily based sensitivity analysis methods. The validity of network observations is often of concern in empirical studies, since observed networks are prone to error and may not represent the population of interest. This lack of validity is not just a result of random measurement errors, but often due to systematic bias that can lead to the misinterpretation of actors™ preferences of 56 network selections, which that are manifest in observed networks. And this misinterpretation can occur due to various reasons: (1) Observation errors. While measurement errors exist in all sorts of data, the accuracy of observations in network relations is especially of concern (Marsden, 1990). The most common self-reported measures of network relations are known to suffer from cognitive bias (Freeman et al, 1987; Feld and Carter, 2002). For example, Freeman & Romney (1987) show that peoples™ perception of social ties will be biased toward the routine, typical structure. And other studies have shown that self-reported measures of network relations often are biased towards self, group structure, balance, routine interaction etc. (Marsden, 2005). Studies by Bernard, Killworth, and Sailer (Killworth & Bernard, 1976, Bernard & Killworth, 1977, Bernard et al., 1981; 1982) showed that there are discernable differences between social ties data obtained via questionnaires, and behavioral records obtained via various methods including diaries, monitoring of radio communication, observers, or electronic monitoring. Later studies have found a higher (80%) agreement between network questions in surveys and interviews (Pitts and Spillane, 2009). However, in general, observation errors in network relations are often a mixture of both random measurement errors and systematic bias that is driven by many known or unknown mechanisms in actors™ preferences for network selection. (2) Mismatch between the frequency of interaction and functions of the network. Even when network observations are 100% accurate, the validity of the observations still depends on the functions of the network. For example in an information flow network, the frequency of interaction does not necessarily represent how much or how valuable certain information/resources are that flow through this tie, and weak ties are known to be more useful in terms of delivering novel information than strong ties (Granovetter, 1973). Furthermore, the observed frequency/importance of interaction is often not the same as that which actors actually perceive (Casciaro, 1998). For example, in the context of influence, actors may perceive more influence from those with whom they shared more similar interests but have less frequent interactions, 57 compared with those with whom they shared less similar interests but talk more often. This is because it is more likely that these actors share the same identity and find it easier to talk to homophilous others, and in contrast they find it more difficult and thus have to spend a longer time communicating with dissimilar others while actually conveying less information (Byrne, 1971; Mark, 1998; Carley, 1991). As a result, the frequency of interaction may not be the best representation of an actor™s perception of the importance of their alters in terms of influence. (3) Sampling bias. The observed networks may not represent the population of interest. And this can occur both on the network level and actors™ level. On a network level, the observed network can be seen as one realization from a set of possible networks that are generated by the same underlying stochastic process (Robins et al., 2007). It is possible that the one realization we observe does not represent the actual underlying preferences of actors in network selection. At the actors™ level, if the actors in the study sample have preferences for network selection which are different to the population of interest, the observed network will be biased and cannot represent the population network of interest as well. Due to these various reasons, by using the observed networks in the analysis, actors™ actual preferences for interactions are often misinterpreted. And this misinterpretation will have a direct impact on issues related to causal inference, such as internal validity Œ whether the observed relationship is confounded by the unobserved mechanisms that drive the network selection, or external validity Œ whether the observed network best represents the population of interest. And with all these issues, analyses based on observed social network data are subject to unobserved bias and we should be cautious when drawing inferences from such data. The purpose of this chapter is not to propose methods that can reduce the bias or the errors in network observations, rather, we follow a sensitivity analysis framework and investigate the extent to which the validity of network relations can essentially affect our results or invalidate our inference. Following Frank et al. (2013), we propose that instead of stating that inference drawn from the study is invalid because of 58 unobserved errors/bias in network relations, one should really ask how much bias (and what kind) must have been in network relations to invalidate the inference. We focus on sensitivity analyses of network relations for several reasons: (1) While a lot of work has been done on sensitivity analysis dealing with unobserved variables (Rosenbaum & Rubin, 1982; Frank, 2000; Pan & Frank, 2004; VanderWeele, 2011), few have focused on social network data, especially the misinterpretation of selection mechanisms that are manifest in observed networks. (2) Sensitivity analysis of the errors/bias in network relations helps to frame external validity issues in social network studies. As network data usually contains the whole population of interest, external validity issues are rarely of concern. But as the observed network can also be treated as one realization from a set of possible networks that are generated by the same underlying stochastic process (Robins et al., 2007), a natural question to ask is to what extent the observed network can represent the underlying stochastic process. So sensitivity analysis helps to frame the validity of network observations into an external validity issue. While the sampling bias of observed networks can also be translated into a sample replacement problem (Frank et al., 2013), social network data usually has unique characteristics like non-random sampling and non-independent observations, also known as network auto-correlation (Manski, 1993; Doreian, 1989); and this poses additional challenges for sensitivity analysis in social network data and calls for alternative methods that account for unique features of network relations. (3) While some forms of observation errors/bias in network relations (missing data for example) and their impact on network outcomes have been studied (Robins et al., 2004; Kossinets, 2006), the impact of many other forms of errors/bias in network relations are rarely considered and largely unknown. Thus we contribute to the literature by exploring various mechanisms that can generate observation errors in network relations, and their impact on outcomes in a sensitivity analysis framework. 59 II. Theoretical Framework i. Sensitivity analysis through rewiring networks The sensitivity analysis we propose is a simulation-based approach that operates through the rewiring of a currently observed network. Our basic model assumes that in an observed network, actors control their out-degrees and can rewire their ties based on various mechanisms given the current network, and each actor preserves his/her number of out-degrees as constant. In this way, we assume errors/bias in network relations only reside in with whom actors interact, not how many people they interact with. Assuming there is errors/bias in network relations, we can explore the magnitude of errors/bias through rewiring, and assess the extent to which errors/bias in network structure can bias our estimates and ultimately alter our inference. And more importantly, this method allows us to ground analysis in theory and to test specific forms of error/bias existing in network relations as represented in actors™ preferences of network selection. This is different from previous research in which errors are assumed to be random, or ties are rewired at random (e.g. the QAP test (Krackhardt, 1987)). Through this framework we want to contribute to the discussion of validity in studies of social network analysis and shift the attention to be more magnitude-based and theory-based, and this framework enables us to devise clear and testable alternative hypotheses when making such inferences, which is the key to making strong inference in any field of science (Platt 1964). Essentially we are asking (1) fiwhat percentage of network relations have to be rewired to invalidate current inferencefl and (2) fiwhat forms of errors/bias must exist in network relations to invalidate current inferencefl. 11 ii. Mechanisms for rewiring While there are many potential mechanisms that drive interaction between actors 11 In principle our proposed methods are analogous to a community detection algorithm (Girvan & Newman 2002) that identifies edges with high betweeness (Freeman 1977) that need to be removed to create separable components in the graph. 60 (thus also errors in observed networks), we follow a long tradition of fistructure versus agencyfl (Emirbayer and Goodwin, 1994), or as Mayhew (1980) called fiindividualisticfl and fistructuralistfl views of the world. Structure, representing the social-organizational structure in which actors are embedded, limits the choices and opportunities available. Agency represents the capacity of actors to act freely, based on their own preferences and intentions. Studies have found that both can play an important part in shaping humans™ interactions, sometimes reinforcing each other (Kossinets & Watts, 2009). In our proposed methods we include some widely studied factors from both views of the world. However, for the purpose of sensitivity analysis we also want to separate each mechanism and explore how our inferences are sensitive to the specific form of errors/bias resulting from each mechanism. Thus in our sensitivity analysis we only rewire an observed network based on one mechanism at a time. Next, we will introduce six mechanisms that can possibly bias our network observations, and we categorize them into either agency or structure. Specifically for Agency we have: random, homophily, anti-homophily; for Structure we have: reciprocity, transitivity, and preferential attachment. For Agency rewiring, first we have random rewiring. In this case, we assume there are random measurement errors in our observed network. As is similar to the QAP test (Krackhardt, 1987) we rewire network ties randomly among nodes, but we preserve nodes™ out-degree, and the purpose is not to simulate the distribution of estimates in a random network, but to assess the extent to which our inference is robust to random errors in networks. Each time we rewire a certain percentage of observed ties randomly, and re-estimate our model of interest. We repeat this many times to get an average estimate, and compare with a pre-set threshold for inference. For example, if we assume 30% of our network observations are due to random error, we would randomly rewire 30% of observed ties and compare our average estimates with a pre-set threshold to decide if our inference is altered. Next we have homophily rewiring. Homophily, or fibirds of a feather flock togetherfl, refers to a pervasive phenomenon that people tend to seek similar others for 61 interaction (McPherson et al., 2001). It is an important network-generating mechanism that sometimes produces clustered networks or segregation (Schelling, 1971). Here particularly we focus on the agency of actors and refer to this type of behavior as a result of fichoice homophilyfl as noted in Kossinets & Watts (2009), which attribute the choice of similar other as results of individual, psychological preferences. This is different from fiinduced homophilyfl where the choice of similar others is a consequence of the homogeneity of structural opportunities for interaction, as in neighborhoods, schools, workplaces and friendship circles (Feld, 1981). Thus in our model, as actors rewire their ties, they tend to choose other actors who are most similar to themselves without being subject to structural constraints. And here, homophily can be broadly defined to be based on various attributes available in the observed data. For example, in a study of contagion effects examining best friends™ smoking behavior on actors™ smoking behavior, if we suspect that networks of interest are more homophilous based on smoking behavior than observed, we can rewire a certain percentage of observed ties based on homophily. That is, we rewire a certain percentage of observed ties to connect actors with those of most similar smoking behavior who are not previously connected, re-estimate our model and compare average estimates with a pre-set threshold to decide if our inference is altered. For agency we also have anti-homophily rewiring. Given the importance of homophily, it would make sense to consider the opposite of homophily for both practical and theoretical reasons. Practically, given the predominant evidence that homophily exists in networks, a natural question to ask is fiwhat if the observed network is too homophilous?fl, or how to account for errors/bias that occurs at the opposite direction to homophily. Theoretically, anti-homophily, or fiheterophilyfl in a broader sense, reflects the tendency of people seeking to interact with dissimilar or diverse others. There is agency in heterophily as heterophilous ties are mostly formed voluntarily (Rivera et al., 2010), and they are found to be more and more common over time in situations such as team building and scientific collaboration (Moody, 2004; Page, 2007). Thus it would make sense to include anti-homophily as an 62 alternative network-generating mechanism and possible source of bias. In our model, if we need to assess how our inference is robust to fianti-homophilyfl, we would rewire a certain percentage of observed ties to connect actors with the most dissimilar attributes of concern, then re-estimate our model and compare average estimates with a pre-set threshold to decide if our inference is altered. Next we turn to a set of structural mechanisms that account for errors/bias in networks. The first is transitivity, or fitriadic closurefl, which refers to the phenomenon that people tend to become friends with the friends of their friends (Rivera et al., 2010). This is found to be true across various social settings such as corporate board members (Davis et al., 2003), Hollywood movie actors (Watts, 1999), Broadway musical artists (Uzzi & Spiro, 2005), inventors (Fleming et al., 2007), scientists (Newman, 2001b) etc. While there are various motivations for transitivity (increased encounter opportunities (Granovetter, 1973), decreased risk and uncertainty (Burt & Knez, 1995), it has important implications for network structure. For example, Jin, Girvan & Newman (2001) found that with higher probability to meet if a pair has more mutual friends, the resulting network exhibits high levels of clustering and strong community structure. Thus we include transitivity as an important source of bias, and rewire networks based on shared numbers of friends, and we update the graph sequentially. For example, if we were to rewire certain of observed ties based on transitivity, we will create an order list for the ties to be rewired, then we rewire the first tie to the alter node who is not connected to the ego in the current graph but shares most common friends12 with ego. Then we update the graph and recalculate the network measures (the number of common friends shared by each pair in the updated graph), and do the same thing for the second tie to be rewired, so on and so forth. In this way, the order of movement 12 Our network is directed, but we define common friends as in a undirected graph to capture various definitions of transitivity or triads. For details see Davis and Leinhardt (1972) on triad censuses, or Wasserman and Faust (1994, p 243) , Robins et al (2007) 63 matters since actors are more Markovian driven, and at the same time it explores the whole space in terms of simulation results. As an illustration, consider Fig 14a as our original network, which is a random network with N = 50 and density = 0.1. Fig 14b is one example of the resulting network if we rewire 100% of the ties based on transitivity, which contains many more triads and exhibits a community structure. Reciprocity represents bi-directional connections (if i selects j, j will also select i) in a directed network. It has been found in friendship networks among students in various grade levels (Runger & Wasserman, 1980; Mollica et al., 2003). Possible reasons for the occurrence of reciprocity include people tending to like others who like them (Newcomb, 1956; Backman & Secord, 1959; Sprecher, 1998; Montoya & Insko, 2008), and reciprocation relative to a first advance of friendship decreases the chance of being rebuffed (Goffman, 1963). Reciprocity also has important implications for network structure such as stabilizing networks over time (Rivera et al., 2010). In our model, we include reciprocity as an alternative mechanism for rewiring, following similar sequential steps as in the transitivity case, except that actors will rewire their out-going ties to create more reciprocated ties. Fig 14c is one example of the network if we rewire 100% of the ties of the network in Fig 14a based on reciprocity, thus creating many more reciprocated relations. Finally, we consider preferential attachment. Preferential attachment states that Social connections tend to accrue to those who already have them, also known as firich get richerfl or the fiMatthew Effectfl (Merton, 1968), and the main reason driving this mechanism could be that people use others™ degree as a proxy for their own fitness, status, power etc. Empirical and simulation results suggest that preferential attachment can generate a core-periphery structure or power-law degree distribution in networks (Barabási & Albert, 1999; Newman, 2001), which is found in many settings such as online friendship networks, scientific collaborations, sexual contact networks, etc. (Golder et al., 2006; Moody, 2004; Newman, 2001; Liljeros et al., 2001). For the reasons above we include preferential attachment as a possible mechanism for rewiring, that accounts for errors/bias in networks. The steps we use to 64 rewire are the same as for previous mechanisms, except that actors will now rewire to others who possess a higher in-degree for the current graph. Fig 14d is one example of the network if we rewire 100% of the ties of the network in Fig 14a based on preferential attachment, thus exhibiting a clear core-periphery structure (Borgatti & Everett, 2000).13 A B Figure 14: Structural rewiring example 13 Note that for random and homophily based rewiring, the maximum percentage of network relations that can be rewired is constrained by the density of the network, since observed ties must be rewired to different pairs. However, for structure-based rewiring there are no such constraints, and observed ties do not necessarily have to be rewired to different pairs, as actors can choose whichever pairs maximize their utility. 65 Figure 14 (cont™d) C D III. Applying Sensitivity Analysis: Examples on Contagion Effects After establishing our sensitivity analysis method and various rewiring mechanisms, in the next section we give some specific examples of how it can be applied to empirical data. Note that though the sensitivity analysis method we propose could potentially be applied to many different inferences using social network data, we are particularly interested in making inference on contagion effects for several reasons: (1) Contagion effects, which are defined as the propensity of an individual to behave in some way varying with the prevalence of that behavior in the network neighbors of the individual (Manski, 1993), have received lots of attention and have been widely studied (Kandel, 1978; Marsden and Friedkin, 1993; Doreian, 2001; An, 2011) as they have potential implications on health behavior (e.g. obesity and smoking), information diffusion, and teacher practice changes, among others (Christakis et al., 2007; 2008; Valente 1995, 1996; KA Frank et al., 2004). (2) There are many difficulties in identifying contagion effects, as they are often confounded with other 66 unobserved variables (individual attributes, social-environmental factors etc), especially homophily in the selection process (Aral, 2009; Shalizi, 2011). Though many sophisticated statistical models have been developed to identify contagion effects (Christakis et al., 2007; Steglich, 2010; An, 2011), there is still much debate about the validity of these methods (Vanderweele et al., 2013; Lyon, 2013; Frank & Xu, 2016). Sensitivity analysis has been proposed as an alternative to deal with the impact of unobserved variables (Rosenbaum & Rubin, 1982; Frank, 2000, 2004; VanderWeele, 2011). However, the validity of network relations has largely been neglected from the inference. In this context our proposed methods can contribute to questions such as how inference about contagion effects are robust to errors/bias in networks generated by various possible mechanisms (homophily in the selection process for example). (3) Since outcomes and identification of contagion effects are critically contingent upon the network structure or to whom individuals are exposed (Friedkin, 1999), it is vital to investigate how inference of contagion effects are robust to alternatives or possible errors in network structure.14 Next, we use a simulated dataset to illustrate how to apply our proposed methods to empirical data to test the robustness of inference about contagion effects to various errors/bias in networks. First, we estimate a social influence model as usual, and acquire model estimates. Second, we calculate thresholds to alter our inference for each parameter, using the sensitivity analysis method in Frank et al. (2013). Third, we assume there are errors in observed networks, and we then rewire observed networks based on various mechanisms (random, homophily, anti-homophily, transitivity, reciprocity, preferential attachment respectively) repeatedly and re-estimate the influence model to get new estimates for each parameter, which are used to compare with a threshold set in the second step to decide what percentage of networks (and under what mechanisms) need to be rewired to invalidate our inference. 14 This in principle is very similar to the case in Frank (2000) where he talked about the attenuation bias due to measurement errors in confounding variables. Here we discuss how measurement errors in networks (and as a result in exposure terms) can attenuate our inference. 67 Here we provided an example. We construct a simulated network dataset where N = 50, T = 2, density = 0.2. The influence process follows 1101121,ijtjtitititijtZYYYeZ (27) where Y represents the behavioral outcome of interest, Z is a binary variable representing a network relationship, and eit is an error term following N(0,0.22). To identify contagion effects we estimate an influence model as in (1); estimated parameters are in Table 3 (CorPrior,Exposure = 0.07 in this example). Furthermore, we calculate the thresholds to alter the inference for each parameter, following Frank et al. (2013). To explore how our estimates are robust to various errors in the observed network, we then rewire a different percentage of existing ties (varying from 10 to 90 percent) based on (1) random selection; (2) homophily, that is, actors will rewire to unestablished ties with the smallest value of |Yit-1-Yjt-1|;15 (3) anti-homophily, that is, actors will rewire to unestablished ties with the largest value of |Yit-1-Yjt-1|; (4) transitivity, wherein actors will rewire to others whom they are not previously connected to but share most common friends with; (5) reciprocity, where actors will rewire to others with whom connections are mutual; (6) Preferential attachment. Actors will rewire to others with highest in-degree. Note that in each case existing ties to be rewired are selected randomly and actors preserve their out-degree. For each configuration we simulate 500 times and re-estimate model (27) 12 and correlation between the prior term and the network exposure term. And we 15 Note that in empirical data, we can incorporate variables other than Y for rewiring based on homophily. 68 compare the mean estimates with the pre-set threshold to determine if our inference should be invalidated. Estimates Standard Error t-value Pr 1‹ 0.41834 0.07563 5.532 1.37e-06 *** 2‹ 0.83638 0.28675 2.917 0.00541 ** Table 3: Influence model example Results are shown in Fig 15. Figs 15A and 15B shows the average estimates of prior and network exposure vs % of ties rewired respectively. Black lines in each graph represent the threshold to alter the inference, which is calculated as in Frank et al. (2013). The graph shows estimates of the prior that are generally not influenced by various mechanisms of rewiring, except when we rewire by homophily or anti-homophily. Nevertheless, estimates of the prior are all significant and the inference is robust to all mechanisms of rewiring. 69 Figure 15: Impact of rewiring on the estimates 70 Figure 15 (cont™d) Estimates of network exposure effects exhibit more interesting patterns, as expected. Fig 15B shows that inference about contagion effects is least robust to anti-homophily rewiring, and inference is invalidated even if only 10% of ties are rewired. For other types of rewiring, inference about contagion effects are generally more robust, varying from 20% for random rewiring and 30% for rewiring based on homophily, in order to invalidate the inference, with effects of structural types of rewiring (transitivity, reciprocity, preferential attachment) in between. Note that a calculation from Frank et al. (2013) indicates that 31% of the estimates of contagion effects have to be due to bias, in order to invalidate the inference. Thus in this example contagion effects are less robust to errors/bias in networks than replacing cases with null effects. Fig 15C represents the mean correlation between prior and network exposure vs % of ties rewired. Correlation is greatly impacted by homophily-based rewiring (positive for homophily and negative for anti-homophily) but not so much by random rewiring or structural rewiring. 71 IV. Analytic Solutions While these simulation-based sensitivity analysis methods are intuitive and easy to implement, they still have several limitations: 1. Sometimes these simulation-based methods are not time efficient, and it could take a long time to run the full sets of simulations, especially when the network size is large or when there are many actors™ characteristics/variables of interest; 2. to completely understand the behavior/performance of these sensitivity analysis methods under various conditions, it would be helpful if we could derive some sort of closed form/analytic solutions, such as solutions that can be represented as functions of observed networks and correlations between variables. Thus in this section, we develop analytical solutions for our sensitivity analysis methods based on three rewiring mechanisms, namely random rewiring, homophily rewiring and anti-homophily rewiring. For simplicity we assume there are only three variables in our influence model: dependent variable Y, a prior term Z and a network exposure term X (although it is possible to extend this analysis to models with more covariates). And the key relationship of interest is the relationship between network exposure X and dependent variable Y. To determine the impact of different rewiring mechanisms on the inferences about network influence, we follow a partial correlation framework as in Frank (2000). The robustness of inference is essentially decided by the partial correlation between the dependent variable Y and the network exposure term X, conditional on other covariates, such as |xyzr, as shown in Fig 16 below. And by rewiring the network relations we are only recreating a new exposure term X without changing the correlation between the dependent variable Y and the prior term Z, given by yzr. As a result, we only need to consider how rewiring changes the correlation xyr between the network exposure and the dependent variable, and xzr, the correlation between 72 network exposure and prior term (or other covariates), and how these new correlations generate the new partial correlation *|xyzr. In this section, we will derive two sets of analytical solutions. In the first set, for (1-p)*100% of ties rewired we assume (1-p)*100% of nodes rewire all of their ties. In the second set, for (1-p)*100% of ties rewired we assume ties rewired are distributed evenly across all nodes. Note that the first set is more intuitive and the second set is more technically challenging, but the operations of our simulations are in principle more similar to the second set of analytical solutions. And as we will show later, the second set of analytical solutions will have a better fit to raw correlationsxyr, xzr after rewiring, but in terms of goodness of fit to partial correlations between network exposure and the dependent variable |xyzr(which is the sufficient statistic determining the robustness of inference), both sets of analytical solutions perform equally well. Finally note that in this memo we have not derived analytical solutions for structural rewiring such as transitivity rewiring, preferential attachment rewiring, or reciprocity rewiring. This is because in these mechanisms we need to know the full network and exact distributions of the variables of interest (rather than their correlations) to calculate the new partial correlations. This topic we leave for future work. Figure 16: Analytical solution framework 73 The rest of the section is organized as follows: first we will derive one set of analytical solutions in which we assume a certain percentage of actors rewire all their ties while others do not. Next we will derive another set of analytical solutions in which we assume ties rewired on average are evenly distributed across all actors. Finally we will give some simulated examples to examine how well the two derived analytical solutions fit the actual simulation results. i. First set of analytical solutions In the first set of analytical solutions we assume only that certain actors in the network have errors in their networks while other actors™ network are perfectly measured. For example, in a network with evenly distributed degrees, if there are (1-p)100% of the observed ties that need to be rewired due to errors/biases, we assume there is zero accuracy in observed networks for (1-p)100% of the actors, all of whom have to rewire all their networks. And for the remaining p*100% actors, their networks are perfectly measured and are 100% accurate. In empirical cases this is less likely to happen, unless some actors identify the wrong primary social group, misread the network question, or deliberately sabotage the study by reporting all their network relations wrongly. However, for the derivation of an analytical solution, this is more intuitive and easier to derive. Next we derive analytical solutions for random, homophily and anti-homophily rewiring under this assumption. For random rewiring, Assuming that we randomly rewire (1-p)% of ties in the observed network, the variance of network exposure wll not change, and the new raw correlation between the network exposure X and a dependent variable Y after rewiring --*xyr would become (assume all variables are grand-mean-centered) *01*(1)(1)*0var()var()xyxyxypxynrpHprpprxy (28) Here, H0 is the hypothesis of correlation for the rewired ties, which in this case is the 74 new correlation of zero, for random rewiring. For a similar reason the new raw correlation between the network exposure X and the prior term Z after rewiring -- *xzr would become *01*(1)(1)*0var()var()xzxzxzpxznrpHprpprxz (29) After deriving the two new raw correlations after rewiring, we now derive the key partial correlation *|xyzr after rewiring (Z represents the prior here, which can also be other covariates), having the form: ***|*222221111xyxzyzxyxzyzxyzxzyzxzyzrrrprprrrrrprr (30) As this shows, we can now represent the new partial correlation in terms of p (the percentage of ties retained) and the original correlations in the observed data. To understand the robustness of inference we need to know one more thing Œ the threshold of inference. In this case the threshold of partial correlation|xyzr -- #r, can be calculated as ###2.trtresdf, where t# is the critical value of t needed to invalidate inference, and res.df represents the residual degrees of freedom. Thus to invalidate the observed inference we need to randomly rewire (1-p)% of ties in order to get *#|22211xyxzyzxyzxzyzprprrrrprr (31) To write p as function of other variables we have #2#22221(1)()zyzxzyxyyzxzrrprrrrrr (32) 75 For homophily-rewiring, Next we derive an analytical solution for homophily rewiring. Assume that homophily is based on the prior (this can also extend to homophily based on a prior and other covariates) and for ties, that rewired people will rewire to others who hold exactly the same behavior. The underlying assumption here is that the network is big and diverse enough so that everyone can find others who show exactly the same behavior). Then for a large network where everyone can find perfectly homophilous others based on the prior, to rewire (1-p)*100% of ties, we assume (1-p)*100% of nodes rewire all their ties (so that on average (1-p)*100% of the total ties are rewired), and the new raw correlation between network exposure X and the prior term Z after rewiring -- *xzr becomes (1)(1)*1111011')(')(1)var()var()var()var()(1)*(1)*1npnpnpnpxzxzxzppxzxzxzxznpnpnrxzxzprpHprp (33) Here, x™ represents the new network exposure after rewiring, and for the (1-p)*100% of nodes who rewire to perfectly homophilous others based on the prior, their new correlation between network exposure and prior term, as stated in H0, should be 1. For a similar reason, the new raw correlation between the network exposure X and the dependent variable Y after rewiring --*xyr would become (1)(1)*111111)(')(1)(1)*var()var()var()var()npnpnpnpxyxyyzppxyzyxyxynpnpnrprprxyxy (34) Note that here for the (1-p)*100% of nodes who rewire all their ties to perfectly homophilous others, their network exposure becomes exactly the same as in their prior term. As a result, for these people, the new correlation between network exposure X and the dependent variable Y after rewiring becomes the correlation between the prior term Z and the dependent variable Y -- yzr. After deriving the two new raw correlations after rewiring, we now derive the key 76 partial correlation *|xyzrafter rewiring, which has the form: ***|*2222((1)*)((1))*111((1))1xyxzyzxyyzxzyzxyzxzyzxzyzrrrprprprprrrrprpr (35) As this shows, we can now represent the new partial correlation in terms of p (the percentage of ties retained) and the original correlations in the observed data. Finally, as before the threshold for partial correlation|xyzr -- #r, can be calculated as ###2.trtresdf, where t# is the critical value of t to invalidate inference, and res.df represents the residual degrees of freedom. Thus to invalidate the observed inference, we need to randomly rewire (1-p)% of ties in order to get *#|22((1)*)((1))*1((1))1xyyzxzyzxyzxzyzprprprprrrprpr (36) To write p as function of other variables we have 2#2222#22#22#2#22#2#22(1)(1)()22xzyzxyxzyzxzyzxzxzyzxzyzrrrprrrrrrrrrrrrrrrr (37) For anti-homophily rewiring, Next we derive an analytical solution for anti-homophily rewiring. Assume that anti-homophily is based on the prior (this can also be extended to other covariates), and for ties, that rewired people will rewire to others who hold the most dissimilar behavior/belief. As before, to rewire (1-p)*100% of ties, we assume that (1-p)*100% of nodes rewire all their ties (so that on average (1-p)*100% of the total ties are rewired), and the new raw correlation between network exposure X and the prior term Z, after rewiring -- *xzr becomes 77 (38) Here, x™ represents the new network exposure after rewiring, and for the (1-p)*100% of nodes who rewire to the most dissimilar others based on the prior, their new correlation between network exposure and the prior term, as stated in H0, should approximately be -1. Note that for each different distribution of the prior term, the correlation *xzrafter rewiring will be different. So we pick the most intuitive one here. For example, if the prior is a binary variable representing whether actors smoke or not, and when all actors connect to others who have different behavior to themselves, the correlation between network exposure and the prior would be -1. For a similar reason, the new raw correlation between network exposure X and the dependent variable Y after rewiring --*xyr would become (1)(1)*111111()'('')(1)(1)var()var()var()var()npnpnpnpxyxyyzppxyzyxyxynpnpnrprprxyxy (39) Note that here for the (1-p)*100% of nodes who rewire all their ties to the most dissimilar others, their new network exposure term and their prior term have a perfect negative correlation of -1. As a result, for these people, the new correlation between network exposure X and the dependent variable Y after rewiring becomes yzr, the correlation between the prior term Z and the dependent variable Y, with a negative sign. After deriving the two new raw correlations after rewiring we now derive the key partial correlation *|xyzrafter rewiring, which has the form: *|22((1))((1))*1((1))1xyyzxzyzxyzxzyzprprprprrprpr (40) As this shows, we can now represents the new partial correlation in terms of p (the 78 percentage of ties retained) and the original correlations in the observed data. Finally, as before the threshold for partial correlation|xyzr -- #r, can be calculated as ###2.trtresdf, where t# is the critical value of t to invalidate inference and res.df represents the residual degrees of freedom. Thus to invalidate the observed inference, we need to randomly rewire (1-p)% of ties in order to get *#|22((1)*)((1))*1((1))1xyyzxzyzxyzxzyzprprprprrrprpr (41) To write p as a function of other variables we have 2#2222#22#22#2#22#2#22(1)(1)()22xzyzxyxzyzxzyzxzxzyzxzyzrrrprrrrrrrrrrrrrrrr (42) ii. Second set of analytical solutions In the previous derivation we reasoned as if we were rewiring all ties for (1-p)*100% of the nodes. However, in the actual simulation we randomly selected (1-p)100% of all observed ties, and rewired them. As a result, on average we are essentially rewiring (1-p)100% of ties evenly across all nodes. Here we give a simulation example where we randomly rewire 30% of the total ties from a random network. Figure 17 below shows the distribution of the percentage of the ties rewired on an individual level. As this shows, the percentage of ties rewired for each individual varies from 10% to 80%, but on average each node has rewired 30% of its original ties. Compared with the assumptions in the first set of analytical solutions, this assumption is more reasonable, as network measurements for each actor can be imperfect for various reasons, such as bias towards self, group structure, balance, routine interaction etc. (Marsden, 2005). So in this section, we derive a new set of 79 analytical solutions by assuming that rewiring occurs evenly for each node. This approach would have more similarity to the empirical situation and the actual simulation, but is more difficult to derive. Here we provide some intuition for this scenario. Note that here we only derive an analytical solution for homophily rewiring and anti-homophily rewiring, since in random rewiring the result will not be affected much by whichever way we randomly rewire the ties, as long as we simulate sufficiently often. Figure 17: Distribution of percent of the ties rewired For homophily rewiring, as before let the network exposure term be X, let the prior term be Z, let the outcome be Y, and let the average out-degree for each node be n. If we rewire (1-p)*100% of the ties to perfectly homophilous others, then the new exposure term for an actor i would become nZZpnXnpjji1)1(' , where Zj represents the behavior of actor i™s original network neighbors, Zi represents 80 the behavior of actor i™s new network neighbors (who hold exactly the same behavior as actor i). Since nZVarnZVarXVarn)()()(1, we obtain )()(1ZVarnpnZVarnp. Also, ),(),()1()',(XZpCovZZCovpXZCov, and xzxzrnZVarrXsdZsdZXCov)()()(),(. (43) As a result, the new correlation between network exposure and the prior, after rewiring becomes *22cov(,')(1)cov(,)cov(,)()(')()(')(1)cov(,)cov(,)()(1)()()2(1)(,)(1)cov(,)cov(,)2(1)()(1)()()()(1)xzxzZXpZZpZXrsdZsdXsdZsdXpZZpZXpsdZpVarZVarZppCovXZnpZZpZXpprpsdZpVarZVarZVarZnnp2222cov(,)2(1)()()(1)cov(,)()(1)()()2(1)()(1)2(1)(1)2(1)(1)xzxzxzxzxzZZpprpsdZsdZpnnpZXsdZnpVarXpVarXnpprVarXprppprpnppnpprpnn (44) Similarly, the new correlation between network exposure and the outcome after rewiring becomes 81 *22cov(,')(1)cov(,)cov(,)()(')()(')(1)cov(,)cov(,)()(1)()()2(1)(,)(1)cov(,)cov(,)2(1)()(1)()()()(1)xyxzYXpYZpYXrsdYsdXsdYsdXpYZpYXpsdYpVarZVarZppCovXZnpYZpYXpprpsdYpVarZVarZVarZnnp2222cov(,)2(1)()()(1)cov(,)()(1)()()2(1)()(1)2(1)(1)2(1)(1)xzxzyzxyxzxzYZpprpsdYsdZpnnpYXsdYnpVarXpVarXnpprVarXprprpprpnppnpprpnn (45) Finally, as before the threshold of partial correlation|xyzr -- #r, can be calculated to be ###2.trtresdf, where t# is the critical value of t to invalidate inference, and res.df represents the residual degrees of freedom. Thus to invalidate the observed inference we need to randomly rewire 100(1-p)% of ties in order to get ***#||*2211xyxzyzxyzxyzxzyzrrrrrrr (46) As we can see, we can now represent the new partial correlation in terms of p (the percentage of ties retained), the original correlations in the observed data, and the average out-degree n. We can also write p as a function of other variables, but the formula is too complicated, so we do not provide it here. 82 For the anti-homophily rewiring we follow the same setup and let network exposure be X, the prior term be Z, the outcome be Y, and the average out-degree for each node be n. If we rewire (1-p)*100% of the ties of the nodes to the most dissimilar others, then assuming Z is centered at 0, which only affects the value of the new network exposure term X but not the correlation we are interested in, the new exposure term for actor i would become nZZpnXnpjjii1)1('. And since nZVarnZVarXVarn)()()(1 , )()(1ZVarnpnZVarnp. Also, ),(),()1()',(XZpCovZZCovpXZCov, and xzxzrnZVarrXsdZsdZXCov)()()(),(. (47) As a result, after rewiring the new correlation between network exposure and the prior becomes *22cov(,')(1)cov(,)cov(,)()(')()(')(1)cov(,)cov(,)()(1)()()2(1)(,)(1)cov(,)cov(,)2(1)()(1)()()()(xzxzZXpZZpZXrsdZsdXsdZsdXpZZpZXpsdZpVarZVarZppCovXZnpZZpZXpprpsdZpVarZVarZVarZnn22221)cov(,)2(1)()()(1)cov(,)()(1)()()2(1)()(1)2(1)(1)2(1)(1)xzxzxzxzxzpZZpprpsdZsdZpnnpZXsdZnpVarXpVarXnpprVarXprppprpnppnpprpnn (48) Similarly, after rewiring, the new correlation between network exposure and the 83 outcome becomes *22cov(,')(1)cov(,)cov(,)()(')()(')(1)cov(,)cov(,)()(1)()()2(1)(,)(1)cov(,)cov(,)2(1)()(1)()()()(xyxzYXpYZpYXrsdYsdXsdYsdXpYZpYXpsdYpVarZVarZppCovXZnpYZpYXpprpsdYpVarZVarZVarZnn22221)cov(,)2(1)()()(1)cov(,)()(1)()()2(1)()(1)2(1)(1)2(1)(1)xzxzyzxyxzxzpYZpprpsdYsdZpnnpYXsdYnpVarXpVarXnpprVarXprprpprpnppnpprpnn (49) Finally, as before the threshold for partial correlation|xyzr -- #r, can be calculated to be ###2.trtresdf, where t# is the critical value of t to invalidate inference, and res.df represents the residual degrees of freedom. Thus to invalidate the observed inference we need to randomly rewire 100(1-p)% of ties in order to get ***#||*2211xyxzyzxyzxyzxzyzrrrrrrr (50) As we can see we can now represent the new partial correlation in terms of p (the percentage of ties retained), the original correlations in the observed data, and the average out-degree n. We can also write p as a function of other variables, but the formula is too 84 complicated, so we do not provide it here. iii. Validation In this section we provide three simulated examples to examine how well our analytical solutions fit the actual simulation results. In each example we construct a simulated network dataset where N = 100, T = 2, and the density = 0.1. The influence process follows 1101121,ijtjtitititijtZYYYeZ (51) where Y represents the behavioral outcome of interest, Z is a binary variable representing a network relationship, and eit is an error term following N(0,0.22). In the three simulated examples, we fix the correlation ryz between the prior and the dependent variable to be ~ 0.6, the correlation rxz between the prior and the network exposure to be 0.1 ~ 0.2, but vary the correlation rxy between network exposure and the dependent variable to be > 0.4 (strong influence/inference), 0.3 (moderate influence/inference) and < 0.2 (weak influence/inference). To explore how our estimates are robust to various errors in the observed network we then rewire a different percentage of existing ties (varying from 10 to 90 percent) based on (1) random selection; (2) homophily, that is, actors will rewire to unestablished ties with the smallest value of |Yit-1-Yjt-1|; or (3) anti-homophily, that is, actors will rewire to unestablished ties with the largest value of |Yit-1-Yjt-1|. In each configuration, we simulate 1000 times and calculate the mean of (1) the new raw correlation rxz between the prior and the network exposure after rewiring; (2) the new raw correlation rxy between network exposure and the dependent variable after rewiring; and (3) the partial correlation rxy|z between network exposure and the dependent variable after rewiring. We then compare the simulation results with the two sets of analytical solutions below. 85 Example 1 is a case when there is strong influence/inference. rxy > 0.4. Random rewiring results are shown in figure 18. Figure 18: Strong influence - random rewiring example The graphs from left to right respectively show (1) raw correlations rxy between network exposure and the dependent variable after rewiring; (2) raw correlations rxz between the prior and network exposure after rewiring; and (3) partial correlations rxy|z between network exposure and the dependent variable after rewiring. The X axis shows the percentage of ties rewired. The Y axis shows the values of correlation. Blue dots are the results from actual simulations (each point is a result of 1000 86 simulations), the green line is the result from the first set of the analytical solutions, and the red line is the result from the second set of analytical solutions. The black line represents the threshold to change the inference. The same description applies to all the graphs set out below. Homophily rewiring results are shown in figure 19, where anti-homophily results are shown in figure 20. Figure 19: Strong influence - homophily rewiring example 87 Figure 20: Strong influence Œ anti-homophily rewiring example Example 2 is a case when there is moderate influence/inference, rxy 0.3. Same as before random-rewiring results are shown in figure 21, homophily rewiring results are shown in figure 22, anti-homophily rewiring results are shown in figure 23. 88 Figure 21: Moderate influence Œ random rewiring example 89 Figure 22: Moderate influence Œ homophily rewiring example 90 Figure 23: Moderate influence Œ anti-homophily rewiring example Example 3 is a case when we have weak influence/inference, rxy < 0.2. Same as before random-rewiring results are shown in figure 24, homophily rewiring results are shown in figure 25, anti-homophily rewiring results are shown in figure 26. 91 Figure 24: Weak influence Œ random rewiring example 92 Figure 25: Weak influence Œ homophily rewiring example 93 Figure 26: Weak influence Œ anti-homophily rewiring example As the three examples show, the first set of analytical solutions (assuming that (1-p)100% actors rewire all their ties) do not fit well to the actual raw correlations rxy and rxz after rewiring, but the second set of analytical solutions (assuming (1-p)100% ties are rewired evenly across all nodes) fit quite well to the raw correlations rxy and rxz from the simulations. However, in terms of determining the robustness of inference, both sets of analytical solutions are equally useful and fit very well with the partial correlation rxy|z 94 calculated from the simulation, except for anti-homophily rewiring, where the analytical solution overestimates the partial correlation to some extent. One possible reason could be that for the anti-homophily rewiring, we assume the new correlation rxz between network exposure and prior after rewiring is -1. But the magnitude of the actual correlation is usually smaller than 1, depending on the distribution of the prior term. As a result, our analytical solutions for anti-homophily rewiring are likely to provide an upper bound for the actual partial correlation of interest. In the Figure S2 in the Appendix, we give another simulation example where the prior term is a binary variable, so that the new correlation between network exposure and the prior after rewiring should be -1. As Figure S2 shows, in this case most of the bias is eliminated, and the analytical solutions fit very well with the simulation results. Finally, note that in the homophily-rewiring, we assume that actors can rewire to others who hold exactly the same behavior. This generally requires a large network where everyone can find perfectly homophilous others. It should not be surprising that the fit between the analytical solution and actual simulation would be worse with a smaller size of the network, or higher density of the ties, in which the actors are forced to talk to others who are different from themselves. And indeed, examples (Figures S3 and S4) in the Appendix show a worse fit between the analytical solution and homophily rewiring when we reduce the network size or increase the density of the network. V. Discussion and Conclusion Concerns for the validity of network observations are common, yet less studied in social network analysis. The lack of validity in network observations is not just a result of simple random measurement errors, but is often due to systematic bias that can lead to misinterpretation of actors™ preferences in network selection. And they can affect how we draw inferences from our empirical studies. This chapter applies a 95 set of simulation based sensitivity analysis methods, that can test the robustness of inferences made in social network analysis, to six forms of selection mechanisms that can cause errors/bias in networks, namely random, homophily, anti-homophily, transitivity, reciprocity and preferential attachment. Specifically, we show how these approaches are useful in testing the robustness of inferences for contagion effects. Besides, we have also derived two sets of analytical solutions for sensitivity analysis methods that can account for selection mechanisms based on random, homophily and anti-homophily. Examples show that the analytical solutions generally fit well to the simulation results, under reasonable assumptions. The simulation-based sensitivity analysis methods developed in this Chapter can be easily adapted and applied to many different forms of network analysis, such as a one-mode selection model (e.g. P2, ERGM), bipartite graph analysis and models that deal with co-evolution between behavior and networks (SIENA). Nevertheless, our focus in this chapter is on the robustness of inference in influence models. Our sensitivity analysis methods essentially re-construct the network exposure terms by rewiring the observed interaction matrix W, and different rewiring mechanisms have distinct implications for the network structure and the distribution of the network exposure term, as follows. (1) Different agency rewiring (random, homophily and anti-homophily) can create distinct distributions of network exposure. For example random rewiring can create a network exposure that is close to the overall mean of the actor™s prior belief/behavior distribution; homophily rewiring (in the extreme) can create a network exposure that is exactly the same as the actor™s prior belief/behavior distribution; and anti-homophily rewiring often creates a network exposure with a polarized bimodal distribution, as actors seek most dissimilar others in this case, and their network exposure will be clustered around those with most distinct behaviors/beliefs. (2) Structural rewiring (transitivity, reciprocity, and preferential attachment) can 96 create distinct network structures. For example, if observed ties are completely rewired based on transitivity, we would see a network with a strong community feature and local clustering; if observed ties are completely rewired based on reciprocity we would see a network with fewer paths but much more bi-directional interactions; finally, if observed ties are completely rewired based on preferential attachment we would see a network with a core-periphery structure or a network with a power-law degree distribution. Besides simulation-based sensitivity analysis methods, in this Chapter we have also developed analytical solutions for random, homophily and anti-homophily rewiring. However, it should be noted that the motivation behind the analytical solutions in this Chapter is not to replace the simulation based rewiring, but to better understand the how simulation-based rewiring methods work, and how they affect the inference of contagion effects in various scenarios. In general, the goodness of fit between analytical solutions and simulation based rewiring depends on the observed network structure and the distribution of covariates of interest (e.g. prior, individual characteristics such as gender, age etc.), some of which have direct sociological implications, as follows. (1) The analytical solution for homophily rewiring requires that actors rewire to perfectly homophilous others, and this fits better to simulation results when we have a larger network, as is plausible since actors are more likely to find similar others in a large social structure with a lot of heterogeneity (Blau, 1977). In contrast, their choices will be more confined if the network is smaller and thus more homogeneous. And it will be even more difficult if actors are seeking similar others on multiple dimensions (age, race etc.). Furthermore, the analytical solution also fits better to simulation results when we have a sparser network, as for each individual it is easier to find and maintain a small homogenous social group than a large one. (2) The analytical solution for anti-homophily rewiring assumes that the new 97 correlation between network exposure and the prior term after rewiring should approximately be -1. This assumption is more likely to hold when the distribution for actors™ prior belief/behavior is binary or bimodal. For example, if actors seek dissimilar others based on whether they smoke or not (1 yes, 0 no), the resulting correlation after rewiring between network exposure and the prior term should be exactly -1. Or if the variable of interest is a continuous measure of political ideas/belief, and actors™ prior orientation are clustered around two polarizing political ideas/beliefs, the resulting correlation after rewiring between network exposure and the prior term should be approximately -1 as well. As a result, the analytical solution for anti-homophily rewiring will fit better with simulation results in cases where there are two ideologically polarized groups. Finally note that there are several limitations in this study. 1. In our study we represent network relations using binary variables, either there is a relation or there is no relation. However, there are many other representations of network relations, using descriptors such as ranks and weights. The sensitivity analyses developed in this study do not apply to these cases. 2. We have only applied our sensitivity analyses to inference for contagion effects. Although our simulation based sensitivity analyses can potentially be applied to any case using network data, in our study we only give examples for the inference for contagion effects, and our analytical solutions also only apply to one type of contagion effects. 3. Our sensitivity analysis methods only deal with observed variables in networks, not unobserved variables. As a result, for the homophily and anti-homophily rewiring, our sensitivity analysis methods only apply to homophily/anti-homophily based on observed variables. That is, our methods do not deal with any issues pertaining to unobserved/confounding variables as in many other sensitivity analysis methods (Frank, 2000; Vanderweele, 2011). Potentially, this limits how useful our 98 methods are in terms of testing the internal validity of the inference for contagion effects, since latent homophily/shared environmental factors are usually the biggest concerns. Nevertheless, we consider the sensitivity analyses developed in this Chapter to be important steps in terms of understanding how misinterpretation of actors™ preferences that are manifest in observed networks can affect the robustness of inferences. And they are also useful as empirical tools that allow us to test the robustness of our inferences by devising clear and testable alternative hypotheses, which are the keys to making strong inference in any field of science (Platt, 1964). 99 APPENDIX 100 APPENDIX a. Negativity Bias Why contagion effects are negatively biased when there is an unobserved variable in the influence model and the network is static (1) 112112‹‹‹YYYYY '1'112112‹‹‹‹‹‹0()()0YYYYYYYYY '1''1'112112112‹‹‹‹()()()()YYYYYYYYYYY ('1'‹()0YYY by construction) '1''1''1'11112222111‹‹‹‹()()()()()YYYYYYYYYYYYAnother way to see why 1‹ is positively biased but 2‹ is negatively biased is through the FrischŒWaughŒLovell (FWL) theorem. '1''1'11111111'1''1''1'1111111‹()()()()()YYYYYYYMYYMYYMYYMYMYYYMYYYYYY Assume (')0EY, then '1'1111‹()YYMYY, which is positively biased. Using the same derivation we have 111111'1''1'22'1''1''1'21111‹()()()()()YYYYYYYMYYMYYMYYMYMYYYMYYYYYY 1'1''1'21111()()YYMYYYYYY, which is negatively biased 101 (2) Figure S1: Cases when true prior is 0 (Big N vs Big T) Figure S1 shows the estimates of exposure when the true prior is 0. If the prior term is excluded from estimation, the bias in exposure is smaller. The bias is possibly due to a combination of Hurwicz bias (correlation between exposure, and c decreases with lower average node degree or larger network size, correlation between exposure 102 and error decreases with larger T) and correlation between the prior and c. However if the true prior is not 0, excluding it will positively bias the exposure estimate. To illustrate negative correlation between 1itYandic(Hurwicz bias), let 21ititiitYYce, assume thatichas mean 0 and that we only have one observation for each node, so that 111()(,)ititiitiYYcCOVYcn, 11itiitYYn. Then 1itiYc will be positive as 1itYappears in some other peoples™ exposure term. Together, 1(,)0itiCOVYc, and2will be negatively biased. b. Reflection problem Consider a two person system where both are simultaneously influencing each other: 1121tttyyu 2212tttyyu Solving these two equations, we have 112221121212,11ttttttuuuuyy As we can see, 1tu,2ty are correlated (1tu is a function of2ty) and 2tu,1ty are correlated (2tuis a function of1ty), and the system is not identified using OLS. i. By imposing a structural constraint we can have identification. Assume person 2 is influencing person 1, but not vice versa. So we have 1121tttyyu and 202ttyu. 103 In this case 110121tttyuu, but as observed variables are not correlated with errors, the system is identified using OLS. ii. By having extra exogenous variables in each equation we can have identification, using the instrumental variable (IV) method. To see this, let the system be 11121211ttttyyzu 22112222ttttyyzu Assume1tz, 2tzare exogenous variables that are not correlated with 1tuor 2tu. For example 1tz could be attributes of person 1™s friends who do not know person 2, similarly 2tz are attributes of person 2™s friends who do not know person 1. Then 1tzcan be an instrumental variable for 1ty in the second equation and 2tz can be an instrumental variable for 2tyin the first equation, and the system is identified using IV estimation methods such as 2SLS. iii. Influence is not simultaneous but lagged: 11211tttyyu 22112tttyyu Then we have 11121312211ttttyyuu,21122321221ttttyyuu. As we can see 11ty,21tyonly correlate with past values of u, not with contemporaneous error terms 1tuand 2tu, so the system is identified using OLS. c. Algebraic derivation Here we show Algebraic derivation for confounding between influence and selection through a latent trait. To see why the network exposure term is correlated with the latent trait, let the influence model be 104 1111231ijtjtititiitijtZYYYceZ, and correspondingly the fitruefl selection model can be represented as 10111[||0]ijtijijtZcc where 1[.] is an indicator function. For 1jtY we have 21121131102110111ttttxtxjtjjxjjxxxYYYce, where jxY represents the network exposure term for person j at time x. For convenience assume 0jY,jxY,jxe are not correlated with ci. If10 (homophily effect), thenijcc and 10111[]ijtijtjiZcc, otherwise 10111[]ijtijtjiZcc. Substitute Zijt-1 and Yjt-1 into the network exposure term 2110121131021101111011[]()11[]ttijtttxtxjijjxjjxxxijtjiccYYcecc. Even if we assume 1ijt,0jY,jxY, andjxe are not correlated with ci, the term cj is still a complex function of ci, so the network exposure term is correlated with an unobserved trait. d. Upper bound Here we establish upper bound of contagion effects using estimation from homophily effects. Estimated selection model 0112()deg...ijtjtijgZInreeSimilarity 105 Here, Z represents a network relationship. The term Similarityij can be a composite measure of multiple attributes such as a cosine similarity: cos(,)||||ikjkkijijxxxxxx, where xi is the vector of attributes for person i. Then the relative magnitude of the standardized coefficient 2represents the magnitude of relational balance, which is 212.... Let the influence model be 110112311...ijtjtititititijtZYYYXeZ Here X is a set of control variables, and the relative magnitude of standardized coefficient 2represents the magnitude of relational balance in the selection model, which is 212.... Then assuming that influence operates no faster than selection (which can be tested using empirical data), the upper-bound of 212... should be 212.... If 2 is over-estimated due to omitted variable bias, this upper bound can be useful in terms of determining the magnitude of bias. It would be interesting to have multiple empirical data sets to test two things: (1) if the homophily effect is the upper bound for contagion effects; (2) if homophily effects and contagion effects are indeed correlated. e. Anti-homophily rewiring example Here we give an anti-homophily rewiring example, N = 100. Density = 0.1, rxy > 0.2. The prior term is a binary variable. Each point is a result of 1000 simulations. The analytical solutions in this case fit much better to the simulation results, compared with cases where the prior term is normally distributed. 106 Figure S2: Anti-homophily rewiring example (better fit) f. Homophily rewiring example Here we provide some extra simulation examples for homophily rewiring, where they show fitting between analytical solutions and simulation results would be worse if we had a smaller or denser network. Example 1 in Figure S3 shows results for homophily rewiring when network is smaller, N = 50 instead of 100. Density = 0.1. Example 2 in Figure S4 shows results for homophily rewiring when network is denser, density = 0.2 instead of 0.1. N = 100. 107 Figure S3: homophily rewiring example when network is smaller 108 Figure S4: homophily rewiring example when network is denser 109 REFERENCES 110 REFERENCES An, W. (2011). Models and methods to identify peer effects. The Sage Handbook of Social Network Analysis. London: Sage, 515-532. An, W. (2015). Instrumental variables estimates of peer effects in social networks. Social Science Research, 50, 382-394. Anderson, T. W., & Hsiao, C. (1981). Estimation of dynamic models with error components. Journal of the American Statistical Association, 76(375), 598-606. Anderson, T. W., & Hsiao, C. (1982). Formulation and estimation of dynamic models using panel data. Journal of Econometrics, 18(1), 47-82. Angrist, J. D., & Lang, K. (2004). Does school integration generate peer effects? Evidence from Boston's Metco Program. American Economic Review, 1613-1634. Aral, S., Muchnik, L., & Sundararajan, A. (2009). Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networks. Proceedings of the National Academy of Sciences, 106(51), 21544-21549. Arellano, M., & Bond, S. (1991). Some tests of specification for panel data: Monte Carlo evidence and an application to employment equations. The Review of Economic Studies, 58(2), 277-297. Asch, S. E. (1952). Group forces in the modification and distortion of judgments. Bandura, A. (1986). Social foundations of thought and action: A social cognitive theory. Prentice-Hall, Inc. Barabási, A. L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509-512. Barnes, G. M., Reifman, A. S., Farrell, M. P., & Dintcheff, B. A. (2000). The effects of parenting on the development of adolescent alcohol misuse: a Six-Wave latent growth model. Journal of Marriage and Family, 62(1), 175-186. Bernard, H. R., & Killworth, P. D. (1977). Informant accuracy in social network data II. Human Communication Research, 4(1), 3-18. Bernard, H. R., Killworth, P. D., & Sailer, L. (1982). Informant accuracy in social-network data V. An experimental attempt to predict actual communication from recall data. Social Science Research, 11(1), 30-66. Bernard, H. R., Killworth, P., & Sailer, L. (1981). Summary of research on informant accuracy in network data and the reverse small world problem. Connections, 4(2), 11-25. 111 Blau, P. M. (1977). Inequality and heterogeneity. Bollen, K. A., & Brand, J. E. (2010). A general panel model with random and fixed effects: A structural equations approach. Social Forces, 89(1), 1-34. Borgatti, S. P., & Everett, M. G. (2000). Models of core/periphery structures. Social Networks, 21(4), 375-395. Bound, J., Jaeger, D. A., & Baker, R. M. (1995). Problems with instrumental variables estimation when the correlation between the instruments and the endogenous explanatory variable is weak. Journal of the American Statistical Association, 90(430), 443-450. Bramoullé, Y., Djebbari, H., & Fortin, B. (2009). Identification of peer effects through social networks. Journal of Econometrics, 150(1), 41-55. Burt, R. S., & Knez, M. (1995). Kinds of third-party effects on trust. Rationality and Society, 7(3), 255-292. Byrne, D. E. (1971). The attraction paradigm (Vol. 11). Academic Pr. Carley, K. M. (1991). Designing organizational structures to cope with communication breakdowns: A simulation model. Organization & Environment, 5(1), 19-57. Casciaro, T. (1998). Seeing things clearly: Social structure, personality, and accuracy in social network perception. Social Networks, 20(4), 331-351. Christakis, N. A., & Fowler, J. H. (2007). The spread of obesity in a large social network over 32 years. New England Journal of Medicine, 357(4), 370-379. Christakis, N. A., & Fowler, J. H. (2008). The collective dynamics of smoking in a large social network. New England Journal of Medicine, 358(21), 2249-2258. Concato, J., Shah, N., & Horwitz, R. I. (2000). Randomized, controlled trials, observational studies, and the hierarchy of research designs. New England Journal of Medicine, 342(25), 1887-1892. Davis, G. F., Yoo, M., & Baker, W. E. (2003). The small world of the American corporate elite, 1982-2001. Strategic Organization, 1(3), 301-326. Davis, J.A. and Leinhardt, S. (1972): The structure of positive interpersonal relations in small groups. In J. Berger, M. Zeldith Jr. and B. Anderson (Eds): Sociological Theories in Progress, 2, 218-251. Boston: Houghton Mifflin. Doreian, P. (1989). Models of network effects on social actors. Research Methods in Social Network Analysis, 295-317. Doreian, P. (2001). Causality in social network analysis. Sociological Methods & Research, 30(1), 81-114. 112 Dornbusch, S. M. (1989). The sociology of adolescence. Annual Review of Sociology, 233-259. Duncan, O. D., Haller, A. O., & Portes, A. (1968). Peer influences on aspirations: A reinterpretation. American Journal of Sociology, 119-137. Emirbayer, M., & Goodwin, J. (1994). Network analysis, culture, and the problem of agency. American Journal of Sociology, 1411-1454. Erbring, L., & Young, A. A. (1979). Individuals and social structure contextual effects as endogenous feedback. Sociological Methods & Research, 7(4), 396-430. Feld, S. L. (1981). The focused organization of social ties. American Journal of Sociology, 1015-1035. Feld, S. L. (1982). Social structural determinants of similarity among associates. American Sociological Review, 797-801. Feld, S. L., & Carter, W. C. (2002). Detecting measurement bias in respondent reports of personal networks. Social Networks, 24(4), 365-383. Fleming, L., King III, C., & Juda, A. I. (2007). Small worlds and regional innovation. Organization Science, 18(6), 938-954. Frank, K. A. (2000). Impact of a confounding variable on a regression coefficient. Sociological Methods & Research, 29(2), 147-194. Frank, K. A., Zhao, Y., & Borman, K. (2004). Social capital and the diffusion of innovations within organizations: The case of computer technology in schools. Sociology of Education, 77(2), 148-171. Frank, K. A., Muller, C., Schiller, K. S., Riegle-Crumb, C., Mueller, A. S., Crosnoe, R., & Pearson, J. (2008). The social dynamics of mathematics coursetaking in high school. AJS; American Journal of Sociology, 113(6), 1645. Frank, K. A., Maroulis, S. J., Duong, M. Q., & Kelcey, B. M. (2013). What would it take to change an inference? Using Rubin™s causal model to interpret the robustness of causal inferences. Educational Evaluation and Policy Analysis, 0162373713493129. Frank, K. A. & Xu, R. (Forthcoming). Causal Inference in Network Analysis: Navigating Dependencies and Alternative Explanations to Create Good Science. Oxford Handbook of Social Network Analysis. Freeman, L. C. (1977). A set of measures of centrality based on betweenness. Sociometry, 35-41. Freeman, L. C., Romney, A. K., & Freeman, S. C. (1987). Cognitive structure and informant accuracy. American Anthropologist, 89(2), 310-325. 113 Friedkin, N. E. (2001). Norm formation in social influence networks. Social Networks, 23(3), 167-189. Friedkin, N. E., & Johnsen, E. C. (1990). Social influence and opinions. Journal of Mathematical Sociology, 15(3-4), 193-206. Friedkin, N. E., & Johnsen, E. C. (1999). Social influence networks and opinion change. Advances in Group Processes, 16(1), 1-29. Girvan, M., & Newman, M. E. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99(12), 7821-7826. Goffman E (1963) Behavior in Public Places: Notes on the Social Organization of Gatherings. New York: Free Press Glencoe. 248 pp. Golder, S. A., Wilkinson, D. M., & Huberman, B. A. (2007). Rhythms of social interaction: Messaging within a massive online network. In Communities and Technologies 2007 (pp. 41-66). Springer London. Granovetter, M. S. (1973). The strength of weak ties. American Journal of Sociology, 1360-1380. Hancock, G. R. (2003). Fortune cookies, measurement error, and experimental design. Journal of Modern Applied Statistical Methods, 2(2), 3. Hargens, L. L. (2000). Using the literature: Reference networks, reference contexts, and the social structure of scholarship. American Sociological Review, 846-865. Harris, K. M., & National Longitudinal Study of Adolescent Health. (2009). Waves I & II, 1994Œ1996; Wave III, 2001Œ2002; Wave IV, 2007Œ2009 [machine-readable data file and documentation]. Chapel Hill, NC: Carolina Population Center, University of North Carolina at Chapel Hill, 10. Hoff, P. D., Raftery, A. E., & Handcock, M. S. (2002). Latent space approaches to social network analysis. Journal of the American Statistical Association, 97(460), 1090-1098. Jin, E. M., Girvan, M., & Newman, M. E. (2001). Structure of growing social networks. Physical Review E, 64(4), 046132. Judson, R. A., & Owen, A. L. (1999). Estimating dynamic panel data models: a guide for macroeconomists. Economics Letters, 65(1), 9-15. Kalmijn, M., & Flap, H. (2001). Assortative meeting and mating: Unintended consequences of organized settings for partner choices. Social Forces, 79(4), 1289-1312. Kandel, D. B. (1978). Homophily, selection, and socialization in adolescent friendships. American Journal of Sociology, 427-436. 114 Kaplan, D (2007). Structural Equation Modeling. Sage. pp. 1089Œ1093 Killworth, P., & Bernard, H. (1976). Informant accuracy in social network data.Human Organization, 35(3), 269-286. Kiviet, J. F. (1995). On bias, inconsistency, and efficiency of various estimators in dynamic panel data models. Journal of Econometrics, 68(1), 53-78. Kline, R (2011). Principles and Practice of Structural Equation Modeling (Third ed.). Guilford Kossinets, G. (2006). Effects of missing data in social networks. Social Networks, 28(3), 247-268. Kossinets, G., & Watts, D. J. (2009). Origins of homophily in an evolving social network1. American Journal of Sociology, 115(2), 405-450. Krackhardt, D. (1987). QAP partialling as a test of spuriousness. Social Networks, 9(2), 171-186. Kruskal, J. B. (1964). Nonmetric multidimensional scaling: a numerical method. Psychometrika, 29(2), 115-129. Lancaster, T. (2000). The incidental parameter problem since 1948. Journal of Econometrics, 95(2), 391-413. Lazarsfeld, P. F., & Merton, R. K. (1954). Friendship as a social process: A substantive and methodological analysis. Freedom and Control in Modern Society, 18(1), 18-66. Liljeros, F., Edling, C. R., Amaral, L. A. N., Stanley, H. E., & Åberg, Y. (2001). The web of human sexual contacts. Nature, 411(6840), 907-908. Lyons, R. (2011). The spread of evidence-poor medicine via flawed social-network analysis. Statistics, Politics, and Policy, 2(1). Manski, C. F. (1993). Identification of endogenous social effects: The reflection problem. The Review of Economic Studies, 60(3), 531-542. Mark, N. (1998). Beyond individual differences: Social differentiation from first principles. American Sociological Review, 309-330. Marsden, P. V. (1990). Network data and measurement. Annual Review of Sociology, 435-463. Marsden, P. V., & Friedkin, N. E. (1993). Network studies of social influence.Sociological Methods & Research, 22(1), 127-151. Marsden, P. V. (2005). Recent developments in network measurement. Models and Methods in Social Network Analysis, 8, 30. 115 Mayhew, B. H. (1980). Structuralism versus individualism: Part 1, shadowboxing in the dark. Social Forces, 59(2), 335-375. McPherson, J. M., & Smith-Lovin, L. (1987). Homophily in voluntary organizations: Status distance and the composition of face-to-face groups. American Sociological Review, 370-379. McPherson, M., Smith-Lovin, L., & Cook, J. M. (2001). Birds of a feather: Homophily in social networks. Annual Review of Sociology, 415-444. Merton, R. K. (1957). Social theory and social structure. Merton, R. K. (1968). The Matthew effect in science. Science, 159(3810), 56-63. Moffitt, R. A. (2001). Policy interventions, low-level equilibria, and social interactions. Social Dynamics, 4(45-82), 6-17. Mollica, K. A., Gray, B., & Trevino, L. K. (2003). Racial homophily and its persistence in newcomers' social networks. Organization Science, 14(2), 123-136. Montoya, R. M., & Insko, C. A. (2008). Toward a more complete understanding of the reciprocity of liking effect. European Journal of Social Psychology, 38(3), 477-498. Moffitt, R. A. (2001). Policy interventions, low-level equilibria, and social interactions. Social Dynamics, 4(45-82), 6-17. Moody, J. (2004). The structure of a social science collaboration network: Disciplinary cohesion from 1963 to 1999. American Sociological Review, 69(2), 213-238. Mouw, T. (2006). Estimating the causal effect of social capital: A review of recent research. Annual Review of Sociology, 79-102. Nash, S. G., McQueen, A., & Bray, J. H. (2005). Pathways to adolescent alcohol use: Family environment, peer influence, and parental expectations. Journal of Adolescent Health, 37(1), 19-28. Newcomb, T. M. (1956). The prediction of interpersonal attraction. American psychologist, 11(11), 575. Backman CW, Secord PF. 1959. The effect of perceived liking on interpersonal attraction. Human Relations 12:379Œ 84. Newman, M. E. (2001). The structure of scientific collaboration networks. Proceedings of the National Academy of Sciences, 98(2), 404-409. Nickell, S. (1981). Biases in dynamic models with fixed effects. Econometrica: Journal of the Econometric Society, 1417-1426. Oetting, E. R., & Donnermeyer, J. F. (1998). Primary socialization theory: The etiology of drug use and deviance. I. Substance Use & Misuse, 33(4), 995-1026. 116 O'Malley, A. J., Elwert, F., Rosenquist, J. N., Zaslavsky, A. M., & Christakis, N. A. (2014). Estimating peer effects in longitudinal dyadic data using instrumental variables. Biometrics, 70(3), 506-515. Page, S. E. (2007). The Difference: How the Power of Diversity Creates Better Groups, Firms, Schools, and Societies. Princeton, NJ: Princeton Univ. Press Pan, W., & Frank, K. A. (2004). An approximation to the distribution of the product of two dependent correlation coefficients. Journal of Statistical Computation and Simulation, 74(6), 419-443. Penuel, W. R., Sun, M., Frank, K. A., & Gallagher, H. A. (2012). Using social network analysis to study how collegial interactions can augment teacher learning from external professional development. American Journal of Education, 119(1), 103-136. Pitts, V. M., & Spillane, J. P. (2009). Using social network methods to study school leadership. International Journal of Research & Method in Education,32(2), 185-207. Platt, J. R. (1964). Strong inference. Science, 146(3642), 347-353. Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical Linear Models: Applications and Data Analysis Methods (Vol. 1). Sage. Rice, R. E., & Richards, W. D. (1985). An overview of network analysis methods and programs. Progress in Communication Sciences, 6, 105-165. Rice, R., Borgman, C., Bednarski, D., & Hart, P. (1989). Journal-to-journal citation data: Issues of validity and reliability. Scientometrics, 15(3-4), 257-282. Rivera, M. T., Soderstrom, S. B., & Uzzi, B. (2010). Dynamics of dyads in social networks: Assortative, relational, and proximity mechanisms. Annual Review of Sociology, 36, 91-115. Robins, G., Pattison, P., & Woolcock, J. (2004). Missing data in networks: exponential random graph (p) models for networks with non-respondents. Social Networks, 26(3), 257-283. Robins, G., Pattison, P., Kalish, Y., & Lusher, D. (2007). An introduction to exponential random graph (p*) models for social networks. Social Networks,29(2), 173-191. Rosenbaum, P. R., & Rubin, D. B. (1983). Assessing sensitivity to an unobserved binary covariate in an observational study with binary outcome. Journal of the Royal Statistical Society. Series B (Methodological), 212-218. Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41-55. 117 Runger, G., & Wasserman, S. (1980). Longitudinal analysis of friendship networks. Social Networks, 2(2), 143-154. Sacerdote, B. (2000). Peer effects with random assignment: Results for Dartmouth roommates (No. w7469). National Bureau of Economic Research. Sarkar, P., & Moore, A. W. (2005). Dynamic social network analysis using latent space models. ACM SIGKDD Explorations Newsletter, 7(2), 31-40. Schelling, T. C. (1971). Dynamic models of segregationƒ. Journal of Mathematical Sociology, 1(2), 143-186. Schonfeld, I. S., & Rindskopf, D. (2007). Hierarchical Linear Modeling in Organizational Research Longitudinal Data Outside the Context of Growth Modeling. Organizational Research Methods, 10(3), 417-429. Shadish, W. R., Clark, M. H., & Steiner, P. M. (2008). Can nonrandomized experiments yield accurate answers? A randomized experiment comparing random and nonrandom assignments. Journal of the American Statistical Association, 103(484), 1334-1344. Shalizi, C. R., & Thomas, A. C. (2011). Homophily and contagion are generically confounded in observational social network studies. Sociological Methods & Research, 40(2), 211-239. Shortreed, S., Handcock, M. S., & Hoff, P. (2006). Positional estimation within a latent space model for networks. Methodology, 2(1), 24-33. Sims, C. A. (1980). Macroeconomics and reality. Econometrica: Journal of the Econometric Society, 1-48. Snijders, T., Steglich, C., & Schweinberger, M. (2007). Modeling the coevolution of networks and behavior (pp. 41-71). Sprecher, S. (1998). Insiders' perspectives on reasons for attraction to a close other. Social Psychology Quarterly, 287-300. Steglich, C., Snijders, T. A., & Pearson, M. (2010). Dynamic networks and behavior: Separating selection from influence. Sociological Methodology, 40(1), 329-393. Steiner, P. M., Cook, T. D., Shadish, W. R., & Clark, M. H. (2010). The importance of covariate selection in controlling for selection bias in observational studies. Psychological Methods, 15(3), 250. Uzzi, B., & Spiro, J. (2005). Collaboration and creativity: The small world Problem1. American Journal of Sociology, 111(2), 447-504. Valente, T. W. (1995). Network models of the diffusion of innovations (Vol. 2, No. 2). Cresskill, NJ: Hampton Press.