TO FIGHT OR NOT TO FIGHT: DOES CONSPECIFIC STRENGTH INFLUENCE DEFENSIVE SIGNALING? By David J. Johnson A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Psychology – Master of Arts 2014 ABSTRACT TO FIGHT OR NOT TO FIGHT: DOES CONSPECIFIC STRENGTH INFLUENCE DEFENSIVE SIGNALING? By David J. Johnson Humans and social animals show similar responses to defensive threats such as the presence of predators or rival conspecifics (Blanchard et al., 2001). The current work tested two extensions of this research: first, whether humans show similar assessment processes compared to nonhuman animals including dynamically updating their assessments based on new information, and second, whether humans send different signals (i.e., willingness to escalate or submission) based on differences in physical formidability and whether those signals have behavioral consequences. Using an experimental procedure where randomly paired same-sex naïve participants competed against one another in a physical task, the current experiment revealed evidence consistent with assessment; participants became more accurate in their judgments of strength after gaining information from a physical contest. In contrast, participants did not send different signals based on differences in formidability, insofar as those signals were broadcasted by changes in strength. Implications of using animal models to predict human defensive behaviors are discussed, as well as relevant connections to game theory. TABLE OF CONTENTS LIST OF TABLES ......................................................................................................................... iv INTRODUCTION .......................................................................................................................... 1 Animal Assessment and Defensive Behaviors ................................................................... 2 Theoretical Accounts .............................................................................................. 2 Experimental Evidence ........................................................................................... 4 Human Assessment and Defensive Behaviors .................................................................... 7 Pilot Study ......................................................................................................................... 10 The Present Research ........................................................................................................ 12 METHOD ..................................................................................................................................... 14 Participants ........................................................................................................................ 14 Measurements ................................................................................................................... 14 Procedure .......................................................................................................................... 14 RESULTS ..................................................................................................................................... 17 Planned Comparisons........................................................................................................ 17 Signaled Strength .................................................................................................. 17 Diagnostic Analyses.............................................................................................. 19 Competition Outcomes ......................................................................................... 21 Exploratory Analyses ........................................................................................................ 22 Updating Strength Assessments ............................................................................ 22 Accuracy of Updated Assessments ....................................................................... 23 DISCUSSION ............................................................................................................................... 27 Assessment Accuracy ....................................................................................................... 27 Defensive Signaling .......................................................................................................... 28 Implications for Game Theory .......................................................................................... 29 Similarities to Animal Assessment ................................................................................... 31 Conclusion ........................................................................................................................ 32 APPENDIX ................................................................................................................................... 33 REFERENCES ............................................................................................................................. 40   iii LIST OF TABLES Table 1: Descriptive Statistics for Baseline Strength Measures ................................................... 34 Table 2: Intraclass Correlations Between Strength Measures ...................................................... 35 Table 3: Multiple Regression Results Predicting Success in Arm Wrestling Competition From Strength Measures ......................................................................................................................... 36 Table 4: Multilevel Multiple Regression Results Predicting Updated Strength Assessments From Past Strength Assessments and Competition Outcomes ............................................................... 37 Table 5:  Over-time APIM Model Predicting Relative Strength Assessment From Strength Measurements Before and After Competition .............................................................................. 38 Table 6: Over-time APIM Model Predicting Fight Outcomes From Strength Measurements Before and After Competition ...................................................................................................... 39 iv INTRODUCTION How do individuals signal for interactions with physically formidable others? Research on non-human animals characterizes defensive behavior as driven by an interaction between ability and context given the presence of a threat (D. C. Blanchard, 1997). In this view, defensive behaviors are the product of an evolved computational process that takes into account contextual variables in order to prepare an animal to act in ways that will afford successful defense from a threat (D. C. Blanchard, Griebel, Pobbe, & R. J. Blanchard, 2011; Gawronski & Cesario, 2013). In a competitive context, certain actions will be more or less appropriate given the unique discrepancy in ability between individuals. Game theory (Maynard Smith, 1974; Maynard Smith & Parker, 1976; Maynard Smith & Price, 1973; Parker, 1974) provides a useful model for predicting what behaviors will occur in such situations. According to this framework animals should be sensitive to differences in physical formidability, or resource holding potential (RHP; Parker, 1974). When discrepancies in RHP are large, more formidable opponents should signal willingness to escalate conflict and weaker opponents should signal submission or withdraw from conflict. As a part of a growing body of work demonstrating commonalities between human and animal defensive behaviors (e.g., D. C. Blanchard, Hynd, Minke, Minemoto, & R. J. Blanchard, 2001; Sell et al., 2009) I propose that the influence a person has on another’s defensive signaling is partially dependent on the discrepancy in formidability between the two. In competitive contexts, stronger individuals will signal readiness to escalate conflict while weaker individuals will signal submission. Critically, these signals will be indexed through changes in strength relative to strength measured when the individual is alone. Weaker individuals will show 1 submission by demonstrating less strength, while stronger individuals will display willingness to engage in conflict by demonstrating greater strength. Animal Assessment and Defensive Behaviors Theoretical accounts. Animals in social species such as humans have historically faced recurrent threats from both predators and rival conspecifics (same-species organisms). Although there are important differences between threats from predators and conspecifics (e.g., predator-prey relationships are typically characterized by large asymmetries in formidability), one commonality across these domains is that aggression should only occur in limited circumstances because of the high costs attached to it, namely risk of injury or death (Parker, 1974). These costs are elevated when an individual is faced with a more formidable conspecific (or predator). An ability to accurately gauge individual differences in formidability would provide an advantage in determining whether to escalate or withdraw from conflicts (Gawronski & Cesario, 2013; Griskevicius et al., 2009; Sell et al., 2009). In favor of this hypothesis, the existence of reliable cues to physical formidability, including (but not limited to) size, weight, and weaponry, are well documented amongst non-human animals (cf. Arnott & Elwood, 2010). From a game theoretical perspective (Maynard Smith, 1974; Maynard Smith & Parker, 1976; Maynard Smith & Price, 1973) every organism has a specific level of fighting ability, or resource holding potential (RHP; Parker, 1974). RHP is influenced by several factors including size, strength, weaponry, group size, and experience. Each factor serves as a partial cue to the individual’s absolute fighting ability. All other things equal, an organism with greater RHP than his or her opponent has a higher probability of winning a physical fight. According to Parker (1974), when discrepancies in RHP are large, more formidable opponents should signal to 2 escalate and weaker opponents to withdraw. Fights should only occur where both opponents escalate; this is more likely to occur as differences in RHP approach zero. It is important to note that all that is meant by assessment is that an organism reacts differentially based on the magnitude of a given threat (Parker, 1974). Conscious intent is not required; rather, actions prepared by the organism are those favored by selection. Historically, organisms that successfully defended themselves from interpersonal threats had better reproductive success. These behavioral tendencies were inherited by their offspring, over time developing into neural mechanisms that process information about the magnitude of threats. These mechanisms activate a set of behaviors appropriate to that unique interaction (Maynard Smith & Parker, 1976; Parker, 1974), which result in differential responding partially based on relative threat level (i.e., differences in RHP). How long should an organism assess a threat before preparing an appropriate behavioral response? In betta fish, lateral displays of body size are given for up to several minutes until one opponent withdraws (Simpson, 1968). Similarly, red deer typically engage in roaring contests (where vocalizations accurately index fighting ability) for several minutes before decisions to engage or withdraw are made (Clutton-Brock & Albon, 1979). Although game theory predicts that extended bouts of assessment give increasingly accurate estimations of RHP, they are also more costly (Dawkins & Guilford, 1991; Maynard Smith & Price, 1973). Animals that signal for long periods of time or engage in prolonged assessment of others’ signals are at higher risk of incurring injury from opponents or predators drawn to the displays. In addition, animals also lose valuable time to pursue other desirable resources or mates. Consider the case of a bird that sings to deter others from invading his or her territory. The song puts not only the signaler at risk for predation, but also increases risk for those close enough to hear the warning song. 3 Often there is a trade-off between assessment length and accuracy. When costs of losing a conflict are high and the costs assessment are low, prolonged assessment may be beneficial (Dawkins & Guilford, 1991; Maynard Smith, 1982). When observing a rival from a distance, the only costs of assessment would likely be the time and energy lost in pursuit of other activities. These low costs might encourage protracted assessment. In contrast, approaching a rival might increase assessment accuracy beyond what could be observed from a distance, but at larger risk of potential injury, especially the longer the animal remains close to the threat. However, these situations are likely to vary widely across species, and are dependent on assessment accuracy. In some cases (e.g., close proximity, high cost situations) even if extensive assessments are more accurate, it may be more cost effective to make quicker, but less accurate decisions about the formidability of conspecifics. The value of such assessments would increase as their accuracy approaches that of prolonged assessments. Similarly, the degree to which assessment is accurate is a key determinant of what actions are selected in response to threats. As assessment accuracy decreases, fighting is more likely to occur between conspecifics (Maynard Smith & Parker, 1976; Parker, 1974). Recall that when there is a discrepancy in RHP the more formidable opponent should signal to escalate and the weaker one to withdraw. Only when the difference is small (or nonexistent), is it likely that both opponents will determine that they are more formidable, resulting in mutual escalation and physical conflict. However, when assessment is inaccurate, it is harder to determine disparities in RHP. This increases the margin of error for determining when escalation is appropriate, resulting in increased conflict. In contrast, when assessment is accurate, conflicts occur less often, as it is easier to gauge the probability of winning a bout (Maynard Smith & Parker, 1976; Parker, 1974). Experimental evidence. 4 The purpose of assessment is to determine what kind of action will most likely to lead to a successful defense from a threat (D. C. Blanchard et al., 2011). Assessment is only useful if the actions prepared in response to the threat benefit the assessor. Therefore, one should expect different responses to threats in non-human animals based on the magnitude of the threat and the context in which the threat is encountered. Indeed, hyenas and other social species respond differently to threats based on their magnitude (Benson-Amram, Heinen, Dryer, & Holekamp, 2011). When exposed to calls from unfamiliar conspecifics, hyena behavior varied flexibly based on the ratio of intruders to allies. They demonstrated more vigilance to larger groups of outsiders, and approach behavior was contingent upon having a numerical advantage. Hyenas demonstrate a sophisticated system of threat assessment that takes into account the formidability of both allies and intruders and chooses appropriate actions accordingly. Similarly, research on rodents also provides support for differential responding based on individual ability and context. The presence of an ambiguous threat (a threat of unclear RHP) encourages orientation towards, and investigation of, the threat. In contrast, the presence of a clear threat (e.g., a predator with much higher RHP) elicits different responses based on the physical context. At a large distance threats elicit flight if escape is possible and freezing if it is not. At closer distances rodents give defensive “threats” (e.g., displays of weaponry), and at even closer distances, they engage in defensive attack (D. C. Blanchard et al., 1997; D. C. Blanchard, 2011). When at a clear disadvantage, rodents seek to withdraw from conflict, and only resort to escalation if all other options have been exhausted. In addition to the defensive behaviors highlighted by D. C. and R. J. Blanchard (D. C. Blanchard, 1997; D.C. Blanchard & R. J. Blanchard, 2003), submissive behaviors are also common in social animals such as primates (Bernstein & Gordon, 1980; D. C. Blanchard et al., 5 2011). These signals are most likely to occur when the value gained from continuing a contest is lower than the potential costs of continuing the fight (Matsumura & Hayden, 2006). Typically, this occurs when discrepancy in RHP is slight: accepting submission is advantageous because opponents are likely to retaliate if attacks are continued. For this reason, submissive signals are only evolutionarily stable when assessment is accurate enough to distinguish subtle differences in formidability. In contrast, submissive signals are likely to be ignored when the difference between opponents is large, because the benefits of continued attack (e.g., asserting dominance) outweigh the risk of injury (Matsumura & Hayden, 2006). Submission is directly related to dominance and the formation of hierarchies in animals. Many animals form dominance relationships, including insects, fish, birds and mammals (cf. Chase & Seitz, 2011). Although a review of dominance hierarchies is beyond the scope of this proposal, certain aspects are relevant to understanding submissive behaviors. In particular, primates have a complex system of hierarchy largely based on a rank order of submission; the most formidable primate submits to no one and the weakest to everyone. Rhesus monkeys simultaneously introduced to each other will compete for dominance but typically settle into a stable hierarchy within an hour (Bernstein & Gordon, 1980; Bernstein, Gordon, & Rose, 1974), suggesting that assessment is a relatively quick process. Interestingly, lone male rhesus monkeys introduced to an established group immediately give submissive signals and assume the lowest position in a hierarchy (Bernstein & Gordon, 1980). This may seem contradictory if the intruder is more formidable than some members of the group. However, chimpanzees are known to solicit and receive help from conspecifics during conflicts (de Waal & Hoekstra, 1980). Therefore, in any given group, the combined RHP of all or multiple members will almost always outnumber 6 the intruder’s RHP by several magnitudes, making submission a reasonable behavior (Bernstein & Gordon, 1980; Matsumura & Hayden, 2006). In sum, many non-human animals show an ability to accurately assess formidability in both predators and conspecifics. Longer assessments may be more accurate, but have higher costs. Selection should favor decision rules that flexibly execute accurate or quick assessments depending on contextual contingencies, such as the magnitude of the threat. All other things equal, accurate assessment limits physical conflict to opponents with similar RHP. When discrepancies in RHP are large, weaker animals should signal withdraw by fleeing or submitting, whereas stronger animals should signal willingness to fight. Critically, the neural mechanisms that orchestrate assessment should respond to contextual differences, including the formidability of opponents, the presence of allies, and the physical situation where the encounter occurs. Human Assessment and Defensive Behaviors Until recently defensive behaviors in animals have typically been characterized by psychologists as inflexible and innate (Gawronski & Cesario, 2013), whereas human aggression was thought of as a learned behavior unique to the species (D. C. Blanchard et al., 2001). This distinction is partially due to the constrained meaning of aggression typically applied to humans versus the more general definition used in the biological sciences. In humans, aggression is typically thought of as a learned hostile behavior with the intent to inflict pain. However, a more general definition, applicable across species, regards aggression as behavior associated with physical attack or escalation towards attack, without regards to intent (Sell, Hone, & Pound, 2012; van Staaden, Searcy, & Hanlon, 2011). According to the former definition, aggression is a uniquely human phenomenon requiring conscious intent and hostile motives. According to the latter, aggression just one of many defensive behaviors employed across species. Although not 7 ignoring the likely possibility that certain aspects of aggression are unique to humans, this definition embraces the possibility that similar selection pressures shaped the defensive behaviors of humans and non-human animals. From this perspective, the same computational processes of assessment that occur in non-human animals provide valuable hypotheses about the effect of context on human defensive behavior. For the computational models of threat assessment to apply to human defensive behavior, humans must first demonstrate an ability to accurately gauge their own formidability as well as that of others. As with non-human animals, both the accuracy and speed of assessment is critical for understanding what actions humans will signal in conflict. Paralleling findings in the animal literature, humans are able to both quickly and accurately assess physical formidability. In a series of experiments, Sell and colleagues (2009, 2010) demonstrated that humans were able to assess formidability (as defined by upper body strength) from mere facial photos or voice recordings. Raters were more accurate at determining male strength than female strength, which is at least partially due to lower variability in female strength due to sexual dimorphism (Sell et al., 2012). These results have also been replicated by other researchers (Archer & Thanzami, 2009). Sell and colleagues results suggest that humans can quickly and accurately assess formidability, even from relatively impoverished information like pictures or voice recordings. Although it may seem obvious that humans prepare different actions based on the magnitude of a threat and the context in which the threat occurs, it is important to question if the behaviors exhibited by non-human animals under threat match human behaviors under similar conditions. In an experimental study, D. C. Blanchard and colleagues (2001) asked individuals to report how they would respond in several threatening situations, which systematically varied the level of threat and ability to escape. Human threat responses closely paralleled rodent threat 8 responses. When threats were ambiguous (e.g., hearing a suspicious noise) the most reported behavioral choice was to investigate the threat. In contrast, when threats were clear (e.g., being attacked) defensive responses were most common. These defensive responses did vary somewhat across the sexes, such that males were more prone to report defensive attack in the context of a clear threat, while females were more likely to report defensive threat (i.e. screaming). These could reflect a higher percentage of males assessing that they were more formidable than the threat and thus a retaliatory attack would be most advantageous. Interestingly, D. C. Blanchard and colleagues (2001) noted that one limitation to this research was not including a response option for submissive or pleading behaviors, noting while that these responses are unlikely in rodents1 they would be in the range of possibility for animals such as primates or humans. Due to the historical focus of studying aggression as a learned phenomenon, little research has investigated the role of RHP disparities on defensive signaling. Moreover, the studies that have done so have typically relied upon hypothetical conflict situations (Archer & Benson, 2008; D. C. Blanchard et al., 2001). Although studies examining responses to hypothetical threats are advantageous in that they can manipulate threat in ways that would not be ethical to examine otherwise, they are limited by relying on prospective self-report, which may not accurately reflect actual behaviors chosen in a conflict situation. For example, males may report being less likely to scream than females when threatened due to self-presentation concerns, rather than actual differences in behavior. A stronger test of whether RHP influences defensive signaling would be to actually set up dyadic interactions between individuals of varying RHP and observe how differences in 1 The definition of submission used in this article excludes behaviors such as “playing dead” which are assumed to be qualitatively different than submission, as they may be used by animals to evade predation but also to lure opponents closer. 9 formidability influence these signals. Changes in upper body strength in response to a partner could serve as an index of defensive signaling. Although relatively stable, upper body strength can be reliably increased or decreased by a variety of factors, including motivation and threat (Ikai & Steinhaus, 1961). Insofar as submission is indexed by showing weakness (i.e., less strength) and escalation is broadcasted by exaggerating those cues (i.e., more strength), individuals may change their strength dependent on the unique relationship between partners, specifically the degree in which their RHP differs. Pilot Study The aforementioned hypotheses were pilot tested in an unpublished study conducted by Cesario and Johnson (2013). Specifically, male undergraduates were randomly paired with one of five male confederates of varying physical formidability. Confederates were blind to study hypotheses. Participants were told that they would compete against each other at a later point in time. They were instructed to stand while strength was measured, with the idea being that this might encourage natural assessment of their partners’ formidability. The experimenter measured participants’ upper body strength using an inverted hand dynamometer. Measurements were taken with the participant’s partner turned around, so participants could not observe their partner’s measured strength. However, participants could to see their own measured strength. To initially examine men’s defensive signaling as a function of confederate strength, a single factor ANOVA was first conducted with confederate as predictor of participant upper body strength. This yielded a significant effect for confederate F(4, 116) = 2.74, p = .032, η2 = .086. Participant upper body strength was lower in the presence of a stronger confederate, and higher in the presence of a weaker confederate. Although these results are promising, they ignore the dyadic nature of the data. Because the hypotheses concerned whether one’s own ability and 10 the confederate’s ability would influence strength, the data were analyzed with a one-with-many model (OWM; Marcus, Kashy, & Baldwin, 2009). The OWM design is a variance decomposition model that can be used with hierarchically structured data in which multiple individuals (the many; participants in this case) interact with or are tied to the same partner (the one; confederates in this case). The model separates confederate variance from participant variance. In a reciprocal OWM design both the confederates and the participants have an outcome score for each dyadic combination (i.e., strength), and the percentage of variance at the confederate and participant levels can be estimated for both. Confederate-level variance for the measure of the confederate’s strength estimates the extent to which a confederate is consistently strong (or weak) across participants, and confederate-level variance for the measure of the participants’ strength estimates the extent to which all participants paired with the same confederate are consistently strong or weak. The correlation between these two confederate-level effects measures generalized reciprocity or the extent to which participants paired with stronger confederates consistently show strong stronger (or weaker) behavior than do participants who are tied to weaker confederates. Multilevel modeling with restricted maximum likelihood was used to estimate a OWM model assessing the reciprocal effects of confederate grip strength on participant grip strength and vice versa. The key finding was that the generalized reciprocity correlation was significant, substantial, and negative (r = -.83, p = .008). The presence of a strong confederate tended to attenuate participants’ strength, such that the mean strength for participants paired with that confederate was lower than the mean strength across all confederates. Conversely, the presence of a weak confederate tended to enhance participants strength, such that the mean participant strength for that confederate was higher than the mean strength across all confederates. 11 The model also revealed that a considerable amount of the variance in confederate strength was at the participant or dyad level, which accounted for 38% (p < .001) of the variance in confederates’ strength. This indicates that confederates’ strength changed depending on the participant they interacted with. Combined with the significant generalized reciprocity correlation, these results replicate the findings from the single factor ANOVA while demonstrating that participants also influence confederate’s strength. These findings are consistent with the hypothesis that humans assess the formidability of conspecifics and automatically signal actions that will lead to successful defense (e.g., escalation, submission). The Present Research Drawing on research from both human and non-human animals, as well as the pilot study described above, the current research tested whether physical formidability influences defensive signaling in humans. Consistent with prior research (Sell et al., 2010; 2009), I predicted that (H1) humans would be able to accurately gauge the strength of people they interact with. However, because there is less variation in upper body strength between females (Sell et al., 2012), the relationship between formidability and defensive signaling would be stronger for males rather than females. Second, assessment will alert individuals to the degree of RHP discrepancy between them and the other competitor. Interacting with a competitor who is much stronger (weaker) would elicit signals of submission (willingness to escalate conflict; H2). By signaling I refer to changes in upper body strength from baseline, when the other competitor is not present. As evidenced in the pilot study, signaling should not require explicit comparisons of formidability (e.g., displaying scores on upper body strength measures), although such a manipulation might certainly enhance the effect. 12 Finally, changes in signaled strength should predict success in a physical competitive task over and above any effects due to baseline differences in strength (H3). Individuals who signal submission should lose more than individuals who signal willingness to escalate conflict. 13 METHOD Same-sex pairs of participants reported separately to the lab and had baseline measurements of strength taken by a same-sex confederate. Participants were then brought together and had their strength measured again. Finally they competed against each other in contest requiring upper body strength. Throughout the experiment participants answered a variety of questions related to perceptions of self and partner strength. Participants Participants were 398 Michigan State undergraduate students (196 women; 24.7% nonWhite) who participated together in dyads (98 female dyads, 101 male dyads). The experiment was advertised as two separate experiments to avoid participants signing up with friends and to ensure random sampling. Measurements Participants’ left hand, right hand, and upper body strength were measured via handgrip dynamometer. Each strength measurement was taken three times, and the average of these scores was used in analyses. Height and arm length were measured in centimeters. Because variation in bicep circumference was smaller than the former two, it was measured in millimeters. Procedure The first participant arrived for an experimental session fifteen minutes prior to the second participant. Participants were told that the current experiment was interested with how physical strength related to personality. The experimenter then took baseline measurements of upper body strength and other body measurements. Measurements were not shared with participants. The participant was then escorted to an individual suite and given instructions to complete measures related to anger and formidability (Sell et al., 2009) unrelated to the current 14 experiment. After leaving the first participant, the experimenter escorted the second participant into a different suite, where he or she repeated the same procedures. When both participants finished the measures, the experimenter brought them together into a large room in the lab and repeated the experimental cover story, elaborating that the two participants would later compete against each other in a physical task. Before the competition, the experimenter took the strength measurements again, this time with the other participant in the room. Measurements were taken individually with the hand dynamometer while their competitor faced away. Experimenters measured participants in the order that they arrived. After measuring the strength of both participants, participants were escorted to two sideby-side computers. They then completed several measures concerned with relative judgments of strength (e.g., how strong are you relative to the other participant?) and predicted fight outcomes (e.g., how likely do you think you would be to win a physical fight against your partner?). After both participants completed these measures they competed against each other in an arm wrestling contest (156 dyads) or a mercy contest (43 dyads).2 Each participant was given three tickets into a raffle for a $50 gift card. For each round they won, participants were given one of their competitor’s tickets. In the arm wrestling task, participants used their right hand to arm wrestle against their opponent. Participants won matches if they managed to push their competitors arm down against the table. In the mercy contest, participants interlocked the fingers on both their hands with their competitor’s fingers. They then squeezed as hard as could in order to try to get the other participant to give in. In both contests, if a winner was not determined in less than 30 seconds, the match ended in a draw. Each dyad competed in up to three rounds, 2 The competitive task was changed from a mercy task to an arm wrestling task because the former did not effectively differentiate between strong and weak participants; most matches (84.2%) ended in draws. 15 unless one participant forfeited. If a participant forfeited at any time, all their tickets were given to their competitor. Participants then again completed measures asking about their strength relative to their partner and predicted fight outcomes. If participants consented, pictures of each were taken in order to facilitate rated judgments of strength and attractiveness. Finally, participants were fully debriefed and dismissed. 16 RESULTS Planned Comparisons Signaled strength. To examine the effect competitors have on signaled strength, I first created composite variables of strength based on the average of the upper body, left hand, and right hand strength measures at each time point separately (αs = .924 and .925, respectively). The composite variable was strongly correlated between time points (r = .972, p < .001), as was the upper body strength measure (r = .920, p < .001), the left hand measure (r = .954, p < .001) and the right hand measure (r = .954, p < .001). This substantial collinearity presents problems with data analysis and will be addressed further in the discussion. As expected and shown in Table 1, men were substantially stronger than women on all strength measures. In particular, the mean score for men on the standardized composite variable (0.70) was almost one standard deviation higher than the mean score for women (-0.76). When examining all the strength variables, in over 99% of cases, the mean score for men exceeded the maximum score for women. Variability in strength also differed by gender, with men having almost twice as much variability compared to women on all strength measures. These composite strength variables were entered into a multilevel model using restricted maximum likelihood in order to estimate an indistinguishable actor-partner interdependence model (APIM; Kenny, Kashy, & Cook, 2006) assessing the effects of baseline strength on strength when a competitor was present (hereafter referred to as signaled strength). If participants adaptively signal strength depending on competitor strength, being paired with a stronger individual should result in less signaled strength (a negative partner effect). However, this effect should be moderated by the individual’s own level of strength, such that decreases in strength 17 signaling should be greater when participants are weak (a negative actor-partner interaction). Finally, baseline strength should strongly predict signaled strength (a positive actor effect), such that strength should largely be stable across time points. The APIM conducted on the composite strength measure revealed no evidence that signaled strength was influenced by a competitor, β = .021, t(260) = 1.30, p = .196. However, the actor-partner strength interaction β = -.032, t(185) = -2.11, p = .036 was significant and negative, indicating that weaker participants paired with stronger partners tended to signal less strength. Thus, as predicted, participants signaled strength changed based on who they were competing against. Finally, a large and positive actor effect was found β = .945, t(259) = 59.25, p < .001, indicating high stability between strength measurements. Participant sex did not influence signaled strength, nor did it interact with the strength variables, (ts < 2, ps > .200). The above analysis was also conducted separately on each of the three strength measures: upper body strength, left arm strength, and right arm strength. While in all cases there was stability between baseline strength measurements (all ts > 30, ps < .001), no actor-partner strength interactions reached significance (all ts < 2, ps > .080). Additionally, in the analysis of the right arm strength measure the presence of the confederate actually led to greater strength (b = .038, t(281) = 2.11, p = .036, opposite of what was predicted. Examination of the intraclass correlations for each of the strength measures revealed that the data were nonindependent, intraclass r = .55 - .65, ps < .001. That is, scores within dyads were more similar than scores between dyads. Although this ostensibly suggests that participants’ strength was influenced by whom they competed against, the null results obtained for the partner effects are inconsistent with this interpretation. Furthermore, rerunning the analyses with only baseline strength as a predictor revealed that it was sufficient to explain the 18 nonindependence in the data; all intraclass correlations were reduced to nonsignficance. Thus, it is likely that the similarity in the data is artifactual. Recall that participants were paired with same-sex partners. Because men’s scores were more similar to other men rather than to women, and women’s scores were more similar to other women than to men (see Table 1), the nonindependence might simply be due to experimental design. Indeed, as Table 2 shows, when examining the similarity between competitors separately by sex, the nonindependence was reduced to nonsignificance in all but one case. Thus, the intraclass correlation is inflated by experimental design, giving little evidence to suggest that participants’ strength was influenced by their opponent. Diagnostic analyses. One reason why the current study may have found no evidence for changes in strength signaling is due to the fact that most dyads were evenly matched in terms of strength, compared to the pilot study where strength differences were manipulated via confederates. When examining the upper body strength measure standardized across sex, 152 dyads (81.2%) had individuals within one standard deviation of each other. Even when standardized within sex, 117 dyads (62.5%) still had individuals within one standard deviation of each other. For comparison, in the pilot study of 121 men, paired with five confederates, only 45 dyads (37.2%) had individuals within one standard deviation of each other. Recall that in the pilot study, post-hoc tests revealed that decreases in upper body strength were most pronounced for participants paired with the two strongest confederates. Given the smaller discrepancies in strength in the current study, the finding that competitors do not influence personal strength is ambiguous. To reiterate, I hypothesized that individuals are sensitive to discrepancies in RHP, which lead to different signals (submission or willingness to escalate) indexed by strength changes. The 19 larger the discrepancy, the more pronounced the signal. This issue is complicated by the fact that assessments are not perfect; while they are strongly correlated with measurements of physical strength (r = .66 for men and r = .51 for women; Sell et al. 2009), there is still a substantial degree of noise. Because judgments are not perfect, when RHP between competitors is roughly equal, they may both assess themselves as stronger (Parker, 1974). In this case, both individuals should signal willingness to fight. Given that a large percentage of participants were of roughly equal RHP, this may have obscured the relationship between differences in formidability and greater signaling of submission. One way to test this hypothesis directly is to split the dyads into two groups, one where there the discrepancy in strength between competitors is large, and another where differences between competitors is small. This was accomplished by using the composite measure of strength (standardized within sex)3 to split the 199 dyads into two groups, one group where the difference in strength between competitors was less than one standard deviation (the similar strength subset, n = 112) and the other where the difference was more than one standard deviation (the different strength subset, n = 77). If similarity in strength was obscuring the relationship between formidability and (submissive) signaling, examining the different strength group should reveal the expected pattern. The similar strength group should signal increased strength as the difference between competitor formidability decreases, above and beyond any effects due to personal strength and competitor strength. However, running the APIM with the different strength dyads did not change the pattern of results. Upper body strength as measured with a competitor present was still strongly 3 As there was more variability in strength in the male dyads, measurements were standardized within sex to prevent imbalance in the number of male and female dyads selected into the similar and dissimilar groups. Standardizing across sex did not change the pattern of results. 20 influenced by baseline strength (β = .952, t(127) = 54.58, p < .001), but the effect of competitor strength did not reach significance (β = .014, t(127) = -0.76, p = .428), nor did their interaction (β = -.023, t(74) = -0.95, p = .348). Similarly, running the APIM with the same strength dyads also did not change the pattern of results. Upper body strength as measured with a competitor present was still strongly influenced by baseline strength (β = .893, t(120) = 20.78, p < .001), but the effect of competitor strength did not reach significance (β = .066, t(120) = 1.53, p = .128), nor did the absolute magnitude of their difference (β = -.022, t(74) = -0.95, p = .348). The coefficient represents the degree to which greater (or lesser) discrepancy in baseline strength is related to greater strength in the presence of the competitor. In sum, we did not find evidence for strength signaling, even when splitting the dyads into two groups, one where the discrepancy in strength was large, and the other where the discrepancy was low. Implications of this are further addressed in the discussion. Competition outcomes. To examine how baseline and signaled strength influence competitive outcomes, arm wrestling outcomes were regressed on strength measures at the dyad level. If signaled strength influences competition outcomes above and beyond baseline differences in strength, its inclusion should improve the predictive validity of the model when controlling for any differences due to baseline strength. That is, does strength signaling influence the likelihood that an individual will win a physical contest such as arm wrestling? Because competition outcomes between competitors are completely dependent (i.e., winning a match necessarily means that the other competitor loses), analyses were conducted at the dyad level. As assignment to Person 1 and Person 2 is arbitrary, in all analyses intercepts were suppressed. Predictors were determined by taking the difference of the two competitors 21 scores (e.g., X1 – X2). The difference in outcomes (range: -3 to 3) was then regressed on the difference between predictors. Thus, positive coefficients represent that greater asymmetry in the predictor (e.g., upper body strength) predicts competitive success. To control for individual differences not (directly) related to upper body strength, hand dominance and arm length were first entered into the model. The overall model was significant, R2 = .059, F(2, 140) = 4.36, p = .015. Handedness did not predict success in the arm wrestling competition, b = .090, t(140) = .540, p = .590, and was removed from the model. Greater arm length significantly predicted success, b = .081, t(140) = 2.933, p = .004, and was retained. In the next step, baseline right arm strength (all matches were right handed) and upper body strength were added as predictors. As can be seen in Table 3, the overall model was again significant, ΔR2 = .203, F(2, 139) = 19.05, p < .001. Greater upper body strength (b = .045, t(139) = 3.00, p = .003) and right arm strength (b = .060, t(139) = 3.17, p = .002) both significantly predicted success. In the last step of the model, signaled upper body and right arm strength were added as predictors. However, adding in these variables did not significantly increase the predictive power of the model ΔR2 = .008, F(2, 137) = 0.78, p = .460. Thus, Hypothesis 3 was not supported, in that signaled strength did not influence the likelihood of winning a physical contest. Exploratory Analyses Updating strength assessments. To further examine whether participants were accurately assessing formidability of conspecifics (Hypothesis 1), I investigated participants’ relative strength assessments. Recall that participants were asked to report their strength relative to their partner on a 7-point scale (1 = I am much weaker than the other participant, 7 = I am much stronger than the other participant) at two time points, first just after meeting the other participant and second after the competition. I 22 predicted participants would use their own strength as well as their competitor’s strength when judging who was stronger. However, participants would gain additional information about strength from the competition that would influence their relative strength assessments such that participants who won would reassess themselves as stronger, while participants who lost would reassess themselves as weaker. To test this hypothesis, I examined whether or not participants updated their relative strength assessments based on information gained from the arm wrestling competition. To control for nonindependence in the data, I ran a multilevel regression model where the second relative strength assessment was regressed onto the initial relative strength assessment, as well as the outcome of the competition, participant sex, and the sex by competitive outcome product term. A positive coefficient for the competition outcome reflects that controlling for the initial relative strength assessment, participants who won (lost) the competition were more likely to rate themselves as stronger (weaker) than their competitor. Table 4 lists the model coefficients. As predicted, the coefficient for the competitive outcome was significant and positive, b = .547, t(166) = 19.38, p < .001, indicating that controlling for their initial strength assessment, participants adjusted their strength ratings based on the competitive task. This effect was moderated by a significant sex by competitive outcome interaction, b = -.079, t(140) = -3.01, p < .001, indicating that men updated their strength ratings less than women. The overall model accounted for the vast majority the variance in relative strength assessments, pseudo-R2 = .734. Accuracy of updated assessments. Although updating strength assessments after a physical competition is consistent with an adaptive reappraisal of strength differences, it is unclear whether changes in assessment better reflect reality. Consider the case where two individuals meet and initially consider themselves of 23 equal strength. With some difficulty, one participant edges out the other in the competition and updates his or her relative strength assessment to suggest that he or she is stronger. The issue is whether this updated assessment more accurately reflects the difference in strength between the two individuals. It is possible that participants might overcorrect, resulting in assessments that less accurately track strength differences than those made before the competition. However, if participants are accurately using the information from the competition to inform their strength assessments, these posterior assessments should more accurately reflect strength differences. A preliminary test revealed that, as expected, initial assessments of strength were nonindependent, intraclass r = -.268, p < .001, indicating that the more (less) individuals rated themselves as stronger relative to their competitor, the less (more) their partner rated themselves as stronger relative to their competitor. More relevant, strength assessments were more correlated after the arm wrestling competition, intraclass r = -.783, Z = 5.66, p < .001. This increase in nonindependence is expected because as both participants’ accuracy increases, so should the correlation between their assessments. Because relative assessments of strength are nonindependent, they were entered into a multilevel model using restricted maximum likelihood in order to estimate an indistinguishable over-time APIM assessing the effects of baseline strength (composite measure) on relative strength assessments. I predicted participants’ relative strength assessments would be strongly influenced by their own strength (a positive actor effect) as well as the strength of their partner (a negative partner effect) at both time points, but particularly after the arm wrestling competition. That is, not only would participants reassess strength judgments after the physical competition, but these judgments would be more accurate (i.e., better predicted by baseline strength). 24 As predicted and displayed in Table 5, the over-time APIM revealed that relative strength assessments were positively related to greater personal baseline strength, b = .971, t(211) = 8.33, p < .001, and negatively related to the competitor’s baseline strength, b = -.661, t(208) = -5.71, p < .001. Thus, participants’ strength assessment was influenced by both their own strength and their competitor’s strength. More importantly, the predicted actor strength by time (b = .226, t(169) = 3.56, p < .001) and partner strength by time (b = -.190, t(169) = -3.00, p = .003) interactions were significant, indicating participant’s assessments after the competition better tracked baseline strength differences. That is, participants’ relative strength assessments became more accurate after the competition. There was also a main effect of participant sex, such that men were more likely to rate their competitor as less strong than women, b = -.254, t(144) = 2.87, p = .005. However, sex did not interact with any other variable (ts < 2, ps > .10). In sum, 32.3% of the variance in participants’ strength assessments were explained by differences in physical strength. A similar analysis was conducted on judgments of predicted fight outcomes. As with relative strength assessments, participants were asked to report on a 7-point scale how likely they would be to win a physical fight with their partner as well as what the outcome of the fight would be, where higher numbers represent better predicted success. These judgments were also reported before and after the arm wrestling competition (αs = .930 and .959, respectively). Paralleling assessments of physical strength, the over-time APIM revealed that predicted fight outcomes were positively related to greater personal baseline strength, b = .797, t(237) = 6.14, p < .001, and negatively related to the competitor’s baseline strength, b = -.465, t(234) = -3.61, p < .001, see Table 6. Thus, participants’ predicted fight outcomes were influenced by both their own strength and their competitor’s strength. More importantly, the predicted actor strength by time 25 (b = .125, t(184) = 3.22, p = .002) and partner strength by time (b = -.100, t(184) = -2.57, p = .011) interactions were significant, indicating participant’s judgments after the competition better tracked baseline strength differences. That is, participants’ predicted fight outcomes more accurately tracked differences in physical strength after the competition. Unlike the relative strength measure, there was no main effect of participant sex, b = -.187, t(145) = -1.62, p = .102, nor did it interact with any other variable (ts < 2, ps > .05). In sum, 15.3% of the variance in participants’ predicted fight outcomes were explained by differences in physical strength. 26 DISCUSSION This research emphasizes that humans can accurately assess differences in physical formidability and update these assessments based on information gained during an interaction. Individuals took into account both their own strength and the strength of a competitor when making initial assessments about who was stronger as well as who would win in a physical fight. In addition, participants updated their assessments after competing against the competitor in a physical task. Like non-human animals, humans appear to have psychological mechanisms that accurately track cues to physical formidability, and they dynamically update these cues based on relevant information. Assessment Accuracy Individuals’ relative assessment of strength accurately tracked their own physical formidability as well as that of the other competitor. Although their initial estimate tracked strength differences, it was improved by competing against the other individual in a contest determined by physical strength. Improvement was obtained regardless of whether dyads were male or female, indicating that both men and women gained information from the competition. Improvement was not due to mere exposure to the other participant. Recall that a portion of the dyads (38 out of 199) engaged in a different competitive task (i.e., a mercy task) before making their second assessment of strength. This task was not very sensitive to differences in physical formidability; 84.2% of all matches ended in draws compared to only 32.9% in the arm wrestling contest, χ2(1) = 32.18, p < .001. In those dyads, accuracy in strength assessments actually became worse after the competition (time by actor/partner effects were in the opposite direction). Similarly, participants predicted fight outcomes did not more accurately track differences in physical formidability after the competition (time by actor/partner effects were 27 nonsignificant). Thus, completing a task that is a poor indicator of physical formidability, or unrelated to physical formidability at all, is unlikely to improve assessment accuracy. Defensive Signaling Based on the results of a pilot study, I predicted that individuals would modify their strength based on the strength of their competitor. Specifically, when an individual was weaker than their competitor they would signal submission, resulting in decreased strength relative to baseline. In contrast, when an individual was stronger than their competitor they would signal willingness to escalate conflict, resulting in increased strength relative to baseline. While the data did not support this hypothesis, this null result may be due to experimental design limitations. The issue is that the current study did not manipulate strength via pairing participants with confederates chosen based on physical formidability. Instead, all individuals were naïve participants whose strength varied based on individual differences. As discussed in the results section, this led to considerably less dyads where there was a large discrepancy in strength. In line with prior theorizing (e.g., Parker, 1974), because strength assessment is not perfect, when individuals are similar in strength, they may both signal willingness to escalate conflict. This could have created noise in the data that obscured the relevant differences in signaling. However, dividing dyads into two groups where the difference in strength was either similar or dissimilar (i.e., less or greater than one standard deviation, respectively) did not change the pattern of pattern of results. Personal strength was not influenced by the presence of a competitor. If the prediction is correct, this null funding may simply be due to low power. However, if signaling changes do occur when individuals interact with competitors, the data suggest the size of the effect is small. To ascertain if the effect is reliable as well as to narrow down the confidence interval of the effect, it will be necessary to conduct additional research that 28 manipulates the discrepancy in strength between individuals. A highly powered test of this hypothesis is necessary to replicate the original effect and help rule out the possibility of a false positive. It may also be the case that the effect driving strength changes in the pilot study was confounded with strength amongst the confederates selected. To rule out these possibilities, more research with additional confederates is needed. Because there were no differences in signaled strength, it is therefore unsurprising that I did not find support for the hypothesis that defensive signaling influenced competitive outcomes. Given well-known problems with including highly correlated variables as predictors in a regression model (Cronbach, 1987), it not unusual that including the second upper body strength (r = .920) and right arm strength (r = .954) measures would not significantly improve model fit. Because this hypothesis is contingent upon finding evidence for strength differences based on the presence of a competitor, this hypothesis cannot directly be evaluated with the current data. Implications For Game Theory The finding that humans can accurately assess the formidability of conspecifics and update those assessments based on information gained during contests supports predictions derived from game theoretical principles. As Parker and colleagues (Maynard Smith & Parker, 1976; Parker & Rubenstein, 1981) predicted, “it is likely that good information, particularly about RHP, can be acquired only during a contest itself” (Parker & Rubenstein, 1981, p. 288). That is, individuals initially might not know (perfectly) the asymmetry in RHP between them and their competitor, but gain such information through physical competition. Insofar as this process is available to conscious introspection, it should be reflected through changes in assessments of relative strength and fighting ability, as it was in the current experiment. 29 Typically, reassessment of relative strength is determined not by conscious changes in relative strength, but is inferred from the decision to withdraw from a contest. However, low rates of forfeiting in the current study (4.0%) prevented conclusive analyses of this. This is likely due to the structure of the competitive tasks used: participants were told that forfeiting at any point would result in a loss of all their raffle entries. This design encouraged participants to keep competing, even if they were likely to lose, because the opportunity cost of not competing was always higher than the cost of competing. When combined with minimal risk of physical injury (i.e. low damage costs), low rates of forfeiture are consistent with game theoretical predictions. This interpretation is bolstered by evidence that participants did update their relative strength assessments based on the match, even if they did not often forfeit. This logic suggests that changing the nature of the contest would change the duration of time before the weaker individual withdraws. Instead of a winner-take-all style competition, if participants are given the option to save their resources in addition to risking them for potential reward, the decision to withdraw should be contingent upon how great the potential loses are for a given round. All else equal, participants for whom potential loses are greater should be quicker to withdraw after discovering that their competitor has higher RHP. This non-fixed framework would also allow inferences as to how the duration of a contest influences assessment accuracy. The current design does not allow for a test of whether contest duration influences accuracy because the vast majority of individuals engaged in three rounds of the physical competition. However, it seems likely that additional bouts would have an asymptotic relationship with RHP assessment; given little information about a competitor (i.e., few rounds of competition) an additional round should help accuracy, however, after many rounds, an additional round is 30 unlikely to improve accuracy. Future research should test this prediction in order to provide additional support for the predictive validity of game theoretical predictions to human conflict. Similarities to Animal Assessment These results share several similarities with studies on competition between human as well as non-human animals. First, these results add to the growing literature demonstrating that humans have psychological mechanisms that accurately determine the strength of conspecifics (e.g., Archer & Thanzami, 2009; Sell et al., 2009). Additionally, they extend this literature by demonstrating that assessments of predicted fight outcomes take into account both personal strength as well as competitor strength. While there is considerable debate about how to best model RHP assessment in non-human animals (for a review, see Taylor & Elwood, 2003), our results provide strong evidence that humans engage in mutual assessment. Though predicted by Arnott and Elwood (2010), no other study has directly tested how strength differences between competitors predict expected conflict outcomes. More generally, these results parallel findings in the non-human animal literature demonstrating that assessment improves accuracy of RHP. For example, during the mating season, red deer stags attempt to guard fertile hinds from other stags (Clutton-Brock & Albon, 1979). A small proportion of approaches between stags actually result in fights; typically when discrepancy in fighting ability is visible, the stag with lower RHP will withdraw. Only when RHP is roughly equal do stags engage in roaring contests, where vocalizations accurately reflect fighting ability. The majority of fights are also preceded by a parallel walks, where stags display their full body size by walking alongside each other. In both cases displays provide additional information that influences the decision to engage or withdraw. Similarly, in the current study, individuals have initial impressions of the strength of their competitor. The accuracy of these 31 impressions is improved based on several rounds of a competition that reflects fighting ability. In many cases where discrepancies in RHP are not clear, competition provides additional evidence as to the magnitude of those discrepancies. Conclusion I found evidence that humans can accurately assess differences in physical formidability and that these assessments are dynamically updated based on information gained from competitive bouts. These processes show remarkable similarities to assessment in non-human animals and suggest that animal models are useful for predicting human behaviors in competitive situations. Similarly, these results are largely consistent with predictions derived from a game theoretical framework. While the prediction that individuals would broadcast different signals depending on the relative difference in strength between them and their competitor was not confirmed, this null effect may be due to the small degree of differences between individuals in dyads. Similarly, the validity of the hypothesis that conflict outcomes would be influenced based on signaling submission or willingness to escalate conflict cannot be ascertained without successfully manipulating differential signaling. More work will need to be done to replicate the original effect and help rule out the possibility of a false positive. 32 APPENDIX 33 Table 1 Descriptive Statistics for Baseline Strength Measures Women Men Strength Measure Upper Body Right Arm Left Arm Composite Upper Body Right Arm Left Arm Composite Minimum 1.67 14.33 12.33 -1.67 4.67 20.67 14.33 -1.03 Maximum 36.00 44.33 47.33 0.39 69.33 71.33 64.67 2.60 M 19.20 27.11 25.68 -0.76 39.84 42.76 42.76 0.70 SD 6.10 5.15 5.14 0.39 12.33 9.13 8.98 0.73 Note. Women N = 188, Men N = 197. Upper body, right arm, and left arm strength are all measured in kilograms. Composite is the average of the three standardized strength measures. 34 Table 2 Intraclass Correlations Between Strength Measures Strength Measure Upper Body Left Arm Right Arm Composite All Ps (N = 374) .592*** .597*** .546*** .647*** Women (N = 184) .088 .137 .081 .097 Men (N = 190) .204* .069 .006 .098 Note. P = participant. Higher correlations indicate greater nonindependence, i.e., that strength measurements within dyads are more similar than strength measurements between dyads. *p < .05, ***p < .001 35 Table 3 Multiple Regression Results Predicting Success in Arm Wrestling Competition From Strength Measures Strength Measure Arm length Upper Body Strength (T1) Right Arm Strength (T1) Upper Body Strength (T2) Right Arm Strength (T2) R2 F ΔR2 ΔF Model 1 Model 2 b β .080** .238 --------.057 8.47** --- b β .043† .128 .045** .254 .060** .280 ----.260 16.25*** .203 19.05 Model 3 b .047† .032* .088 .017 -.036 β .139 .182 .408 .109 -.166 .268 10.32 .008 0.78 Note. T1 = baseline strength measurement. T2 = strength measurement with competitor present. Model 1 df = 141, Model 2 df = 139, Model 3 df = 137. All variables were grand-mean centered prior to analysis. Regression was conducted at the dyad level, so the intercept was suppressed. † p < .10 *p < .05, **p < .01, ***p < .001. 36 Table 4 Multilevel Multiple Regression Results Predicting Updated Strength Assessments From Past Strength Assessments and Competition Outcomes Intercept Strength Judgment (T1) Competition Outcome Sex Sex*Competition Outcome b 3.085 .251 .547 .048 -.079 β -.215 .748 .032 -.107 t -6.42*** 19.38*** 1.24 -3.01** df -263 166 138 140 Note. T1 = relative strength judgment prior to competition. All continuous variables were grandmean centered prior to analysis; sex was effects coded (-1 = women, 1 = men). **p < .01, ***p < .001. 37 Table 5 Over-time APIM Model Predicting Relative Strength Assessment From Strength Measurements Before and After Competition Intercept Actor Strength Partner Strength Sex Time Actor Strength*Time Partner Strength*Time b 3.877 .971 -.661 -.254 .131 .226 -.190 β -.660 -.449 -.181 .093 .153 -.129 t -8.33*** -5.71*** -2.87** 4.28*** 3.56*** -3.00** df -211 208 144 143 169 169 Note. Strength refers to the composite strength measure taken at baseline. All continuous variables were grand-mean centered prior to analysis; sex was effects coded (-1 = women, 1 = men) and time was effects coded (-1 = initial assessment, 1 = reassessment after competition). **p < .01, ***p < .001. 38 Table 6 Over-time APIM Model Predicting Fight Outcomes From Strength Measurements Before and After Competition Intercept Actor Strength Partner Strength Sex Time Actor Strength*Time Partner Strength*Time b 4.079 .797 -.465 -.187 .082 .125 -.100 β -.557 -.325 -.137 .059 .088 -.070 t -6.14*** -3.61*** -1.62 3.47** 3.21** -2.57** df -237 234 145 144 184 184 Note. Strength refers to the composite strength measure taken at baseline. All continuous variables were grand-mean centered prior to analysis; sex was effects coded (-1 = women, 1 = men) and time was effects coded (-1 = initial judgment, 1 = judgment after competition). **p < .01, ***p < .001. 39 REFERENCES 40 REFERENCES Archer, J., & Benson, D. (2008). Physical aggression as a function of perceived fighting ability and provocation: An experimental investigation. Aggressive Behavior, 34(1), 9-24. doi: 10.1002/ab.20179 Archer, J., & Thanzami, V. (2009). The relation between mate value, entitlement, physical aggression, size and strength among a sample of young Indian men. Evolution and Human Behavior, 30(5), 315-321. doi: 10.1016/j.evolhumbehav.2009.03.003 Arnott, G., & Elwood, R. W. (2010). Signal residuals and hermit crab displays: flaunt it if you have it! Animal Behaviour, 79(1), 137-143. doi: 10.1016/j.anbehav.2009.10.011 Benson-Amram, S., Heinen, V. K., Dryer, S. L., & Holekamp, K. E. (2011). Numerical assessment and individual call discrimination by wild spotted hyaenas, Crocuta crocuta. Animal Behaviour, 82(4), 743-752. doi: 10.1016/j.anbehav.2011.07.004 Bernstein, I. S., & Gordon, T. P. (1980). The social component of dominance relationships in rhesus monkeys (Macaca mulatta). Animal Behaviour, 28(4), 1033-1039. doi: 10.1016/S0003-3472(80)80092-3 Bernstein, I. S., Gordon, T. P., & Rose, R. M. (1974). Factors influencing the expression of aggression during introductions to rhesus monkey groups. Primate Aggression, Territoriality and Xenophobia, 211-240. Blanchard, D. C. (1997). Stimulus, environmental, and pharmacological control of defensive behaviors. In M. E. Bouton & M. S. Fanselow (Eds.), Learning, motivation, and cognition: The functional behaviorism of Robert C. Bolles (pp. 283-303). Washington, DC, US: American Psychological Association. Blanchard, D. C., & Blanchard, R. J. (2003). What can animal aggression research tell us about human aggression? Hormones and Behavior, 44(3), 171-177. doi: 10.1016/S0018506X(03)00133-8 Blanchard, D. C., Griebel, G., Pobbe, R., & Blanchard, R. J. (2011). Risk assessment as an evolved threat detection and analysis process. Neuroscience & Biobehavioral Reviews, 35(4), 991-998. doi: 10.1016/j.neubiorev.2010.10.016 Blanchard, D. C., Hynd, A. L., Minke, K. A., Minemoto, T., & Blanchard, R. J. (2001). Human defensive behaviors to threat scenarios show parallels to fear- and anxiety-related defense patterns of non-human mammals. Neuroscience & Biobehavioral Reviews, 25(7–8), 761770. doi: 10.1016/S0149-7634(01)00056-2 Cesario, J., & Johnson, D. J. (2013). Partner formidability influences personal strength. Manuscript in preparation. 41 Chase, I. D., & Seitz, K. (2011). Self-Structuring Properties of Dominance Hierarchies: A New Perspective. In D. L. B. Robert Huber & B. Patricia (Eds.), Advances in Genetics (Vol. Volume 75, pp. 51-81): Academic. Clutton-Brock, T. H., & Albon, S. D. (1979). The roaring of red deer and the evolution of honest advertisement. Behaviour, 145-170. Cronbach, L. J. (1987). Statistical tests for moderator variables: Flaws in analyses recently proposed. Psychological Bulletin, 102(3), 414-417. Dawkins, M. S., & Guilford, T. (1991). The corruption of honest signalling. Animal Behaviour, 41(5), 865-873. doi: 10.1016/S0003-3472(05)80353-7 de Waal, F. B. M., & Hoekstra, J. A. (1980). Contexts and predictability of aggression in chimpanzees. Animal Behaviour, 28(3), 929-937. doi: 10.1016/S0003-3472(80)80155-2 Gawronski, B., & Cesario, J. (2013). Of mice and men: What animal research can tell us about context effects on automatic responses in humans. Personality and Social Psychology Review, 17(2), 187-215. doi: 10.1177/1088868313480096 Griskevicius, V., Tybur, J. M., Gangestad, S. W., Perea, E. F., Shapiro, J. R., & Kenrick, D. T. (2009). Aggress to impress: Hostility as an evolved context-dependent strategy. Journal of Personality and Social Psychology, 96(5), 980-994. doi: 10.1037/a0013907 Ikai, M., & Steinhaus, A. H. (1961). Some factors modifying the expression of human strength. Journal of Applied Physiology, 16(1), 157-163. Kenny, D. A., Kashy, D., & Cook, W. L. (2006). Dyadic data analysis. New York, NY: Guilford. Marcus, D. K., Kashy, D. A., & Baldwin, S. A. (2009). Studying psychotherapy using the onewith-many design: The therapeutic alliance as an exemplar. Journal of Counseling Psychology, 56(4), 537. Matsumura, S., & Hayden, T. J. (2006). When should signals of submission be given?–A game theory model. Journal of Theoretical Biology, 240(3), 425-433. doi: 10.1016/j.jtbi.2005.10.002 Maynard Smith, J. (1974). The theory of games and the evolution of animal conflicts. Journal of Theoretical Biology, 47(1), 209-221. Maynard Smith, J. (1982). Do animals convey information about their intentions? Journal of Theoretical Biology, 97(1), 1-5. doi: 10.1016/0022-5193(82)90271-5 Maynard Smith, J., & Parker, G. A. (1976). The logic of asymmetric contests. Animal Behaviour, 24(1), 159-175. doi: 10.1016/S0003-3472(76)80110-8 42 Maynard Smith, J., & Price, G. R. (1973). The Logic of Animal Conflict. Nature, 246(5427), 1518. doi: 10.1038/246015a0 Parker, G. A. (1974). Assessment strategy and the evolution of fighting behaviour. Journal of Theoretical Biology, 47(1), 223-243. doi: 10.1016/0022-5193(74)90111-8 Parker, G. A., & Rubenstein, D. I. (1981). Role assessment, reserve strategy, and acquisition of information in asymmetric animal conflicts. Animal Behaviour, 29(1), 221-240. doi: 10.1016/S0003-3472(81)80170-4 Sell, A., Bryant, G. A., Cosmides, L., Tooby, J., Sznycer, D., von Rueden, C., et al. (2010). Adaptations in humans for assessing physical strength from the voice. Proceedings of the Royal Society B: Biological Sciences, 277(1699), 3509-3518. doi: 10.1098/rspb.2010.0769 Sell, A., Cosmides, L., Tooby, J., Sznycer, D., von Rueden, C., & Gurven, M. (2009). Human adaptations for the visual assessment of strength and fighting ability from the body and face. Proceedings of the Royal Society B: Biological Sciences, 276(1656), 575-584. doi: 10.1098/rspb.2008.1177 Sell, A., Hone, L. S., & Pound, N. (2012). The importance of physical strength to human males. Human Nature: An Interdisciplinary Biosocial Perspective, 23(1), 30-44. doi: 10.1007/s12110-012-9131-2 Simpson, M. J. A. (1968). The display of the Siamese fighting fish, Betta splendens. Animal Behaviour Monographs, 1(1). Taylor, P. W., & Elwood, R. W. (2003). The mismeasure of animal contests. Animal Behaviour, 65(6), 1195-1202. doi: 10.1006/anbe.2003.2169 van Staaden, M. J., Searcy, W. A., & Hanlon, R. T. (2011). Signaling aggression. In D. L. B. Robert Huber & B. Patricia (Eds.), Advances in Genetics (Vol. 75, pp. 23-49). New York, NY: Academic. 43