







































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
A study conducted by Langenhoff, Wiegmann, Halpern, Tenenbaum, and Gerstenberg on how people hold others responsible for their actions. The researchers tested a computational framework for responsibility judgments, which predicts that dispositional inferences and causal attributions combine additively to yield responsibility judgments. The study involved participants making responsibility judgments in various situations that manipulated action expectations and the causal structure in graded ways.
Typology: Study Guides, Projects, Research
1 / 47
This page cannot be seen from the preview
Don't miss anything!








































University of California, Berkeley
University of Göttingen
Cornell University
Massachusetts Institute of Technology
Stanford University
Abstract How do people hold others responsible for their actions? In this paper, we test and extend a computational framework originally introduced by Ger- stenberg et al. (2018) that assigns responsibility as a function of two factors: a dispositional inference that captures what we learn about a person’s char- acter from their action, and the causal role that the person’s action played in bringing about the outcome. This framework has been shown to accurately capture how people assign responsibility to decision-makers in achievement contexts. Here, we focus on a more complex group setting in which political committee members vote on whether or not a policy should be passed. This setting allowed us to manipulate both dispositional inferences and causal attributions in graded ways, as well as directly test the model’s key com- ponents by asking participants to judge how surprising and how important a committee member’s vote was. Participants’ answers to these questions in Experiment 1 accurately predicted the responsibility judgments of an- other group of participants in Experiment 2. In Experiment 3, we show that the model also predicts moral responsibility judgments and that, in the moral domain, dispositional inferences affect responsibility judgments more strongly than causal attributions.
Keywords: responsibility; causality; counterfactuals; pivotality; normality; voting; expectations.
*Corresponding author: Tobias Gerstenberg ([email protected]), 302 Jordan Hall, Stanford, CA,
Introduction
Shortly before the 2016 presidential election, Christopher Suprun, a Texas state elec- tor for the Republican party, signaled that he would refuse to vote for Donald Trump, even if Trump won the popular vote in his state. Trump did indeed win the popular vote in Texas and on election day, as announced, Suprun voted for a different candidate. His decision caused turmoil among Republicans. Both Suprun’s party colleagues and the voters vocifer- ously proclaimed their anger in newspapers, blogs, and social networks. Despite Suprun’s attempt, Trump won the electoral vote – and thus, the presidential election. Imagine that Hilary Clinton had become the next president of the United States. Certainly, Suprun’s party colleagues would have held him responsible for contributing to Clinton’s victory and Trump’s loss in that case. But to what extent? Intuitively, Suprun would have been blamed more than a Democratic state elector who also voted against Trump. And suppose that Clinton’s victory margin was only a couple of votes, as some projections had suggested before the election. Presumably Republicans would have blamed Suprun even more in this scenario. Judgments of responsibility are ubiquitous in our everyday lives. When something goes wrong – for example, when our favored candidate lost an election – we want to know who is to blame. How exactly people assign responsibility has intrigued researchers in psychology (Alicke, 2000; Hilton, McClure, & Slugoski, 2005; Lagnado & Harvey, 2008; Shaver, 1985), philosophy (Hart & Honoré, 1959/1985) and the legal sciences (Moore, 2009) for decades. In this paper, we further develop and test a computational framework for responsibility judgments that was originally introduced by Gerstenberg et al. (2018). The framework predicts that responsibility judgments are influenced by two key pro- cesses. Inspired by a rich literature in attribution theory (Ajzen, 1971; Fishbein & Ajzen, 1973; Heider, 1946; Weiner & Kukla, 1970), the first process is a dispositional inference that captures what we learn about a person’s character from observing their action. The idea is that, in a given situation, we form an expectation about how another person will act, based on our knowledge about that person and the situation. The more the person’s actual be- havior diverts from our expectation, the more likely we are to infer that the person’s action must have been determined by an unobserved aspect of her disposition. This dispositional inference, in turn, translates into a responsibility judgment: We hold another person re- sponsible to the extent that we see her action as determined by her own dispositions, goals, or desires, rather than by determinants that are out of her control (e.g. Alicke, 2000; Uttich & Lombrozo, 2010). In Suprun’s case, his party affiliation and the outcome of the popular vote all spoke in favor of him voting for Trump. Given the gap between their expectations about how he would vote and Suprun’s actual vote, the framework predicts that Republicans would assign a high level of blame to him for contributing to Trump’s (hypothetical) loss. Critically, the framework predicts that Republicans would blame Suprun more than, for example, a Democratic state elector who also voted against Trump, but for whom voting for a candidate other than Trump was less surprising. The second process is a causal attribution that determines what role the person’s action played in bringing about the outcome. The framework predicts that a person is held more responsible for an outcome the closer their action was to having made a difference
Policy information Number: # 109383 Supported by : The Democratic party Votes in favor of policy required : 5
Votes Party affiliation Voted “yes” Allie Democrat Bridget Democrat 3 Christie Democrat 3 Dalia Republican Emma Republican 3
Outcome: The policy was not passed. 3 out of 5 committee members voted in favor of the policy and 5 votes were required for the policy to pass.
Figure 1. Exemplary voting scenario.
Who is responsible for the workers who died during the factory fire? We propose that in situations like these, when more than a victory in a soccer game or the growth of a flower is at stake, the weights between dispositional inferences and causal attributions shift. Specif- ically, we predict that in the moral domain, inferences about a person’s character become more important than causal attributions, reflecting the fundamental human motivation to determine the moral character of others (Uhlmann, Pizarro, & Diermeier, 2015). In this paper, we report the results of three experiments, each of which tackles one of these open questions. Experiment 1 provided a direct test of how participants make dispositional inferences (by asking about how surprising a particular action was in a given situation) and causal attributions (by asking about how important an action was for the outcome). Experiment 2 asked participants to make responsibility judgments in a large variety of situations that manipulated action expectations and the causal structure in graded ways. Finally, Experiment 3 applied the framework to the moral domain by manipulating the moral valence of the outcome, as well as what question participants were asked to evaluate. Before describing each experiment in more detail, we provide an overview of our experimental paradigm, and explain the computational framework. Subsequently, we report our experimental results and relate them to the predictions of our computational framework. We conclude by discussing some remaining challenges.
Overview of the experimental paradigm
In our experiments, we presented participants with scenarios in which different po- litical committees voted on whether or not a policy should be passed. For each scenario, participants saw how many votes in favor were required for the policy to pass, how each of the committee members voted, and what the outcome of the vote was. In Experiments 1 and 2, participants also saw each committee member’s party affiliation and which party supported the policy: the Republican or the Democratic party. Figure 1 shows a voting scenario similar to the ones used in Experiments 1 and 2. Policy #109383 was up for vote. There were five people on the committee: Allie, Bridget, Christie, Dalia and Emma. The policy was supported by the Democratic party. Five votes in favor of the policy were required in order for the policy to be passed. As it turned out,
the Democrats Bridget and Christie, as well as the Republican Emma voted in favor of the policy, whereas Allie and Dalia voted against it. The policy was not passed since only three committee members voted in favor but all five votes were required for the policy to pass. Experiment 3 did not include information about party affiliation. Instead, we told participants about the content of the policy that was up for vote. One group of participants made their judgments in a context where the content and the consequences of the policy were “morally neutral” (changing documents into a certain font) while another group made their judgments in a context where the content and the consequences of the vote were “morally negative”(introducing corporal punishment in schools). We expected that a committee member’s party affiliation in Experiments 1 and 2, and voting for the “morally appropriate” outcome in the morally negative context condition in Experiment 3 (i.e., voting against corporal punishment in schools) would affect participants’ expectations about how a committee member would vote. By varying how the committee members actually voted, we manipulated the extent to which their votes were surprising. We predicted that the more surprising a vote was, the more likely our participants would be to infer that the vote must have been determined by the committee member’s unique character or disposition (rather than by other factors such as allegiance to the party, or the overall quality of a particular policy). In all three experiments, the committee members’ causal contribution to the outcome was manipulated by varying the patterns of votes and the threshold of votes required for the policy to pass. We predicted that a vote would be seen as more important the closer it was to having made a difference to the outcome and the fewer causes had contributed to the outcome. We expand on these predictions below.
Model
We now discuss in more detail how we concretely implemented the two components of the computational framework – dispositional inferences and causal attributions – for the experiments reported here. For each component, we first briefly discuss the broader theoretical background, and then the specific model implementation.
Dispositional inferences
Background. How do we explain other people’s behavior? Early attribution the- orists suggested Bayesian inference as a normative framework for studying this question (Ajzen, 1971; Ajzen & Fishbein, 1975; Fischhoff & Beyth-Marom, 1983; Fishbein & Ajzen, 1973; Morris & Larrick, 1995; Trope, 1974; Trope & Burnstein, 1975). Within the Bayesian framework, behavioral attributions arise from a comparison between different hypotheses as explanations for a given action. Hypotheses are favored that have high prior probability and that explain the observed behavior well. Generally, it has been shown that we consider both internal factors (such as a person’s abilities, dispositions, goals, beliefs or desires) and external factors (such as the situation the person was in) as possible behavioral explanations. However, research in attribution theory has also revealed that when we try to make sense of others’ behavior (as compared to our own behavior), we tend to emphasize dispositional or character-based explanations, and neglect the influence of situational and environmental factors (Jones & Harris, 1967;
Partysame ∼ beta(a, b) Partyother ∼ beta(b, a)
Policy ∼ beta(
a + b 2
a + b 2
Votesame = Partysame + Policy 2 Voteother =
Partyother + Policy 2 (a) Mathematical form
0
1
2
3
4
0 0.25 0.5 x 0.75 1
y 0
1
2
3
4
0 0.25 0.5 x 0.75 1
y 0
1
2
3
4
0 0.25 0.5 x 0.75 1
y
0
1
2
3
4
0 0.25 0.5 x 0.75 1
density 0
1
2
3
4
0 0.25 0.5 x 0.75 1
density
P ( Partyother ) ∼ beta (5, 10 ) P ( Policy ) ∼ beta (7.5, 7.5) P ( Partysame ) ∼ beta (10, 5 )
P ( Voteother ) (^) P ( Votesame )
P ( Vote = yes | Party = other , Policy ) P ( Vote = yes | Party = same , Policy )
(b) Graphical representation
Figure 2. Generative voting model. When deciding how to vote, a voter takes into account his party affiliation ( same or other ), and the quality of the policy, weighing each factor equally. We fit a and b to the data with the constraint that a > b , reflecting the assumption that committee members affiliated with the party that supports the policy are a priori more likely to vote in favor of the policy than committee members from the other party. The diagram shows the shape of the prior distributions for a = 10, and b = 5. See Figure A1 for a sensitivity analysis of the parameter space, and Table C1 for detailed model predictions. Note : ∼ indicates “distributed as”.
Specifically, we assume the generative voting model illustrated in Figure 2. Committee members who are affiliated with the party that is stated to support the policy start with a prior belief that their party supports the policy (Partysame), whereas committee members from the other party believe that their party doesn’t support the policy (Partyother). A committee member votes by equally taking into account their belief about their party’s support as well as their belief about the quality of the policy (Policy). We assume beliefs about the quality of a policy are initially unbiased – that is, policies are just as likely to be good or bad. We model these prior beliefs using beta distributions which have support between 0 and 1. We then assume that a committee member makes their choice about how to vote (Votesame or Voteother) by equally weighting their belief about the party’s support as well as the quality of the policy. Our model performs Bayesian inference by conditioning on the observed evidence (the votes) to go from prior distributions over the party and policy factors to posterior distributions over these factors (see Equation 1).
p (Partysame, Partyother, Policy|
Votes) ∝ p (
Votes|Partysame, Partyother, Policy) · p (Partysame) · p (Partyother) · p (Policy)
We assume that the vector of votes (
Votes) is generated from a binomial distribution with the probability of each vote determined by party membership and policy as shown in Fig- ure 2. Based on the posteriors over Partysame, Partyother and Policy, the model then forms an expectation about how the committee member of interest will vote.
For an example, consider committee member Allie in the voting scenario shown in Figure 1. Allie is a Democrat, and thus affiliated with the party that supports the policy. Accordingly, she is a priori more likely to vote for rather than against the policy. The model then updates this prior distribution based on how the other committee members voted. The two other Democrats, Bridget and Christie, voted for the policy, and one out of the two Republicans voted for the policy. Based on this evidence, the model now believes Allie is even more likely than before to vote in favor of the policy. We define the extent to which a committee member’s vote is surprising as the differ- ence between the actual vote (coding a vote against the policy as 0 and a vote for the policy as 1) and the expected vote (where we use the mean of the posterior over the committee member’s vote (Votesame or Voteother depending on the committee member’s party) as our measure of expectation; see Figure 2). Given that an observer would have expected Allie to vote for the policy, her actual vote against the policy is surprising. The inference that Allie’s vote must have been affected by her individual preference (since it’s not well-explained by how the others voted) is then predicted to lead to an increased judgment of responsibility. We implemented the dispositional inference model in R (R Core Team, 2019) using the greta package (Golding, 2018). We modelled the prior distributions over Party and Policy as beta distributions, and the likelihood function for the pattern of votes as a bino- mial distribution, as shown in Figure 2. greta uses Markov-chain Monte Carlo (MCMC) inference to approximate the posterior distribution. The code implementing this model is available on the project’s github repository: https://github.com/cicl-stanford/ voting_responsibility
Causal attributions
Background. We now turn to the second key process in the computational frame- work: A causal attribution of the person’s role in bringing about the outcome. One way of capturing whether a person’s action is causally connected to an outcome is to run a coun- terfactual simulation and ask whether the outcome would have been different without the person’s action (Lewis, 1973). This test of causation works well in situations that involve a single agent: Is Martin responsible for the bottle being smashed? Yes, because had Martin not dropped the bottle, the bottle would not have smashed. However, this simple counterfactual analysis does not suffice in general as a model of responsibility. Consider again the voting scenario in Figure 1. All five of the committee members had to vote in favor for the policy to pass but only three committee members did, so the policy did not pass. Intuitively, Allie and Dalia are each somewhat responsible for the policy not passing, because both of them voted against the policy. However, their individual actions did not make a difference to the outcome because the outcome was causally overdetermined : Even if Allie had voted in favor of the policy, it would still have failed since five votes in total were required for the policy to pass. Halpern and Pearl (2005) introduce a solution to this problem. They define a person’s action as a cause of an outcome if the outcome would have depended on the action under certain contingencies. Their definition identifies Allie and Dalia as causes even in this case of overdetermination. Building on Halpern and Pearl’s (2005) definition of causality, Chockler and Halpern (2004) developed a model of responsibility that makes graded predictions. The degree of responsibility that an action bears for an outcome is defined in terms of the minimal
there are that have contributed to an outcome, the less important each individual cause is perceived to be. In our voting setting, this means that the more committee members voted in line with the outcome of the vote, the less important each vote is perceived to be. Overall, we predict that both pivotality and the number of causes affect participants’ causal attributions. We assume that both factors affect causal attributions in an additive way (with number of causes being a negative predictor).
Causal attribution = β 1 · Pivotality + β 2 · Number of causes, (3)
whereby β 1 and β 2 determine how much emphasis is put on pivotality and the number of causes when making causal attributions.
Bringing it together: The computational model
We predict that judgments of responsibility are sensitive to what the observer learned about the person from their action (‘dispositional inference’), and how important the per- son’s action was perceived for the outcome (‘causal attribution’). For simplicity, we assume that both factors of the model combine additively to affect judgments of responsibility.
Responsibility = α · Dispositional inference + β · Causal attribution (4) In the remainder of this paper, we report the results of three empirical studies, de- signed to answer three outstanding questions: Can the model components be directly as- sessed by asking people to judge dispositional inferences and causal contributions directly (Experiment 1)? Does the model capture how people assign responsibility in group contexts with more complex causal settings (Experiment 2)? And finally, can the model be applied to the moral domain, where judgments of responsibility play a central role (Experiment 3)?
Experiment 1: Testing dispositional inferences and causal attributions directly
In Experiment 1, participants’ task was to judge to what extent the vote of a politician who voted in a committee on whether a new policy should be passed was 1) surprising and
Methods
Participants. 40 participants ( Mage = 35, SDage = 11, 10 female) were recruited via Amazon Mechanical Turk. Participation was restricted to workers based in the US with a prior approval rate greater than 95% (see Mason & Suri, 2012, for details about how Amazon Mechanical Turk works).
Design. Experiment 1 included 27 voting scenarios. Each scenario featured a dif- ferent political committee comprised of five members.^2 Between scenarios, we manipulated how each committee member voted, how many votes in favor of the policy were required for the policy to pass (1–5), the outcome of the vote (passed / did not pass), which political party supported the policy (Democrats / Republicans) and the party affiliation of each committee member. Figure 1 shows one scenario. For each scenario, we assessed impor- tance and surprise judgments for one out of the five committee members. We selected 27 scenarios that elicit a range of predictions from our surprise and importance model.^3
Procedure
The experiment was programmed in Qualtrics. After receiving instructions, partici- pants answered a set of comprehension check questions. Participants were redirected to the beginning of the survey in case they didn’t correctly answer all of the comprehension check questions. Participants were then presented with the 27 voting scenarios in randomized order. For each scenario, participants were asked to judge the extent to which they con- sidered one of the committee members’ votes 1) important and 2) surprising. For example, when the committee member John had been described as having voted in favor of the policy and the policy passed, participants were asked: 1) “How important was John’s vote for the policy passing?” and 2) “How surprising was John’s vote?”. Participants responded using continuous sliders whose endpoints were labeled with “not important at all” (0) and “very important” (100), as well as “not surprising at all” (0) and “very surprising” (100). On average, it took participants 13.67 minutes ( SD = 9.29) to complete the experiment.^4
Results
We first describe participants’ judgments for a selection of cases in detail, and then report their overall judgments. Detailed analysis of a selection of scenarios. Figure 3 shows the results of four of the voting scenarios. The figure shows participants’ mean judgments together with the predictions of the surprise and importance model described above. In all four scenarios, the policy was passed because the number of votes in favor met or exceeded the threshold (T). We take a look at surprise judgments first. In all four scenarios, the committee member for whom ratings were assessed (the “focus person”, indicated by the arrow in Figure 3) voted in favor of the policy. In Scenario 1 and 2, the focus person was affiliated with the party that supported the policy, whereas in Scenario 3 and 4, the focus person was affiliated with the other party. We see that in general, participants were more surprised when a person voted in favor of a policy despite being from the opposite party. However, surprise judgments were not solely determined by whether a person’s vote was consistent with their party affiliation. Participants were more surprised about the
(^2) Note that unlike the example in Figure 1, we used only male first names for the politicians within our actual experiments, in order to eliminate possible gender effects. (^3) See Table B1 in the Appendix for a full list of the scenarios. (^4) All materials including data, experiments, and analysis scripts are available here: https://github.com/cicl-stanford/voting_responsibility
l
l
l
l
l lll l
l
l l
l
l ll (^) l
l
l
l
l
l
l
l
l
ll
4
3
2 1
r = 0. RMSE = 4.
0
25
50
75
100
0 25 50 75 100
(a) Surprise judgments
l
l
l
l
l
l
l
l
ll
l l
l
l
l
l
l
l
l
l l l
l
l
l
l
l
4
3
2
1
r = 0. RMSE = 7.
0
25
50
75
100
0 25 50 75 100
(b) Importance judgments Figure 4. Experiment 1 : Surprise and importance judgments. Data points show mean judgments. The colored data points correspond to the four scenarios shown in Figure 3. The gray ribbon shows the 95% highest-density interval (HDI) for the model fit. The error bars indicate bootstrapped 95% confidence intervals. Note : r = Pearson moment correlation, RMSE = root mean squared error.
pivotal if the second committee member had also voted against the policy. As predicted, participants judged the focus person’s vote as more important the closer it was to being pivotal for the outcome. Participants’ importance judgments are greater in Scenarios 1 and 4 than in Scenario 3 (20.05 [15.02, 25.15]), and greater in Scenario 3 than in Scenario 2 (31.79 [26.76, 36.83]).
However, if pivotality was the only factor that influenced people’s judgments of impor- tance, then varying the threshold while keeping pivotality fixed should not make a difference. That is, we should expect no difference in importance judgments between Scenario 1 and 4 since in both scenarios, the focus person’s vote was pivotal for the outcome. However, participants considered the person’s vote more important in Scenario 4 than in Scenario 1 (21.96 [16.89, 27.93]). This shows that participants’ importance judgments are not solely determined by how close a person’s vote was to having been pivotal for the outcome, but that it also matters how many causes contributed to the outcome. In Scenario 4, there was only a single cause for the policy passing – the focus member’s vote. In contrast, in Scenario 1, there were five causes for the policy passing – all of the committee members’ votes were required. A vote is seen as more important when it is the only cause versus just one of several causes. Our model of causal attribution which considers both pivotality and number of causes adequately captures the pattern of importance judgments.
Overall results and model comparison. Figure 4 shows scatter plots of the model’s predictions and participants’ mean surprise and importance ratings for all 27 sce- narios. We fitted the model to individual participants’ responses by specifying a Bayesian linear mixed effects model with random intercepts and slopes for each predictor.
Our dispositional inference model captures participants’ average surprise judgments very well with r =. 98 and RMSE = 4_._ 59 (Figure 4a). A model that considers only whether the committee member voted in line with his party affiliation also correlates well with partic- ipants’ judgments r =. 95 and RMSE = 7_._ 66. We compared the models using approximate leave-one-out crossvalidation as model selection criterion (PSIS-LOO; cf. Vehtari, Gelman, & Gabry, 2017). According to this criterion, the Bayesian surprise model performs better than the model that considers only party affiliation (difference in expected log predictive density (elpd) = 38_._ 4 , with a standard error of 16_._ 1 ).^6 Figure 4b shows that the causal attribution model accounts well for participants’ mean importance judgments with r =. 92 and RMSE = 7_._ 95. Remember that our causal attribution model considers both the extent to which a person’s action was pivotal for the outcome, as well as the number of causes that contributed to the outcome. This model compares favorably with lesioned models that only consider a subset of the predictors, such as just pivotality ( r =. 88 and RMSE = 10_._ 06 ; elpd = 37.4, standard error = 8.8) or just the number of causes that contributed to the outcome ( r =. 54 and RMSE = 17_._ 54 ; elpd = 233.3, standard error = 23.7).
Discussion
In this experiment, we presented participants with a number of different voting sce- narios that manipulated how many votes were required for a particular policy to pass, the political affiliation of the committee members, how each committee member voted, and whether the policy passed (see Figure 1). The results show that the extent to which partici- pants found a committee member’s vote to be surprising and important for the outcome was systematically affected by this information. To explain participants’ surprise judgments, we developed a dispositional inference model that forms an expectation about how a committee member would vote based on the committee members’ party affiliations as well as how they voted. This model captures participants’ surprise judgments well, and better than a model that only considers a committee member’s party affiliation. Participants’ judgments about how important a committee member’s vote was for the outcome are well-explained by our causal attribution model. This model considers both how close a person’s vote was to being pivotal for the outcome, as well as how many other committee members voted alike. A vote is seen as more important the closer it was to being pivotal (i.e., when the outcome of the overall vote would have been different had the committee member voted differently) and the fewer causes contributed to the outcome.
Experiment 2: Responsibility judgments in voting scenarios In Experiment 1, we experimentally manipulated the extent to which a vote was sur- prising and its importance for the outcome, and assessed how these manipulations affected participants’ dispositional inferences and their causal attributions. Since the computational framework predicts that dispositional and causal inferences combine additively to yield re- sponsibility judgments, the extent to which committee members in our voting scenarios are considered responsible for the outcome of the vote should be influenced by the same
(^6) As a rule of thumb, a model is considered superior when the difference in expected log predictive density is greater than twice the standard error of that difference (for details, see Vehtari et al., 2017).
or not passing. Participants made their judgments on sliding scales ranging from “not at all responsible” (0) to “very much responsible” (100). Participants assigned responsibility to committee members whose vote was in line with the outcome. Depending on the scenario, participants were either asked to make one or two judgments. When all committee members whose vote was in line with the outcome shared the same party affiliation, participants made only one judgment. When two of the committee members whose vote was in line with the outcome came from different political parties, then participants were asked to judge the responsibility for one of the Democrats and one of the Republicans. Out of the set of 170 scenarios, there were 90 scenarios in which participants were asked to make a single judgment, and 80 scenarios in which they made responsibility judgments for two committee members. Thus, we have a total of 250 data points. In our example scenario depicted in Figure 1, two voters voted in line with the outcome of the vote (policy not passed): Allie and Dalia. In this case, since Allie and Dalia came from different political parties, we assessed responsibility judgments for both of them. Thus, in this scenario, participants made two ratings; one for Allie (Democrat) and one for Dalia (Republican). On average, it took participants 6.61 minutes ( SD = 7_._ 03 ) to complete the experiment.
Results
We first discuss a selection of cases individually before examining the data on a higher level of aggregation to see whether, and to what extent, participants’ responsibility judgments were influenced by dispositional inferences and causal attributions. Detailed analysis of a selection of cases. Figure 5 shows participants’ judg- ments for 24 of the 170 scenarios. These 24 scenarios are the ones that were also used in Experiment 1. The figure shows participants’ mean responsibility judgments in addition to the mean surprise and importance judgments from Experiment 1, as well as the predictions of a model that uses participants’ surprise and importance judgments from Experiment 1 to predict participants responsibility judgments in the current experiment. For example, in the first scenario, the threshold for the policy passing was one ( T = 1), and all the commit- tee members were from the party other than the one that supported the policy (O). The policy passed because one of the committee members voted in favor of the policy. We see that in this case, participants in Experiment 1 considered the committee member’s action very surprising, and also judged that the vote was very important. Here, in Experiment 2, participants judged the responsibility of the committee member to be very high and the model correctly predicts a high responsibility judgment in this case. In Scenario 24, the threshold was 5, but all committee members voted against the policy. Two members were affiliated with the party that supported the policy, and three were affiliated with the other party. Participants in Experiment 1 found it somewhat surprising that the focus person didn’t vote for the policy even though he was from the party that supported the policy. Note, however, that they found this less surprising than what the focus person did in Scenario 1 (who also voted against the party affiliation). In Scenario 1, all other committee members voted against the policy, and the focus member was the only one voting in favor. However, in Scenario 24, all of the committee members voted against the policy, thus making the action of the focus member less surprising.
0 25 50 75 100 judgments
surprise importance
model prediction responsibility
T = threshold, S = same party, O = other party, ⇨ = focus, ✓ = yes, ✗ = no Figure 5. Mean responsibility judgments (black dots) together with the mean surprise (red dots) and importance (blue dots) judgments based on Experiment 1, as well as the model prediction (gray dots) that combines surprise and importance judgments. We numbered the cases here in decreasing order of participants’ mean responsibility judgments. Note: The error bars indicate bootstrapped 95% confidence intervals.
Participants in Experiment 1 judged that the focus person’s action was not partic- ularly important in Scenario 24. His vote is far from being pivotal (all of the other four votes would have needed to change), and it’s only one among five causes of the outcome. Participants in Experiment 2 judged that the focus person in Scenario 24 was not very
l
l
lll l
l
l l
l
l l
l
l
l
l
ll
l
l
l
l
ll
r = 0. RMSE = 6.
20
40
60
80
100
20 40 60 80 100
(a) Model predictions for a selection of cases based on participants’ surprise and importance judgments in Experiment 1.
r = 0. RMSE = 8.
20
40
60
80
100
20 40 60 80 100
(b) Model predictions for the full set of cases based on considering surprise, pivotality, and the number of causes as predictors.
Figure 6. Experiment 2 : Scatter plot between model predictions (x-axis) and mean responsibility judgments (y-axis). The gray ribbon indicates the 95% HDI for the regression line. The error bars indicate bootstrapped 95% confidence intervals. ( Note : r = Pearson moment correlation, RMSE = root mean squared error.)
(elpd) = 89_._ 4 , with a standard error of 16_._ 1 ). A model that, in addition to the predictors discussed here, also considers whether the outcome was positive or negative (i.e., whether the policy passed), does an even better job at predicting participants’ responsibility judg- ments with r =. 81 and RMSE = 7_._ 60 (difference in elpd = 70.7, with a standard error of 14.6, compared to the model without the outcome predictor). Participants assigned more responsibility when the outcome was positive (i.e., when a committee member voted in favor of a policy) than when the outcome was negative (and a committee member voted against a policy).
Table 1 Estimates of the mean, standard error, and 95% HDIs of the different predictors in the Bayesian mixed effects model. Note: n_causes = number of causes.
responsibility ∼ 1 + surprise + pivotality + n_causes + (1 + surprise + pivotality + n_causes | participant)
term estimate std.error lower 95% HDI upper 95% HDI intercept 59.94 3.25 54.70 65. surprise 21.68 4.57 14.17 29. pivotality 13.52 1.82 10.47 16. n_causes -5.72 0.50 -6.55 -4.
Discussion
In this experiment, we asked participants to make responsibility judgments about in- dividual committee members for a large set of voting scenarios. Our computational frame- work captured participants’ judgments well. While previous work showed that the model accounts well for responsibility judgments about individual decision-makers in achievement contexts (Gerstenberg et al., 2018), the results of this experiment show that the model also does a good job of accounting for responsibility judgments about individuals in group settings. Furthermore, while the responsibility judgments obtained in previous work were consistent with the key processes that the model postulates (dispositional inference and causal attribution), the results of Experiments 1 and 2 together provide a much stronger test of this proposal. Participants’ surprise and importance judgments in Experiment 1 predict the responsibility judgments of different participants in Experiment 2. The results further showed that while both components of the model are important, participants’ responsibility judgments were most strongly influenced by the causal attri- bution aspect of our framework which expresses how important a person’s action was for bringing about the outcome. However, as we discussed earlier, the extent to which dispo- sitional inferences play a role for responsibility judgments might differ between domains. In Experiment 3, we test the idea that in the moral domain, the most relevant component may shift from causal attributions to dispositional inferences. In addition to the factors that our model considers, we also found that participants’ responsibility judgments were affected by whether the outcome was positive (i.e., the policy was passed) or negative (i.e., the policy was not passed). This effect was not predicted by our model, but could in principle be accommodated by it. Right now, our model assumes that committee members are just as likely to vote for or against a policy. It is possible, however, that people consider it generally more likely that committee members will vote against a policy rather than in its favor; that is, that the prior distribution over the policy is skewed toward voting “no”. It is well known that people often prefer to “do nothing” when they have the choice between acting and not acting (see, e.g., Ritov & Baron, 1992); it seems plausible that they expect others to behave in the same way. In our voting setting, this means that participants might have assumed that committee members vote “yes” only if they really agree with the policy, while voting “no” is compatible both with being against the policy and with having no strong opinion.
Experiment 3: Responsibility judgments in moral contexts
Gerstenberg et al. (2018) previously tested the computational framework in achieve- ment contexts, where the outcome critically depended on an individual’s skill. Achievement contexts naturally elicit judgments of responsibility, as one can witness in any sports bar. However, judgments of responsibility are also particularly relevant in the moral domain. The computational framework is not restricted to specific contexts or a specific domain. When people assign responsibility to an individual for an outcome in the moral domain, the framework predicts that their judgments should be affected by dispositional inferences on the one hand, and causal attributions on the other hand, just like in achievement contexts. However, research in moral psychology has shown that when people make moral judgments, they often assign more weight to those features of a behavior that seem most in-