Lies, Dupes, and Shit Tests

Leaders sometimes tell outrageous lies. Maybe they are channeled through “fake news” sites. Of course fake news has been around forever — e.g., we’ve long spoken of “government mouthpieces” and propaganda. How could leaders get away with such lies?

A great read on this is a working paper by Andrew Little: .pdf Here is a summary: .html. Andrew’s theory supposes that some segment of the population are dupes — gullible enough to fall for the lies. Of course if most people are dupes, there is not much of a story to tell (though not to say it isn’t accurate). But the world also contains sophisticates who can see through the lies. The dupes could nonetheless induce sophisticates to express belief in the lie (even if privately the sophisticates think differently). The reason is that the sophisticates share a sense of the need for consensus. So they will go along, not because they believe or even like the lie, but because they would simply prefer to be part of the consensus.

A complementary way to think about this does not require there to be any dupes at all. I call this the “shit test” explanation. Suppose everyone is a sophisticate. But for at least some of these sophisticates, it is important to be in good standing with the leader. Then the leader can say something outrageous, but this serves as a shit test: the leader can use this to check people’s commitment to the leader. If you are truly devoted, you will “swallow the shit.” The more outrageous the lie, the better as a shit test (although maybe the leader cannot push it too far). The way people respond therefore conveys information about whether they are with the leader or not. The leader’s allies might also beat other people over the head with the shit: are you with us or against us? You would expect separation in people’s expressed support for the lie on the basis of their degree of attachment to the leader or the degree to which they feel compelled to go along with the leader’s allies. An implication of the shit test theory is that a mirror phenomenon is also possible: when the leader speaks truths those who are alienated from the leader may have reason to deny those truths.


Election questions

  1. R votes were about what they were in the past. What we really need to know is whether these are almost entirely people who have chosen R in the past. If yes, then the big question for Rs is “why would they accept him?” If it’s lots of new R votes, compensating for lots of Rs who didn’t vote for R again, then the question is “why these new Rs for this election?”

  2. D votes are down relative to the past. What we really need to know is whether this is primarily because lots who voted D in the past either stayed home didn’t vote (whether by choice or because of suppression) or voted third party. If yes, then the big question is “why didn’t they vote for D again?”

  3. If neither of the above accounts for what happened, then the implication is that a non-negligible share of people who had voted D in the past were actually comparing what R and D had to offer, and at least in a select set of districts in a select set of states, chose R. Then the big question is “why would they switch?”

My current belief, based on results (vote shares, vote share swings, and vote totals) and my own understanding about voters, is that 3 is unimportant, asking about the relative appeal of the R vs D candidates is irrelevant, and better understanding would come from examining why Rs do what they do as Rs and Ds as Ds. But my belief about this would change if I saw individual-level survey data or voter file data suggesting that 3 is in fact important.


Notes on trust

Let’s think about trust in the context of the trust game. In the trust game, the first mover has the option to engage in a transaction with a trustee. Specifically, the first mover has the option to transfer resources to the trustee in hopes that the trustee will enhance the value of these resources and then share the surplus back with the first mover. Trust, then, is the first mover’s willingness to engage in the transaction with the trustee. We can use this conceptualization in thinking about trust generally.

We can imagine two different sources of trust:

  1. Trust predicated on the first mover’s beliefs about the trustee’s intentions or motivations—that is, trust based on beliefs about the trustee’s intrinsic motivation to avoid doing harm to the first mover.

  2. Trust predicated on the first mover’s beliefs about whether the trustee is constrained by extrinsic circumstances that affect its ability to hurt the first mover.

The behavioral implications of the two types of trust are the same insofar as each yields the same prediction about whether the first mover would engage in the transaction with the trustee. Moreover measures of “generalized trust” do not distinguish between these two per se (though in principle you could look to see whether such measures correlate more strongly with things that affect intrinsic motivations versus extrinsic circumstances).

Where the difference matters is in thinking about how levels of trust might change and why levels of trust vary.

In terms of measurement, “lab in the field” methods are typically motivated in terms of isolating the first, intrinsic, source of trust. The argument is that “in the lab,” and under conditions of anonymity, there is no scope for punishing the trustee. That being the case, such lab-in-the-field methods are not always what we want. That is, maybe sometimes we want to measure change in terms of the second, extrinsic, basis for trust. Now, we could modify the lab measure such that behavior is not anonymous. That would allow us to get at some of the extrinsic bases, although not necessarily all. My hunch is that most people would think that giving in the non-anonymous set-up would tend to be quite high (I am sure people have examined this, but I don’t have the references at my finger tips). That being the case, it follows that most people must believe that such extrinsic bases for trust are first order important.


Monitoring versus Feedback

Think for a moment about service providers and beneficiaries. The issue is to motivate service providers to do a good job in providing services for beneficiaries. I am thinking about this in the context of development research, and so the focus is often on public service providers. But I think the concerns here could apply to private (whether non-profit or for-profit) actors as well.

Would-be beneficiaries are sometimes called upon to rate the quality of services provided.
At least in development research, beneficiary ratings are often interpreted as “monitoring” in the service of holding service providers accountable. Examples of this include Olken’s study on community monitoring of infrastructure spending, scorecard programs, and other “social accountability” arrangements. (Of course there are also examples that are closer to home, like student evaluations for professors.)

When ratings are used for “monitoring,” they are tied to threats that are meant to keep service providers honest. Maybe the threat is for the ratings to be passed on to higher authorities who have some kind of sanctioning power. Maybe the threat is just some kind of more diffuse social sanction.

But I want to propose that there is another way to view beneficiary ratings: as feedback rather than monitoring. To see what I mean, step outside the realm of development and think instead of things like Amazon seller ratings and Yelp reviews. In these cases, the reviews are not tied to any real sanctioning. Rather, the feedback serves different purposes.

First, it may help the service providers to know what they are doing well and what they are doing poorly. This information can in itself help to improve service delivery.

Second, the ratings can function as a tool that service providers use to win new clients. (E.g., restaurants may like Yelp reviews because when they get good reviews, they have a tool for winning the trust of new patrons.) Of course, the importance of this function will depend on the extent that a service provider benefits from winning the confidence of new people. Not all services would fall into this category but many may. (Indeed these thoughts came about during a discussion of strategies for extending the reach of basic health services via community health workers, where it was important to win the trust of new potential clients.) An institution that ensures that (i) good deeds are recognized and, through their recognition, (ii) allows for ratings to be used to gain new clients, would induce higher quality service provision as well.

This “monitoring” versus “feedback” distinction can have higher order “selection” effects too. You could imagine that the introduction of a punitive monitoring approach may disincline some people from taking up jobs as service providers. By contrast the feedback approach may provide assurance and induce some to take up such jobs. The point is that the manner in which the ratings system is presented and used may affect the types of people who become service providers. (This is a pretty basic adverse selection argument.)


Pre-analysis plans

Good in theory, but problems in implementation currently make them less useful than they should be. Note that the point of a plan is to show voluntary commitment to transparency as a way to distinguish oneself as credible—cf. separating equilibria. Features of plans that improve this “separating” function are preferred by those who want credible science.

I am going to focus on two issues: challenges to checking fidelity and lack of public vetting.

First it is too cumbersome at the moment to check papers against fidelity to the plan. This is partly because there are often so many hypothesis tests proposed. This is also because plans are formatted poorly such that we cannot quickly take in what is being proposed. And finally this is also because results are presented separately from what is specified in the plan. There are some exceptions to this dim assessment—eg, Beath et al. did a stellar job in the final report of their NSP study, but even in that case, the sheer volume of tests was quite dizzying. Similar for Casey et al. in their GoBifo study. In both cases though it would have been nice for formatting that permitted fidelity-checking in the main texts of the published papers.

Second is the lack of public vetting of plans. The standard now is to publicly register. But what is being registered? Mostly specifications and tests that the author thinks are persuasive. But the point isn’t to for authors to signal back to themselves. The point is to signal out to the academic community. This function would be enhanced if the academic community weighed in on the plan before it was finalized. The Comparative Political Studies pilot of results-free review was an awesome move toward improvement on this front. As are registered reports (a la Cortex journal). Let’s do more of this.

Improving on both of these fronts implies higher costs to plans, but of course that is quite the point (cf. separating equilibria).

(I will also continue here to push for plans to be just as much about specifying how to interpret results as about how to generate them. That is, plans are more meaningful when they show a theoretical model and they map the statistical estimates back to parameters in the model. But these are separate issues.)