In fact, the research has a serious flaw, at least as applied to things like field goal shooting in basketball. This is the assumption that "chance" can be represented by a model in which the probability of success in each trial is independent and identically distributed. For example, if a player is a 50% shooter, he has a probability of .5 of making every shot.

The probability of success is clearly not identically distributed: there are easy shots and hard shots. The paper that started the "hot hand" research (Gilovich, Vallone, and Tversky, "The Hot Hand in Basketball," Cognitive Science 1985), acknowledged this point but said that the model of an identical distribution was "indistinguishable" from a more realistic model in which "each player has an ensemble of shots that vary in difficulty . . . and each shot is randomly selected from that ensemble."

*But in fact, random variation in the difficulty of shots will tend to hide any evidence of a "hot hand." For example, one way of trying to detect the hot hand is to look at the occurrence of streaks: are they more common than would be predicted by the model of an identical distribution? Suppose that someone takes two shots and has a .5 chance of making each one. If the chance of making the second is independent of success on the first, there is a .25 chance of two misses, and a .25 chance of two hits. Suppose that someone takes two shots, an easy one (.9 probability of success), and a difficult on (.10 probability). Then he has a .09 chance of making both and a .09 chance of missing both. That is, the chance of a "streak" is .5 if the shots don't vary in difficulty and .18 if they do, even though the average chance of success is the same in both cases. **

Random variation in the difficulty of shots will also tend to drive any correlation between success in successive shots toward zero. I did a simulation in which a player had three states: "hot" (field goal percentage of about 75%), ordinary (.5), and "cold" (about 25%). There is a 90% chance that a player will be in the same state as he was on the last shot: otherwise, he randomly shifts to a new state with the probability of 80% normal, 10% hot, and 10% cold. This degree of variation seems large enough to make a practical difference, but it turns out to be hard to detect if you allow for random variation in shot difficulty.

I assumed that shot difficulty followed a normal distribution with mean 0 and standard deviation of 1, and that the probability of making a shot was (exp(x+s)/(1+exp(x+s)), where x is the difficulty variable and s is -1,0, or 1 depending on whether a player is cold, normal, or hot. Then the correlation between success and success in the previous shot is only about .02; the correlation between success and the number of successes in the three previous shots is a little less than .04. It takes a sample of about 3,000 shots to have a 50% chance of getting a statistically significant association between success and the number of successes in the last three (around 10,000 to have a 50% chance of a statistically significant association between success in two successive shots).

Of course, I have no particular reason to think that the difficulty of successive shots is normally distributed with a standard deviation of 1, but that's the general problem: results are sensitive to the assumptions we make about an unobserved variable. And this is assuming that the difficulty of successive shots is independent. Suppose there's a slight negative correlation between them (maybe a player who just missed a shot gets more selective). That would make it even harder to detect a hot hand.

The research questioning hot hands made a valid point: that people have a tendency to overinterpret data and think there's a reason for differences that are really just due to chance. But I think the conviction that hot (and cold) hands simply don't exist reflects social scientists' love for reaching counter-intuitive conclusions rather than a justified inference from the data.

*[Note of May 31] On further review, I realize that this paragraph is wrong: the probability of a streak of given length is not affected by random variation in the probability of success. So the strategy of comparing the length of actual streaks to the expected length under the assumption of independent and identically distributed events is a valid one. I stand by my general point about needing a lot of data to identify "hot hands."

The stats of the Monte Carlo Martingales are available for analysis.

ReplyDelete