Parts 4 and 5 were better. In part 5 he actually alludes to the biggest problem with complex polls that attempt to isolate ideological trends: it’s simply impossible to ask enough people enough unbiased questions to get at the big picture with any reasonable degree of precision.

Here’s where he might have offered his readers a little more information about statistical methods. The mathematical tools pollsters use to gauge the reliability of their estimates were originally devised in the context of physical experimentation. Those tools rely on a simple model. We have a set of measurements — all of the same phenomenon — that are subject to random error. How can we estimate the size of the inherent errors from these data, and how close is the average of all our measurements to the “true” result?

The underlying mathematical assumption on which classical statical methods rely is that the “errors” are independently and identically distributed. This is an adequate model for political polls that ask a simple yes/no question (Will you vote for candidate A, or B?). People may change their minds between the time the poll is taken and election day, but in the absence of some major event that changes everybody’s perceptions, the “A” voter is just about as likely to change his mind as the “B” voter is. So simple “A” or “B” polls are usually quite accurate.

The simple yes/no model can be jazzed up a little bit to cover more complicated questions (Do you favor or oppose, or are you neutral?). But the underlying assumption of classical statistical methods — that people are random members of a population that is “independently and identically distributed” — is simply not accurate when the whole complex of political and economic ideas comes into play. People can take independent actions, but our ideas, and our values, and our choices, are heavily influenced by the views of our neighbors. And we’re certainly not identical — not in terms of our wants and needs and abilities, at least.

Anyway, I like Mr. Knee’s general inclination toward fiscal conservatism, and I hope he uses his skills to focus public attention on the kind of questions politicians really ought to ask: what can I do to make everybody more successful in their pursuit of happiness?

]]>Back in college, I managed a B in Intro Psych Statistics without attending a single lecture, but — as David can attest — these days, too much math makes my brain ache. So I’ll defer to you guys on the technical arguments. It’s Cal Tech vs. Yale, and way out of my league! ðŸ™‚

]]>1. The margin of error CAN be affected by population size in certain cases. This almost never happens in political polling, but I said primarily because this is possible.

2. Margin of error technically describes the bell curve rather than being the bell curve, but I did not want to confuse general audiences with the extra layers of complexity about distributions and confidence intervals and bring home the point that possible outcomes are not distributed evenly. I still explain the definition of margin of error exactly as you described. Still, I might go back and clarify that sentence, so thanks for the catch.

]]>”The margin of error is a bell curve representing possible outcomes …”

This statement is simply false. There is a theoretical “bell curve” (Gaussian) that closely approximates the probability that a given sample is representative of the population taken as a whole. The commonly cited “margin of error” is a number (not a curve!) that represents how widely similar samples are likely to vary from the given sample, with 95% probability. In other words, if the poll were conducted many times with different random samples of the same size, 95% of those polls would be expected to find the same result as the original poll, within a precision given by the “margin of error”. Or, to put it another way, there is only a 5% chance that the “true” result for the entire population differs from the sample measurement by more than the “margin of error”.

”The stated margin of error for a poll is the margin of error for the entire poll, and is primarily driven by sample size.”

In fact, the stated “margin of error” is driven ”entirely” by the size of the sample, not ”primarily”. OK, that’s a nit, and Mr. Knee does make some valid observations about sub-populations within a poll. But he fails to give a simple formula for the “margin of error” which is very useful for careful thinkers (the margin of error is, roughly, the square root of the number of people who answered a yes/no question). For instance, if the poll consults 1,000 people, the margin of error is roughly 3.2% (because the square root of 1,000 is 32, very nearly, and 32 is 3.2% of 1,000). If the same poll consults 500 Democrats and 500 Republicans, the “margin of error” for the two parties taken as sub-populations is roughly 4.4% (again, because the square root of 500 is 22, more or less, and 22 is 4.4% of 500).

—

Maybe I’m just too picky about math. I’m still disappointed by Part 2.

]]>