Monday, May 11, 2009

Public opinion - good vs bad polls

Opinion Polling - lauded by pundits and politicians (especially if they are leading in a given survey) alike, is nonetheless an inexact science. Some polls are certainly better than others though in terms of accuracy and thus predictability.

The key of understanding a good versus a "not so good" poll is to read the small print explaining the methodology at the bottom of the results. The most important things to consider are the sample size and who (in terms of respondents) is actually surveyed.

Initially any sample needs to be weighted to ensure that it is demographically representative of the overall population. If it ain't representative, its no good - sort of like polling Cambridge or Amherst on the presidential race and concluding that we are are indeed a two-party country: Democrats and Greens! (the same hold true for polling in places like the Texas panhandle, rural Pennsylvania, and Lynnfield MA). All demographics groups need an equal chance to be selected, otherwise the results will be biased.

For any nationwide poll, the sample size should be at least 1,100 respondents. Anything much smaller and the margin of error (the statistical wiggle room that is a measure of how far off the sample response is from what the response would be if all the millions of potential voters were polled) begins increasing to the point where there is so much potential variation of the sample results from the entire population's results that the value of the poll is lessened. 1,100 respondents comes out to a 3% margin of error which is close enough to have decent predictive value.

Larger samples do cost more money, but will yield much more value. Personally, my eyes really open when I see sample sizes of 2,000 or greater (gets the margin of error within 2%)

A quick example of how the margin of error (we'll use 3%) is used based upon the following results:

Moe: poll result is 45% (range w/ error factored in is 42% to 48%)
Larry: poll result is 40% (range is 37% to 43%)
Curly: poll result is 15% (range is 12% to 18%)

The range consists of adding and subtracting the error (3% in this case) from the actual poll result. Statistically, there is a 95% chance that the actual election results if we polled everyone would be within those percentage ranges.

Bonus question - who is ahead in this example? If you answered Moe, better luck next time. Statistically speaking, Moe and Larry are tied as Moe's low end of 42% overlaps Larry's upper end of 43%. Curly is still bonked in the head as usual... Yes, the Prof. is a huge Stooges fan and I use this example in class.

The most important factor however, is WHO exactly is being polled. Usually this falls into the following four categories.

Adults - this is the simplest poll - as long as the respondent is an adult over the age of 18 they qualify as a respondent. The bad news is that it is the least accurate way to poll. Remember that about 40% of American Adults did not vote in the 2008 elections (which was a high turnout election) and that number is higher for state and local elections. Beware of polls of just Adults as a large proportion will end up not voting and therefore really don't matter unless and until they become voters.

Some media outfits publish (cheaply run) polls that have nationwide samples of 800 or 900 adults as respondents and tout these surveys as breaking news (usually in papers distributed nationally with a large blue logo and lots of color photos and a great weather page on the back of the first section). This is a nice way to fill column inches, but is not terribly meaningful in terms predictability.

Registered Voters - ahh, getting better! The respondent is asked if they are a registered voter. That tells the pollster if they are on the voter rolls (we will assume people are not lying to the pollster, although this can and does happen). This gets rid of the 25% of adults who are not registered to vote. Better than polling just adults, but not as good as screening for...

Likely Voters - this is the best method, since potential respondents are asked follow-up questions on how likely they are to trudge to the polls and vote on election day. If an individual indicates they are registered to vote, but has not voted since 1968, they would probably not be considered a likely voter. If they get someone like me who counts down the days to the next election (and will trudge thru four feet of snow to participate on election day), it is a pretty good bet that person will show up and cast a vote. Distinguishing a likely voter can be hit or miss, but for pre-election polls they are the most meaningful and therefore deserve the most attention.

Exit Polls - the ultimate poll! This does not ask just likely voters - it asks real voters as they are leaving a polling place on election day. They can' t change their mind as a voter can do in a pre-election survey and assuming they are honest in how they voted, this is the best measure of how a race will turn out. On election night, races are often called by the media based on exit polls, even before many actual results have trickled in. In 2006, Governor Patrick was declared the winner at about 8:02 PM based on his 20 pont lead in the exit polling.

Remember - no poll is perfect! There are many examples of how pollsters blew it and were dead wrong. The national exit polls were off in 2000 and 2004. In fact the 2004 exit polls overstated Kerry's vote to the point where he was competitive in several states (South Carolina and Virginia come to mind)he ended up losing by 10 points. The polls were pretty close in 2008 as better sampling techniques were implemented.

One more thing - the timing of a survey. In my previous posting regarding the Boston mayoral race, the poll was run six months prior to the election when most people (Lovoi excepted of course!) are not paying close attention. As the big day draws near, polls become more predictive.

Now that we know all about polls and how to read them...what? you want me to go over this again?? Ok, polling is...

No comments: