WHY BOTHER WITH DATA WEIGHTING?

Why does EUROSTUDENT put so much emphasis on the weighting of the national survey data? The US presidential election just provided an answer

Martin Unger, IHS-Vienna, EUROSTUDENT Consortium

The University of Southern California is polling the presidential election in the US on a daily basis using a panel of voters (it is published in the Los Angeles Times). Unlike in most (all?) other polls, Trump has led this one during the last weeks. Why? Just because one 19-Year-old black man indicating to vote for Trump got an enormously high weight of 30 or 300 times as high as that of voters from groups which were overrepresented in the sample. This guy didn’t participate in the last round of USC polls, and Clinton took over the lead in their predictions. It is absolutely worth reading the full story in the New York Times to understand how it happened that this single respondent received such high weights (they use a far more complex weighting scheme than we usually do in our student surveys).

Weighting of survey data is important to rebalance a potential design bias (e.g. when using quotas) but it’s even more important to rebalance a return bias which is very likely to happen in any self-reporting survey. EUROSTUDENT regards the age of the students (in most countries at least) as very critical for weighting the raw data. The type of higher education institute (university versus non-university), maybe also the institute itself and/or the field of study may be added. And of course the sex of the respondents plays a role. These variables are chosen based on past experiences showing that different students (different with regard to their social and economic background) choose different studies at different institutions. Males and females differ in this regard, too. Younger students live in very different circumstances as their older colleagues even if they are just older by very few years.

However, the cells of the nested table used for weighting should contain at least 30 respondents and we encourage national teams to either trim (i.e. cut) weights above 5 or to aggregate the cells to obtain larger groups. And, of course, no single respondent should receive a weight of 30 or, even worse, a weight 300 times as high as that of other respondents. The EUROSTUDENT data team takes great care to find an accurate weighting scheme for each country, offering individual support if needed.

More information about The USC Dornsife / LA Times Presidential Election “Daybreak” Poll can be found here.

More methodological information on how the main polls in the US (but in principle anywhere in the world) are done, can be found in another article by the NYT: “We Gave Four Good Pollsters the Same Raw Data. They Had Four Different Results.”.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s