Can Online Surveys Represent Purchasing by Offline Consumers?
An Online vs. Paper-and-Pencil Comparison
By Mark Kinnucan, Ph.D.,
Director of Research Sciences,
The NPD Group, Inc.
When NPD moved its tracking studies online in 2001, many individuals, including clients and NPD staff, had questions. Could an online panel adequately account for purchases made by those who do not use the Internet and who, consequently, would never be part of the pool of potential respondents for an online survey? Although NPD’s online panel has now been in operation for five years, these questions are still raised from time to time. Now, as this article demonstrates, we have solid evidence that an online panel can indeed represent all consumers.
Questions about the possible bias of conducting tracking studies online have been raised most forcefully with respect to Wal-Mart shopping. Both in terms of income and in terms of consumers’ rural vs. urban locale, a higher level of shopping at Wal-Mart and a lower level of Internet use would seem to go hand in hand. Does this mean NPD’s online trackers under-represent purchases at Wal-Mart? In NPD’s Research Sciences group, we would argue “No,” because NPD’s survey responses are balanced to the entire U.S. population on several demographic characteristics, including both income and residential location (rural vs. urban).
Still, it is possible to question whether our demographic rebalancing adequately captures all the differences between those who are online and those who are not. Perhaps there are residual differences, even after re-weighting the data, that cause purchases among those who do not use the Internet -- especially their Wal-Mart purchases -- to be under-represented. The only way to find out for sure would be to conduct a study that directly compares purchase reporting by those who are online and those who are not.
That is exactly what we did in 2005, in our Shopper Segmentation Study. This study also attempted to answer the further question, “Well, if there are residual differences between online and offline individuals that go beyond demographics, how might we characterize those differences?” We wanted to see if there were attitudinal differences between online and offline individuals that could be identified and then used to further rebalance our online data.
Study Design
The Shopper Segmentation Study, conducted in March 2005, consisted of a parallel study in both online and paper-and-pencil modes. The content of the questionnaire used in the two modes was identical, as was the time period during which the study was in the field. The only differences between the two data collection efforts were the mode itself and the sample to which the study was fielded. The survey asked about frequency of shopping at different kinds of retailers, items bought at Wal-Mart, and Internet use and online shopping. It also included an attitudinal battery. Sample sizes were 3275 for the online version and 3953 for the paper-and-pencil version.
Three-quarters of the consumers who completed paper-and-pencil surveys indicated they were Internet users, while the other quarter said they never use the Internet. Comparing the reponses of these two subgroups of the “paper-and-pencil” consumer group holds the key to answering the study's main question.
Results of the Study
We focused on the frequency of purchasing apparel at Wal-Mart. First, we compared the Internet-user subgroup of the paper-and-pencil segment to the online data collection group. Both groups were weighted to the entire U.S. population.
The results can be seen in Chart 1. Those surveyed by mail were slightly more likely to shop for apparel at Wal-Mart, but the difference was not substantial. This told us that the form of the questionnaire itself has minimal impact on reported levels of shopping at Wal-Mart.

Next, we directed our attention to the paper-and-pencil group. We added back in those who said they did not use the Internet, and reweighted the entire group to the U.S. population. Then we asked, “Do those without Internet access exhibit a higher propensity to shop for apparel at Wal-Mart than those who have Internet access?” The answer, shown in Chart 2, is yes.

No one is likely to be surprised that those without Internet access shop more at Wal-Mart than those with Internet access. To begin to get an idea of the role demographic variables play in this correlation, we charted both Internet access and Wal-Mart shopping for several demographic variables. Charts 3, 4, and 5 show the results for income, size of metropolitan area (including rural), and age. (Recall that these charts show just the paper-and-pencil group, split between those with and without Internet access.)



Clearly, household income does play a major role in the association between Internet use and likelihood of shopping at Wal-Mart. Likewise, those living in rural areas are both more likely to shop at Wal-Mart and less likely to use the Internet than those who live in major metropolitan areas. Interestingly, however, while seniors are much less likely to use the Internet than younger consumers, Wal-Mart shopping is fairly constant across the age groups.
We are now in a position to answer the central question of the study: do these and other demographic variables explain all of the difference in Wal-Mart shopping between those with and without Internet access? In other words, can our online survey adequately represent the purchases of those lacking Internet access? Chart 6 provides the answer. For this chart, we compared the Wal-Mart shopping of just the subgroup of the paper-and-pencil group with Internet access (reweighted to represent the entire U.S. population by itself), to the Wal-Mart shopping of both subgroups (weighted together as a single group).

And the answer is, yes, it can. The reweighted Internet access subgroup matches the properly blended mix of those with and without access exactly for the frequency question and the underwear and socks question, and shows only a slight difference on the “other clothing” question. There are slight differences between an online sample and an offline sample, but none is large enough to cause material differences in the final market estimates provided to clients. In other words, it is possible to use an online sample, properly weighted to the entire population, to represent all U.S. consumers.