Alert! Magazine February 2013 - Smarter Data: Challenges and Opportunities
Smarter Data: Challenges and Opportunities (Open this article in the February 2013 PDF)
Newsweek recently published its last print issue. When you think of the marketing research industry today, think of the legacy magazine and newspaper industry in 1999.
As the Internet surged in influence, regular print readers questioned the 1980s’ titans of influence in a multi-billion dollar print industry in which Newsweek was the belle of the ball: yet, very quickly, a gobsmack hit news editors everywhere in the world. Esteemed reporters, columnists and editors lost their jobs and influence. Bureau chiefs got axed; revenues sunk fast; only the few dailies that embraced the Web and the army of bloggers (Huffington Post) or those that offered niche content (the Wall Street Journal) have flourished.
The death of the newspaper is eerily similar to the impending fundamental changes to the legacy of the Marketing Research Industry (MRI). Customers will only pay for information that serves up real value.
Today we see a crisis of confidence within the MRI. Serious questions have been raised about the quality of standard operating practices used by some of the leaders in global data capture, with a particularly acute concern about the reliability of online panels. This is of note as the industry moves increasingly online and mobile for reasons of cost efficiency and to ensure they intercept people where they are most active.
Does this represent the doom and gloom that hit the print newspaper industry? Or are we faced with a golden opportunity to re-establish the MRI, or, more importantly, marketing research professionals, as critical and strategic players in the new world of mass data generation, interpretation and application?
Over the past 14 years, I have been leading organizations that have focused on developing innovative consumer research methodologies and technologies. This has included conducting some of the earliest online marketing research studies (back in 1999, essentially the stone age of online data collection), as well as building some of the first proprietary online consumer communities. Over this period I have been witness to the rapid deterioration in the quality of marketing research data being captured.
The RIWI Corporation, of which I am President, recently published a paper in conjunction with CLSA Asia Pacific Markets called ``Smarter Data: Eliciting Insights from the Cloud``, that addresses in detail some of the urgent and critical issues facing the MRI and data capture methodologies. Some key findings are excerpted and expanded upon in this article.
The RIWI Corporation, which stands for “Real-Time Interactive World Wide Intelligence”, has developed a patented technology that gathers highly randomized data using a unique respondent intercept methodology and geo-targeted three to seven question nano-surveys. This enables clients to derive real-time insights into the thoughts, wants and purchasing intent of the everyday consumer in every city around the world with Internet access. People who do not usually participate in online panels answer these surveys; respondents are not incented.
Regular random people stumble upon the nano-surveys as they navigate the Web (no pop-ups are used), and the methodology was first applied for government-commissioned academic research in pandemic surveillance and drug safety surveillance. The approach, detailed in the “Smarter Data” report noted above, has proven especially effective in capturing data in a privacy-compliant manner in emerging markets and hard-to-penetrate regions, and, to date, has captured over 10 million responses from consumers in every country in the world for clients such as the World Bank, The International Association of Prosecutors, KPMG, and major global corporations.
Too Much Data, Too Few Insights
With the proliferation of online commerce and social media, the volume and variety of data available for analysis have surged, offering potential breakthroughs from the breadth of sources. In practice, there are significant practical problems – more data does not always mean better data. The analysis output is only as good as the underlying data set, and for marketing researchers, low-quality information results in “garbage in, garbage out.” Systematic bias also creeps into all analysis since, tempted by so-called “Big Data”, pricey, quant-sophisticated consultants can hoodwink naïve business managers into thinking that minor correlations inside a haystack of correlations are of value – even when these sham correlations are built on a stack of “likes” and “cookie”-based gobbledygook about people’s online surfing habits.
Decreasing Respondent Participation Rates
With increased availability of ultra-broadband and fast network speeds, there is easier access to online resources, and people spend more and more of their time online. This, in conjunction with increased mobile phone use and less landline use, has translated into a steady decline of contact, cooperation and response rates in phone-based surveys; since 1997 response rates have declined steadily in the USA. This is in part due to the fact that, as of the end of 2Q12, mobile smart phone use (generally inaccessible to standard length surveys) had surpassed the one billion mark, specifically 1.038 billion. (Gartner predicts 1.2 billion smart phone devices and tablets to be purchased in 2013.) In 2012, only 9 percent of US households that had been contacted for a survey had ever responded to a phone poll, with the definition of “responded” meaning simply answering the phone and beginning the interview process; it did not mean that the respondent actually completed the survey. Meanwhile, some publicly reported industry response rates for Internet intercept surveys are less than 1 percent.
Commercial panels are based on repeat, incented, primed respondents, and the challenges of securing an even modestly representative group of the Internet population have long plagued the online panel community. “River sampling” is one technique – surprisingly, now in common usage, perhaps because of scant media attention to its extreme limitations – and it has proven to be an ineffective strategy. River sampling is already proving to be less of a river and more of a recycling system. Essentially, there are survey takers and non-survey takers, and while we should all encourage any effort to increase efficiency and reduce waste in our lives, respondent recycling should not be one of them.
False Promise of Data Mining From Social Media and the Open-Source Web
As suggested earlier, some analysts have chosen to incorporate data harvesting from social-media sites, e.g. micro-blogs such as Twitter or Facebook. The business case for Big Data is as follows: significant amounts of “social chatter” online have provided vast opportunities for firms to gain insight into people’s minds. Hence, by clever use of information extraction methods from Big Data, organizations can deliver answers to essential management questions such as, “Where is the market heading for product x?”. The problem is that, by definition, these processes require extensive data management, the use of automated solutions, deep-dive analysis and external consulting from a new profession of “data scientists”. More importantly, social chatter contains excessive data noise that makes it ineffective in capturing true user sentiment.
With such tools, the consumer sentiment expressed is not representative of the Internet population; it is not randomized. It would be like evaluating text on Twitter for the words “housing” and “bubble” and then trying to divine from this word combination – using some type of linguistic algorithm that ensured the adjacency of these two words – whether a rising incidence of Twitter users feel that housing prices are a “bubble” ready to pop. Social media sentiment is not randomized and thus limits the widespread applicability of sentiment analysis.
The Long Goodbye
Long surveys are essentially a form of respondent abuse, except in certain circumstances where the respondent has a vested interest in the data generated beyond simple compensation, e.g. employee satisfaction surveys. By this time next year, it will be very difficult to find a credible marketing researcher who will be able to defend the 5+ minute survey, never mind the industry standard 20+ minute survey, as an ideal, effective and credible general population data collection mechanism.
Good, Fast, Cheap: Pick 2
Clients are demanding, and getting, data that are faster, cheaper and bigger, but not necessarily better, and this trend will continue to accelerate in an industry-wide race to the basement. The “industry” itself shall morph; companies that collect and sell data to companies for insights will sound more like IBM than Gallup.
Except in the cases where high value groups are being engaged (doctors), the cost of capturing general market, non-representative data / “noise”, from social media or panels will effectively become zero within 12 to 18 months. Anything less than instant and live data will be considered too slow to provide competitive advantage.
All this being said, the value of highly specialized, predictive, real-time market data will continue to increase as marketers, governments, hedge fund managers, global risk assessment companies and others seek to respond and anticipate, in real time, to rapidly changing market opportunities and threats.
The good news for the MRI is that the amount of investment in data capture over the next few years will grow substantially, and applied market intelligence has never been more critical to competitive advantage as it is today. Following the unbridled exuberance of the 2012 Big Data all-you-can eat-buffet, we are already seeing signs of collective data indigestion that hopefully should define 2013 as the year of “Smarter Data” or something similar.
MRI professionals are uniquely positioned with many of the necessary skill sets to bring the thinking and analysis needed in the new world of mass data generation, interpretation and application. Provided we can embrace the world of rapidly changing live data and wean ourselves, as well as our clients, away from long surveys, panel based respondent pools, and low quality “good enough” data, we can avoid the fate of becoming commoditized data collection managers and once again provide the deep analytical and insight generation skills the industry was founded upon.