The Internet has,
in a few short years, redefined our abilities to reach diverse segments of
people where they live and work either as a batch or real-time process. Thanks
to consumer-level proliferation of broadband connectivity, Webcams, smartphones
and the like, we can conduct complex interviews with dispersed and even rare
samples of respondents using audio and visual communications of considerable
fidelity. There is no question that this is a great step forward for our
discipline.
But, in the midst of this rush to capitalize on the
efficiencies of digital research, I want to cast some words of serious caution.
My intent is not to denigrate the Internet as a research tool but to remind the
reader that it does not erase the need to gather face-to-face data in pursuit
of understanding human beings. My thesis is that our work isn’t fully done
until we sit across the table from those we wish to understand – physically in
their presence as we engage in discourse about their needs and interests.
Not the first time
This is not the
first time, incidentally, that our industry has encountered such issues. In the
first half of the 20th century, opinion polling (the forerunner of marketing
research) was conducted largely by canvassing sampled neighborhoods on foot,
with rigid rules about which households you should stop at and with whom you
were to speak when you got there. But the efficiencies of mail and telephone
surveys were too seductive to continue relying solely on a face-to-face
approach.
Furthermore, it was
clear almost from the onset that mail and phone studies each had its own set of
limitations and biases. Mail surveys allowed you to provide visual stimuli but
you had little control over who answered, when and with what sorts of
preparation. Additionally, it was all but impossible to stop people from
backtracking or otherwise distorting the order in which they answered
questions. Phone surveys solved some of these problems but had their own issues
to contend with – no visuals, for example, unless they were distributed
beforehand. But more troublesome was the temporal imperative to answer the
questions in relatively short order whether you understood the question or had
the wherewithal to answer. Each approach had advantages but left something out
in the process.
An extraordinary
tool
In comparison, the
Internet is an extraordinary tool. It can be used in so many different ways –
from analyzing content that flows on its own (e.g., blogs, reviews, social
media) to various synchronous and asynchronous querying techniques from chat
boards to online focus groups, which simulate face-to-face encounters with
considerable precision. And with today’s smart mobile devices, respondents can
take the interview with them – into their homes where they’re comfortable or to
the store where they can describe what goes through their heads as they weigh
their options.
However, like all of the new techniques before it,
there is still an important body of information left on the table, even as our
technological skills bring us closer to the experience of face-to-face
communications. My sense is that Marshall McLuhan’s phrase, “The medium is the
message,” is as relevant to Internet-based research as it was to the advent of
TV when McLuhan’s Understanding Media: The Extensions of Man was
published in 1964. Indeed, I am absolutely sure it is.
A social animal
Man is a social animal. Parts of our brains have
evolved over many millennia to attend to communications that take place on
levels other than verbal. There are thousands of scientific articles attesting
to the fact that a lot of what we “say” to each other is transmitted via all of
our senses in ways so nuanced as to defy verbal recognition.
Moms and infants communicate with each other long
before language forms for the little one. Adults recognize others’ dispositions
and moods instantly without knowing exactly how or why. Experts in nonverbal
communication tell us how to recognize when people are lying, just nervous or,
perhaps, romantically inclined. In Malcolm Gladwell’s Blink, he describes
peoples’ abilities to make good decisions instantly even when high-level
cognitions tell them to do otherwise – decisions made on the basis of nonverbal
communication.
The effects of
nonverbal cognition are an important part of social and personal understanding.
Today’s brain scientists tell us that we echo the appropriate emotions of
others whom we are watching. We experience genuine fear and excitement and
sadness and anxiety as we observe others in situations manifesting those
emotions. So it happens that group behavior ebbs and flows with a rhythm that
is interconnected among group members in ways that cannot be explained as an
aggregate of individuals’ isolated thoughts and feelings. Say what you will
about groupthink, the truth is that we think and behave as groups in real life.
We are an inherently social species.
Taken in context
When you have a
group discussion in a face-to-face environment, interactions take place on an
entirely different level than they do in an Internet-based focus group.
Interactions in digital groups take place on the basis of literal
interpretations of what is being said.
Interactions in a face-to-face group are
based on literal communications taken in the context of a continuous flow and
interpretation of descriptive metadata (having to do with how the information
was delivered) – information our brains have come to understand and over the
past million-or-so years. Such modifications often yield very different sets of
messages.
I’m not trying to say that moderators, even the best
of them, are extraordinary in their ability to read all the subtle cues that
emanate from face-to-face encounters. What I’m saying is that all of us are
hardwired to do this. A good moderator is perhaps better tuned in to such vibes
than most. He or she may not be a studied expert in turning covert
communications into overt messages but has learned how to use that underlying
current of information to arrive at a deeper sense of what someone is really
trying to say.
A good moderator
thinks carefully about the literal meaning of what someone is saying, modifies
that literal meaning in the light of nonverbal cues and then – this is the key
– asks the respondent to clarify the extent to which the moderator’s
interpretation of what was meant fits or doesn’t fit the respondent’s intended
message. This is an iterative process and dramatically more effective in person
than online.
Furthermore, and
just as importantly, this same process is going on with everyone who is party
to the conversation. Because we are human and because our brains are designed
to attend to the flow of emotive cues that surround individual pronouncements,
our reactions to what someone says in our presence are continuously being modified
and do not always track with a literal interpretation of what was actually
mouthed. Despite the efforts of even the most rigid of moderators, those
reactions enter into the dynamics of all group discussions.
The metadata
People posture.
They do it all the time. They do it to convince others and themselves that they
truly are the person they project. The interesting thing is that we’re often
more capable of – or at least willing to – acknowledge the pretenses of others
than we are our own. Focus groups (as well as 12-step programs) make use of the
fact that, as social animals, we sense when other people are misrepresenting
themselves, perhaps because we are familiar with the same pattern in ourselves
but also because we are all expert at attending to nonverbal cues that
accompany the communication. Furthermore, it doesn’t take long in a physical
group setting for members to press each other to explain discrepancies between
what they literally say and what they seem to be saying when one takes into
account the metadata.
Unfortunately much
of that metadata is missing when we interface digitally. In a face-to-face
setting, micro-expressions that would never be seen on screen are readily
apparent, such as eyes rolling back, a one-sided sneer, a particularly-intense
rather than off-handed delivery – all things that fine-tune the intent of one’s
words. Body language is vivid. Hand gesticulations, submissive posture, a
slight turn away from the listener or a cock of the head all add shades of
meaning. Sighs, snorts, giggles, huffs and murmurs emphasize the emotional
components of a viewpoint. Indeed, a host of marginal cues that may not be seen
or felt in an Internet transmission are obvious in a face-to-face setting –
perhaps to be considered at a low level of consciousness but nonetheless
pertinent to interpreting an intended meaning.
And I’m only mentioning here things that are overtly
apparent. Some communicators precluded from Internet representation entirely
(e.g., odors, flop sweat, etc.) can modify how we interpret what literally
spills from the mouth. Smell-O-Vision has been talked about for years but it
isn’t here yet.
Lost in
transmission
It’s not just
odors, of course, that are lost in transmission. The sensing devices that feed
remote interviews are, by-and-large, fixed in their focus. Cameras are
generally trained on the face and upper torso and rarely offer acute details of
either. Microphones also tend to be focused to filter out extraneous noise,
blocking metadata in the process. Because almost all transmissions involve
duplex communication, there is very little by way of useful sidebar
information.
In a traditional,
face-to-face focus group, I can direct my visual or audio focus anywhere I want
at any time. If there’s a sidebar event taking place, I can divert my attention
from the primary conversation to the sidebar (and I can assure you that
sidebars are frequently more interesting and relevant than the primaries).
Doing this is all but impossible in a digital encounter where sidebar
information, if present, is generally too indistinct or garbled to track.
A complete story
So, it happens that
40+ years of conducting marketing research studies of all types have convinced
me that face-to-face inquiry is an essential part of truly understanding
peoples’ thoughts, feelings and dispositions toward the products, services and
communications we study in our work. To be sure, we can get a huge amount of
reliable and valid information by carefully collecting data via the Internet
but we won’t have a complete story until we lace in some of the richness that
comes only from sitting down across from someone in the physical world and
talking things through.
The problem, of
course, is that face-to-face work is expensive and time-consuming. It is
especially difficult when you need to talk to people who are geographically
dispersed. Still, I believe leaving out face-to-face work entirely is
equivalent to the drunk who looks for his lost watch under a streetlamp because
that’s where the light’s best.
What I wish to advocate
here is that, as an industry, we develop hybrid approaches to research that
include components suited to digital research in addition to substantial
face-to-face work.
A goodly number of
our clients are already marrying digital and face-to-face approaches that
transcend the sum of their parts to create new avenues of understanding. The
surface is just being scratched, with new ways of using smartphones and tablets
to gather personalized observations and bringing those observations into
face-to-face settings. I believe these approaches have enormous promise for
illuminating peoples’ attitudes and motivations.
Not just words
But no matter how elegantly it is
done, no matter how closely the medium mimics reality, I remain convinced that
if you don’t spend a good deal of time and energy on thoughtful discourse in
the physical presence of your customers, you are never going to understand
exactly what they are trying to tell you. Articulateness is not just a matter
of words. It also comes from the way words are packaged and no emoticons – no
matter how clever – can achieve the warmth of face-to-face interaction.
By Stephen Turner
Stephen Turner is chairman of Fieldwork Inc., a Chicago research company. He is based in Honolulu.
This article was originally published in December 2012 by Quirk’s Marketing Research Review. Article ID: 201206608