Crunching the numbers can expose myths

A recent article in the New York Times Magazine’s Freakonomics column, and one of my favorite books of the year, both remind me that a careful examination of data can dispel long-held myths. Neither is directly related to a particular marketing challenge. But they both inspire me to continue to goad my clients into thinking beyond the obvious. We can seize a strong competitive advantage by assuming nothing and testing our premises whenever possible.

The Freakonomics article is ostensibly about soccer and an odd correlation between player excellence and the month that player was born. In analyzing data about the birth months of some of Europe’s best soccer players, it was found that they were born far more than you would expect in the first three months of the year. When researchers looked deeper, they realized that a logical explanation is that children born in these months were exposed to more months of coaching in their schools — more repetition, more chances to excel.

It suggests that the power of the Two P’s of practice and passion — as opposed to simply having “raw talent” — is far more important in excellence than is commonly believed. Thus the title of the Freakonomics article: “A Star Is Made.”

If you know me, you know I care little about sports. But after reading Moneyball by Michael Lewis, I was so inspired I bought three copies*. One to keep, one to pass around to co-workers, and a third to give to my father as a gift. The message of the Freakonomics article was that stars are made, not born. Similarly, the message of this book, about the unlikely, data-driven success strategy of the Oakland Athletics baseball team, is: “A winning baseball team is made, not bought.”

Read the book, and marvel at how Billy Beane, the general manager, refused to believe the group think of baseball scouts and the status quo. He wouldn’t listen to them when they told him how to identify promising players for his under-financed, under-performing team.

It’s a great read, and another reminder that looking at the data instead of listening to the way things have always been done can pay huge dividends.

*Thank you Bret Stasiak, my boss from my BVK/respond360 days, for letting me know about this wonderful book!

Internal search data is free, quantitative usability testing, if you use it

Even if I’ve never met you or visited your web site, I can diagnose with a fair amount of certainty what many users say about it. Whether you realize it or now, they don’t particularly enjoy visiting your site.

That’s because most people use web sites only out of necessity. And your web site really has only one responsibility to these people: To give them the information they value. Period.

Ideally this trade of “effort for information” should be short and sweet. No visitors to your site want to feel like they’re on a scavenger hunt. But that’s exactly what it often feels like, and it pisses them off. Thus, your site’s low conversion rates and high abandon rates. How did I know about those? They’re about as predictable as inhaling and exhaling.

So how do you take some of the frustration out of using your web site? Simple. Fix your site’s confusing navigation and it’s improperly labeled and organized content.

And I suggest you start with the single easiest and best source for learning what’s missing on your site: Namely, data from your internal search.

Think about it. If you have an internal search engine operating right now, the people who find your site the most frustrating are often typing out their frustration in that little text box. The sound of user dissatisfaction (dissatisfaction with your navigation, dissatisfaction with your content) is right there … loud and unequivocal. But it’s got to be captured and measured or this gold mine of information is lost.

Okay, here’s a shameless plug: I and my team at ec-connection build this system in many of our clients’ web specifications. By tabulating the search phrases that users type in, we get to see what’s frustrating them, or at the very least, what they want to see on this site that they’re not finding. With this valuable, free quantitative research, we can fix our clients’ navigation and content problems. And watch the searches, and the user pain they suggest, fall off.

Many customers who made the same types of phone calls as you also bombed The World Trade Center

I’m not ordinarily a defender of Bush Administration actions concerning its response to The World Trade Center attacks, but the database analysis proponent in me feels something should be clarified in the minds of most Americans. According to a recent NEWSWEEK poll, “53 percent of Americans think the NSA‘s surveillance program ‘goes too far in invading people’s privacy.'” This of course is the taking of cell phone and other telephone records and mining them for clues to possible terrorists.

The outcry, I think, is in part because when we think of phone surveillance we think of wire-tapping (or, in the case of cell phones, wireless-tapping). However, if I understand this situation correctly, the NSA used this vast database of phone call numbers (both of originators and recipients), along with call dates, times and lengths, to look for suspicious patterns that were similar to those found in known terrorists’ phone behaviors.

I know, I know. If you analyze for this type of activity, you can also find patterns in the activities of your political enemies. Imagine the blackmail potential! It could shut down Washington! (Hmmm … could the blackmail have already begun?)

But let’s assume for the moment that we could somehow shine some light on the activity, thus preventing such abuses. Is this data mining an invasion of privacy? I suspect it’s closer to the surveillance we’re all accustomed to — and appreciative of — in our quiet suburban neighborhoods.

Probable cause is a term used to justify a police officer pulling over a citizen for questioning. I would equate this database research to looking for probably cause. So how is the research done? It uses the same technique that marketers use to predict whether a consumer will like this product versus that one.

For instance, you buy a CD on Amazon, and the web site immediately says, “Other of our customers who bought that CD also purchased these.” Then it lists three or four other, often surprisingly unrelated, artists, along with their latest CDs. If you have a big enough music collection, and predictable enough tastes, you’re surprised that you already love the work of one or two of those other artists. Amazing!

Amazon, and other large marketers using this profiling, let you know in advance that they looked into their database and found those correlations (through the statement, “Other customers of ours …”). What they don’t tell you is that usually, those data relationships are — on their own — too obscure or unrelated to be recognized in any way other than by using a sophisticated statistical regression analysis.

The same for this NSA action. I think a lot of Americans are concerned because they imagine an all-seeing computer is examining every single phone call they make or receive. I also suspect they are angry because now they have yet another privacy vulnerability to worry about, along with identity theft, spyware, etc.

But I suspect the process of profiling that was done by the NSA is more along the lines of the Amazon example. The predictive model takes into consideration thousands of weak correlations — possible coincidences that are only significant because when added together they match the behavior of known terrorists, (I would say convicted, but good ole Mr. Moussaoui is about it, and that’s an awfully small sample to try to model against! Known domestic terrorists would include the guys who died in their planes on 9/11, and made plenty of phone calls before they did).

So, if that is the case, is this intrusive? That depends.

 Is a police officer driving down your quiet residential neighborhood invading your neighborhood’s privacy when looking for probably cause to investigate a possible crime? This officer may not stop if one suspicious fact is noted about someone in your neighborhood. Maybe even two or three aren’t sufficient for probably cause. Each on its own may be too subtle — too similar to the behavior of those not breaking the law. But if there are enough suspicious facts concentrated around the behavior of, let’s say, that guy parked outside your door, then the officer will conclude the correlation is too great. The behavior and evidence surrounding that guy show too many similarities to those of convicted criminals. This behavior taken as a whole is too close to that of a burglar, let’s say.

The brain of that cop isn’t going to retain much information the next day, or even the next hour, about the non-suspicious behaviors that were observed, and in a similar way, I don’t think the NSA’s computers will be able to do much else but identify the behavior patterns they are programmed to sniff out.

Which brings me back to my original observation. How in the world did I become a defender of Bush? The answer is the NSA, under his watch, found a non-intrusive way to comb this country for possible criminal activity. I only pray that there will now be enough judicial (and judicious) oversight to ensure that the profiling being done is for real enemies of the state, and not enemies of the administration and its incumbents.

What if the contents of your home page was ultimately controlled by Google?

I’ve noticed that the growing power of search engines has brought about a new way to look at the design of a commercial web site. The old approach was to design a site starting with the Home Page. That was the presumed entry page — at least much of the time.

Paradoxically, the new paradigm suggests we should design our sites with pages beyond our direct control in mind. 

Today I’ll focus on the entry pages to your site that a very important set of prospects uses. I’m talking about search engine results pages (SERPs) on important search engines, for specific, relevant search terms.

You know what you want to say on your home page. But what about what is said on a search results page? If you consider that you will be getting 10% to 15% of your total site traffic from search engines (a norm we’ve witnessed with many of our commercial sites), can you afford to ignore the influence that these SERPs have on consumers who click through to you? Or worse, the influence they have to cause other prospects not to click through?

I suggest we all regularly check to see what descriptions are showing up for our sites on important SERPs. What’s more, consider setting up a way to find correlations between the best descriptions in these organic listings and those consumers’ chances of converting from visitors to customers. It can be done, and could yield better ROI from those visits. Remember, these folks are pre-qualified and are often your very best prospects!

By the way, I’m talking here exclusively about “organic” results — the results that are generated by a search engine’s true search algorithms. Much has been written elsewhere about testing and tweaking text listings for pay-for-click ads. My point is, why not apply this same discipline to refining your organic results descriptions?

You may even eventually want to optimize key pages that are most likely to be visited from these organic search click-throughs, to ensure that was is stated on the SERPs’ descriptions is restated on that landing page. It could be the difference between a new, satisfied customer and a frustrated, departing visitor.

Speaking of not frustrating your site visitors, my plan is to follow this post with one about the easiest and most sure-fire way to improve your site’s navigation. Stay tuned.