December 9, 2007 | No comments
That’s a fairly typical ad, which is now running on TV, and which is also on Glad’s web site. Looks like a clear majority would rather buy Glad’s fine trash bag than some other, lesser, bag. Right?
So what is the probability that a “consumer” would prefer a Glad bag? You’ll be forgiven if you said 70%. That is exactly what the advertiser wants you to think. But it is wrong, wrong, wrong. Why? Let’s parse the ad used and see how you can learn to cheat from it.
The first notable comment is “over the other leading brand.” This heavily implies, but of course does not absolutely prove, that Glad commissioned a market research firm to survey “consumers” about what trash bag they preferred. The best way to do this is to ask people, “What trash bag do you prefer?”
But evidently, this is not what happened here. Here, the “consumer” was given a dichotomy, “Would you rather have Glad? Or this other particular brand?” Here, we have no idea what that other brand was, nor what was meant by “leading brand.” Do you suppose it’s possible that the advertiser gave in to temptation and chose, for his comparison bag, a truly crappy one? One that, in his opinion, is obviously inferior to Glad (but maybe cheaper)? It certainly is possible.
So we already suspect that the 70% guess is off. But we’re not finished yet.
Continue reading “How to Exaggerate Your Results: Case study #2”
December 8, 2007 | No comments
In Part I of this post, we started with a typical problem: which of two advertising campaigns was “better” in terms of generating more sales. Campaigns A and B were each tested for 20 days, during which time sales data was collected. The mean sales during Campaign A was $421 and the mean sales during Campaign B was $440.
Campaign B looks better on this evidence, doesn’t it? But suppose instead of 20 days, we only ran the campaigns one day each, and that the sales for A was just $421 and that for B was $440. B is still better, but our intuition tells us that the evidence isn’t as strong because the difference might be due to something other than differences in the ad campaigns themselves. One day’s worth of data just isn’t enough to convince us that B is truly better. But is 20 days enough?
Maybe. How can we tell? This is the part that Statistics plays. And it turns out that this is no easy problem. But please stay with me, because failing to understand how to properly answer this question leads to the most common mistake made in statistics. If you routinely use statistical models to make decisions like this—“Which campaign should I go with?”, “Which drug is better?”, “Which product do customers really prefer?”—you’re probably making this mistake too.
In Part I, we started by assuming that the (observable) sales data could be described by probability models. A probability model gives the chance that the data can take any value. For example, we could calculate the probability that the sales in Campaign A was greater than $500. We usually write this using math symbols like this:
Pr(Sales in Campaign A > $500 | e)
Most of that formula should make sense to you, except for the right-hand side of it. The bar at the end, the “|”, is the “given” bar. It means that whatever appears to the right of it is accepted as true. The “e” is whatever evidence we might have, or think is true. We can ignore that part for the moment, because what we really want to know is
Pr(Sales in B > Sales in A | data collected)
But that turns out to be a question that is impossible to answer using classical statistics!
Continue reading “Why most statistics don’t mean what you think they do: Part II.”
December 7, 2007 | 2 Comments
Here’s a common, classical statistics problem. Uncle Ted’s chain of Kill ’em and Grill ’em Venison Burgers tested two ad campaigns, A and B, and measured the sales of sausage sandwiches for 20 days under both campaigns. This was done, and it was found that mean(A) = 421, and mean(B) = 440. The question is: are the campaigns different?
In Part II of this post, I will ask the following, which is not a trick question: what is the probability that mean(A) < mean(B)? The answer will surprise you.
But for right now, I merely want to characterize the sales of sausages under Campaigns A and B. Rule #1 is always look at your data! So we start with some simple plots:
I will explain box and density plots elsewhere; but for short: these pictures show the range and variability of the actual observed sales for the 20 days of the ad campaigns. Both plots show the range and frequency of the sales, but show it in different ways. Even if you don’t understand these plots well, you can see that the sales under the two campaigns was different. Let’s concentrate on Campaign A.
This is where it starts to get hard, because we first need to understand that, in statistics, data is described by probability distributions, which are mathematical formulas that characterize pictures like those above. The most common probability distribution is the normal, the familiar bell-shaped curve.
The classical way to begin is to then assume that the sales, in A (and B too), follow a normal distribution. The plots give us some evidence that this assumption is not terrible—the data is sort of bell-shaped—but not perfectly so. But this slight deviation from the assumptions is not the problem, yet.
Continue reading “Why most statistics don’t mean what you think they do: Part I.”
December 6, 2007 | 2 Comments
My paper on this subject will finally appear in the Journal of Climate soon. You can see it’s status (temporarily, anyway) at this link.
You can download the paper here.
The gist is that the evidence shows that hurricanes have not increased in either number of intensity in the North Atlantic. I’ve only used data through 2006; which is to say, not this year’s. But if I were to, then, since the number and intensity of storms this past year were nothing special, the evidence would be even more conclusive that not much is going on.
Now, I did find that there were some changes in certain characteristics of North Atlantic storms. There is some evidence that the probability that strong (what are called Category 4 or 5) storms evolving from ordinary hurricanes has increased. But, there has also been an increase in storms not reaching hurricane level. Which is to say, that the only clear signal is that there has been an increase in the variability of intensity of tropical cyclones.
Of course, I do not say why this increase has happened. Well, I suggest why it has: changes in instrumentation quality and frequency since the late 1960s (which is when satellites first went up, allowing us to finally observe better). This is in line with what others, like Chris Landsea at the Hurricane Center, have found.
I also have done the same set of models of global hurricanes. I found the same thing. I’m scheduled to give a talk on this at the American Meteorological Society’s annual meeting in January 2008 in New Orleans. That paper is here.