William M. Briggs

Statistician to the Stars!

Page 146 of 581

My Complete And Utter Lack Of Business Sense

Hot statistician for sale. Act today!

Hot statistician for sale. Act today!

I can derive the Omega equation, can tell you why clouds float, and I know what number comes after א0, but I couldn’t sell fire to an Eskimo.

Not for lack of trying. That’s item one.

I don’t recall who, but a gentleman once commented on a post here that business knowledge and expertise was entirely different from scientific and engineering knowledge and expertise, and that scientists, bright fellows all, assume that because they have mastered group theory and Feynman diagrams that operating a business is a cinch.

It isn’t.

Item two is this:

Monsanto just bought The Climate Corp., née WeatherBill, for a cool nine-hundred-and-thirty million-with-an-M bucks. If you’re like me—and you may thank the good Lord you are not—your immediate thought was…those cheap so-and-sos couldn’t come up with a lousy extra seventy million?

Climate Corp. specializes in “Using ‘hyper-local weather monitoring,’ predictive models and other data to help farmers make optimal growing decisions”. In other words, taking the output from skillful climate models—those out a month or two have skill (in the technical sense)—and blending it with weather to produce better weather forecasts.

I had the same idea years back and published the general idea in Journal of Climate. (The difference between skillful and non-skillful forecasts is what convinced me climatologists ought not to cherish their creations as much as they do, incidentally.)

This idea—better, more precise and tailored weather forecasts—was so good that I and a couple of friends set up a company to sell them back in the 2000s. Called ourselves Gotham Risk Management. The technical side of the biz wasn’t difficult. Creating a database, building automatic models, a website of course and all that held no mystery for us.

But then came phone calls with potential clients. I’d open with, “Gbbbb, uh, bwwwpth, you see, um.” After confirming that I wasn’t a patient from the Speech Pathology Lab, my listener would beg to be told what the heck I wanted. I had an answer: “Buy our forecasts. You won’t be sorry.”

Shyness, as anybody who has ever been within one hundred standard English yards of me will confirm, is not my problem. Knowing how to convince people to part with their money is.

Here I had this tool that would let people price their weather insurance and climate derivatives more correctly, that would make them more money than it would cost, but I couldn’t convince anybody to try it. It didn’t help that the futures market in heating and cooling degree days in Chicago had froze solid—literally, trading ceased entirely because of the naughtiness originating at Enron. But that’s no excuse.

Item three. So I had another terrific idea that would track people’s predictions and rate them in an optimal way. Doesn’t sound much, until you tie it to something practical. How about sports betting? With a company called (at first) EdgeHogs, we tried.

It’s easy to make accurate sports predictions, but only for games which any clod can guess. University of Michigan v. St Peter’s Boys Junior Academy? Better get that one right. Tight games are hard. The optimal rating takes these kinds of things into account. We did all the big sports.

Everybody, including pros, entered guesses into the system and were rated, using a bevy of statistics about moneylines, Vegas odds, and all that.

So what? Well, exactly. The idea the guys with the money had (Yours Truly has an amount close to the subscript on the first kind of infinity mentioned above) was to make the business into a social media phenomenon. Get rated for bragging rights, win prizes. Didn’t fly. Social media is saturated.

Idea is still good, though, and should be applied in a way such that better predictions are produced from knowledge of how old ones fared. These superior ones can be sold. Money, baby. And it doesn’t have to be sports, but how about stocks? Or even weather predictions again?

Couldn’t sell this, either.

Item four. My latest mission has been to tell people who routinely use statistics, mainly marketing companies, that the results on which they rely are too certain. Being too certain means suboptimal and even bad decisions are made. And don’t people want to know that?

Well, my selling talent has proved itself again. Either that, or it turns out people really do want to be too certain.

Gist? There’s much truth in the old adage that those that can’t do, teach.


Everything Wrong With P-Values Under One Roof

Statistics is the only field in which men boast of their wee p-values.

Statistics is the only field in which men boast of their wee p-values.

Handy PDF of this post

See also: The Alternative To P-Values.

They are based on a fallacious argument.

Repeated in introductory texts, and began by Fisher himself, are words very like these (these were adapted from Fisher, R. 1970. Statistical Methods for Research Workers. Oliver and Boyd, Edinburgh, fourteenth edition):

Belief in a null hypothesis as an accurate representation of the population sampled is confronted by a logical disjunction: Either the null hypothesis is false, or the p-value has attained by chance an exceptionally low value.

Fisher’s choice of words was poor. This is evidently not a logical disjunction, but can be made into one with slight surgery:

Either the null hypothesis is false and we see a small p-value, or the null hypothesis is true and we see a small p-value.

Stated another way, “Either the null hypothesis is true or it is false, and we see a small p-value.” Of course, the first clause of this proposition, “Either the null hypothesis is true or it is false”, is a tautology, a necessary truth, which transforms the proposition to “TRUE and we see a small p-value.” Or, in the end, Fisher’s dictum boils down to:

We see a small p-value.

In other words, a small p-value has no bearing on any hypothesis (unrelated to the p-value itself, of course). Making a decision because the p-value takes any particular value is thus always fallacious. The decision may be serendipitously correct, as indeed any decision based on any criterion might be, and as it often likely correct because experimenters are good at controlling their experiments, but it is still reached by a fallacy.

People believe them.

Whenever the p-value is less than the magic number, people believe or “act like” the alternate hypothesis is true, or very likely true. (The alternate hypothesis is the contradiction of the null hypothesis.) We have just seen this is fallacious. Compounding the error, the smaller the p-value is, the more likely people believe the alternate hypothesis true.

This is also despite the strict injunction in frequentist theory that no probability may be assigned to the truth of the alternate hypothesis. (Since the null is the contradiction of the alternate, putting a probability on the truth of the alternate also puts a probability on the truth of the null, which is also thus forbidden.) Repeat: the p-value is silent as the tomb on the probability the alternate hypothesis is true. Yet nobody remembers this, and all violate the injunction in practice.

People don’t believe them.

Whenever the p-value is less than the magic number, people are supposed to “reject” the null hypothesis forevermore. They do not. They argue for further testing, additional evidence; they say the result from just one sample is only a guide; etc., etc. This behavior tacitly puts a (non-numerical) probability on the alternate hypothesis, which is forbidden.

It is not the non-numerical bit that makes it forbidden, but the act of assigning any probability, numerical or not. The rejection is said to have a probability being in error, but this is only for samples in general in “the long run”, and never for the sample at hand. If it were for the sample at hand, the p-value would be putting a probability on the truth of the alternate hypothesis, which is forbidden.

They are not unique: 1.

Test statistics, which are formed in the first step of the p-value hunt, are arbitrary, subject to whim, experience, culture. There is no unique or correct test statistic for any given set of data and model. Each test statistic will give a different p-value, none of which are preferred (except by pointing to evidence outside the experiment). Therefore, each of the p-values are “correct.” This is perfectly in line with the p-value having nothing to say about the alternate hypothesis, but it encourages bad and sloppy behavior on the part of p-value purveyors as they seek to find that which is smallest.

They are not unique: 2.

The probability model representing the data at hand is usually ad hoc; other models are possible. Each model gives different p-values for the same (or rather equivalent) null hypothesis. Just as with test statistics, each of these p-values are “correct,” etc.

They can always be found.

Increasing the sample size drives p-values lower. This is so well known in medicine that people quote the difference between “clinical” versus “statistical” significance. Strangely, this line is always applied to the other fellow’s results, never one’s own.

They encourage magical thinking.

Few remember its definition, which is this: Given the model used and the test statistic dependent on that model and given the data seen and assuming the null hypothesis (tied to a parameter) is true, the p-value is the probability of seeing a test statistic larger (in absolute value) than the one actually seen if the experiment which generated the data were run an indefinite number of future times and where the milieu of the experiment is precisely the same except where it is “randomly” different. The p-value says nothing about the experiment at hand, by design.

Since that is a mouthful, all that is recalled is that if the p-value is less than the magic number, there is success, else failure. P-values work as charms do. “Seek and ye shall find a small p-value” is the aphorism on the lips of every researcher who delves into his data for the umpteenth time looking for that which will excite him. Since wee p-values are so easy to generate, his search will almost always be rewarded.

They focus attention on the unobservable.

Parameters–the creatures which live inside probability models but which cannot be seen, touched, or tasted—are the bane of statistics. Inordinate attention is given them. People wrongly assume that the null hypotheses ascribed to parameters map precisely to hypotheses about observables. P-values are used to “fail to reject” hypotheses which nobody believes true; i.e. the parameter in a regression is precisely, exactly, to infinite decimal places zero. Confidence in real-world observables must always be necessary lower than in confidence in parameters. Null hypotheses are never “accepted”, incidentally, because that would violate Fisher’s (and Popper’s) falsificationist philosophy.

They base decisions on what did not occur.

They calculate the probability of what did not happen on the assumption that what didn’t happen should be rare. As Jefferys famously said: “What the use of P[-value] implies, therefore, is that a hypothesis that may be true may be rejected because it has not predicted observable results that have not occurred.”

Fans of p-values are strongly tempted to this fallacy.

If a man shows that a certain position you cherish is absurd or fallacious, you multiply your error by saying, “Sez you! The position you hold has errors, too. That’s why I’m going to still use p-values. Ha!” Regardless whether the position the man holds is keen or dull, you have not saved yourself from ignominy. Whether you adopt logical probability or Bayesianism or something else, you must still abandon p-values.

Confidence intervals.

No, confidence intervals are not better. That for another day.

Handy PDF of this post

See also: The Alternative To P-Values.


Diversity! Update (Now With More Diversity)



Diversity. Diversity, diversity diversity diversity diversity. Diversity?

Diversity diversity diversity diversity diversity, diversity diversity diversity diversity diversity, diversity diversity. Diversity diversity diversity diversity? Diversity diversity diversity diversity: diversity diversity diversity.

Diversity diversity diversity, diversity diversity diversity. Diversity diversity diversity diversity diversity diversity.

“Diversity diversity diversity diversity diversity diversity; diversity diversity diversity, diversity diversity diversity.”


  1. Diversity, diversity.
  2. Diversity diversity diversity diversity.
    • Diversity?
    • Diversity: diversity.
  3. Diversity diversity diversity.
  4. Diversity.

Diversity diversity diversity—diversity? diversity!—diversity diversity diversity.


Diversity diversity diversity, diversity diversity diversity. Diversity diversity diversity diversity diversity diversity. Diversity diversity diversity diversity diversity diversity; diversity diversity diversity, diversity diversity diversity.

Diversity diversity diversity diversity diversity chicken.

Update Diversity, diversity, diversity, diversity.


On The Intersection Of Scientist And Politicians—Guest Post by An Engineer

When science and politics meet Red is the result?

“An Engineer” is a gentleman who has practiced Civil, Mechanical and Industrial Engineering for more than 40 years. He has been granted patents for a number of practical inventions, and continues to invent while enjoying retirement. He also is a retired US Army Lieutenant Colonel who served proudly in the Corps of Engineers.

Scientists don’t make anything. They discover new knowledge, and the process of discovery is well organized but the outcome is messy, subject to wide debate, and extensive revision.

Politicians don’t make anything. They talk about problems, real or imagined, and make policies and laws to address those problems. Their process is well organized but the outcome is also messy, subject to wide debate, and extensive revision.

Here is the intersection. Politicians rely on scientists to give them a problem to talk about for there is a lot of talking in politics. Politicians then award more money to scientists to discover more problems. Hint: the best problems for both are those without solutions. For neither is equipped to solve problems, just to discover, talk and tax.

The top 259 scientists comprising the Intergovernmental Panel on Climate Change from 30 countries, mostly funded by politicians from the 30 countries, announced an important finding after 52 hours of non-stop revision and then danced in celebration when they released their latest giant climate report on September 26th.

Only a summary of the report was published—the million-word full version will follow—but over the last week “every single word” has been justified to 110 governments. Their conclusion: mankind is causing global warming with 95% confidence. The report is released just in time for a conference to forge political agreement in Paris in December.

What will the scientific-political-media event in Paris decide? Certainly there will be a lot of discussion about the damage mankind is doing to the planet. Likely there will be statistical affirmation showing a high probability of disaster looming in the future. The conferees may conclude, with high confidence, that people have the power and means to irreversibly alter the earth and its atmosphere by emitting carbon.

So when the conference report is adopted and implementation guidance is published, mankind will have acquired god-like power to change 5.97219 x 1024 kg of mass, not including the atmosphere (pegged at 1 millionth of total earth mass), in the span of but 253 years (1760 to current). Who is responsible for the mile-thick glacial ice over Cincinnati 100,000 years ago?

Paris is cold in December; often, but not always. Assume there is a warm spell. After all there could be, for the only certainty we have about weather is its variability. And hundreds of millions of days of weather equals climate. Climate change is a well-established scientific premise. Not well established are predictability and causes. Here is a simple test. How accurate was last night’s low temperature forecast and if inaccurate what was the cause? Chart ‘predicted’ verses ‘actual’ for a month and note the variability. Variation defeats certainty, and without an identified cause prediction becomes a best guess—just like climate forecasting.

But politicians do not need to consider variability. They do not need nor most do not want to understand science. They have paid for scientifically generated, peer-reviewed, “statistically sound” findings. And remember, the best political problem is one without a solution, and climate change will produce lots of sound bites for the media who will scare us into compliance for the TAX. For politicians decide who pays for all this carbon that if unchecked will destroy civilization as we know it. Their only dilemma is figuring out how to tax us for comfortable homes, cars and the like. On reflection, there might be one problem politicians are capable of solving—deciding who pays the bill. I wonder where the money will go!

« Older posts Newer posts »

© 2015 William M. Briggs

Theme by Anders NorenUp ↑