William M. Briggs

Statistician to the Stars!

Page 150 of 590

Daily Links & Comments

@1 Amen! One should at least suspect [BS] when symbolism and other formal techniques that could easily be dispensed with without loss of rigor and with a great gain in readability are used anyway. Reminder: I have a sensitive spam filter. Link

@2 A plucky amateur dared to question a celebrated psychological finding. He wound up blowing the whole theory wide open. Link

@3 Under the Yeah, Sure heading. How 3,000 year age of empires was recreated by a simple equation: Scientists show how math can predict historical trends with 65% accuracy. Link

@4 Following close behind…Nearly 25 percent of Asia-Pacific men rapists: study. Link

@5 The Folly of Scientism. Link

Please prefix comments with “@X” to indicate which story you’re commenting on. I should hardly need to say that a link does not necessarily imply endorsement.


How To Properly Handle Proxy Time Series Reconstructions

This is all made up data, so as not to hurt anybody’s feelings. Also, this is a sketch. Everything can’t be done in 700 words.

We are interested in the time series T, which represents values of some thing taken at progressive time points (these needn’t be regular). But we can’t measure T. We can, however, measure a proxy of T, something “correlated” or associated with T, something which might causally be affected by T. What’s a proxy? Something like this:

Figure 1

Figure 1

Imagine the proxy is some chemical measurement inside tree rings, coral reefs, or whatever and T temperature. Somehow we have taken simultaneous measurements where both the proxy and T were available. Step one is to model the relationship, which is shown by the over-plotted line (a simple linear regression). Pretty good fit, no?

It ought to be, because this is an oracle model; which is to say, the model here is true because I picked it. In real life, the model itself is usually a guess, meaning everything that follows will paint a picture of confidence which evades us in reality.

Next thing is to guesstimate T where we have no T but where we have the proxy. Like this (the proxies aren’t shown, but I used the perfect model fitted above to predict T):

Figure 2

Figure 2

Very well, this looks like a reasonable prediction of T given new values of proxy (using the same regression). But every good scientist knows that error bars should accompany any prediction. Here’s what people using time series usually do:

Figure 3

Figure 3

The fuzziness comes from looking at the error, the plus-or-minus, of the relevant parameter inside the model (standard 95% bounds). Looks like a tight prediction, no? Even after taking into account the uncertainty of the parameter, we’re still pretty sure what T was. Right? You guessed it: wrong.

For that, we need this:

Figure 4

Figure 4

The wider bands show the plus-or-minus of T, the prediction interval of the real observable (same bounds). There is no use plotting the uncertainty of the parameter as above, because the parameter doesn’t exist. T exists. We want to know T. This is the best guess of T, our ostensible goal, and not of anything else.

I would like to shout that previous paragraph right up next to your ear until I see you nod.

Notice how much, how dramatically larger are the intervals? How less certain we really, truly are? If you noticed that, you have done well. But don’t forget that this picture is too optimistic, because the proxy-T model was known. In real life, we won’t usually know this and so have to widen the final error bars.

By how much? Nobody knows. This is key. If we knew, then we could know the model and we wouldn’t have to widen the bars. But since we do not know the proxy-T model, we do not know how much to push out the envelope. Meaning that if we accept the numerical bounds as accurate just because they are numerical, we will be too certain. Worse, in our quantitative-induced euphoria, we’ll forget that we should be less certain. Not all probability is quantifiable.

Now another thing people like to do is to plot a straight line over the guesstimated T and speak of whether there was a “statistically significant” increase or decrease in T, or they’ll use the line to say “there has been an X average increase in T” or some such thing. This is almost always folly, not the least because these judgments eschew the uncertainty we have been at pains to illuminate.

Plus there is no reason in the world to do this unless you expect that straight line to skillfully predict future values of T. How do you know if this is true? Hint: you don’t. After all, something like this can happen:

Figure 5

Figure 5

The new T (over the entire period and not just the time of the proxy) was generated in advance (as were the proxies, which recall have a specific known relationship with T). I picked this one (T is a kinda-sorta a “long-memory” time series) because of its vague resemblance to actual time series we have all seen before.


Daily Liks & Comments

@1 Free concert tomorrow, Sunday, 27 October. 1:30 pm. Salve Regina: Music for the Heavenly Queen. St. Catherine of Sienna. 68th street between First and York. Gregorian Chant, Ave Maris Stella, etc., etc. The music there is good.

@2 Chemists show life on Earth was not a fluke. But “[h]ow life came about from inanimate sets of chemicals is still a mystery.” Link

@3 Tenured professors says, “That’s it. I’m outta here.” Why I Jumped Off The Ivory Tower. Link

@4 Keep the scourge of scientism out of schools. Why evidence-based teaching methods are a bad idea. Link

Please prefix your comments with “@X” to indicate which story you’re commenting on. I should hardly need to say that a link does not necessarily imply endorsement.


Rick Santorum Says Devil Pwons Hollywood

Santorum was out flogging a movie in which he has an interest—The Christmas Candleand was heard to say of more traditional Hollywood fare that “the Devil for a long, long time has had this, these screens, for his playground and he isn’t going to give it up easily.”

I know what you’re thinking. What else explains Transformers, Transformers: Revenge of the Fallen, Transformers: Dark of the Moon, and the fell rumor of Transformers 4? What other than dark forces could be responsible for Indiana Jones and the Kingdom of the Crystal Skull and Dumb and Dumberer: When Harry Met Lloyd?

Could it be only a coincidence that Shia LaBeouf is an anagram for Oh! Abuse! Fail!? I think you know the answer.

Two words: Adam Sandler.

Yes, this is the twenty-first century—counting from when, pray?—and we’re supposed to pooh-pooh the idea of invisible forces guiding our lives—of course the Higgs field is real—by why else would Chuck Norris—Chuck freaking Norris!—have succumbed to plastic surgery? Say it isn’t so, Chuck! It’s so.

Now I, along with Justice Scalia and millions of others, am with Santorum on the reality of the Prince of Darkness. “And the Lord said unto Satan, Whence comest thou? Then Satan answered the Lord, and said, From going to and fro in the earth, and from walking up and down in it.” Frightening stuff.

But given its history and the grim determination of its denizens, it figures that the Devil doesn’t actually put in a lot of time in La La Land. It’s a cliché, but why mess with success? Its the audiences that are the real worry.

One final item to prove Rick was right. Hal Needham has died! He took his final drive into the sunset last night. It was Needham, the stuntman of stuntmen, a man who broke over 50 bones, some of which were his own, who brought us such cinema classics as Smokey & the Bandit, Hooper, and Cannonball Run. God rest you, Mr Needham.

Update I don’t know if anybody noticed, but this makes three weeks in a row…

« Older posts Newer posts »

© 2015 William M. Briggs

Theme by Anders NorenUp ↑