Marcel Crok asked me to comment on the *peer-reviewed* paper “Tide gauge location and the measurement of global sea level rise” Beenstock (yes) and others in the journal *Environmental and Ecological Statistics*. From the Abstract:

The location of tide gauges is not random. If their locations are positively (negatively) correlated with sea level rise (SLR), estimates of global SLR will be biased upwards (downwards). Using individual tide gauges obtained from the Permanent Service for Mean Sea Level during 1807–2010, we show that tide gauge locations in 2000 were independent of SLR as measured by satellite altimetry. Therefore these tide gauges constitute a quasi-random sample, and inferences about global SLR obtained from them are unbiased.

Random means unknown, therefore it is true the gauge locations are not random since we know where they are. But they obviously mean *random* in the classical mysticism sense where “randomness” is a (magical) property of a system and which must be present to bless statistical results. The authors are right to insist that if gauges are placed only in locations where the sea is likely to rise, and none where it isn’t, then the gauges will only show sea level rise. So what is wanted is *control* of gauge placement, not “random” placement, such that the gauges accurately estimate overall changes in sea level. I mean *control* in the engineering sense and not the incorrect sense used by most users of probability models, where “control” isn’t control at all.

So the authors gathered a subset of gauges where the control appears better. They call this a “quasi-random sample”, but that’s the old magical thinking and it can be ignored. About real-control, the authors say:

Although coverage has greatly improved, especially in the second half of the 20th century, there are still many locations where there are no tide gauges at all, such as the southern Mediterranean, while other locations are under-represented, especially Africa and South America. The geographic deficiency of tide gauge location is compounded by a temporal deficiency due to the limited number of tide gauge with long histories. Indeed, the number of tide gauges recording simultaneously at any given period is

limited…

So measurement error is with us. No surprise to any but activists. Yet here comes a dodgy move: “To mitigate these spatial and temporal deficiencies, data imputation has been widely used.” By “imputation” they mean “guesses without their *predictive* uncertainty (and where parametric uncertainty is of no interest).” Imputation in this sense is smoothing, and we all know what happens when smoothed data is input into downstream analyses, right? Yes: over-certainty.

Okay, whatever. Did the sea rise or didn’t it?

To estimate SLR in specific locations over time we use statistical tests for non-stationarity. A time series is non-stationary when its sample moments depend on when the data are measured. Trending variables must be non-stationary because their means vary over time. Therefore, if a time series, such as sea level, happens to be stationary it cannot have a trend by definition…”augmented Dickey-Fuller” test (ADF) for stationarity…

No. This is the Deadly Sin of Reification. To tell if the sea has risen at any gauge, *all you have to do is look*. If last year the gauge read “low” and this year it said “high”, *then the sea has risen no matter what any model in the world says.* All that mumbo-jumbo about Dickey’s Fuller and stationarity apply to *fictional mathematical objects* and not reality itself. Hence reification.

And there is nothing in any statistical model which says what caused a change, in one or any number of gauges.

“Trends”, if they exist, merely depend on the definition—and nothing more. Trends exist if they meet the definition, else they don’t. Just look! The remainder of the methods section is given over to various tests and models to determine if a reified trend occurred. Wait. Isn’t showing a trend conditional on a model a definition of a trend? Yes. But then we must ask: is the model any good at making skillful predictions? If not, then this definition of trend is not useful. Is the model used here good at making predictions? I have no idea. The authors never made any.

Some interesting just-look statistics of the number of gauges and missing data through time are shown. Then the findings:

The Augmented Dickey-Fuller statistics (ADF) show that in the vast majority of segments and tide gauges sea levels are not rising…

The number of tide gauges with statistically significant trends is even smaller using Phillips-Perron tests, according to which as few as 5 segments and 5 tide gauges have trends.

By “trends” they mean reified and not real.

Look. Gauges are usually placed where they are useful in some decision-making activity. Harbormaster or the Army Corps of Engineers wants to know what’s happening here or there. What happens at these gauges is what is useful, and not what happens in some model. *Unless* that model can be used to skillfully predict what will happen (or what is unknown, such as what happened in times historical), and that means forecasts with predictive bounds.

I have no idea whether the models developed by the authors are useful for forecasting. They didn’t make any predictions. All they said was whether or not some model dial was in some place or other. They present some actual data plots, and some of these, by eye, look like they have positive trends, some negative, and some bounce around. One is at the head of this article.

The definition I used for trend was “by eye.” I have no idea if that’s the right definition for any decisions made at any of these locations. That’s what really counts.

Categories: Statistics

This article is interesting, Dr. Briggs, and prompts me to pose a question to this group (just sending out to the ether and seeing whether the Gods of statistics can offer a response).

Oftentimes in principle component analysis (PCA), imputation is identified as the mechanism to fill in missing data. This is a “recommended” action, so far as I have studied. Yet, my gut reaction to this is that it is not un-biased. A number of recommended methods for imputation are advertised, and I understand their mathematics and reasoning (so far as it can be understood). But, dealing with raw observations as much as I do, I do not participate in imputing of values as I believe they are not quite “honest”. Most of my dealings are with raw data collection from patients through clinical trials and where there are missing data, we either cannot use that set of measurements or there is a valid reason for the missing data and we just move on.

Perhaps this is not the right forum for such a question, but I thought I would pose to this unique site.

Thank you.

John Z,

I have the same issues with missing data imputation. I don’t work in a field with human subjects, though, so I rarely encounter missing data.

Imputation of data requires a model or assumption about what the missing data must look like. One, if we knew what the missing data must look like, then it’s hardly missing. Just write in the number that we know it is. Two, if we don’t know what it’s like, then we are making an assumption and introducing error. A ‘good’ imputation method will need to bring that error all the way through the analysis.

To me, case deletion is the only real way to handle missing data. The fact that when faced with the question “How can I analyze this data that has holes in it?” people didn’t respond with “no data, no analysis” is a really good example of the kinds of reification and magical thinking going on in stats.

James —

Thank you for that rapid reply. Am in agreement with you. In experiences with human subjects it is difficult to employ a model for imputation in many cases with physiological data (such as patient respiratory rate reaction to opioid infusion) because the “model” is not fully described and is highly dependent on many dependent as well as independent variables. In other cases much more tractable (e.g.: heart rate response to vasodilator infusion), it may be possible to bound the response and, hence, infer the missing data. This, however, can be dangerous as “getting it wrong” can have dire effects.

Yet, in the former case mentioned, it simply is not possible to impute missing results without incurring bias or otherwise influencing the data. I am a strong advocate for letting the data tell the story. This does not mean that I do not look for trends. It simply means that when I see a trend, I seek base cause (e.g.: patient movement, change in drug delivery, or underlying etiological, demographic and physiologic mechanics).

Am happy to hear that I am not out on that branch all alone.

Thanks again.

James said, “A ‘good’ imputation method will need to bring that error all the way through the analysis.”

A double Amen to that. And it has to be the

predictiveand NOTparametricerror.I would first note that sea level rise does not prove AGW caused it. It only shows a rise in sea level. The cause is totally independent of this measurement.

To use my favorite quote, “Just because we can measure it does not mean we can control it.”

At one time, it was understood that the “control group” was to be influence free. In this case, the tidal gauges would be put where the sea level was not expected to rise. This checked the null hypothesis. I guess that’s not done anymore.

Isn’t this similiar to what Monte Carlo and bootstrapping does? Take existing limited data and pretending it can be made more comprehensive by taking small “random” samples and making a large data base?

John Z and James: A thumbs up to both of you!

For all the discussion about analytical technique & associated shortcomings, the only real issue–not addressed at all by Briggs–is if the conclusions drawn by the authors are reasonable from the available data. The actual conclusions seem pretty straightforward:

First, the tide gauge data in PSMSL (Permanent Service on Mean Sea Levels ) may be used to obtain unbiased estimates of global sea level rise without the need for data reconstruction or imputation.

Second, there is no evidence from PSMSL of global sea level rise. In most locations sea levels are stable. In a minority of locations sea levels are rising, and in a smaller minority sea levels are falling.

Third, the claim that sea levels are rising globally (IPCC 2014) is an artifact induced by the use of imputed data.

Actually.. . for all tests or imputation methods based on some form of sampling to require knowledge of the real distribution of whatever it is we’re measuring. In the SLR testing case we don’t have that – the limited information we have comes from gauges located where they are, or have been, useful for other purposes.

What these people do in their paper is akin to estimating the rate of growth among human children by sampling physiotherapy records from geriatric centers in English speaking countries – to see that, imagine doing the right thing: using data from a randomly chosen subset of records, each an hourly record for several hundred years generated through exactly the same mechanism, and each applicable to a well defined, and large (>1E5), set of fixed grid positions covering all “sea level” bodies of water on earth.

According to Dr. Morner there is no sea level rise.

http://www.telegraph.co.uk/comment/columnists/christopherbooker/5067351/Rise-of-sea-levels-is-the-greatest-lie-ever-told.html

As a non-scientist, the only paper I have looked at on this subject is from 2010, by Nils-Axel Mörner. It has a discussion of tide gauges, geology, and methodology. 6 papers cite it but all seem to be pay-walled. I looked for criticism elsewhere.

Mörner has been a pariah since maybe 2007, apparently. He said then that the oceans had stopped rising completely, which seemed highly fringey. His 2010 paper treats that as just one dataset.

http://www.21stcenturysciencetech.com/Articles_2011/Winter-2010/Morner.pdf.

figure 4 from that paper. (figure 2 is also interesting)

http://i.imgur.com/zKkFyA7.jpg

What nobody has pointed out, here (& of course in the alarmist community) is that “rising sea level” in some places is really a proxy measure for a geologic process — the ground is settling lower (e.g. in subduction zones) giving the illusion that the water is getting deeper (“sea level rise” is itself a proxy measure for ‘how full the oceans are’).

That’s a distinction with a real difference.

Analyzing tide gauges to measure relative depth to a sinking or rising area of earthen geography omits a fundamental factor and is sure to lead to wrong conclusions — even if the statistical analysis were to meet Briggs’ standards the conclusions would still be wrong because they fundamentally misrepresent reality by implicitly assuming a fixed & stable coastline.

If you don’t understand & incorporate ALL the physics involved in an activity, you are doomed to produce a flawed model & analysis — no amount of statistical rigor, much less statistical philosophizing, can possibly offset such omissions.

I wrote to Beenstock twice, in December 2013, essentially to point out that he had been basing his analyses on assumptions that are unjustified—and that seem to be unjustifiable. I did not receive a reply.

It is annoying that he has continued on. It is really good, though, that you have posted about this.

Ken,

Are there not tide gauges that are GPS or otherwise elevation corrected for sinkage? I seem to remember reading about that somewhere on WAWT.

Obviously that should be accounted for, or the results are junk.

Gauges in the Baltic were placed there to monitor dropping sea levels due to rapid land uplift after the ice ages.

Italy has volcanic effects: see Lyell on the Pozzuoli temple of Serapis.

One of the subtexts to the Third Way method of analyzing statistics is that the question of “What is Reasonable” is opened up and deepened, beyond what can ever be accomplished via frequentism or traditional Bayesianism. Hence, analysis that seems abstruse or even irrelevant to a worthy commenter like Ken is in fact a better, surer way to get what he also wants: a representation of data that is reasonable.

Applying the Third Way automatically creates the possibility for

a demonstrably ‘more reasonable’ reasonablenessthan had previously been possible. That in essence is why Third Way methods of analysis should always be used, and never frequentist or traditional Bayesian approaches.Of course all three approaches, when done carefully, will at least occasionally provide pretty much the same answers. But only Third Way analysis can show why agreement among the approaches would be noted in a particular case, and also show why frequentism and traditional Bayesianism fail, when they do.

Third Way ‘abstruse’ or ‘arcane’ questioning really does mean more careful questioning — more demonstrably reasonable, reasonableness that takes greater public (and where possible, also more quantifiable) account of its uncertainty about its conclusions.

And regarding “missing” data, I vividly understand how it might be very difficult to be reasonable about it. The show must go on, after all. You’ve clawed your way to a grant, and now — well, reputable,

peer-reviewedAuthorities tell you of a number of academically-acceptable ways to ‘handle’ missing data. Especially ways that get you publishable results, where previously you had fewer.I’ve always wanted to believe that there was some kind of sure-fire way to ‘handle’ (= hand-wave away) the problem of “missing cells” in an analysis. But I greatly fear that James is exactly right. And here Matt’s point to James, above, is critical. There could easily be an order of magnitude difference between predictive error and mere parametric error, even when parametric error is honestly carried right through the entire analysis.

Third Way analysis will ALWAYS give you “Flatter and Fatter” probability distributions. Its conclusions will ALWAYS look less ‘certain’ than now-common approaches to data. But that is why Third Way analysis is truly, provably, more reasonable than either frequentist or traditional Bayesian methods.

Walter Munk looked at sea level rise through tidal gauges some 15 years ago and concluded the data was biased toward sea level rise. The gauges were placed predominantly in areas most susceptible to rise through thermal expansion. This paper stimulated my interest in the biases of large geophysical data sets.

Sea level is the balance between the volume of water in the ocean basins (water input/output, thermal expansion, etc) and what volume the ocean basins have to hold this water. Sea floor spreading at an increasing rate will reduce volume of ocean basins and lead to sea level rise all other factors remaining steady.

Mr. Briggs,

None of the trend definitions you proposed in your old posts can be used for prediction. One of the purposes for postulating a trend is for prediction. However, to check the second-order stationarity of a time series, all you need to do is examine the time series plot with your eyes, and it doesn’t depend on any form of proposed trend though the time series may show a clear deterministic trend such as linear in time. Why is the stationarity important? It has to do with the prediction efficiency while trying to model the uncertainty. Not a short story, so please do a Google search or open up a time series textbook if interested.

James,

Throwing away any information/data is a sin in statistics, unless you have a very good reason to do so. Which is one of the reason why there is a vast amount of studies about imputation methods. The main idea is that if we have incomplete data, there might be a way of using them with the information contained in the rest of the complete data and the knowledge of the data collecting process. No, there is no magic thinking going on in statistics.

JohnK,

Totally nonsense!!! If Briggs disagrees with my assessment here, he should speak up!!!

JH,

Say, that’s a lot of explanation marks. Just one or two and I wouldn’t have been convinced, but so many? Surely you must be right.

But, no. You’re wrong. Completely wrong. And sinful!!!!!! (Look at all those explanation marks!!!!!) To talk about stationarity is to engage in the Deadly Sin of Reification. It is to imagine the data itself are alive and have actual properties, whereas the data are just the data and have causes, and, as JohnK is so right to point out, the nature of these causes is our true goal.

Yours is the classical, mystical view of probability, which I have shown time and again to be wrong. It would take a book to show every aspect of wrongness. And I wrote one!!!!!!! (Aren’t explanation marks grand?)

I expect you’d say that much. *sigh* If you have explained what stationaryity is first, I’d have tried to explained more.

Have you read C. Robert’s (xianblog) review of your Third-Way paper? How did you come up the probability model to start with and what assumptions have you made in Section 3? The assumption of stationaryity is imposed by the same line of reasoning…

Thinking that you might have drastically revised your paper after reading JohnK’s comments, I actually checked out your paper again. Well, you’ve changed the description of your data in Section 3- Example.

Hmmm… are the data observed or generated? If they are simulated, you are to report accordingly, and hopefully you have used a seed number so one can reproduce the same set of values. If they are observed, you are to describe the background accordingly as it is essential in postulating an appropriate model. Yes, I should not have to point this out!!!

You are the one who imagines others believing data are alive and are confused by cause. Your Third-Way paper does nothing what JohnK wrote.

Oh… may I suggest that you check out some papers or books on the causation in epidemiology?

Which part of my comments represents the classical, mystical view of probability?

JH,

“Which part of my comments represents the classical, mystical view of probability?”

That question means I have failed in my job. I’ve written two or three hundred articles on this very subject, but none have yet been written in language clear enough to show you.

Yet you have stirred in me a new way of putting it. I’ll write a separate article. It shall be entitled (probably) “Data Do Not Have Means.”

Did I say data have means? Well, there is so-called sample mean for data. Think of the

modelyou’ve used in your third-way paper. What do you try to model and what is the prior placed on in Section 3 of your thid-way paper? You cannot preach one thing and do another. Think of Stove’s fair coin example… model diagnostics . What are the differences between model and data?Think of how you calculate the proportion of SSA in twins for your naive model in a previous post, though the naive model’d have no skill in predicting a twin’s sexual orientation. Sample, model, diagnostic, and generalization/prediction.

Repeating the words “wee pee values” and “no predictiive skills” hundreds of times doesn’t mean you actually understand what they are.

And I agree even more with C Robert’s comment after you wrote “Data do not have mean.” Not this statement is wrong, it’s just that … Well, I am going to stop here.

Hey JH, data do not “have” means.

Bwahahahahaha!

See the upcoming post this week (probably Tuesday).

I’ll be watching for the post. I have long believed data do have means. Reading why you believe this should be interesting.

Hey, Briggs,

I know, just like there are no such things as gays. Please spare me word games. There are no right triangles in reality either.

I could send you monthly average (mean) temperature anomalies from January 1950 to August, 2013. Next, you use your

logic probabilityto produce two-year ahead forecasts, and we shall check the skill of your predictions. No point predictions allowed.Think through it, just once!!!

Let’s see you accomplish

what can ever be accomplished via frequentism or traditional Bayesianism.And show “why Third Way analysis is truly, provably, more reasonable than either frequentist or traditional Bayesian methods” and why“Third Way analysis will ALWAYS give you “Flatter and Fatter” probability distributions. Its conclusions will ALWAYS look less ‘certain’ than now-common approaches to data.”Seriously, how much have you paid John K to say the above? Bwahahahahaha!

(JohnK, I apologize for making this joke, please please don’t take it personally. )

JH,

You’re bringing up same-sex attraction here? Dude.

Anyway, be patient. It will be Wednesday and not Tuesday when the “Data do not have means” column runs. I’m going to a talk on Monday which I’ll write about Tuesday.

JohnK,

In short, you’re right and JH is wrong.

Briggs,

Simply gave you an example of the word game you play! Dude!

Again,

I could send you monthly average (mean) temperature anomalies from January 1950 to August, 2013. Next, you use your logic probability to produce two-year ahead forecasts, and we shall check the skill of your predictions. No point predictions allowed.

Think through it, just once!!!

Well, Birggs, accepting the above challenge is one way for you to show you are correct in telling JohK is right. The Only Way!

JH: No one who is an actual statistician or scientist would ever make two-year ahead predictions. It’s impossible, given the chaotic characterists of weather and our lack of knowledge of causes for various phenomena.