Read the introduction to this first. If you don’t, you will be lost, lost, lost.
Logical probability answer to B
The answer to B follows from A. The picture is of what the mean might have been given the assumptions used above. A 1/9 chance the mean was 69.5, 2/9 it was 70, 3/9 for 70.5, etc. No error bars? Well, no: none are needed. This picture is the complete answer.
“Error bars” are classical and come from assuming some sort of parameterized probability model, like a normal (where the probability of seeing any observation is always zero), which we did not use and do not need here. No test statistics, no p-values, no parameters, no priors, no posteriors. Just probability.
Notice that the problem is entirely discrete? It’s not because we only averaged two days, but because the nature of the evidence is discrete (homework: find a non-discrete real-life example; I won’t wait). It would still be discrete no matter what finite number of days on which we took our mean. Our answer is exact given the assumptions; it has been deduced.
What about the month of Maxes, say 30 days? Still discrete. Each daily Max can take three values with equal probability (given our assumptions), and each of these can be combined with each other daily Max so that the mean is comprised of 330 = 2×1014 possibilities. Still discrete, but what a number!
Actually, it’s not as bad as that because not that many unique combinations can occur. It could be that every single time Max was measured, it was low by a degree, or every time high by a degree, or something in between, including the time every measurement was spot on. That makes only 61 possible values the mean of Max can take (every possibility from adding -30 to +30). Quite a reduction!
Start from the left: Assume the average is from every measurement by one degree low, which is equivalent to the sum of the actual temperatures minus 30, all divided by 30. There’s only one way out of the 2×1014 possibilities this can happen, so inverting that gives the probability. This is symmetric with every day being hot by one degree, which has the same probability.
Next: the temp could have been low 29 times and right once. That can happen 30 different ways, with a probability 30/330. This is also symmetric with 29 times high. Next: the temp could have been low 28 times and right twice, or low 29 times and right once, which is also symmetric (they all are).
You get the idea. All we need do is count the number of times each under or over could happen. A cute, eventually tedious, but not overwhelming combinatoric problem. Example: +/- 30 can happen just one way; +/- 29 can happen 30 ways (these are all each); +/- 28 can happen 465 ways; +/- 27 can happen 4930 ways; +/- 26 is 40,020 ways, and so on towards the peak at 0 (where the plus and minus errors balance). Summing all the different ways equals 330 (it must!). (Homework: what are number of ways for +/- 0?)
So I made up 30 days of Max temperatures somewhere around 70. Here’s the picture of what the mean can be, and the probabilities we deduced for these values given our assumptions.
My made-up Maxes were from 60 to 77. The computed average was 70.3666… The most likely value of the true mean is the same. The only values the mean could have been, given these conditions and data, are (rounded) 69.37, 69.4, …, 71.33, 71.37. This distribution is exact. There is an exact (to within roundoff in my calculations) probability of 0.087 for the mean to be 70.37. The others may be drawn from the figure.
For fun, I can report there is a 95.54% chance that the mean is in the set 70.0667, 70.1, …, 70.633 (I can give you all the numbers, but they are beside the point). There is no reason in the world to pick 95.54% except that it is close to the classical magical (magical classical?) value.
Did you notice the language? I did not say that there is a 95.54% chance that the mean is “between” 70.0667 to 70.633, because that is false. For one, those words leave out the endpoints, which are real possibilities. For another, only the discrete values in the set are possible. The mean might have been 70.0667 or 70.1, but it was impossible (given our etc.) that it could have been, say, 70.09, or any other value not in the discrete set.
The red line on the picture, which is cut off and which actually extends from 68.8 to 71.9, is the classical “95% confidence interval” on the parameter of a normal distribution model. Notice that this extends beyond the actual possibilities. The definition of the confidence interval means—ready?—nothing for any particular set of data (except the true mean lies in the interval or not), but even if you took the Bayesian view (same as the frequentist here for a flat prior) the interval still only speaks of a parameter. And even if you integrated the parameters out (let he who readeth understand), you’d still be left with an interval, which gives probabilities for impossible values (actually it gives probability 0 for every value!).
Don’t worry if the last paragraph made little sense. The point is this: the results we have are exact, and not the result of a parameterized probability model. Our results are deduced given the assumptions we used, and not calculated via some ad hoc model.
What happens if we change the assumptions? We change the results! Of course we do. All probability (all logic) is conditional. Change the conditions, change the conclusion.
What this answer isn’t
Because of measurement error, we were not certain of the mean, which is what we wanted to know. But we are certain of what the mean could have been, and its chances.
The results are not a prediction of future values of Max temperature. The are a prediction of what the mean of Max temperatures were during those 30 days, which we don’t know (again) because of measurement error.
There results are not statements about actual past temperatures, which we already knew, up to measurement error.
The results are also not what Kip originally asked for, but the answer to those questions are discovered in just the same was as these.
I’ll do the logical probability example most close to binomial next.
Basically, you model what could have been observed based on what’s observed on a particular day, and then compute the probability distribution of the sample mean of max temperatures for 30 days.
This could have been written by a classical statistician who tries to show Lyapunov Central Limit Theorem by example! (http://en.wikipedia.org/wiki/Central_limit_theorem#Lyapunov_CLT)
Show me those 30 maximums, your data! What do your data tell you? Whip out a prediction of tomorrow’s max temperature based on the maximum temperatures from the past 30 days.
Logical probability? Deduced? Not via some ad hoc model? What(and where) is the assumption/premise of independence required in your calculations called in the context of logical probability?
With the information that “I know that my accuracy cannot exceed 0.5 degrees, because my thermometer is marked only in single degrees, and I estimate the temps up or down to the whole number as I see fit,†I would not assign a discrete uniform probability of 1/3 to each of the three values that a daily Max can take (after the observation).
JH,
Your comments show that you have no succeeded in breaking out of classical thinking. There is no “sample mean”. There is a computed mean (a fixed function) of certain observations subject to measurement error, of a kind we assumed (others are possible, as admitted; but even classical measurement error models need to make these assumptions). The word “mean” as I use it was specifically defined in part one. It does not take the definition classical statistics gives it. Besides, “samples” are frequentist thinking, which is on no interest here. I recommend re-reading part one in the spirit I intended. Do not try to shoehorn what I say into a classical framework. That way lies darkness.
The results presented here could not “have been written by a classical statistician who tries to show Lyapunov Central Limit Theorem.” A classical statistician might have tried something like this, but the two methods are not equivalent. My results are exact given the assumptions. The other is merely an approximation given the same assumptions I made and adding some which are not warranted.
These are the temperatures: t = c(68, 72, 71, 70, 67, 68, 60, 66, 75, 70, 75, 77, 62, 68, 68, 70, 75, 72, 77, 64, 71, 75, 74, 70, 69, 64, 72, 77, 71, 73)
There is no prediction of tomorrow’s high from these, for two big reasons: (1) this wasn’t the problem we were working on, (2) the answer is about a function (the numerical average) of actual observations. But stand by for a logical probability version of regression and time series. I’ll do this in a week or two (I have some traveling to do before then).
“Logical probability? Deduced?” Exactly, yes. Deduced, meaning error free given the assumptions we had. (All probability is conditional on assumptions/evidence/premises.)
“What(and where) is the assumption/premise of independence required…” It doesn’t exist. It is not needed. You have misunderstood the goal. Part one lays this out.
“I would not assign a discrete uniform probability of 1/3 …” Congratulations? Well, as said in part one (recall) the assumption about measurement error characteristics about the instrument was one possibility. Since we had nothing else, I used the 1/3 assumption for illustration. Other assumptions are possible, and each would give a different answer. But the technique would be the same. And it’s the technique we’re interested in here, not in Kip’s thermometer.
Re-read the warning about language I gave in part one. This is the second time you’ve referenced Wikipedia as a kind of trump card (but you’re playing a different game!) based on misunderstanding the terms.
This post is how a classical statistician would demonstrate the sampling distribution of the sample mean.
A classical statistician would show the histogram for a particular day, and then plot a histogram of the resulting probabilities (the calculations of which can be easily demonstrated in class) for two days, and finally, just like you’ve done here, construct the histogram for a sample size of 30 days. No approximation at all.
Students can see show how the shape of the histograms approaches to a bell shape as the number of days increase. Now, here, students would have to use approximation and their imagination. When theoretical proof is impossible, this is one way to show the theorem.
No convinced by your measurement error model at all. It should start with what the true variable that a surrogate (which represents your erroneously observed maximums in this case) is trying to measure. Your model doesn’t make sense. Well, you know whom to ask if you have questions on measurement error models, Bayesian or not… all conditional on information available. You won’t believe what I say, so I am going to save myself some time.
Referencing Wikipedia is not any kind of trump card! Please don’t try to project your own thinking and behavior onto others. I wish you could cite references. A book? A paper? Anything!
JH,
You’re just not understanding the problem. There is no reason in the world to display a histogram. This would just confuse things. There is no theorem here, either. I’m not sure I have it in my powers to explain it any more clearly, but I’ll try.
Assume zero measurement error. Then we would know the mean with 100% certainty. The “mean” is an exact function. It has zero to do with any theory in the world (except that one which says a mean is an exact function, i.e. the numerical average). No matter how many days I had, with zero measurement error, we know with certainty what the mean is.
Now assume measurement error in the way we did (as I’ve tried to explain, other ways are possible, but assume this one). Suppose we measured 62. The we know the mean is 61 with probability 1/3, 62 with probability 1/3, and 63 with probability 1/3. This is an exact, precise, perfect answer given our assumptions (all probability is conditional on assumptions/premises/evidence).
What I did above is show how the same reasoning works for 30 instead of one day. But it’s exactly the same.
Go back to part one and look at the example where we only have two days (two measures). It is easy to work out entirely by hand (which I did). Doing 30 days requires a computer. That’s what the homework is.
Cite what? What for? We have worked out the answer. What do we need to cite?
Update. If the measurement error were not (1/3, 1/3, 1/3)—and recall we are just assuming these—but, say, (.1, .8, .1), meaning there is an 80% chance of showing the right temp but a 20% chance of showing 1 degree over and under, then if we measured 62 we know the mean is 61 with probability 0.1, 62 with probability 0.8, and 63 with probability 0.1. I’ll leave it as a homework problem to show what this does to the picture above.
All,
Don’t over-think this problem.
If I told you two numbers, say 2 and 4, and I asked you to compute their mean, would you be certain it is 3? Of course! We ask kids in 4th (is it 4th?) grade to do this daily. They do not need central limit theorems to calculate 2 plus 4 divided by 2.
But now if I told you I wanted to know the mean of two numbers, the first of which is either 2 or 3 each with probability 1/2, and the second is either 4 or 5 each with probability 1/2, then we know the mean is either (2+4)/2 with probability 1/4, (2+5)/2 with probability 1/4, (3+4)/2 with probability 1/4, or (3+5)/2 with probability 1/4.
That’s it! Just as we did in part one. The answer is exact, though fuzzy, because we don’t have just one number.
Keep all classical statistics out of your head when doing these problems. That stuff will just confuse you.
Briggs,
The probability plots shown in this post are called probability histograms!!!
And they are calculated based on the model you’ve postulated. I KNOW it requires computer to calculate the probability for 30 days (but not for 2 days), hence only the probability histogram can be displayed for students for 30 days.
My point is: your calculations are no different from what classical statisticians would have done! A classical statistician would make the same conclusion as you do in this post… assuming that they accept your discrete uniform model.
The problem arises when a classical statistician tries to make inferences/predictions. Hence my question: what do your data tell you and how would you make prediction?
Let x be the true maximum temperature (m.t.), and w be the observed surrogate m.t. Kip wrote:
“I know that my accuracy cannot exceed 0.5 degrees, because my thermometer is marked only in single degrees, and I estimate the temps up or down to the whole number as I see fit,…â€
I think Kip meat that he rounded his temperature readings to the nearest integer temperature, if yes, the unobserved true x can be assume to have a possible value ranging from w – 0.5 to w + 0.5, again, where w is the observed surrogate. This is how I would start my error-in-variable model.
The movie “Hobbit” is on now, so I rather go back to it… watching it for the second time.
Since I don’t know any classical statistics that should be easy. What kind of statistics does statistical mechanics use? The canonical ensemble, both petite and grand. What you have done so far is straight forward enough, banal even, but I am sure that the denouement will rattle a few cages and anger the beast within. It’s one of those nights – turkey tomorrow! 🙂
JH,
I admit defeat. You have defeated me. There are just some things I cannot teach to all people. But, no, no, and as many noes as you like, this solution is not how a classical statistician would have solved it, as you yourself have demonstrated in your previous comments.
Briggs,
The point of my mentioning CLT is not that one needs the CLT to find the probability distribution of possible sample means. It is to let you know that your calculations are exaclty what a classical stastician would do in order to show the theorem by example. Your setup in this post can serve that purpose beautifully.
(In the mean time, I am hoping that your readers might get a rough idea what a CLT is from your calculations and histograms. Let me not mention the reason why I specifically use Lyapunov CLT, instead of just CLT.)
There is no specific parameter to be estimated in the way the entire thing is set up by you here. No parameter called mu or beta at all…which usually is the causes of frequentist inferential problems.
The fact that I think your model doesn’t make sense based on Kip’s message has nothing to do whether it’s classical statistics at all. Classical statisticians set up their models based on information available also.
Hence, I have no idea what you meant by saying that I have defeated you.
I have not tried to solve any question. Nor have postulated any probability model relating x and w.
I have not said that your calculations are wrong based on your model either.
The true value of the central limit theorem is that it allows us to use the normal distribution as an approximation in cases where we do not know the true distribution and there are computational difficulties. How long will it take to find the probability distribution of possible means over 365 days?
Tom,
Since this took about the blink of an eye on my cheesy laptop, 10 times that long isn’t too long. Turns out there’s a formula in the end so this is pretty easy to do.
We’ll see when I do the “success” example that calculations are just as fast and what happens when you let n go to infinity. Parameters emerge: at infinity! We could take limits here, but since the problem is not of wide interest there doesn’t seem a lot of point.
Briggs, 3^30 and 3^356. Think about it.
Tom,
Oops. You’re right. But taking limits as approximations would be okay in cases like this. So would numerical approximations to the formula. The number of possible means would still be limited.
Briggs, my dear boy, think before you speak and take the time to understand a comment before criticizing.
I’ll come clean — Prior to asking my dumb questions here, I have been making trouble at Andy Revkin’s Dot Earth blog with this guest essay there:
http://dotearth.blogs.nytimes.com/2013/10/09/on-walking-dogs-and-global-warming-trends/
I did not claim to be a statistician (as you can see by my miserable performance here, it is obviously true). I only meant to write as a “practic-ian”.