Philosophy

Stats 101: Chapter 1

UPDATE: If you downloaded the chapter before 6 am on 4 May, please download another copy. An older version contained fonts that were not available on all computers, causing it to look like random gibberish when opened. It now just looks like gibberish

I’ve been laying aside a lot of other work, and instead finishing some books I’ve started. The most important one is (working title only) Stats 601, a professional explanation of logical probability and statistics (I mean the modifier to apply to both fields). But nearly as useful will be Stats 101, the same sort of book, but designed for a (guided or self-taught) introductory course in modern probability and statistics.

I’m about 60% of the way through 101, but no chapter except the first is ready for public viewing. I’m not saying Chapter 1 is done, but it is mostly done.

I’d post the whole thing, but it’s not easy to do so because of the equations. Those of you who use Linux will know of latex2html, which is a fine enough utility, but since it turns all equations into images, documents don’t always end up looking especially beautiful or easy to work with.

So below is a tiny excerpt, with all of Chapter 1 available at this link. All questions, suggestions for clarifications, or queries about the homework questions are welcome.

Logic

1. Certainty & Uncertainty

There are some things we know with certainty. These things
are true or false given some evidence or just because they are
obviously true or false. There are many more things about which
we are uncertain. These things are more or less probable given
some evidence. And there are still more things of which nobody
can ever quantify the uncertainty. These things are nonsensical or
paradoxical.

First I want to prove to you there are things that are true,
but which cannot be proved to be true, and which are true based
on no evidence. Suppose some statement A is true (A might be
shorthand for “I am a citizen of Planet Earth”; writing just ‘A’ is
easier than writing the entire statement; the statement is every-
thing between the quotation marks). Also suppose some statement
B is true (B might be “Some people are frightfully boring”). Then
this statement: “A and B are true”, is true, right? But also true is
the statement “B and A are true”. We were allowed to reverse the
letters A and B and the joint statement stayed true. Why? Why
doesn?t switching make the new statement false? Nobody knows.
It is just assumed that switching the letters is valid and does not
change the truth of the statement. The operation of switching
does not change the truth of statements like this, but nobody will
ever be able to prove or explain why switching has this property.
If you like, you can say we take it on faith.

That there are certain statements which are assumed true
based on no evidence will not be surprising to you if you have
ever studied mathematics. The basis of all mathematics rests on
beliefs which are assumed to be true but cannot be proved to
be true. These beliefs are called axioms. Axioms are the base;
theorems, lemmas, and proofs are the bricks which build upon
the base using rules (like the switching statements rule) that are
also assumed true. The axioms and basic rules cannot, and can
never, be proved to be true. Another way to say this is, “We hold
these truths to be self-evident.”

Here is one of the axioms of arithmetic: For all natural
numbers x and y, if x = y, then y = x. Obviously true, right? It is just
like our switching statements rule above. There is no way to prove
this axiom is valid. From this axiom and a couple of others, plus
acceptance of some manipulation rules, all of mathematics arises.
There are other axioms?two, actually?that define probability.
Here, due to Cox (1961), is one of those axioms: The probability
of a statement on given evidence determines the probability of its
contradictory on the same evidence. I’ll explain these terms as we
go.

It is the job of logic, probability, and statistics to quantify
the amount of certainty any given statement has. An example
of a statement which might interest us: “This new drug improves
memory in Alzheimer patients by at least ten percent.” How prob-
able is it that that statement is true given some specific evidence,
perhaps in the form of a clinical trial? Another statement: “This
stock will increase in price by at least two dollars within the next
thirty days.” Another: “Marketing campaign B will result in more
sales than campaign A.” In order to specify how probable these
statements are, we need evidence, which usually comes in the form
of data. Manipulating data to provide coherent evidence is why
we need statistics.

Manipulating data, while extremely important, is in some
sense only mechanical. We must always keep in mind that our
goal is to make sense of the world and to quantify the uncertainty
we have in given problems. So we will hold off on playing with data
for several chapters until we understand exactly what probability
really means.

2. Logic

We start with simple logic. Here is a classical logical argument,
slightly reworked:

All statistics books are boring.

Stats 101 is a statistics book.
_______________________________________________
Therefore, Stats 101 is boring.

The structure of this argument can be broken down as follows.
The two statements above the horizontal line are called premises;
they are our evidence for the statement below the line, which is
the conclusion. We can use the words “premises” and “evidence”
interchangeably. We want to know the probability that the conclusion
is true given these two premises. Given the evidence listed,
it is 1 (probability is a number between, and including, 0 and 1).
The conclusion is true given these premises. Another way to say
this is the conclusion is entailed by the premises (or evidence).

You are no doubt tempted to say that the probability of the
conclusion is not 1, that is, that the conclusion is not certain,
because, you say to yourself, statistics is nothing if not fun. But
that would be missing the point. You are not free to add to the
evidence (premises) given. You must assess the probability of the
conclusion given only the evidence provided.

This argument is important because it shows you that there
are things we can know to be true given certain evidence. Another
way to say this, which is commonly used in statistics, is that the
conclusion is true conditional on certain evidence.

(To read the rest, Chapter 1 is available at this link.)

Categories: Philosophy

23 replies »

  1. I downloaded your Chapter 1. You need to pull it immediately – it is beyond embarrassing. I’ve never seen that many typos in one paragraph before – it’s like you typed it out on a typewriter and then scanned it into a computer using cheap OCR software, or perhaps the ‘s’ and ‘c’ characters in your PDF generator need to be repaired??. Did you actually look at it even once before putting it on web for other people to look at?

    “Certainty & Unyertainty”?

    “Aleo euppoe eome etatement B ie true” ???

  2. Tom,

    When I originally posted it, the file looked fine on both of my Linux boxes and on one Mac, but people with PCs were receiving a corrupted version. In fact, I looked a the file dozens of times, downloading it off my own page to make sure it worked.

    I’ve placed a new version, this time embedding all the fonts inside the document using a ghostscript utility. It still looks good here, but I don’t have a Windows machine until tomorrow. Would you mind having another go?

    Of course, there is always the chance that the version with the screwed up fonts will read better than the corrected version, so read at your own risk.

    Thanks for the heads up.

    Matt

  3. It looks fine on my MacOSX laptop now, which is the machine where I saw the problem previously. Thanks for fixing it so quickly.

    Also, I want to apologize for the harsh tone of my first comment. Looking at it again in the morning, after a cup of coffee, I’m ashamed that I couldn’t find a friendlier way of pointing out the problems I saw last night.

  4. Thank you. I understood the font problem immediately and while I could interpret it without difficulty, it was very annoying.

    Also thanks for the refreshing approach to introducing the concepts.

    I took intro stats on three occasions in my academic life – first while studying programming (COBOL, and FORTRAN), second while an undergrad in geology, and finally while a grad student in geology.

    Each course was different. The first two focussed entirely on social sciences and frankly bored me stiff. I received a second class, but lost the learning due to non-use. The final time was focused on bioscience data, but was at least partially relevant so I retain a bit more.

    I’ll let you know in few years if your treatment is more successful.

  5. Matt

    Love the clarity of your explanations. However, there’s a minor character translation problem remaining. Under IE 7 at the home page, your typographer’s quotes are rendered as three characters: an “a” with a grave accent, a capital “C” with two strokes throught it that I have forgotten the name of and a box. The page of the same matter with comments renders the quotes as intended.

    On my Mac running OSX 10.5.1 running Safari, the reverse is true!

    Things haven’t improved any since I wrote about this back in 2000.

    http://www.sturmsoft.com/Writing/Old_ephemerides/20001029.htm#Saturday

    Best to use inch and foot marks instead of typographer’s’ quotes.

  6. Jonathan, Thanks for the tips. I originally cut and paste from the PDF file, but I’ve fixed the HTML.

    althos, I used pdflatex, so no dvi file. Are you still having trouble with the PDF?

    Deadwood, I’ll be waiting.

  7. On winXP the pdf looks good…. now I just have to finish reading it…. thanks for posting the corrected version.

  8. I had no problem reading the text at all using Firefox on my Linux machine. I will have to re-boot to get to my Windows system on this computer to see if IE and the Windows version of Firefox has a problem. If so then I will report back, but, it seems that text problems are under control.

    This version of Chapter 1 is much more readable, and brings the reader into a boring subject in a gentle way. I miss the equations, but, you will undoubtedly get to those later.

    Good job , and good writing style.

    bob

  9. Most people hate and fear statistics.

    I love statistics, and I loved your Chapter 1.

    ———————————————

    Therefore, don’t quit your day job.

  10. Any time two Ph.D.s get together, you have a paradox.

    In other words, you may need a co-author, or else develop multiple personalities.

    I would like to see Chapter 2. I guess I was a little dismayed by your contention that frequentist probability statements have no connection with reality, despite their inherent truth. That’s a little bit paradoxical, in my opinion. Are Bayesian statements more real but less truthful?

  11. I don’t mean to be flippant. It is very fine writing about a difficult subject. The question about reality and truth somewhat avoids the issue, though. Mathematical statistics (Bayesian or not), like the English language, are a symbolic representation of the “real” world. It’s all symbols, which have a reality of their own, but are used as models, whether in classical distributions or Cartesian displays. All premises, too, are inherently symbolic. The map is not the territory, but maps are useful nonetheless.

  12. Mike,

    Now we’re getting somewhere (and I don’t see you as flippant).

    You are interested in the probability of statement A—which is usually called an event in classical statistics—given evidence E.

    In frequentist statistics, you cannot say what Pr(A|E) is. At least, you cannot say so until you have re-run the “experiment” that would or would not let A obtain an infinite number of times.

    The relative frequency of times A obtains in that infinite sequence—and only in that infinite sequence—is defined to be the probability of A.

    If A is a statement, or event, that cannot happen an infinite number of times (for example A = “Mrs Clinton wins the Presidency in 2008″) then you are out of luck in classical probability. You just cannot answer the question. A has no probability.

    Another example: A = “A star will show on page 12 one day hence” where E = “There is a notebook with 52 pages, and today on just one I will draw a star, and three days hence I will destroy the notebook.” The logical probability of A given E is 1/52. But you are not allowed to say so in frequentist statistics. (The only reason to emphasize “destroy the notebook” is to highlight this is a unique event;; it is not needed.)

    True, A is just a map, but it is an exact one. You will look at the notebook at page 12, and either a star will be there or it won’t.

    Another example due to Stove:: A = “Bob is a horse” given E = “Bob is a winged horse.” The logical probability of A given E is 1, but the argument schema, of course, has no relative frequency.

    The reason to bring up an argument schema, is that frequentists are always obliged to posit an infinite series of experiments, and so must convert your concrete argument A and E into an argument schema where other A’s and E’s—identical in every way except for the substitution of labels—are assumed to exist. How else can you get to infinity?

    So in the example above, you might like to say A_2, A_3, … are future notebooks which you will force me to either write a star on page 12 or not. I’ll certainly get a cramp as I get near drawing in an infinite number of notebooks.

    Actually, I can see I am going to far. I think I will make this a posting for itself. This is just the kind of thing, incidentally, that’s going into the “Stats 601” book, although with more detail there.

  13. Hello Bill!
    Please GO:

    http://www.usefulinfo.co.uk/climate_change_global_warming.php

    and locate the large table with the t-tests. Could you explain to me if this method of stat analysis of the CET records is OK or not OK?

    I’m using this same method of stat analysis of temp records from several very remote weather stations. When I post my results over at Tim Lambert’s blog, everybody just dumps all over me and call me names like stupid, fool, idiot etc.

    This is because I flatout reject any and all hypotheses that the activaties of man have any influence on climate whatsoever, and they do not like me at all especially Bernard J, Barton Paul, Dano, etc.

    So far my results show no evidence of any global warming at this site, have detected the PDO shift from a cool to warm phase and back to a present cool phase quite accurately, and show that climate at this site is the same for 2000-06 interval as it was for the 1900-06 interval. Indeed I have found constant mean Tmax and Tmin intervals of 60-70 years at this site.

    I would like to post some result here for you to check out. However I don’t how the data will display and you don’t have a preview button so I can check check this.

    Have you checked out the late John Daly’s webite at: http://www.John-Daly.com. What is wrong his work? Everybody over at Tim’s IOD call him names also. Check out some the temp-time plots in the “Station Temperature Data” especially Death Valley.

  14. Harold,

    I am no fan of t-tests, especially when applied indiscriminately as they were in that table.

    I’ll talk about t-tests in my Chapter 10.

    Never saw Daly’s web site before. I see he is dead.

    Chapter 2 is just about ready.

  15. Thank you for the explantion re infinite limits. I have a friend who does not believe in calculus for exactly the same reason. And yet, calculus seems to work in many situations.

    Here is a good stat problem. Imagine an experiment with two rats, Andy and Barney. You feed Andy cat food and Barney dog food. Then Barney dies. What is the probability that dog food is poison to rats?

    Hint: there is both a frequentist and Bayesian answer to this problem, although both border on logical absurdity. One would really like to try the dog food out on more than one rat, just to add a little “confidence.” Also, it may surprise people how many allegedly scientific experiments are carried out with only two rats, i.e. two sampling units.

  16. Hello William!

    You didn’t answer my question. And why did you say these t-tests were applied indiscriminately?

    If I want to determine if the mean Tmax and Tmin temps for each month of 1900 are the same or different than that of 2000, what stat test(s) would you recommend?

    I’d pull out the 12 gauge pump shotgun (aka simple unpaired t-test) and blast the data with 0-0 buckshot (cf The Getaway) because it gets the job done and leaves no survivors.

  17. However, I do accept and understand (and appreciate) your example of one-of-a-kind future events, and the theoretical difficulty of assigning classical probabilities to those. But let’s not throw the baby out with the bath water. Gauss, like Newton and Euhler, left us some fun stuff which is still useful in certain applications.

    (not including t-tests, which are mostly useless).

  18. Harold,

    Let me take a stab at your question because Matt is busy writing a book and I am busy wasting time. T-tests assume a normal (Gaussian) distribution (the old bell-shaped curve) which never exists in actual data (only in theoretical distributions of an infinity of averages). Since the (limited) temp data is not normally distributed, the t-test is inappropriate and gives misleading results.

    Blasting data with a shotgun is inelegant. Your flat-out rejection of GW hypotheses may or may not be correct, but the t-test doesn’t add any supporting evidence one way or another.

    Which is a conclusion one might draw from Matt’s Chapter 1, but I have confidence will be better explained in the future Chapters.

    PS Gauss was still a remarkable genius. I admire him, anyway.

  19. Hello Mike D!

    Thank you for your really nice polite reply. It is not what I get over at IOD, desmogblog, Gavin’s Garage, and many other sites in the Kingdom of the Warmers.

    Data is always limited especially for many lab and field experiments with animals and plants, and most of us doing bioassays and experiments with them don’t have the time and resources to do these with a large number of reps, but use t-test and reviewers of our papers on insect pheromone research don’t object.

    When testing decadal means for sample interval of a month, I using 300 numbers for each decade, and that would seem quite sufficient for a t-test. I actually never got around to checking if the distribution was normal.

    Although many consider a shotgun an inelegant weapon, Craig Venter used it on the human genome project and it got the job done. Presumably, he was in a real hurry so he could use the results to restore the hair of his youth and make billions with a sure-fire, gene-based cure for male baldness.

    My rejection of global warming and climate change is not based upon the usual of climate metrics but on being alive for 64 years. Prior to this year, the climate in Metro Vancouver has been more or less the same although the weather has been quite variable and strongly influenced by the ENSO. However, the winter of 1991-92 was an exception and was quite cold (like 0 to ca -10 deg C for 2 months) due to the Mt Pinatubo eruption.

    Starting this spring, my body sensors are detecting a much different type of air. It has a coolness that I never experienced and it feels really creepy.

    BTW: Did you check JD’s and AM’s websites using the posted links. What is your assesment of them. Over at IOD John D is called a crackpot and most don’t even know about AM.

  20. I am looking forward to STATS 601. I last had a statistics course over 30 years ago and obviously need an update.

    After reading STATS 101, I feel like Cotton Mather after he was told everything he learned about witches (statistics) is nonsense. Bummer. Five semesters of statistics courses and now I need to be re-educated in modern statistics.

    Regards,
    Ray

Leave a Reply

Your email address will not be published. Required fields are marked *