## Stats 101: Chapter 1

#### UPDATE: If you downloaded the chapter before 6 am on 4 May, please download another copy. An older version contained fonts that were not available on all computers, causing it to look like random gibberish when opened. It now just looks like gibberish

I’ve been laying aside a lot of other work, and instead finishing some books I’ve started. The most important one is (working title only) *Stats 601*, a professional explanation of logical probability and statistics (I mean the modifier to apply to both fields). But nearly as useful will be *Stats 101*, the same sort of book, but designed for a (guided or self-taught) introductory course in modern probability and statistics.

I’m about 60% of the way through *101*, but no chapter except the first is ready for public viewing. I’m not saying Chapter 1 is done, but it is mostly done.

I’d post the whole thing, but it’s not easy to do so because of the equations. Those of you who use Linux will know of `latex2html`

, which is a fine enough utility, but since it turns all equations into images, documents don’t always end up looking especially beautiful or easy to work with.

So below is a tiny excerpt, with all of Chapter 1 available at this link. All questions, suggestions for clarifications, or queries about the homework questions are welcome.

### Logic

** 1. Certainty & Uncertainty**

There are some things we know with certainty. These things

are true or false given some evidence or just because they are

obviously true or false. There are many more things about which

we are uncertain. These things are more or less probable given

some evidence. And there are still more things of which nobody

can ever quantify the uncertainty. These things are nonsensical or

paradoxical.

First I want to prove to you there are things that are true,

but which cannot be proved to be true, and which are true based

on no evidence. Suppose some statement A is true (A might be

shorthand for “I am a citizen of Planet Earth”; writing just ‘A’ is

easier than writing the entire statement; the statement is every-

thing between the quotation marks). Also suppose some statement

B is true (B might be “Some people are frightfully boring”). Then

this statement: “A and B are true”, is true, right? But also true is

the statement “B and A are true”. We were allowed to reverse the

letters A and B and the joint statement stayed true. Why? Why

doesn?t switching make the new statement false? Nobody knows.

It is just assumed that switching the letters is valid and does not

change the truth of the statement. The operation of switching

does not change the truth of statements like this, but nobody will

ever be able to prove or explain why switching has this property.

If you like, you can say we take it on faith.

That there are certain statements which are assumed true

based on no evidence will not be surprising to you if you have

ever studied mathematics. The basis of all mathematics rests on

beliefs which are assumed to be true but cannot be proved to

be true. These beliefs are called axioms. Axioms are the base;

theorems, lemmas, and proofs are the bricks which build upon

the base using rules (like the switching statements rule) that are

also assumed true. The axioms and basic rules cannot, and can

never, be proved to be true. Another way to say this is, “We hold

these truths to be self-evident.”

Here is one of the axioms of arithmetic: For all natural

numbers x and y, if x = y, then y = x. Obviously true, right? It is just

like our switching statements rule above. There is no way to prove

this axiom is valid. From this axiom and a couple of others, plus

acceptance of some manipulation rules, all of mathematics arises.

There are other axioms?two, actually?that define probability.

Here, due to Cox (1961), is one of those axioms: The probability

of a statement on given evidence determines the probability of its

contradictory on the same evidence. I’ll explain these terms as we

go.

It is the job of logic, probability, and statistics to quantify

the amount of certainty any given statement has. An example

of a statement which might interest us: “This new drug improves

memory in Alzheimer patients by at least ten percent.” How prob-

able is it that that statement is true given some specific evidence,

perhaps in the form of a clinical trial? Another statement: “This

stock will increase in price by at least two dollars within the next

thirty days.” Another: “Marketing campaign B will result in more

sales than campaign A.” In order to specify how probable these

statements are, we need evidence, which usually comes in the form

of data. Manipulating data to provide coherent evidence is why

we need statistics.

Manipulating data, while extremely important, is in some

sense only mechanical. We must always keep in mind that our

goal is to make sense of the world and to quantify the uncertainty

we have in given problems. So we will hold off on playing with data

for several chapters until we understand exactly what probability

really means.

** 2. Logic**

We start with simple logic. Here is a classical logical argument,

slightly reworked:

All statistics books are boring.

Stats 101 is a statistics book.

_______________________________________________

Therefore, Stats 101 is boring.

The structure of this argument can be broken down as follows.

The two statements above the horizontal line are called premises;

they are our evidence for the statement below the line, which is

the conclusion. We can use the words “premises” and “evidence”

interchangeably. We want to know the probability that the conclusion

is true given these two premises. Given the evidence listed,

it is 1 (probability is a number between, and including, 0 and 1).

The conclusion is true given these premises. Another way to say

this is the conclusion is entailed by the premises (or evidence).

You are no doubt tempted to say that the probability of the

conclusion is not 1, that is, that the conclusion is not certain,

because, you say to yourself, statistics is nothing if not fun. But

that would be missing the point. You are not free to add to the

evidence (premises) given. You must assess the probability of the

conclusion given only the evidence provided.

This argument is important because it shows you that there

are things we can know to be true given certain evidence. Another

way to say this, which is commonly used in statistics, is that the

conclusion is true conditional on certain evidence.

(To read the rest, Chapter 1 is available at this link.)