BIOL 3110

Biostatistics

Phil

Ganter

301 Harned Hall

963-5782

Eyespot on the wing of a Polyphemus moth

Odds and Ends

Chapter 13 but not in text

Email me

Back to:

Course Page
TSU Home Page
Ganter Home Page

Unit Organization

Testing for a difference between variances (s2)

There are two situations in which a comparison of two variances might be in order

Situation A - comparing an observed variance to an expected (= known) variance

Situation B - comparing two observed variances (neither of which is calculated or known from prior experience)

Situation A - Using the s2-square distribution

The Chi-square distribution can be used to construct a confidence interval for a variance or standard deviation or to test the hypothesis that a sample variance does not differ from an expected or known variance.

Confidence interval for a sample variance

Procedure:

Hypothesis test of the difference between a sample variance and a known variance

This situation arises when you have a sample and can calculate a sample variance and you want to compare this value with a known variance to see if they are really different or if the difference is just due to sampling error

there are two ways you might know the value of a variance - theory or experience

There are three alternative hypotheses, each with its own variation of the test

Once again (see the confidence interval above), the2 distribution is asymmetric, so the two one-tailed tests use different2 values

Procedure:

Situation B - Using the F distribution

We have seen the F distribution before in the lectures on ANOVA

Consider the previous use of the F distribution. 

To evaluate the effect of a factor on a response variable we used the ratio of mean squares, dividing the mean square due to the factor by the mean square due to random error

The ratio is the F statistic, which has a defined probability distribution, and we can compare our F value with a critical value that depends on a pre-defined alpha-level.

A mean square is a variance (look at the way in which it is calculated in Lecture 11a) and so we are really comparing two variances when we calclate the F statistics in an ANOVA table

In other words, we have already compared two variances when we evaluated ANOVA results.  The MS/MS is from lecture 11a and the S/S is the general definition of the F statistic

So, to test for equal variances:

Two tailed or one tailed?

You will have to decide if the test is one or two tailed

To evaluate the F-statistic, you

Calculating power

Remember what statistical power is -

The probability of rejecting H0 when H0 is false (i. e., when HA is true).

To test this, we need to know the distribution of ts when is HA true

Specifics of the test

and

standard deviation = 1

More post-hoc tests for differences between levels in ANOVA analysis

Sheffé Test

The Sheffé test is conservative, in that it will reject the null hypothesis less often than other tests provided here and in Lecture 11b

 

Duncan

Dunnett

SNK

Tukey

http://fsweb.berry.edu/academic/education/vbissonnette/tables/posthoc.pdf

http://departments.vassar.edu/~lowry/tabs.html#q

http://cse.niaes.affrc.go.jp/miwa/probcalc/s-range/index.html

 

Testing for a difference between proportions

Other uses for the Chi-square distribution

We have already gone over two uses above:

A Bit More Probability

We have already had a very brief introduction to probability in Lecture 3 but we will formalize some basic concepts here and introduce some new ones.

Four basic rules of probability:

1.  The probability of an event, x, is expressed as a fraction between 0 and 1, inclusive

0 ≤ Pr(x) ≥ 1

2.  Impossible events have a probability of 0 and certain events have a probability of 1

3.  The sum of the probabilities of all possible events is 1

4.  The compliment of an event (or set of events) is all other possible events (not part of the set) and the probability of the compliment of an event is 1 minus the probability of the event

Pr(compliment x) = 1 - Pr(x)

Adding Probabilities - We have (in Lecture 3) covered the way to add two mutually-exclusive events [Pr(A+B) = Pr(A) + Pr(B)] and how to add two events that are not mutually exclusive [Pr(A+B) = Pr(A) + Pr(B) - Pr(AB)].

Multiplying Probabilities - In Lecture 3, we introduced multiplying probabilities through the use of a probability tree.  To use the tree, we had to assume that the two events were independent events.

What if the outcome of one event affects the probability of a second event occurring?  We call these dependent events, not surprisingly, and we need a second formula for multiplying these events.

Dependence in the real world is often more subtle than this example.

So, why and how would we multiply dependent probabilities.  Let's consider a situation in which dependence applies.

Suppose you have a bag of M&M candy, say 10 pieces in the bag.  You are thinking of offering two friends a chance to reach in and choose a piece but are a bit worried.  You like the new blue colored pieces the best and will only offer the candy if the chance of losing two of the blue is sufficiently small.  If there are only 2 blues, how do we calculate the chance that both friends will take a blue (assume that neither can see the piece they are choosing) and leave you bereft of the choicest M&Ms?

The first draw yields a chance of 2/10, or 1/5

Given that the first draw took one of the precious blues, the chance that the second will also be a blue is 1/9. 

We had to reduce both the total number of blue pieces and the total number of pieces by 1 due to the outcome of the first draw.

The probability of both events occurring is then 1/5 x 1/9 or 1/45.  That's low enough for all but the most abject chocoholics and you would probably decide to share.

Suppose you were at a Christmas party where all 10 attendees brought a gift, each with a label indicating who brought it.  Who gets which gift will be decided by writing the attendee's names on identical slips of paper, putting them into a hat, and letting everyone take a slip and open the present they have chosen, even if it's the gift they brought.  During the party, you find out that there are two presents you would really like to have.  When the gift giving begins, you happen to be sitting so that you will be the third person to choose a present.  Before the first person chooses, it occurs to you that both of the desirable presents may be chosen by the time you choose and you ask yourself, "What's the probability of that!"

The chance of a desirable gift being drawn by the first to choose is 2/10 or 1/5.

Given that the first draw took one of the precious gifts, the chance that the second person to choose will take the other good gift is 1/9. 

The probability of both events occurring is then 1/5 times 1/9 or 1/45.  Once you realize this, you stop worrying.

We can formalize this by introducing a new wrinkle in our probability notation, Pr(B|A).  The line is verticle, so it does not indicate a fraction and the expression is read "the probability of event B given that event A has occurred" or, more briefly, "the probability of B given A."

Pr(B|A) is called the conditional probability of B given A.

So, the multiplication of two dependent events is:

Pr (A and B) = Pr(AB) = Pr(A) x Pr(B|A)

To see if you understand, try calculating the following.  Two cards are drawn from a deck and are not replaced.  What is the probabilty of drawing two aces?  of drawing an ace and a king, in that order?  The answers are 4/52 x 3/51 or 12/2652 and 4/52 x 4/52 or 16/2652.

This logic can be extened to three dependent events.  What is the probability of drawing three aces in three cards?  An ace, then a king, then a queen?  The answers here are 4/52 x 3/51 x 2/50 or 24/132,600 and 4/52 x 4/51 x 4/50 or 24/132,600.

The formulation above can be rearranged using some simple algebra.

 

Reading this in English produces "The probability of B given A is equal to the probability of both A and B occurring divided by the probability of A"

We need to do two things now.  One is to understand what was just said above by going through an example and the next is to understand the implications of this formulation.  The are profound, a term I do not use lightly.

First and example.  This will illustrate the formula and give you some additional experience in the use of a probability tree.  The probability of being born a male Drosophila is 1/2 (their sex determination system is similar to ours).  Suppose that the probability of a Drosophila having the ability to detoxify the insecticide DDT depends on its sex: 1/4 if it's a female but 1/2 if it's a male.

What's the chance that a newly depositied Drosophila egg will be resistant to DDT?

This answer requires the application of the first formulation : Pr (A and B) = Pr(AB) = Pr(A) x Pr(B|A) for each thread of outcomes that leads to a resistant fly.  The diagram

 

 

 

Terminology

Monte Carlo Simulation

This is the basic simulation technique used to simulate long-term real world outcomes based on immediate probabilities of outcomes.

It uses are seen most readily in the description of the Monte Carlo procedure

Markov Chain Simulations

This is a useful method of predicting change in a system over time where there are different states for any member of the system and known probabilities of transitioning from one state to any other state during a given period of time

a simulation will begin with a set of individuals, each in one of the four possible states

an iteration will take the situation as it is and move individuals from state to state based on the probabilities of transitioning from the current state to the other states (including no change of state)

successive iterations will simulate the most probable outcome for that system at some time in the future

Markov Chain Monte Carlo Simulations (MCMC Simulations)

If individuals are moved from state to state using a Monte Carlo approach, we refer to the model as an MCMC model