|
BIOL 3110
Biostatistics Phil Ganter 301 Harned Hall 963-5782 |
Basic Experimental Design - Chapter 08
Back to:
Academic
Page |
Tennessee
State Home page
|
Bio
311 Page
|
Ganter
home page |
Unit Organization
Problems
Problems for homework (assume all numbers to be preceded by 8.)
- 2, 3, 8, 15, 16, 20, 26, 27, 29, 34, 38
Suggested Problems
- any of the additional problems in the chapter (there are few enough to examine them all)
Data is gathered by a researcher by observing a situation that would occur without the researcher's presence or effort in an OBSERVATIONAL STUDY.
Statistical tests, like the t-test, are used here to detect differences among groups of observations, just as in experiments.
OBSERVATIONAL UNITS are the persons, things or situations that are observed.
VARIABLES are conditions that can take on more than one value during the experiment. Variation can be qualitative or quantitative.
A RESPONSE VARIABLE is the quantity or quality of interest that should change during the period of observation. There is often one but there may be more than one response variable in an observational study.
EXPLANATORY VARIABLES are the quantities or qualities that are measured by the observer to explain the changes in the response variable.
EXTRANEOUS VARIABLES are the quantities or qualities that are not measured by the observer but effect changes in the response variable.
Problems with Observational studies
Nonrandom selection of observations (sometimes non-independent)
Uncontrolled extraneous variables
These problems make it difficult to determine cause and effect relationships in observational studies
We usually say that outcomes are ASSOCIATED, rather than one causes the other.
By observing, we can not tell when one thing causes another or if the purported cause simply precede the effect, even if it seems logical based on current beliefs.
SPURIOUS ASSOCIATION
Both cause and effect can be the effects of a third factor.
If A and B occur, with A preceding B, does A cause B (if A, then B)?
No if C causes A and then C causes B (if C, then A and then B.
C0NFOUNDING
Confounding occurs when explanatory or extraneous variables are not independent of one another.
Example from my work.
Yeast communities are found in cacti from Ontario, Canada to Patagonia in Argentina.
Yeast communities are found in many different species of cacti.
Data exists from collections taken from many locales and many species of cacti.
Can we separate the effects that distance has on yeast communities from that different species of cacti have?
No, for the most part. Many locations have only one species of cactus, so we can not tell if the differences found there are due to differences cause by different host plants or because the community is isolated by distance from other yeast communities.
Thus, in these studies, host species and collection locale are Confounded.
We use the observational approach when the experiment is difficult, costly or impossible to perform.
CASE-CONTROL STUDIES
Case-control studies match up similar situations (each cases is an observational unit) for comparisons, so that extraneous variables have less effect on the outcome.
EXPERIMENTS are studies where the investigator determines some or all of the important conditions affecting the outcome.
EXPERIMENTAL UNITS are the people, things, or situations studied in an experiment.
Working with humans presents special problems, both practical and ethical.
The practical problems are our focus of interest.
Humans can perceive the design of an experiment and may alter their response in light of that perception
BLINDING is a fix for this problem that involves not allowing the experimental unit (the person) to know about which level of the explanatory variable (or variables) they are experiencing.
The person who gathers the data may also affect the outcome of an experiment unfairly (even if unconsciously)
DOUBLE BLINDING is a fix for this problem that involves keeping both the subject of the experiment and those who gather the data from knowing about which level of the explanatory variable applies to a particular observation.
Experimental Terminology
TREATMENT is an explanatory variable that is manipulated by the experimenter. There may be more than one in an experiment. Treatment variable is another name for an explanatory variable. It is the hypothesized cause for the effect measured by the response variable.
TREATMENT LEVEL is one of the quantities or qualities of the treatment to which the experimental units are exposed. There may be as few as two (never just one if one considers no manipulation as one of the levels, see below).
CONTROL is the treatment level that represents no manipulation. It is designed to measure or detect the outcome if no manipulation of the explanatory variable were done. It is often the zero treatment level.
Good controls are necessary if one is to be able to properly evaluate the null hypothesis, because control treatments (or zero treatment levels) prevent extraneous or confounding variables from affecting the response variable.
NEGATIVE CONTROL is a control for the absence of a change in the response variable when no manipulation is done. For instance, if PCR is used to produce DNA when the template is added, there is a chance that other DNA may contaminate the procedure and produce a band even when the proper template is not there. A negative control would be a tube to which everything was done EXCEPT THE ADDITION OF THE TEMPLATE. It should produce no band in the subsequent gel.
POSITIVE CONTROL is a control for the ability of the response variable to change when a known manipulation is done. If the response depends on the detection of something (presence of a protein on a gel, release of light, etc.) then a positive control checks for the response when the experimenter adds protein to the procedure or induces light. For instance, running DNA size markers in one or more lanes will serve as a positive control that the gel worked and that DNA should have separated by size. Another example can be described for the PCR experiment above, in which the template you are searching for is added to one tube to be sure that, if the right template is found in an experimental unit, it will be amplified and appear on the gel as a band.
PLACEBO is a special control found in some experiments with people as the experimental units and it illustrates the subtlety of designing the right controls. Humans expect to get better when given a treatment or a pill. They may subsequently report recovery or actually experience recovery simply from that expectation, no matter whether or not the pill represents a non-zero treatment level. Thus, to control for the pill effect, pills had to be given to all in order to detect the effect of the treatment. However, this is just another example of a control.
HISTORICAL CONTROL is a control that is completed before the experimental manipulations are done. This often is necessary if one is treating people as not treating someone is not ethical, so those not treated are those who had the illness before the new treatment was available.
There is a second flavor of historical control that is part of a Natural Experiment, which are explained in your ecology class
BIAS is variation that is the result of a lack of randomness or independence. Many psychology experiments have been done from universities with an over-abundance of students as subjects. This may not represent a truly random sample of any population except university students and that is probably not the population the researcher intended to investigate, so this may represent a bias. One might say that the tendency for people to react to a pill by feeling better is a bias. PANEL BIAS is a bias that results from the altered behavior of the people in an experiment. Once you tell them they are in an experiment and something of the rationale and expected outcomes, they may alter their behavior simply as a result of this knowledge.
Importance of Randomizing
We have discussed random allocation previously, but the importance of this is re-emphasized here.
The reason to do this is to eliminate bias in the match of experimental units to treatments. This is most effectively done in a COMPLETELY RANDOMIZED DESIGN in which experimental units are assigned to a treatment level randomly, such that each unit has an equal chance of ending up in any of the groups
This mean that there may not be equal numbers of units assigned to each treatment level.
An acceptable departure from this is to randomly assign equal numbers of the pool of experimental units to each treatment level. Some statistical tests require or work better with if all groups have the same number of units in them.
Haphazard is not Random
Much bias is not conscious, so just by not thinking about which to choose does not eliminate bias.
If you are choosing cattle for feeding experiments by going to the edge of the herd and grabbing the first cow you come to each time you choose, you are assuming that the cows are located in the herd randomly. If smaller, weaker cows are pushed to the edge, then you are picking them first and whichever treatment level is getting filled first will be filled with the smaller, weaker cows.
A note on Observational and Experimental Units
In order to use statistics, one must define a population under study. Samples are taken from that population.
When designing an experiment or conducting an observational study, the experimental unit must be a representative of that population.
In the simplest cases, these units are members of the population being studied.
If you are interested in hibernation in squirrels, the experimental units are individual squirrels.
In some cases, this simple scenario becomes more complicated.
Sometimes one cannot use members of the actual population in an experiment.
Ethical requirements prevent many kinds of experimentation on humans.
In this situation, we often perform the experiments on a MODEL population.
The model population is assumed (or has been demonstrated) to respond to the treatments in the same manner as the real population of interest.
Rats, mice, monkeys, pigs, dogs, and other mammals have been used as models for physiological studies where the real population of interest is the human population.
Sometimes, it is impractical to use the real population, not unethical, and a model is employed.
In many cases (especially in ecological studies), the question of interest involves not just a population of organisms but also includes the environment in which those organisms reside.
Here, the experimental unit is both the organism and the environment, since the interaction is the object of the study.
I study yeast that grow on pockets of damaged cactus tissue. Although I am interested in the yeast in their natural habitat, it is difficult to do experiments in the field. Many experiments aimed at understanding how the yeast interact have been done on cactus tissue in glass vials in a laboratory. The assumption is that the vials are a reasonable model of the natural cactus rot.
Sometimes practical considerations interfere with the perfect experimental design. Here we discuss some strategies to overcome the problems caused by limited resources for experimentation
Blocking (or Stratification)
An experimental BLOCK is a group of experimental units that are suspected or known to be more similar to one another than to other experimental units.
Blocks usually represent the effect of an extraneous variable we can't avoid, so we plan for it.
If you have to keep your rats in a rack, and the rats can see that the light is brighter in the top row than at the middle or bottom. If you suspect that this could alter your rats response to the treatment, then blocking is the answer.
RANDOMIZED BLOCKS DESIGN
This design assigns treatment levels randomly to the experimental units in each block.
If you are working on 2 tables in a room with windows, and the difference in light and temperature caused by this might reasonable have an effect on the response variable, then you might divide the room into two blocks (the table near the window and the one away). After doing this, assign treatment levels randomly to each of the experimental units on one table before moving on to the next table. Equal number of each treatment level occur on each block.
If you want to test the effect of fertilizer and you have two fields, would it be right to put fertilizer on one and not the other/
No, one field may be better than the other.
Consider each field a block and divide it into separate experimental units and assign fertilizer or control level to each unit randomly within each block.
We have already considered replication, in part. Here we expand our look at a very important and difficult subject.
Replication is a necessary part of any statistical analysis.
A sample size of 1 is not enough from which to draw statistical inferences (a sample size of one is an ANECDOTE, not a sample, although many are persuaded by mere anecdotes).
PSEUDOREPLICATION (introduced in Chapter 6 but not given a name)
This is a mistake caused by failure to remember that one is trying to make a statement about a particular population.
Replicates are repeated observations of the members of that population, not repeated measurements from one member of the population.
Experimental units must represent a member of the population,
When the pseudoreplicates are repeated measurements from a single organism and we wish to know something about all organisms belonging to that species, pseudoreplication is easy to spot.
It can be harder to spot.
In fact, I believe that your book has committed the mistake on pages 334-337 in Example 8.25, Germination of Spores, in their example to illustrate how not to pseudoreplicate!
What is the population the mycologist wishes to investigate?
It is a species of fungus that causes a disease of corn. This can only mean that they want to know something about all of the members of the species.
Notice that all of the spores come from a single fungus culture that was started from a single spore.
Can we conclude something about all of the members of this species from a single spore culture?
My feeling is that, no, we can't. It is a sample size of 1, and we need a sample size of 2 before we can do the statistics to answer the question. This was a poorly designed experiment which assumed one spore could represent an entire species.
All of the replicates defined by the authors (three plates per treatment) suffer from a lack of independence that can not be overcome. Read the last paragraph on page 337 before the section on Determination of Sample Size. There the authors clearly point out why one plate per treatment would not work. They forget that one individual per experiment will also not work.
The problem is that the authors mistook the plate for the experimental unit. In fact, the experimental unit is the members of the disease-causing species of fungus.
This is a problem that arises when the organism is clonal, so that it can be divided into an infinite number of what look like different individuals but are not, and many, many, many researchers have fallen into this trap.
The mistake is not in gathering the data. It is in the analysis.
NESTED OBSERVATIONS
Repeated observations from a single experimental unit, or even from a sub-unit.
Nested observations are part of a hierarchical design, which is an important type of experimental design.
NESTING allows you to estimate variation within the unit or subunit and that can be important in understanding differences at higher hierarchical levels.
Even when the design does not have to be hierarchical, nesting can be useful.
If there is a lot of within-unit variation, repeated samples from the unit can be combined into some measure of central tendency (the mean is the usual one chosen, but is not always the best), and the measure of central tendency can be used as the best estimate of the unit's true value.
Multiple measurements within an experimental unit do not become pseudoreplication until they are improperly incorporated into an analysis.
ERROR
Error, Variation, and Mistakes
We expect that, when we choose members of the population to be in a sample, that they will differ from one another with respect to whatever we are measuring.
Understanding this discrepancy requires that we define two ideas, variation and error. In addition, error is often used to mean a mistake in common speech, so we need to separate statistical error from this sort of error.
MISTAKES are incorrect choices made by individuals and we will keep this idea separate from that of statistical error by using mistake for poor choices and error only in the statistical sense.
Notice that I used the word mistake when I described the problems with the textbooks discussion of pseudoreplication.
VARIATION is the difference between members of the population under study.
Variation and error arise from the fact that not all measurements are the same.
One way to measure variation in the population is with the standard deviation - this is not a type of error.
Notice that, after reading about error below, each data point will be affected by measurement error and so standard deviations reflect both the true difference among experimental units and the measurement error associated with collecting the data.
ERROR arises when there is a difference between an estimated value and some actual (=true) value.
This error is not error in the sense of a mistake (as in the pseudoreplication problem above) but is an unavoidable consequence of our methods (part of the structure of our world).
MEASUREMENT ERROR is caused by the inaccuracy of our measurement method. Anyone who has used a balance knows about this sort of error.
In the book, this is called NONSAMPLING ERROR, but it is the same. They use examples from survey data and, in this case, the survey is the measurement method.
The book refers to survey problems like non-response bias, but we will not deal with these problems here. Using survey data has long been studied by sociologists and is too deep for us in this course.
Notice that each data point will be affected by this sort of error.
SAMPLING ERROR is caused by the inaccuracy introduced when using a sample instead of the entire population.
When we draw a sample from the population in a random fashion and calculate a mean from this, we acknowledge that the sample mean is the best estimate of the population mean, but we also recognize that it may differ from the actual mean.
The difference between the sample mean and the true mean is measured by the standard error and is a form of statistical error because it is the result of the sampling procedure.
When error is not the result of a random process, whether it is measurement or statistical error, it becomes BIAS.
Bias is systematic error, error that is not random.
If your scale is not zeroed, then all of the weights you take may be too large. so your estimate of weights is biased towards over-estimating the weights
If your sampling procedure is not random, you may pick individuals who all share some quality, even though not all members of the population have that quality. Since the sample is not a true reflection of the differences found in the population, this is a bias.
If a sample is biased, the statistical tests covered in this course and in your book are not applicable.
When can you use Statistical Inference?
STATISTICAL INFERENCE is the process of using statistics applied to a sample to INFER something about a population.
It is inference because you are using a particular case, the samples you draw, to say something general about the entire population.
When it is done correctly, you can infer cause and effect relationships.
The book has a nice summary of some of the points in this chapter under the heading "Scope of Inference"
Method of choosing experimental units |
|||
Experimental units randomly chosen |
Experimental units not randomly chosen |
||
Sampling Method |
Random |
Cause-Effect inference possible |
Cause-Effect limited to significant sample patterns but can't conclude these reflect population patterns |
Not Random |
No Cause-Effect but sample patterns reflect population patterns |
No statistical inference is possible |
Last updated March 23, 2006