Sunrise at Canis Bay Lake, Algonquin N.P., Ont, Canada

BIO 412

Principles of Ecology

Phil Ganter

302 Harned Hall

963-5782

Lab 3 Spreadsheet Graphics

Email me

Back to:

Academics Page
Tennessee State Home page
Bio 412 Page
Ganter home page

Introduction:

This lab is designed to be a brief introduction to the use of spreadsheet graphics. We will use the spreadsheet program available in the biology computer lab, MS Excel. However, as before, there are more similarities among spreadsheet programs' graphics than differences, so, once you learn to use this one, you should be able to figure out how to use other programs. Just keep in mind that there is a way to do what you want to do and don't give up until you find it.

This lab should be read when you are sitting in front of a computer with a spreadsheet program running. It was written for the MSExcel program, version 98, and was intended for a Macintosh computer. However, most of it will apply to earlier versions of the Excel program and to MS Excel running within a Windows operating system. Remember, if you are using Windows, it may be more convenient to resize the resolution of the monitor so that you see more of the spreadsheet at one time. If you can't remember how to do it, go back to the first lab.

Graphics are the most intuitive way of presenting or summarizing data. However, some rules apply. The first applies to labeling.

All graphs should have a title that explains what the graphic is intended to show.

All graphs and charts have axes and most axes are measured in some unit value. All axes should be labeled and the units clearly labeled.

In presenting your data from the labs, it is sometimes specifically requested that a graphic be used and sometimes even the kind of graphic is specified. Sometimes, the means of presenting the data is not specified and you will choose between a table and a graphic. No matter how specific the instructions, you must be familiar with the kinds of graphics, why a particular graphic is the correct choice for the particular data, and you must be able to interpret and draw conclusions from the graphic. In addition, there are always formatting decisions you will make. It is hoped that the following definitions, descnriptions, and exercises will help.

Some Useful Terms:

  • Frequency -- the number of times a value occurs in a dataset.
    • A Frequency Table has two columns
      • one for each unique value that a dataset contains (ignoring repeated values)
      • one for the number of times each unique value occurs
    • Frequency tables are the first step in constructing a histogram
    • Frequency tables are useful for some statistical calculations
    • Example of a frequency table (Frequency of different size fruit sampled from an orchard. The sizes refer to inches in circumference). You can see that there were 44 fruits found that were 4 inches in circumference.

Size

Frequency

1

22

2

34

3

56

4

44

5

16

 

  • Numeric and Categorical data are different kinds of data, and different graphical methods are used to display different kinds of data.
    • Categorical data is data that indicated an object belongs in a particular category, such as the sex of an individual, its genotype, or its collection locale
    • Numeric data involve numbers (what do you know - an obvious term!) and are either:
      • meristic (data that is counted -- i. e. number of individuals, Drosophila bristle number, etc.). Also called discrete data.
      • metric (measured values -- i. e. length, weight, etc.). Also called continuous data.
  • Grouping -- grouping data is done to scale data or to change it from nominal to categorical data. This can be necessary for many reasons. The following list is not exhaustive but does cover the two most common reasons for grouping.
    • If you have a lot of values with a frequency of 1 or 2, then grouping can gather similar values into groups with larger frequencies, which may make a much more useful graphic
    • when you want to make a histogram, but the data is metric (continuous), it may be that no two values are the same. Consider the situation where you carefully weigh 100 insects. If the scale is accurate, it is likely no two animals will have exactly the same weight. In this case grouping can form classes of weights, each data point can be placed into a category, and the frequency of weights in each category can be counted.

Excel Data Files:

  • Workbook - a workbook is an excel data file. It is what opens when you open excel.
  • Worksheet - look at the bottom of a workbook. There are several tabs there. Click on one and you go to another worksheet within the workbook. Think of them as pages in a book.
    • You can change the name of the worksheet (written on the tab) by double clicking on the name presently on the tab. This will select it and change the cursor to the text editing cursor (a vertical blinking line). You can now change the name to whatever you wish.
    • You can also change the order of the worksheets. Click on a tab, hold the mouse button down, and drag the tab to another position in the line of tabs. Release the button and the move is completed.
    • Worksheets are linked together, so that you can calculate a value in one worksheet from data on another worksheet. Try it by adding two values from one worksheet on a second worksheet.
      • Enter a 2 and a 3 in one worksheet.
      • Click on the tab for a second worksheet. Select a cell and type =
      • Click on the tab for the original worksheet and click on the cell with 2.
      • Go back to the original worksheet (some versions of Excel do this automatically) and type + in the cell with the summation. Do not click on the cell before typing, as the cursor is already in the correct cell. Just type +.
      • go back to the original data sheet and click on the cell with 3.
      • Finally, go to the worksheet with the summation and look at the formula. Notice that the cell references are preceded by a reference to the worksheet with the cell. Hit return (Mac) or enter (Windows) and the answer should appear in the cell.

Kinds of Graphics covered here:

Excel seemingly has an endless way of graphing things. Pies, Stars, Polar charts, 3-D, whatever. Most of them have limited utility and we will neither cover them here nor use them in the course. We will concentrate on the four types below.

  • Histogram
  • Bar Chart (horizontal or vertical bars)
  • Line Graph
  • Scatter Plot

Histograms and frequency tables:

Histograms are graphs which have a frequency of occurrence as the Y-axis and a categorical variable as the X-axis. The categorical variable can be converted from a numerical variable if it is discrete but continuous variables must be grouped before they are suitable for constructing histograms (discrete variables may need to be grouped also if there are two many of them or if the frequencies are too small)

Histograms are used to display how often a particular response occurs. Examples:

In many specie of birds, both the male and female feed the chicks. Is the effort comparable for both sexes? One way to demonstrate the effort would be to construct a histogram with sex as the X-axis and frequency of feeding visits to the chicks as the Y-axis.

We often hear that the heights of humans approximates a bell curve (or more accurately for scientists, a normal curve). To demonstrate this, one might measure 100 randomly chosen people of one sex. Then one should take the range of heights (which is a continuous variable) and make groups. Then count the frequency of heights in each group and make a graph. It should look like a bell curve if what everyone hears is true.

MSExcel will make histograms from raw data (data not in frequency format), but the program must have some extra tools added to it. Since there is no guarantee the these add-ons will be present all over campus, we will take a more labor intensive approach to making a histogram. It involves two separate steps: making the frequency table (this is the hard part) and then graphing the frequency table.

Frequency tables:

  • Please follow the directions below exactly and in exactly the same order. This is the most difficult command I know of in Excel but it can save lots of time if used correctly.
  • There is a frequency function, but do not use the insert menu to enter the command. You will do it by hand. First enter the following values in a column: 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 7, 7, 7. Start the data in cell A1 so that you finish in A16. If you do not use these cells, you will have to adjust what you type in when you are entering the frequency function. However, the function will actually take data that is in a row, or in a square or rectangular matrix, or in blocks that are not contiguous. A frequency table allows you to make the data more compact when there is a lot of it (as in the problems below).
  • take a look at the data - in this case it is easy to see that there are 5 categories (3, 4, 5, 6, and 7) and that each occurs with a frequency between 2 and 5. However, this is not always so easy to see. If the data is too numerous or involves continuous data, the best first step is to sort the data so that you can at least see the largest and smallest values.
  • before typing in the formula, you have to decide on what the categories will be. Here it is obvious, they should be 3, 4, 5 , 6, and 7. Enter these values in a column adjacent to and to the right of the data column and label it Categories in the top cell. In the cell to the right of the cell with "Categories" enter the word "Frequencies." This will act as the label for the column with the frequencies, which you will create next.
  • now you are ready to type in the function.
    • First select a column of empty cells that contains the total number of categories plus one more cell (in this case select 5 + 1 = 6 cells). You might want to do this in the column to the right of the column with the categories, under the "Frequencies" label
    • Second, while you are still "choosing" the cells to contain the frequencies, pull down the insert menu and click on "functions."
    • You will get a dialog box with the categories of functions on the left and the functions on the right. Click on the statistics category on the left and then move the slider until you see the "Frequency" function and choose it. You will get a dialog box.
    • There are two separate sets of data you need to enter into this dialog box before you can click the enter button. Click on the upper box (look for a blinking cursor in the box after the click). Choose the data to be counted for frequencies (move the dialog box out of the way if you need to and don't worry if the data is in more than one column). Now click on the lower box (look for a blinking cursor in the box after the click). Choose the categories you entered previously (move the dialog box out of the way if you need to). DO NOT HIT RETURN. If you do, you will have to start again. This function must be entered as an array function. To do this, hold down the apple key (shift and control keys on a Windows machine) and then hit the return button while holding them down.
  • you should see the frequency data in the selected columns (2, 3, 5, 3, 3, 0). The zero is the number in the "more than the last category" category. Since there are no values larger than 7, it is 0 here. To see what this means, select the last 7 in the data, type 8 and hit return. You should see the last frequency change to 1 and the one above it change to 2 from 3.
  • What about continuous data? Here the categories should be chosen depending on the largest and smallest values and the number of bars you want in the graph. You have to enter the upper value for each category and the program will place the data into categories by looking at successive entries in the category column.
    • Enter the values 3.23, 3.62, 4.32, 5.24, 5.33, 5.99, 6.01, 6.55 in cells D1 to D8
    • As the data is already sorted, we can see that the range is from about 3 to almost 7. A reasonable choice for categories in this case might be 4, 5, 6, and 7. I did not include 3 because there are no values lower than 3 (which would be the upper limit for the category). Enter the category values in a column next to the data (E1 to E4)
    • Select a column of 5 empty cells (F1 to F5) for the frequencies.
    • Follow the procedure above to insert the frequency function.
  • you should get the frequency table as the result (it should be 2, 1, 3, 2, and a 0 for the more than category)

Making the Bar Chart - this is described below

Bar charts:

Bar charts are graphs with categories along one axis and numerical values on the other axis. The bars indicate the value of each category. They can be horizontal or vertical, which is why I did not specify which axis gets the categorical data. You choose this by choosing the chart type when you make a chart. The categories can be types of things (species, colors of individuals, categories of tree height) and the values can be anything at all as long as they are numerical.

Bar charts are used to present visual comparisons between a small set of values (usually 10 or less, rarely as many as 30 values). Examples are the mean heights of humans by sex (2 values compared), average GPA of all TSU students by year (four values if all students are grouped into freshman, sophomore, junior, and senior categories).

To make a bar chart, you need to have the numerical data in an array (a series of cells in a row or in a column). If you want to label the categories, then you must also have them in an array. If you are willing to take the default labels (usually you are not), then you only need the array of values.

  • Go to cell I1 and enter the following values: 3.23, 3.62, 4.32, 5.24, 5.33 in a column from I1 to I5 (you can also copy them from cells D1 to D5). These are the mean weights in grams of individuals of 5 species of frog collected from the study site.
  • In cells H1 to H5 enter: Species A, Species B , Species C , Species D , Species E. These will be the category labels.

Before you start to make the chart, it is best to decide where it will be. Two options are available. It can be an entire, separate worksheet in the current workbook (the excel data file) or it can be a graphic in a worksheet already in the workbook. The choice is yours, but if it is a separate worksheet, then you must label the worksheet clearly so that I know which worksheet goes with which problem. In most cases, it is better to include the graphic as part of the worksheet with the data and any writing you may need to explain you answers.

  • Choose a place to anchor the graphic in the worksheet with the graphic. This is only a temporary choice, as you can re-size and move the graphic around in the workbook after you have created it. Cell H 7 will do fine here so click on that cell.
  • The graphing function is available as a icon on the toolbar at the top of the application window. It looks a bit like a bar chart. Click on it.
  • The first page asks you to choose the type of graphic. Choose either the bar or column chart (your choice, they only differ in the direction of the bars). Once you have made your choice, click next.
  • The second page asks which data to graph. Tell it by clicking on the line labeled and then clicking on cell H1 (don't let the mouse button up!) and dragging the cursor to cell I5 (now let up). The cell reference to the area should be in the line. Usually, the correct choice for the series is automatically chosen for you, so you can ignore the series buttons for now. Notice that there are two sub-pages available here and that you go between them using the tab a the top. Once you have made your selection, click next.
  • The third page of the dialog is where you make many of the choices about the format of the graph. There are several tabs and I will cover each below. Usually you must use the first three tabs, the fourth is commonly used, and the last two are rarely used.
    • Title -- sets title of overall graph and for each axis. Be sure you add titles, as they are as necessary as the data if you wish to communicate with others. Type in a descriptive overall title. Label each of the axes with what they are and with the units if the data is numeric. Notice that the labels are added to the graph as soon as you are through typing them. If you want to change something, just correct the line and the graph will be updated automatically.
    • Axes -- allows you to choose to display the axes labels or not. Usually you want both boxes checked. Try clicking on one to see what change it makes. You can always click on it a second time to undo the action.
    • Gridlines -- this allows you to format the gridlines. Usually, I prefer that they be off. They clutter the graph and do not really provide much information. Turn all gridlines off.
    • Legend -- this is useful only when you have more than one kind of data, as the program will give each a different bar color. In this case, there is only one kind of data, species weights, so it is not necessary. Click it off.
    • Data Labels -- this is an option rarely used. It will place labels near specific data points. This can be useful if you want to identify data points in other types of graphs (scatter plots) but is not usually useful here. You can play with it if you want.
    • Data Table -- this is also another rare option. Click on the box with "show table" next to it. It puts the data table in the graph. This might be useful if the graph were a separate worksheet, but it is not here and the data is close to the graph already. Unclick the box.
  • The last page allows you to choose where the graph will be. The top choice is a new worksheet. If you choose it, you should fill in a title for the worksheet. It will be used to label the tab that get you to the worksheet. Make your choice and click finish.

Now the graph appears in the spreadsheet (assuming you followed my advice and put it there). You can move and resize the chart. These rules apply to any chart or graph.

  • Move the cursor anywhere just to the inner edge of the chart. Click the mouse button and hold it down. You can now move it as long as you keep the mouse button down. Try it.
  • To resize the entire graph, you must have the object handles visible. These are little black boxes at the midpoints of the lines around the graph and on the corners. If there are no little boxes, you must click anywhere just to the inside of edge the graph to get them there. To resize, click on a little box, hold down the button, and drag the box. Boxes on the midpoints resize the graph in one dimension, boxes at the corners change both dimensions. Try it.
  • To resize just the graph within the larger area, click in the graph. You will get a fuzzy-lined box with little boxes. You can drag the entire graph around within the total graph area by click-and-dragging it. You can resize the graph by click-and-dragging one of the little boxes on the edge of the larger box.

You can also modify almost any characteristic of the chart. Some of these modifications are necessary, as the default choices are not very useful. You might not like the size of the font in the labels. The color of the bars may not meet with your approval. There is a box around the graph, which is unnecessary. One change that is required is to alter the background color of the chart. Why gray is the default is a mystery. To make changes to the graph, double click on the part of the graph you want to change. A dialog box will come up and you must navigate through it to effect the change you want. In addition, you can move things like a legend and the title around just be

Rather than describe how to do the changes, the occasion to make some changes will be used to illustrate a greater point, one that was mentioned in the first lab. The best way to learn about the capabilities of a program is to work with it. If you have to search for the way to do something, you will remember what you found much better than if you are simply told what to do. Below, in each of the problems, some format features for each graph are described. It will be your task to use the information in the paragraph above to discover the means to effect these changes. In doing so, you will gain a more intuitive appreciation for the larger principle of using canned programs. To use a program, don't be afraid to play with it. Try options. Explore. This flexibility is important. It is why this tutorial can be written for Excel running on a Macintosh and be useful to someone using a Windows machine. Things may not be exactly the same, but they are close enough that some play will allow you to complete each task. All errors can be corrected.

Line Graphs:

Line graphs in Excel are not what you might expect from the name. Many would define a line graph as a graph you get by plotting pairs of data points as X and Y coordinates and then connecting the dots with a line. Excel considers this sort of graphic to be a scatter plot (see below). A line graph is, in essence, a bar chart in which the bars have been replaced by dots and a line is used to connect the dot for each category. Thus, one axis of the chart is categorical in an Excel line graph. If both axes are numeric, then you need to choose a scatter plot to graph them in Excel.

Line graphs are used to show trends in categories of data. They are better than bar charts when you wish to imply that a data point somewhere in between two categories would have a numerical value somewhere in between the numerical values of the two categories. Sometimes this makes sense and sometimes not.

  • For categories in which the idea of an intermediate value makes no sense, then a line graph would be misleading. A bar chart has no information about intermediates and so it the appropriate choice. An example would be the bar chart of mean species weights you just created. Assuming that there are no hybrid frogs (intermediates between the categories), then it is misleading to imply what their weights would be.
  • If the idea of intermediates makes some sense, then a line chart is appropriate (although a bar chart may still be the appropriate choice, especially if you do not wish to imply anything about the intermediates). An example of an situation calling for a line graph might be a chart in which the mean weight of different age classes is being graphed. The categorical data are the age classes (perhaps 5 year classes for humans). In this case, the categories are really arbitrary divisions of a continuous variable: age. A bar chart would give the mean for all individuals in a five year period, but growth would look like a staircase in which individuals stay the same weight for five years and then suddenly shoot up to the next mean weight. A more realistic graph is a line graph, where successive means are connected by a line, implying that the mean weight will change in a more gradual fashion through time.

The creation of a line graph is identical to that of a bar chart, except that you choose the line graph option in the first page of the graphics dialog box.

Scatter Plots:

Scatter plots are what most think of when they read the word graph. They are the means of visualizing data pairs. Each data pair is considered to be a set of Cartesian coordinates, usually written as (x.y), which is where the names X and Y axes come from. You can plot more than one Y variable but only a single X variable.

Scatter plots are used to show relationships between two variables. How does the size of a bird relate to the sound power it can generate? How does the size of a countries' human population relate to the number of endangered bird species in the country (this scatter plot is figure 3.16, page 98 in your textbook). Relationships (often called trends in the data) can be negative (large X implies small Y), positive or no relationship at all.

  • Notice that a relationship between two variables does not always mean a cause and effect relationship. The relationship in Figure 3.16, which looks as though more people lead to more endangered species, may not indicate a cause and effect relationship. Perhaps (not likely) it is the other way around and more endangered species causes more people to be in the country (just switch the X and Y axes). Perhaps (more likely) there is a third factor causing the relationship. Here it might be land area. More land, more people (in general). More land, more species of birds. More species of birds, more threatened species of birds. Thus, when people and threatened bird species are graphed, you get a positive relationship.

To make a scatter plot, you must enter the data. Here we will use the data below. There are three variables. We will first plot the relationship between yearly rainfall (in inches) and total crop weight (tons per hectare). Then we will plot both total crop weight and damaged crop weight (also in tons per hectare).

Yearly Rainfall Total Crop Damaged Crop
22.3 107 23
24.3 124 25
21.5 115 27
26.9 136 43
32.1 149 99
30.4 151 89
27.2 134 56
27.5 133 47
29 145 80
24.9 126 29
  • Enter the data in an excel workbook. To prevent a lot of unnecessary trouble, make sure that the variable that will be the X-axis is on the left side of the data array, as above. The program will automatically assume that the leftmost column contains the values for the X-axis. You can change this later, but it is hard to do.
  • Choose a place to anchor the graphic in the worksheet with the graphic.
  • Click on the graphics tool.
  • Choose the XY scatter plot. Once you have made your choice, click next.
  • The second page asks which data to graph. Select the first two columns of data, including the labels on top. Once you have made your selection, click next.
  • Use the third page to make the following choices.
    • Title -- Give the chart and the axes labels (include units).
    • Axes --Turn all gridlines off.
    • Legend -- Click it off.
    • Data Labels -- no change
    • Data Table -- Not an option.
  • Choose to enter the graph as an object in the worksheet. Click finish.
  • Once you have the graph in front of you , you may notice some things need changing.
    • Change the scale of the X-axis so that it goes from 20 to 35 by 5.
    • Remove the box around the graph and remove the background.
    • Move and resize the graph within the box to get the most informative configuration.
    • Change the shape of the points to circles and make them larger.

Now we will do a second graph with both y variables

  • Move the first graph out of the way. Choose a place to anchor the graphic in the worksheet with the graphic.
  • Click on the graphics tool.
  • Choose the XY scatter plot. Once you have made your choice, click next.
  • The second page asks which data to graph. Select all three columns of data, including the labels on top. Once you have made your selection, click next.
  • Use the third page to make the following choices.
    • Title -- Give the chart and the axes labels (include units).
    • Axes --Turn all gridlines off.
    • Legend -- Leave it on. For once it is useful.
    • Data Labels -- no change
    • Data Table -- Not an option.
  • Choose to enter the graph as an object in the worksheet. Click finish.
  • Once you have the graph in front of you , you may notice some things need changing.
    • Change the scale of the X-axis so that it goes from 20 to 35 by 5.
    • Remove the box around the graph and remove the background.
    • Move the legend and resize the graph within the box to get the most informative configuration.
    • Change the shape of the points to circles (total crop) and make both symbols larger.
    • remove the box from around the legend.

Putting lines in scatter plots:

When you make a scatter plot, it is usual that the points are not connected by a line (this is done only in particular circumstances). They are, as the name suggests, a scattering of points. However, there may be a trend in the points, such that high values of X are associated with high values of Y. In the graphs above, there appeared to be a tendency for both Y variables to go up as rainfall went up (a positive relationship). How can we characterize this trend?

The usual way is to enter a trend line. Really, it should be a trend curve (a line is a special type of curve). In the crop graph, the trend looks like a straight line for the Total Crop data, but not so straight for the Damaged Crop trend. There are ways to estimate the best trend line for a relationship, but we can not go into them now. However, we can get the program to do this for us.

  • Click on one of the Total Crop data points. This should select all of the Total Crop data points.
  • Perhaps without your awareness, you got a different set of menu choices in the menu bar at the top when you chose to edit the graph. One of the new choices is Chart. Pull it down and choose "Add Trend line".
  • This brings up a dialog box with two tabs. One allows for a choice of trend line types. Almost all of the time, the simplest is the best and the simplest is the linear trend line (described by y = mx + b). Choose it for the total crop line.
  • The second tab allows you to name the trend line (do so if you want). If no name is entered, a default name is given and will appear in the legend. You can add the equation for the line (a good choice) and you can add something called r 2. R-square is a measure of how well the line and the data agree. More formally, it is the proportion of variation in the Y-axis data that is accounted for by the line.

Now we can choose the line for the Damaged Crop data.

  • Click on one of the Damaged Crop data points. This should select all of the Damaged Crop data points.
  • Pull Chart down and choose "Add Trend line".
  • Choose a trend line type for the Damaged Crop line. (Hint - only one goes both up and down, but you will have to modify the default options.)
  • Name the trend line, add the equation for the line and r2 (you have click on the other tab to do this).

Error Bars:

Yearly Rainfall Total Crop Damaged Crop Damaged Error
22.3 107 23 3.4
24.3 124 25 4.2
21.5 115 27 5.6
26.9 136 43 6.9
32.1 149 99 8.0
30.4 151 89 8.8
27.2 134 56 2.3
27.5 133 47 5.9
29 145 80 6.0
24.9 126 29 3.3

Often, we calculate not only a mean value for a category of data but also some measure of variation in the data for that category. In some cases, it is possible to display this information as part of the graph as error bars around the averages.

For line graphs and bar graphs, you need to enter the size of the error in a separate column so that the first entry in the error bar array is the error for the first value of the original data array. Go to the first bar chart data and enter these values: 0.62, 0.44, 0.51, 0.92, and 0.31. These numbers represent some measure of the variation within each species.

Double click on the bar (any bar) in the first chart you made. This will select all of the data in that category and open a dialog box on format. Choose the tab for Y error bars for vertical bars and X error bars for horizontal bars (it depends on which you want, usually it is y). In the display, choose the option for half error bars above (this is one of three options on how the error bars will appear - here the lower portion of the error bar would not add any more information and would make the graph "busier").

In the error amount section, there are several choices. Most are useless and you will often choose "Custom". Choose custom now and click in the + line (this corresponds to the error displayed above the data on the graph). Click in the box next to "Custom" so that you get a blinking cursor. Go to the column of errors (move the dialog box if you must) and select the errors you want. Hit Okay.

For scatter plots, the procedure is mostly the same except that you usually want the error bars to extend both above and below the data point. To do this, enter the new column of data in the scattergram data matrix (from the table above). To to the scattergram you made from this data, click on any of the data points and get the same dialog box. Make the appropriate choice in the display area of the Y error bar tab and enter the same column of errors twice in the error amount section: once in the + line and once in the - line. Hit Okay. If you want to explore, choose the x error bar and do the same thing (use the same error data).

Rules for Scaling:

A few remarks on how to scale axes. We will not cover scaling in which the values are transformed (such as taking the square root of the values before plotting the graph).

  • The axes should just hold the data from largest to smallest
  • There is no need to have the 0,0 X- and Y-axis intersection in the graph. As long as the axes are clearly labeled, it is not misleading to remove lots of blank space from the graph by re-scaling the axis.
  • If the range of values for one variable is very large, or if data points are mostly clustered in one part of the range, a more clear depiction of the trends may be had by transforming the data. A Transformation is a mathematical re-sizing of the data done before plotting the data and the same transformation procedure is applied to every data point for that variable. There are many transformations and we will not go them here except to point out a few as examples.
    • Data with a large range may be log transformed (take the log10 of each datum) so that a range from 10 to 100 becomes a range from 1 to 3.
    • Data with many small values and a few large values may be square root transformed. Here the few large values are greatly reduced and the small values are not changed by much. Consider the values 2, 3, and 144. After transformation, the values are 1.41, 1.73, and 12. The ratio of smallest to largest falls from 72 (=144/2) to 8.5 (=12/1.41)
  • If two Y variables are to be plotted, but one has much larger values, then it is permissible to transform one so that they have comparable ranges. Indicate the transformed data by changing the column label for that data so that the legend says that it is transformed data.

Problems:

Reminder: axes must be labeled and the labels must include the units (where appropriate)

1. Make a frequency table from the following data. This data is integer data, like you get when you are counting numbers of individuals (you cant get a fractional individual). Use the methodology described above to make the table (doing it by hand will not get you any credit). Label the two columns of the frequency table "Category Values" and "Frequencies". Remember that you can enter the data as a vector (row or column) or as a matrix (square or rectangular -as you see below). The values range from 1 to 17.

Data matrix        
1 2 8 9 12
1 2 8 9 12
1 3 8 9 12
1 3 8 9 12
1 3 8 9 12
4 3 8 9 12
4 3 8 9 12
4 3 8 9 17
5 4 8 9 12
5 4 8 9 13
5 4 11 8 13
5 4 11 8 12
5 4 11 10 12
6 7 11 10 12
6 10 11 10 12
7 10 13 10 15
7 10 13 10 16
7 10 13 10 15
14 10 13 17 11
 

Check your results by summing up all of the frequencies (use the summation function). What total did you get? What total should you get (and why)?

2. The data above have some gaps and some low-frequency values. Regrouping the data will produce a better graphic. Regroup the data by redoing the frequency table, but use the following category values:

0, 3, 6, 9, 15, and 18

Check your results by summing up all of the frequencies (use the summation function).

3. Make a frequency table from the continuous data (weights of fish) below. There is a wide size range, from hatchlings to large adults so the data ranges from 0.010 grams to 12.045 grams. Do two frequency tables, one with one gram category values and another with three gram category values.

Data Matrix

1.518 1.543 1.743
2.392 1.554 8.588
1.536 1.685 5.010
0.046 0.392 0.010
0.031 1.823 8.001
2.408 4.669 1.521
5.916 2.232 1.431
11.05 0.264 3.495
8.000 2.535 13
2.047 2.665 9.225
0.053 2.784 12.602
7.213 2.895 3.954
9.117 3.211 5.310
6.224 6.540 10.220
0.750 1.100 12.045

Check your results by summing up all of the frequencies (use the summation function) for each table.

4. Make a bar chart from the frequency table in problem 2. Before you make the graphic, it might be easier if you made some category labels. To do this, you will need blank cells next to the frequencies.

  • If they are blank, then type in category labels (like 1 to 2, 3 to 4, etc.) to the left of the appropriate frequency.
  • If they are not blank (many of you will have put the category values there), you should first insert cells between the frequencies and whatever is on the left side of them.
    • Do this by selecting the frequency cells and then pulling down Insert on the menu bar.
    • Choose cells and, when the dialog box come up to ask which way to move cells to make way for the new cells, choose over. Notice that the program updates all of the cell references so that no calculations are changed.
    • Now you can type in the labels (1 to 2, 3 to 4, etc.).

Insert always puts the new cells on the left hand side of the chosen cells and moves the new cells and everything to the right of them to the right (or on top and moves the old cells down if you choose down).

  • The bar chart should have these characteristics: Title, axes labels, no background color, appropriate scales for axes, appropriate labels for categories.

5. Make a bar chart of the following table, including error bars (label the axes and title the graph). The data are taken from a group of 20 partridges reared from the egg.

Age of Chick

(weeks)

Weight of Chick (grams)

Standard Error (grams)

1  27.4 7.3
3  48.8 7.2
5  62.5 8.7
7  121.3 9.8
9  149.7 10.4
11  244.2 12.1
  • The bar chart should have these characteristics: Title, axes labels, no background color, appropriate scales for axes, appropriate labels for categories, error bars over top of bars (not below).

Now, present the data as a line graph (including error bars).

  • The line graph should have these characteristics: Title, axes labels, no background color, appropriate scales for axes, appropriate labels for categories, error bars around data points.

 

6. The table below is data from experimental fields over a long time. It has the rainfall measured at the field, the total crop (in tons per hectare), the error associated with the total damage measurement and the tonnage of the portion of the crop that was damaged by insects to the point that it could not be sold (total tonnage = useable tonnage + damaged tonnage). Enter the data in your spreadsheet.

Year Yearly Rainfall (inches) Total Crop (Tons/Hectare) Damaged Crop (Tons/Hectare) Error (Tons/Hectare)
1 22.3 107 23 10
2 24.3 124 25 15
3 21.5 115 23 10
4 26.9 136 43 20
5 32.1 149 99 15
6 30.4 151 101 20
7 27.2 134 56 25
8 27.5 133 47 20
9 29 145 92 10
10 24.9 126 29 15

Graph the relationship between total crop and rainfall with error bars for total crop. Fit a linear trend line and present the equation for the line and the r2.

  • The scatter plot should have these characteristics: Title, axes labels, no background color, appropriate scales for axes, Y-error bars around data points, trend line with equation and r 2.

Graph the relationship between rainfall and both total crop and damaged crop. Fit trend lines and present equations for each line and the r2's (Hint: a linear trend line is not appropriate for the damaged crop data. Choose one with a better fit).

  • The scatter plot should have these characteristics: Title, axes labels, no background color, appropriate scales for axes, trend line with equation and r 2. No error bars here.

Problems 7 and 8 represent extensions of what you have learned above.

7. Planarians will lose weight when they are starving and, of course, will grow when they receive enough food. You have a series of aquarium tanks to which you will add some planaria and a constant supply of food. You also have a model of planarian growth in these tanks. Y is the predicted average total weight (grams) of planarians per tank after two months in the tanks and X is the total weight of planarians (grams) initially added to the experimental tanks. Graph the following formula that relates these two variables:

Graph them over the range of 0 to 10 grams added to the tank (the ranges you are considering using). Label the axes (with units) and title the graph. Answer this question. What is the break-even point at which the total weight of planaria at the end of two months is equal to the total weight initially added to the tank?

Hint: - you have to choose the X values to use. Start with 1, 2, etc. but put extra values in if the graph line shows a lot of change between any two points already plotted.

8. Make a chart that compares the effect of diet for each sex. Include the error term and label the axes and title the chart. (Hint: to receive credit, you will have to have all of the data on the same chart and this will involve entering the data in a different order than is presented below.)

Sex Diet Adult weight (grams) Standard Error (grams)
Male High Fat 492.4 17.6
  High Protein 532.3 21.4
  High Carbohydrate 481.2 12.3
Female High Fat 427.2 13.8
  High Protein 477.8 21.1
  High Carbohydrate 239.1 10.1

Draw some conclusions about the effect of both sex and diet (and the interaction between these factors) based on the chart.

How to submit this assignment:

The assignment will be submitted through the internet. It will not be accepted in any other format as there is no way to check formulas. If you forget how to submit, see the end of of the first lab on spreadsheets.

Last updated August 30, 2006