Data analysis, Lecture notes of Statistics

o Near the bottom, use the pull-down menu and select “Excel Add-Ins” and click ... Using one-way ANOVA in MS Excel ... Add standard error bars to the graph.

Typology: Lecture notes

2022/2023

Uploaded on 02/28/2023

anuprabha
anuprabha 🇺🇸

4.4

(18)

237 documents

1 / 11

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Data analysis
Data analysis in Excel using Windows 7/Office 2010
Open the “Data” tab in Excel
If “Data Analysis” is not visible along the top toolbar then do the following:
o Right click anywhere on the toolbar and select “Customize quick access
toolbar…
o On the left click on Add-Ins
o Near the bottom, use the pull-down menu and select “Excel Add-Ins” and click
Go” to bring up this menu:
o
o Select the “Analysis ToolPak” and click “OK”.
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Data analysis and more Lecture notes Statistics in PDF only on Docsity!

Data analysis

Data analysis in Excel using Windows 7/Office 2010

  • Open the “ Data” tab in Excel
  • If “ Data Analysis ” is not visible along the top toolbar then do the following: o Right click anywhere on the toolbar and select “ Customize quick access toolbar… ” o On the left click on “ Add-Ins ” o Near the bottom, use the pull-down menu and select “ Excel Add-Ins ” and click “ Go ” to bring up this menu:

o o Select the “ Analysis ToolPak ” and click “ OK ”.

Using one-way ANOVA in MS Excel

Introduction : When your observations fall into two or more categories of continuous or even discrete variables, you may be interested in asking if the groups differ from each other. Is fish diversity higher in phosphorus-enriched ponds than in low-phosphorus ponds? Does the abundance of forest-floor plants differ between clear-cut, tornado-damaged, and control plots of forest? Questions of this nature are answered using analysis of variance (ANOVA). It is worth mentioning that in the case of 2 categories you can run a t test or an ANOVA and the result will be the same.

Analysis :

  1. Organize your comparative data in adjacent columns (Table 1). There is no need to average them for analysis, and in fact averages will be calculated automatically during the ANOVA or t test.
  2. From the “ Data ” tab, select “ data analysis ” (this must be added from the “addin” menu; see previous section).
  3. Choose “ ANOVA single factor ”; click OK. Table 1 lists data from three habitats; so the factor of interest is habitat.
  4. Click the tiny red arrow by “ input range ” and highlight all of the data including the column headings. Click the “Columns” button and check the “Labels in first row” box.
  5. Select any of the output options that you like and hit “OK”
  6. The output from the fake data should look like this: Anova: Single Factor

SUMMARY Groups Count Sum Average Variance island 6 16 2.666667 1. mainland 6 28 4.666667 0. peninsula 6 17 2.833333 0.

ANOVA rce of Varia SS df MS F P-value F crit Between G 14.77778 2 7.388889 7.150538 0.006593 3. Within Gro 15.5 15 1.

Total 30.27778 17

Number of mammal species island mainland peninsula 2 5 3 3 4 2 3 6 4 5 5 3 1 4 3 2 4 2

Table 1. Fake data for ANOVA

Regression in MS Excel

Does blood pressure increase with age? Does shrub cover decrease with increasing canopy cover? Is there a relationship between phosphorus concentration and algal cell density in ponds? All of these questions can be addressed using regression.

Nature of the data All of the datasets described above are continuous ; that is to say, they vary over some range without breaks. They are not categorical (like male and female), that are not discrete (like number of people in a single car; you would not typically think about 3.5 people in a car). As the range of a discrete variable increases (number of plants per hectare for example), the larger number means that what in fact is a discrete variable can be treated as continuous.

Graphing We typically graph such datasets using a scatter plot (Figure 1). If we have a basis for considering for example that running speed impacts heart rate, then we would use running speed on the horizontal ( x ) axis, and heart rate on the vertical ( y ). In this case running speed is the independent variable. The dependent , or response variable is heart rate because we expect it to depend on, or respond to running speed. Figure 1. Fictional data representing the effects of running speed on heart rate.

70

80

90

100

110

120

130

140

150

160

170

0 2 4 6 8 10 12 14 16 Running Speed (mph)

Heart rate (bpm)

Analysis: We might look at the pattern on the right and perceive a pattern, or not! As is the case with all statistics, the point is to remove subjectivity and have firm criteria for claiming a relationship. The analysis one would use for this sort of question is regression. There are many forms of regression for relationships of different shapes, but for our purposes we are considering only linear regression. In other words we are asking only if, and how well a straight line can describe the relationship between variables. In excel under the Data tab, select data analysis, regression to bring up this window: The response variable goes in the Input Y Range and the independent variable goes in the Input X range. You can click on the tiny red arrow in each case and highlight the appropriate portion of the data (including labels). The output range simply is a place for the statistical output to go.

Output: Output from the preceding data set:

The number under Significance F is the p value. In this case the p value is greater than 0.05 and we can conclude that there is no relationship between running speed and heart rate.

SUMMARY OUTPUT

Regression Statistics Multiple R 0. R Square 0. Adjusted R -0. Standard E 33. Observatio 15

ANOVA df SS MS F ignificance F Regression 1 425.0892857 425.0893 0.384931 0. Residual 13 14356.24405 1104. Total 14 14781.

Coefficients Standard Error t Stat P-value Lower 95%Upper 95%ower 95.0%Upper 95.0% Intercept 130.5238095 18.05655676 7.22861 6.66E-06 91.51499 169.5326 91.51499 169. Running S -1.232142857 1.985956467 -0.62043 0.545699 -5.52254 3.058255 -5.52254 3.

Graphing

Figures in Community Ecology

All graphs, maps, photographs, and sketches are considered “Figures” and appear in a numbered sequence in the order cited in your paper. Any set of numbers and/or letters is considered a table and tables have their own numbered sequence (IE, even after three figures, your first table is still Table 1 ).

A good graph minimizes clutter and unnecessary ‘ink’. Use the MS Excel “Scatter Plot” option to make graphs displaying continuous data on the vertical and horizontal axis. The species area data for the upcoming lab report are a good example; area on the X axis; number of species on the Y axis. Remove all of the following items added by Microsoft excel: “Series 1”; background color; frames on right and top; grid lines; 3D effects.

Scatter plots

Figure 1. Illustrating the point that more sampling leads to more species observed. Connor & Simberloff (1978) analyzed data from collecting trips to the Galapagos Islands and found that number of collecting trips better explained number of species recorded than did island area, elevation, or isolation. Data extracted from Table 3 in Connor & Simberloff (1978).

The figure legend is always placed underneath and contains roughly a paragraph of information describing the figure content in sufficient detail that the figure stands alone. The

legend inserted by MS excel is useful only if two or more data sets are displayed on one graph using symbols. This figure contains data that span the nearly entire range presented. If we were presenting data from only the largest five islands we would adjust the horizontal axis to run from 20 to 40, and the vertical axis from 150 to 450. Note that the axis lines have been thickened and fonts enlarged beyond the default. Important : Graphs should not start at zero, zero if the data range fall between 75 and 85 (for example).

Bar graphs

We use bar graphs when presenting the averages of continuous variables (on the We use bar graphs when presenting the averages of continuous variables (on the Y axis) from one or more categories on the horizontal axis.

The bar height equals the average of the response variables for treatments 1, and treatments 2. The error bars above and below the average in this case equal standard error; calculate these values as: (standard deviation)/(square root of the number of samples). The scale is appropriate to the data; if the averages were 150 and 200, I might start the axis at 100 rather than zero. Important: You should replace the numbers on the horizontal axis with names of sites or treatments (see example under adding error bars handout).

Figure 1. Very detailed title, 3-4 lines;

place under the graph

Nicely labeled axis categories

Nicely labeled axis (with

units)

How to add the error bars to your bar graph : Lay your data out as illustrated below. In this case the fake data represent the average number of insect species found several samples taken from each of three locations in a stream.

Note:

  • Standard error values are underneath the graphed averages.
  • The graph has been moved in the spreadsheet so as not hide the numerical values.
  1. Click anywhere on the chart - this will reveal the “ Chart Tools ” at the top of the window. Click “ Layout
  2. Right click on any bar in the graph – 2 small windows will pop up – work in the smaller upper one. Click the little drop down arrow and select the data set to which you’d like to add error bars ( Series 1 unless you have renamed the data set).
  3. Now, go up to “ Chart Tools ” at the top and select “ Error Bars ”/ ” More error Bar Options ” (because all of the other options offered are, to be perfectly honest, fake).
  1. Click “ Custom ” and “ Specify Value ”.
  2. Next click the tiny red arrow in the box under “ Positive Error Bar ”; highlight the values

for the standard errors that are lined up under the averages. Hit “ Enter ”!

  1. Now, you would think that having selected “both”, that both the upper and lower error bars would be displayed; you would be wrong! Repeat the process for “ Negative Error Bars ”.
  2. Click “ Close ”.
  3. Truly beauteous error bars will now grace your bar graph!