Worksheet Correlation and Regression

Worksheet Correlation and Regression Solve the Statistical question pro Question 2 Janssenet al. (2007) studied the relationships between a variety of abiotic factors and benthic invertebrate abundance at sites on beaches along the Dutch coast. One of these abiotic factors was the relative height of the site in relationship to the average sea level of the area (NAP). Positive values of NAP indicate sites that are higher than the average sea level, whereas negative values indicate sites that are below the average sea level. The data are in the file sle251dutch.csv and the relevant variables are the response variable, richness (richness of invertebrate species), and the predictor variable, NAP (relative height of the site in relationship to the average sea level of the area). Format of sle251dutch.csv data file Site    NAP    richness 1    0.045    11 2    -1.036    10 3    -1.336    13 4    0.616    11 5    -0.684    10 ..    ..    .. Site    The number of the site where the samples were collected NAP    Relative height of the site in relationship to the average sea level of the area Predictor variable richness    Richness of invertebrate species Response variable a)    Janssenet al. (1996) were interested in modeling the linear relationship between invertebrate richness (response) and the relative height of the site in relationship to the average sea level (predictor). List the following: The biological inference of interest The biological null hypothesis derived from above The statistical null hypothesis (H0) derived from above b)    Draw a scatterplot of NAP against richness.  Draw boxplots for each variable as well. Any evidence of skewness in the distributions or nonlinearity? To create scatterplot in R Graphs Scatterplot Select x-variable (NAP) and y-variable (richness) Check Marginal boxplots and Least-squares line Unselect Smooth line and show spread OK c)    Fit the regression model richness = intercept + slope x NAP. To fit linear regression and create an ANOVA table in R Statistics Fit models Linear regression... You can enter a name for the results object (Enter name for model:) but its simplest to just use the name that R provides. Select richnessfrom Response variablelist Select NAPfrom Explanatory variables list. OK Models Hypothesis tests ANOVA table Select Partial, ignoring marginality (“Type III”). OK Examine the regression output and identify and interpret the following: Sample y-intercept Value (estimate in the R output): Interpretation: Slope of regression line (NAP) Value(estimate in the R output): Interpretation: t statistic for main H0 (regression slope equals zero) Value: Interpretation: P-value for main H0 (regression slope equals zero) Value: Interpretation: r2 value (multiple R-squared) Value: Interpretation: d)    Complete the following ANOVA table from the regression analysis Source of variation    SS    df    MS    F ratio Regression Residual Total 44 Note: To get the MS values from the output – remember to divide the SS value by the df. e)    What conclusions would you draw from the regression analysis (statistical and biological)? f)    What invertebrate richness would you predict for a new site with an NAP of -2? Simply plug -2 into your regression equation and calculate predicted richness.

IS IT YOUR FIRST TIME HERE? WELCOME

USE COUPON "11OFF" AND GET 11% OFF YOUR ORDERS