Turn in one pdf file. Other file formats will not be accepted.
Answer all questions inside this document and in the same order. Questions answered out of order will not be graded. All questions should appear in the homework.
Copy&Paste tables/graphs from R below each question. You will need to delete blank lines as you do this. Be careful not to delete any questions or change the format of the number list.
There is a penalty for up to 10 points for not following any of the instructions above.
A Dataset on the Effectiveness of a Math Placement exam
Download and open the comma-delimited file MathPlacement.csv on R to answer the following questions. Read the supporting documentation MathPlacement.pdf to understand the contents of the file. The dataset is a sample of results from a Math Placement exam at a liberal arts college.
How many observations are there in the dataset
How many variables are in the dataset?
Which variables are quantitative?
Report the Descriptive Statistics (Paste your output from R)
R code
attach(MathPlacement)
summary(MathPlacement)
Write one or two sentences describing the characteristics of the typical (average) individual in the sample? (Hint: Use the means or medians from the descriptive statistics.)
Is there an association between the student’s rank in its high school class and the placement test score?
Do a scatter plot between PlcmtScore and Rank.
What is the correlation coefficient between PlcmtScore and Rank.
Show the distribution of grades and comment comment on the distribution.
Show the distribution of the placement score
Battle of the sexes: Who are performing better in this data set, Females or Males?
Construct a two way frequency distribution between variables Gender and Grade.
Construct a 100% Stacked Bar Chart of Gender and Grade with Grade in the horizontal axis and Grades in categories.
Who is performing better? Males or Females? By a little or by much? Comment using the graphs/tab;e in (a) and (b).
Calculate a 95% confidence interval for the population mean of SAT score in Math..
Calculate a 95% confidence interval for the population proportion of male students.
Calculate it by hand. Show your work.
Calculate it in R. Paste your output here.
R code
plot(Rank, PlcmtScore, main=" Student’s Adjusted Rank in High School and Placement Score")
R code
cor(Rank, PlcmtScore)
R code
gradedist <- table(Grade)
gradedistpct <- prop.table(gradedist)
gradedistpct
barplot(gradedistpct, main="Frequency Distribution of Grades")
R code
hist(PlcmtScore, main= "Distribution of Placement Score")
R code
Two-way Frequency Distribution with Row Percentages.
The next command creates a table with the row percentages of “table1”
table1 <- table(Grade, Gender)
table1
ptable <- prop.table(table1, 2) # Column proportions
ptable
R code
Stacked Bar Chart
graph1 <- barplot(ptable, main="Graph #3: Gender and Grades", sub="100% Stacked", xlab="Gender", legend=rownames(table1), beside=FALSE, xpd=FALSE)
Percentage Clustered
graph2 <- barplot(ptable, main="Graph #4: Gender and Grades", sub="100% Clustered", xlab="Gender", legend=rownames(table1), beside=TRUE, xpd=FALSE)
Calculate it by hand. Show your work.
Calculate it in R. Paste your output here.
Confidence Intervals of a Population Mean
Suppose you do not know the population mean fee charged to H&R Block customers last year. Instead, suppose you take a sample of size n=40 and find a sample mean of 350. Assume that the distribution for fees is normally distributed with a population standard deviation of $100. (You will use the z-table for this exercise.)
Before conducting the survey, suppose you believed based on your previous observations, your best guess for population standard deviation of fee charged to H&R Block is $120. With this assumption in mind, what should your sample size approximately be if you want:
Suppose you do not know the population mean fee charged to H&R Block customers last year. Instead, suppose you take a sample of size n=16 and find a sample mean of 350. Assume that the distribution for fees is normally distributed with a sample standard deviation of $120. Note that you are given sample standard deviation in this question compared to population standard deviation in question 2. (You will use the t-table for this exercise.)
Calculate the standard error of .
Calculate the standard error (standard deviation) of .
95% confidence interval for the population mean of fees at H&R Block.
Calculate the margin of error (MOE) of using a 5% significance level.
Calculate the 95% confidence interval.
Write one complete sentence about the interpretation of the confidence interval.
90% confidence interval for the population mean of fees H&R Block.
Calculate the margin of error (MOE) of using a 10% significance level.
Calculate the 90% confidence interval.
Suppose an analyst believe that the population mean fee is equal to $300. Using a 90% confidence level, can we conclude the analyst is right? Why or why not?
Calculate the margin of error (MOE) of using a 5% significance level.
Calculate the 95% confidence interval.
Write one complete sentence about the interpretation of the confidence interval.
Suppose an analyst believes that the population mean fee is equal to $250. Using a 95% confidence level. can we conclude the analyst is right? Why or why not?
Calculate the standard error (standard deviation) of .
95% confidence interval for the population proportion.
Calculate the margin of error (MOE) of using a 5% significance level.
Calculate the 95% confidence interval.
Write one complete sentence about the interpretation of the confidence interval.
Suppose an analyst believes that the population proportion is equal to 20%. Using a 95% confidence level. can we conclude the analyst is right? Why or why not?
Margin-of-Error to be 2% and confidence level to be 95%?
Margin-of-Error to be 4% and confidence level to be 95%?
Margin-of-Error to be 2% and confidence level to be 99%?
95% confidence interval for the population mean of fees at H&R Block.
Confidence Intervals of a Population Proportion
Suppose 12% of the H&R Block customers are given a discount on the fee charged. Assume this is the true population proportion and that you plan to take a sample survey of 540 customers to further investigate this. (You wil use the z-table for this exercise.)
Suppose you have the following information:
z-test: Hypothesis Test About Population Mean with Population Standard Deviation () Known.
5.Suppose you have the following information:
Complete the graph of the sampling distribution of the test statistic by drawing the rejection regions. Conclude, wth the given information, do you Reject or Fail to Reject ?
- Step 1:
- Step 2:
- Step 3:
- Step 4 (critical value):
- Decision Rule[1]:
- Step 5: Conclude, Reject or Fail to Reject ?
[^3]: Note that this is a two-tailed hypothesis test and therefore the
significance level is divided by two to find the critical values.
Finish the 5-step procedure for hypothesis testing below. Calculate the z-test, find the critical value and write the decision rule. Complete the graph of the sampling distribution of the test statistic and conclude. Do you Reject or Fail to Reject ?
- Step 1:
- Step 2:
- Step 3:
- Step 4 (critical value):
- Step 5:
7.Suppose you have the following information:
Finish the 5-step procedure for hypothesis testing below. Calculate the z-test, find the critical value and write the decision rule. Complete the graph of the sampling distribution of the test statistic and conclude. Do you Reject or Fail to Reject ?
- Step 1:
- Step 2:
- Step 3:
- Step 4 (critical value):
- Step 5:
[1] Note that in this decision rule we are using |z| which is the absolute value of z. This decision rule is equivalent to writing: Reject if or . For the exercise: Reject if or . You can use either decision rule.