Course Project> Project Part C:
Regression and Correlation Analysis
.equella.ecollege.com/file/2c8d44cf-16de-4684-900a-8088186a0f88/1/ReadSpeaker.html”>What’s this?
Introduction
Your Instructor will provide you with a Case
description and data set in Doc Sharing.
PROJECT PART A:
Exploratory Data Analysis
·
Open the files for
the Course Project and the data set in Doc Sharing.
·
For each of the five
variables, process, organize, present and summarize the data. Analyze each
variable by itself using graphical and numerical techniques of summarization.
Use MINITAB as much as possible, explaining what the printout tells you. You
may wish to use some of the following graphs: stem-leaf diagram,
frequency/relative frequency table, histogram, boxplot, dotplot, pie chart, bar
graph. Caution: not all of these are appropriate for each of these variables,
nor are they all necessary. More is not necessarily better. In addition be sure
to find the appropriate measures of central tendency, the measures of
dispersion, and the shapes of the distributions (for the quantitative
variables) for the above data. Where appropriate, use the five number summary
(the Min, Q1, Median, Q3, Max). Once again, use MINITAB as appropriate, and
explain what the results mean.
·
Analyze the connections
or relationships between the variables. There are ten possible pairings of two
variables. Use graphical as well as numerical summary measures. Explain what
you see. Be sure to consider all 10 pairings. Some variables show clear
relationships, while others do not.
·
Prepare your report in
Microsoft Word, integrating your graphs and tables with text
explanations and interpretations. Be sure that you have graphical and
numerical back up for your explanations and interpretations. Be selective in
what you include in the report. I’m not looking for a 20 page report on every
variable and every possible relationship (that’s 15 things to do).
·
In particular, what I
want you do is to highlight what you see for three individual variables (no
more than 1 graph for each, one or two measures of central tendency and
variability (as appropriate), the shapes of the distributions for quantitative variables,
and two or three sentences of interpretation). For the 10 pairings, identify
and report only on three of the pairings, again using graphical and
numerical summary (as appropriate), with interpretations. Please note
that at least one of your pairings must include the qualitative
variable and at least one of your pairings must not include the
qualitative variable.
·
All DeVry University
policies are in effect, including the plagiarism policy.
·
Project Part A report is
due by the end of Week 2.
·
Project Part A is worth
100 total points. See grading rubric below.
Submission: The report including all relevant
graphs and numerical analysis along with interpretations.
Format for report:
A.
Brief Introduction
B.
Discuss your 1st
individual variable, using graphical, numerical summary and interpretation
C.
Discuss your 2nd
individual variable, using graphical, numerical summary and interpretation
D.
Discuss your 3rd
individual variable, using graphical, numerical summary and interpretation
E.
Discuss your 1st pairing
of variables, using graphical, numerical summary and interpretation
F.
Discuss your 2nd pairing
of variables, using graphical, numerical summary and interpretation
G.
Discuss your 3rd pairing
of variables, using graphical, numerical summary and interpretation
H.
Conclusion
Project Part A Grading Rubric
Category |
Points |
% |
Description |
Three Individual Variables – |
36 |
36 |
graphical analysis, numerical |
Three Relationships – 15 pts. each |
45 |
45 |
graphical analysis, numerical |
Communication Skills |
19 |
19 |
writing, grammar, clarity, logic, |
Total |
100 |
100 |
A quality paper will meet or |
Project Part B:
Hypothesis Testing and Confidence Intervals
Your Instructor will provide you with four
manager speculations, a.-d., in the Doc Sharing file.
1.
Using the sample data,
perform the hypothesis test for each of the above situations in order to see if
there is evidence to support your manager’s belief in each case a.-d. In
each case use the Seven Elements of a Test of Hypothesis, in Section 6.2 of your
text book, using the α provided by your Instructor in the Doc Sharing
materials, and explain your conclusion in simple terms. Also be sure to
compute the p-value and interpret.
2.
Follow this up with
computing confidence intervals (the required confidence level will be provided
by your Instructor) for each of the variables described in a.-d., and again
interpreting these intervals.
3.
Write a report to your
manager about the results, distilling down the results in a way that would be
understandable to someone who does not know statistics. Clear
explanations and interpretations are critical.
4.
All DeVry University
policies are in effect, including the plagiarism policy.
5.
Project Part B report is
due by the end of Week 6.
6.
Project Part B is worth
100 total points. See grading rubric below.
Submission: The report from part 3 + all of the
relevant work done in the hypothesis testing (including Minitab) in 1., and the
confidence intervals (Minitab) in 2. as an appendix.
Format for report:
A.
Summary Report (about 1
paragraph on each of the speculations a.-d.)
B.
Appendix with all of the
steps in hypothesis testing (the format of the Seven Elements of a Test of
Hypothesis, in Section 6.2 of your text book) for each speculation a.-d. as
well as the confidence intervals, the p-values, and including all Minitab
output
Project Part B: Grading Rubric
Category |
Points |
% |
Description |
Addressing each speculation – 20 |
80 |
80 |
hypothesis test, interpretation, |
Summary report |
20 |
20 |
1 paragraph on each of the |
Total |
100 |
100 |
A quality paper will meet or |
Project Part C:
Regression and Correlation Analysis
Your Instructor will specify for you the
dependent variable and the independent variables in your Case and data.
Using MINITAB perform the regression and correlation analysis for the data by
answering the following.
1.
Generate a scatterplot
for the specified dependent variable and the specified independent variable,
including the graph of the “best fit” line. Interpret.
2.
Determine the equation
of the “best fit” line, which describes the relationship between the
dependent variable and the selected independent variable.
3.
Determine the
coefficient of correlation. Interpret.
4.
Determine the
coefficient of determination. Interpret.
5.
Test the utility of this
regression model (use a two tail test with the α provided by your Instructor).
Interpret your results, including the p-value.
6.
Based on your findings
in 1-5, what is your opinion about using the designated independent
variable to predict the designated dependent variable? Explain.
7.
Compute the confidence
interval for beta-1 (the population slope), using the confidence level
specified by your Instructor. Interpret this interval.
8.
Using an interval,
estimate the average for the dependent variable for a selected value of the
independent variable (to be provided by your Instructor). Interpret this
interval.
9.
Using an interval,
predict the particular value of the dependent variable for a selected value of
the independent variable (to be provided by your Instructor). Interpret this
interval.
10.
What can we say about
the value of the dependent variable for values of the independent variable that
are outside the range of the sample values? Explain your answer.
In an attempt to improve the model, we will
attempt to do a multiple regression model predicting the dependent variable
based on all of the independent variables.
11.
Using MINITAB run the
multiple regression analysis using the designated dependent and independent
variables. State the equation for this multiple regression model.
12.
Perform the Global Test
for Utility (F-Test). Explain your conclusion.
13.
Perform the t-test on
each independent variable. Explain your conclusions and clearly state how you
should proceed. In particular, which independent variables should we keep and
which should be discarded. If any independent variables are to be discarded,
re-run the multiple regression, including only the significant independent
variables, and include the final Minitab output, with interpretation.
14.
Is this multiple
regression model better than the linear model that we generated in parts 1-10?
Explain.
15.
All DeVry University
policies are in effect, including the plagiarism policy.
16.
Project Part C report is
due by the end of Week 7.
17.
Project Part C is worth
100 total points. See grading rubric below.
Summarize your results from 1-14 in a report
that is three pages or less in length and explains and interprets the results
in ways that are understandable to someone who does not know statistics.
Submission: The summary report + all of the work
done in 1-14 (Minitab Output + interpretations) as an appendix.
Format for report:
A.
Summary Report
B.
Points 1-14 addressed
with appropriate output, graphs and interpretations. Be sure to number each
point 1-14.
Project Part C: Grading Rubric
Category |
Points |
% |
Description |
Questions 1 – 12 and 14 – 5 pts. |
65 |
65 |
addressed with appropriate output, |
Question 13 |
15 |
15 |
addressed with appropriate output, |
Summary |
20 |
20 |
writing, grammar, clarity, logic, |
Total |
100 |
100 |
A quality paper will meet or |