BA 275 Comprehensive Project

Summer 2018

Project Idea: Using one data set (all major airline domestic flights departing from Oregon airports in 2016), you will use all the tools we learn in this class. The point of any statistics class (including this one!) is to better understand the world through data. Although we have many tools, all of them come back to one question: “How can we better understand our data?” In order to really understand these tools, we will repeatedly ask the question, “what patterns can we find in this Oregon flight data?”

Project Procedure: Each week, you will be asked to use a different tool or approach on this same data. Each week you will add a new section to an ongoing project report. By the end of the course, you will have used every tool we have to better understand this data.

Project Data: the data is available on Canvas. If you want to check it for yourself, it’s available for free download here: https://www.transtats.bts.gov/DataIndex.asp

You may find the “glossary” to be of help in decoding the meanings of some of what you read: https://www.transtats.bts.gov/glossary.asp

Week 2: In class this week we’ve learned about “margins of error” and “confidence intervals”, which allow us to estimate not just quantities we care about but also our level of uncertainty about those quantities.

Additionally, we’ve learned about “hypothesis testing”, which allow us to answer yes/no questions with a certain confidence.

1) Open your data in Excel and answer the following in complete sentences.

a) Explain why this data is a population, rather than a sample. Remember that we can generally describe a population using a phrase like, “this is a list of all of ___________.”

b) Like last time, we’ll calculate confidence intervals using random samples of this data. Choose 3 sets of 30 rows at random (it’s fine to use the same random 30 rows you picked last week). After finding their average, calculate a 90% confidence interval for each sample of 30 for the average flight departure delay, being sure to show your work clearly. How many of the three confidence intervals captured the true mean?

Hint 1: Excel’s =randbetween(2,4863) will make picking a random row easy, if you didn’t do that last week.

Hint 2: If you’re using Word, Insert>Equation will make your life easier as you show your work! (Alt+= is the shortcut for inserting equations)

2) Let’s use this data to conduct a hypothesis test Write your claim: you might want to start, “I’m testing the claim that the average delay of a flight departing {airport in Oregon} is less than _____________.” Don’t forget units of time! This means that your:

Null hypothesis:

and the Alternative hypothesis:

a) Set your significance level: alpha = ____%.

b) Like last time, we’ll calculate hypothesis tests using random samples of this data. Choose 30 rows at random (it’s fine to use one of the 30 samples from earlier in this project.). Now rather than finding a confidence interval, calculate a hypothesis test using the equation

You must use both critical value method and p-value method. Do both methods lead to the same conclusion?