Transcription

1 M 225 Test 1 A Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75 1

2 Multiple choice questions (1 point each) 1. Look at the following histogram for salaries of baseball players. What shape would you say the data take? a) bimodal b) left-skewed c) right-skewed d) symmetric e) uniform 2. For the distribution of major league baseball players salaries in the previous question, which measures of center and spread are more appropriate? a) Mean and standard deviation b) Median and interquartile range c) Mean and interquartile range d) Median and standard deviation 3. Which one of the following variables is NOT categorical? a) whether or not an individual has a cell phone b) the color of Reese s Pieces candy c) the airfare to a selected city from LAX d) the occupational background of a Civil War general 4. If a distribution is skewed to the right, a) the mean is less than the median b) the median is less than the mean c) the mean and the median are equal 5. What percent of the observations in a distribution lie between the first quartile Q 1 and the third quartile Q 3? a) About 25% b) About 50% c) About 75% d) 100% 6. Which of the following is LEAST affected if an extreme outlier is added to your data? a) the median b) the mean c) the standard deviation d) the range 7. Which of these statements are FALSE? a) There is a strong linear relationship between gender and height because we found a correlation of b) Plant height and leaf height were found to be negatively correlated because the correlation coefficient is c) Since the correlation between X and Y is 0, this means there is no relationship at all between these two variables. 2

3 d) All of the above. e) None of the above. 8. What is the median of the following data: 1, 5, 4, 8, 11, 3, 9, 15, 2 a) 4 b) 5 c) 4 d) What are all the values that a correlation r can possibly take? a) r 0 b) 0 r 1 c) 1 r 1 d) r Several pieces of fruit from each tree in an orchard are selected. Identify the sampling technique. a) Cluster sample b) SRS c) Stratified sample d) Multistage sample 11. A sample of households in a community is selected at random from the telephone directory. In this community, 4% of households have no telephone and another 35% have unlisted telephone numbers. The sample will certainly suffer from a) Nonresponse b) Undercoverage c) False response 12. Mr. Marino has compiled a list of 1,348 students in his high school. He has selected a sample of students by choosing every 14 th student on this list starting with a randomly selected student. Which type of sampling is he using? a) random b) stratified c) cluster d) systematic 13. Does drinking coffee tend to increase a students performance in school? To answer this, I walked down to the Cafeteria and found 10 coffee-drinking undergraduates and 10 non-coffee drinking undergraduates. I asked them for their cumulative GPA and compared the GPAs for the two groups. Is this an experiment or an observational study? a) experiment b) observational study 14. Which one of the following statements is NOT true? a) The only way the standard deviation can be 0 is when all the observations have the same value. b) The correlation coefficient has the same units as the data. c) The standard deviation has the same units as the data. d) If the z-score for a value x is less than -2, the value is called an unusual value. 3

4 15. (3 points) Match each of the five scatterplots with its correlation b a e d c 16. TRUE or FALSE (1 point each) For the following five statements consider this: Infant mortality rates per 1000 live births for the following regions are summarized with side-byside boxplots: ( Region 1: Asia (South and East) and the Pacific Region 2: Europe and Central Asia Region 3: Middle East and North Africa Region 4: North and Central America and the Caribbean Region 5: South America Region 6: Sub-Saharan Africa 4

5 T T F About 75% of the countries in Region 1 have an infant mortality rate less than about 75, while about 75% of the countries in Region 6 have an infant mortality rate more than about 75. F The distribution of infant mortality rates in Region 4 is left skewed. T F The median infant mortality rate in Region 4 is about the same as the first quartile of Region 3. T F The variability in the infant mortality rate is the smallest in Regions 2 and 5. T F The infant mortality rate in all countries except for one in Region 2 is lower than the mortality rate of 50% of the countries in Region (4 points) Explain briefly the difference between observational study and experiment. In an observational study we just observe the subject. We don t pose any treatments. In an experiment we randomly assign treatments to the subjects. We can t establish cause and effect relationship using observational studies, but well designed experiments can establish cause and effect relationship. That s why we prefer to use them, if it s possible. 18. (4 points) What do we mean by double-blind experiment? Why do we use double-blind experiments at all? Double-blind experiment means that the neither the subjects nor the experimenter know which group of subjects gets the treatment and which group gets the placebo. We like to use double-blind experiments because this is the best way to avoid bias. 19. The following data represent the price (in cents per pound) paid to 15 farmers for apples a. (1 point) Is the variable quantitative or categorical? Quantitative b. (1 point) Which of the following graphical displays is appropriate to for these data--stemplot or bar graph? Stemplot c. (4 points) Create the graph you picked in the previous part. Describe the shape of the distribution. 5

6 The shape of the distribution is fairly symmetric and bimodal. d. (5 points) Find the five-number summary, and check the data set for outliers using the 1.5(IQR) rule. Five-number summary: Min.: 16.4 Q1: 17.4 Med: 18.4 Q3: 19.2 Max: 20.1 IQR = Q3 Q1 = = 1.8 Q1 1.5(IQR) = (1.8) = 14.7 Q (IQR) = (1.8) = 21.9 No data below 14.7, so no low outliers. No data above 21.9, so no high outliers. No outliers. 20. The heights of women aged 20 to 29 is approximately Normal with mean 64 inches and standard deviation of 2.7 inches. a. (3 points) How tall are those women who are in the middle 95%? The middle 95% is two standard deviations below and above the mean: 64 2(2.7) = (2.7) = 69.4 Thus, the height of women in the middle 95% is between 58.6 and 69.4 inches. b. (3 points) What percent of women in this age group are taller than 66.7 inches? Since 66.7 is one standard deviation above the mean, the upper tail is 16%. Thus, 16% of the women are taller than 66.7 inches. c. (3 points) How tall are those women who are in the shortest 2.5%? The shortest 2.5% is the lower tail below 2 standard deviations of the mean. That is 58.6 inches. Thus, the shortest 2.5% of the women are 58.6 inches or shorter. 6

7 21. Consider the following two distributions. The first one (A) shows the distribution of the number of houseplants owned by a sample of 30 households in Los Angeles. The second one (B) shows for a sample of 30 freshmen the distribution of the number of girlfriends/boyfriends they have ever had. a. (3 points) Which distribution has the lower standard deviation and why? Distribution A has the lower standard deviation because most of the values are around the mean. Only a few are far from the mean. b. (2 points) What percent of households have one houseplant or none? 2 households have one plant, and 1 household has no plants. That is 3 out of 30, 3/30 = 0.1 = 10% (See yellow bars on the graph) c. (2 points) Select the statement below that gives the most complete and correct statistical description of the graph A. A. The bars go from 0 to 10, increasing in height to 4, then decreasing to 10. The tallest bar is at 4. There is a gap between 8 and 10. B. The distribution is normal, with a mean of about 4 and a standard deviation of about 1. C. Most households seem to have about 4 houseplants, but some have more or less. One household has 10 plants. D. The distribution of the number of houseplants is somewhat symmetric and bellshaped, with a possible outlier at 10. The typical number of houseplants owned is about 4, and the overall range is 10 plants. d. (1 point) Which of these graphs is the boxplot for distribution B? 7

8 22. The ages (in weeks) and the numbers of hours slept in a day by eight infants are given below. Age Hours slept x = s = y = s = r = x y a. (2 points) Identify the explanatory and response variables. Explanatory variable: Age Response variable: Hours slept b. (4 points) Display the data in a scatter plot clearly labeling the axis, and describe the plot. Hours slept Form: linear Direction: negative Strength: moderately strong Outliers: maybe one Age c. (3 points) Find the equation of the least squares line, and sketch the line on the plot. Use three decimal digits in your answers. Y = X d. (3 points) Predict the number of hours slept for a 15 weeks old infant. Y = (15) = Using the regression line, we can predict that a 15-week old infant sleeps about hours a day. 8

9 e. (2 points) One observation greatly affects the apparent relationship. Circle it, and indicate which of the following is the most likely value of r if this point is removed: g. (2 points) Would it be OK to use the regression line to predict the number of hours slept for a 60-week old infant? Explain. No, it wouldn t be OK. 60 is out of the range of the explanatory variable (which is 8 to 45 from the table), so it s not reliable to use the regression line for this prediction. That would be extrapolation. 9

Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

Math 1530-017 Exam 1 February 19, 2009 Name Student Number E There are five possible responses to each of the following multiple choice questions. There is only on BEST answer. Be sure to read all possible

AP * Statistics Review Descriptive Statistics Teacher Packet Advanced Placement and AP are registered trademark of the College Entrance Examination Board. The College Board was not involved in the production

AP Statistics Chapter 1 Test - Multiple Choice Name: 1. The following bar graph gives the percent of owners of three brands of trucks who are satisfied with their truck. From this graph, we may conclude

Chapter 3: Data Description Numerical Methods Learning Objectives Upon successful completion of Chapter 3, you will be able to: Summarize data using measures of central tendency, such as the mean, median,

Chapter 1: Exploring Data Chapter 1 Review 1. As part of survey of college students a researcher is interested in the variable class standing. She records a 1 if the student is a freshman, a 2 if the student

STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members

Learning objectives Descriptive Statistics F. Farrokhyar, MPhil, PhD, PDoc To recognize different types of variables To learn how to appropriately explore your data How to display data using graphs How

STP 231 EXAM #1 (Example) Instructor: Ela Jackiewicz Honor Statement: I have neither given nor received information regarding this exam, and I will not do so until all exams have been graded and returned.

First Midterm Exam (MATH1070 Spring 2012) Instructions: This is a one hour exam. You can use a notecard. Calculators are allowed, but other electronics are prohibited. 1. [40pts] Multiple Choice Problems

Center: Finding the Median When we think of a typical value, we usually look for the center of the distribution. For a unimodal, symmetric distribution, it s easy to find the center it s just the center

Homework 8 Solutions Chapter 5D Review Questions. 6. What is an exponential scale? When is an exponential scale useful? An exponential scale is one in which each unit corresponds to a power of. In general,

Summarizing and Displaying Categorical Data Categorical data can be summarized in a frequency distribution which counts the number of cases, or frequency, that fall into each category, or a relative frequency

3. Since 4. The HOMEWORK 3 Due: Feb.3 1. A set of data are put in numerical order, and a statistic is calculated that divides the data set into two equal parts with one part below it and the other part

Chapter 10 - Practice Problems 1 1. A researcher is interested in determining if one could predict the score on a statistics exam from the amount of time spent studying for the exam. In this study, the

NATIONAL MATH + SCIENCE INITIATIVE Mathematics American League AL Central AL West AL East National League NL West NL East Level 7 th grade in a unit on graphical displays Connection to AP* Graphical Display

CURRICULUM FOR STATISTICS & PROBABILITY GRADES 11 & 12 This curriculum is part of the Educational Program of Studies of the Rahway Public Schools. ACKNOWLEDGMENTS Christine H. Salcito, Director of Curriculum

Exploratory Data Analysis Exploratory Data Analysis involves both graphical displays of data and numerical summaries of data. A common situation is for a data set to be represented as a matrix. There is

10-3 Measures of Central Tendency and Variation So far, we have discussed some graphical methods of data description. Now, we will investigate how statements of central tendency and variation can be used.

Chapter 3 Descriptive Statistics: Numerical Measures Slide 1 Learning objectives 1. Single variable Part I (Basic) 1.1. How to calculate and use the measures of location 1.. How to calculate and use the

1. Phone surveys are sometimes used to rate TV shows. Such a survey records several variables listed below. Which ones of them are categorical and which are quantitative? - the number of people watching

13.2 Measures of Central Tendency Measures of Central Tendency For a given set of numbers, it may be desirable to have a single number to serve as a kind of representative value around which all the numbers

StatTools Assignment #1, Winter 2007 This assignment has three parts. Before beginning this assignment, be sure to carefully read the General Instructions document that is located on the StatTools Assignments

Chapter 2- Problems to look at Use the given frequency distribution to find the (a) class width. (b) class midpoints of the first class. (c) class boundaries of the first class. 1) Height (in inches) 1)

Chapter 2 Overview Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Classify as categorical or qualitative data. 1) A survey of autos parked in

Mathematics Probability and Statistics Curriculum Guide Revised 2010 This page is intentionally left blank. Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction

AP Statistics 2001 Solutions and Scoring Guidelines The materials included in these files are intended for non-commercial use by AP teachers for course and exam preparation; permission for any other use

A Few Sources for Data Examples Used Introduction to Environmental Statistics Professor Jessica Utts University of California, Irvine jutts@uci.edu 1. Statistical Methods in Water Resources by D.R. Helsel

32 Measures of Central Tendency and Dispersion In this section we discuss two important aspects of data which are its center and its spread. The mean, median, and the mode are measures of central tendency

Final Exam Review MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) A researcher for an airline interviews all of the passengers on five randomly

MBA/MIB 5315 Sample Test Problems Page 1 of 1 1. An English survey of 3000 medical records showed that smokers are more inclined to get depressed than non-smokers. Does this imply that smoking causes depression?

Mathematics Box-and-Whisker Plots About this Lesson This is a foundational lesson for box-and-whisker plots (boxplots), a graphical tool used throughout statistics for displaying data. During the lesson,

Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select

The Effect of Dropping a Ball from Different Heights on the Number of Times the Ball Bounces Or: How I Learned to Stop Worrying and Love the Ball Comment [DP1]: Titles, headings, and figure/table captions

Name: Date: 1. A study is conducted on students taking a statistics class. Several variables are recorded in the survey. Identify each variable as categorical or quantitative. A) Type of car the student

AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home

Chapter 6 The Normal Distribution 6.1 The Normal Distribution 1 6.1.1 Student Learning Objectives By the end of this chapter, the student should be able to: Recognize the normal probability distribution

Lecture 2 Summarizing the Sample WARNING: Today s lecture may bore some of you It s (sort of) not my fault I m required to teach you about what we re going to cover today. I ll try to make it as exciting

Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),

24 Chapter 2. Describing, Exploring, and Comparing Data Chapter 2. Describing, Exploring, and Comparing Data There are many tools used in Statistics to visualize, summarize, and describe data. This chapter

Hints for Success on the AP Statistics Exam. (Compiled by Zack Bigner) The Exam The AP Stat exam has 2 sections that take 90 minutes each. The first section is 40 multiple choice questions, and the second

Prescription: 430 Statistics and Financial Mathematics for Business Elective prescription Level 4 Credit 20 Version 2 Aim Students will be able to summarise, analyse, interpret and present data, make predictions

T O P I C 1 2 Techniques and tools for data analysis Preview Introduction In chapter 3 of Statistics In A Day different combinations of numbers and types of variables are presented. We go through these

Introduction to Statistics for Psychology and Quantitative Methods for Human Sciences Jonathan Marchini Course Information There is website devoted to the course at http://www.stats.ox.ac.uk/ marchini/phs.html

Review MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) All but one of these statements contain a mistake. Which could be true? A) There is a correlation

A Correlation of to the South Carolina Data Analysis and Probability Standards INTRODUCTION This document demonstrates how Stats in Your World 2012 meets the indicators of the South Carolina Academic Standards

Density Curve A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: 1. The total area under the curve must equal 1. 2. Every point on the curve

Sample Exam #1 Elementary Statistics Instructions. No books, notes, or calculators are allowed. 1. Some variables that were recorded while studying diets of sharks are given below. Which of the variables

MATH 3/GRACEY PRACTICE EXAM/CHAPTERS 2-3 Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) The frequency distribution

CONDENSED L E S S O N 1.1 Bar Graphs and Dot Plots In this lesson you will interpret and create a variety of graphs find some summary values for a data set draw conclusions about a data set based on graphs

Scatter Plot, Regression Line, Linear Correlation Coefficient, and Coefficient of Determination What is a Scatter Plot? A Scatter Plot is a plot of ordered pairs (x, y) where the horizontal axis is used

Chapter 15 Multiple Choice Questions (The answers are provided after the last question.) 1. What is the median of the following set of scores? 18, 6, 12, 10, 14? a. 10 b. 14 c. 18 d. 12 2. Approximately

Graphical and Tabular Summarization of Data OPRE 6301 Introduction and Re-cap... Descriptive statistics involves arranging, summarizing, and presenting a set of data in such a way that useful information

NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are

Rescaling and shifting A fancy way of changing one variable to another Main concepts involve: Adding or subtracting a number (shifting) Multiplying or dividing by a number (rescaling) Where have you seen

Pearson s correlation Introduction Often several quantitative variables are measured on each member of a sample. If we consider a pair of such variables, it is frequently of interest to establish if there