University of California, Los Angeles Department of Statistics. Normal distribution

Transcription

1 University of California, Los Angeles Department of Statistics Statistics 100A Instructor: Nicolas Christou Normal distribution The normal distribution is the most important distribution. It describes well the distribution of random variables that arise in practice, such as the heights or weights of people, the total annual sales of a firm, exam scores etc. Also, it is important for the central limit theorem, the approximation of other distributions such as the binomial, etc. We say that a random variable X follows the normal distribution if the probability density function of X is given by f(x) = 1 σ 1 2π e 2 ( x µ This is a bell-shaped curve. σ )2, < x < We write X N(µ, σ). We read: X follows the normal distribution (or X is normally distributed) with mean µ, and standard deviation σ. The normal distribution can be described completely by the two parameters µ and σ. As always, the mean is the center of the distribution and the standard deviation is the measure of the variation around the mean. Shape of the normal distribution. Suppose X N(5, 2). X ~ N(5,2) f(x) x 1

2 The area under the normal curve is 1 (100%). 1 σ 2π e 1 2 ( x µ σ )2 dx = 1 The normal distribution is symmetric about µ. Therefore, the area to the left of µ is equal to the area to the right of µ (50% each). Useful rule (see figure above): The interval µ ± 1σ covers the middle 68% of the distribution. The interval µ ± 2σ covers the middle 95% of the distribution. The interval µ ± 3σ covers the middle 100% of the distribution. Because the normal distribution is symmetric it follows that P (X > µ + α) = P (X < µ α) The normal distribution is a continuous distribution. Therefore, P (X a) = P (X > a), because P (X = a) = 0. Why? How do we compute probabilities? Because the following integral has no closed form solution P (X > α) = α 1 σ 2π e 1 2 ( x µ σ )2 dx =... the computation of normal distribution probabilities can be done through the standard normal distribution Z: Z = X µ σ Theorem: Let X N(µ, σ). Then Y = αx + β follows also the normal distribution as follows: Y N(αµ + β, ασ) Therefore, using this theorem we find that Z N(0, 1) It is said that the random variable Z follows the standard normal distribution and we can find probabilities for the Z distribution from tables (see next pages). 2

3 The standard normal distribution table: 3

4 4

5 5

6 Example: Suppose the diameter of a certain car component follows the normal distribution with X N(10, 3). Find the proportion of these components that have diameter larger than 13.4 mm. Or, if we randomly select one of these components, find the probability that its diameter will be larger than 13.4 mm. P (X > 13.4) = P (X 10 > ) = ( X 10 P 3 > ) = P (Z > 1.13) = = We read the number from the table. First we find the value of z = 1.13 (first column and first row of the table - the first row gives the second decimal of the value of z). Therefore the probability that the diameter is larger than 13.4 mm is 12.92%. Question: What is z? The value of z gives the number of standard deviations the particular value of X lies above or below the mean µ. In other words, X = µ ± zσ, and in our example the value x = 13.4 lies 1.13 standard deviations away from the mean. Of course z will be negative when the value of x is below the mean. Example: Find the proportion of these components with diameter less than 5.1 mm. P (X < 5.1) = P (Z < ) = P (Z < 1.63) = Here, the value of 3 x = 5.1 lies 1.63 standard deviations below the mean µ. Finding percentiles of the normal distribution: Find the 25 th percentile (or first quartile) of the distribution of X. In other words, find c such that P (X c) = First we find (approximately) the probability 0.25 from the table and we read the corresponding value of z. Here it is equal to z = It is negative because the first percentile is below the mean. Therefore, x 25 = (3) =

11 Normal distribution - Examples Example 1 The lengths of the sardines received by a certain cannery is normally distributed with mean 4.62 inches and a standard deviation 0.23 inch. What percentage of all these sardines is between 4.35 and 4.85 inches long? Example 2 A baker knows that the daily demand for apple pies is a random variable which follows the normal distribution with mean 43.3 pies and standard deviation 4.6 pies. Find the demand which has probability 5% of being exceeded. Example 3 Suppose that the height of UCLA female students has normal distribution with mean 62 inches and standard deviation 8 inches. a. Find the height below which is the shortest 30% of the female students. b. Find the height above which is the tallest 5% of the female students. Example 4 A firm s marketing manager believes that total sales for next year will follow the normal distribution, with mean of $2.5 million and a standard deviation of $300, 000. a. What is the probability that the firm s sales will fall within $ of the mean? b. Determine the sales level that has only a 9% chance of being exceeded next year. Example 5 To avoid accusations of sexism in a college class equally populated by male and female students, the professor flips a fair coin to decide whether to call upon a male or female student to answer a question directed to the class. The professor will call upon a female student if a tails occurs. Suppose the professor does this 1000 times during the semester. a. What is the probability that he calls upon a female student at least 530 times? b. What is the probability that he calls upon a female student at most 480 times? c. What is the probability that he calls upon a female student exactly 510 times? Example 6 MENSA is an organization whose members possess IQs in the top 2% of the population. a. If IQs are normally distributed, with mean 100 and a standard deviation of 16, what is the minimum IQ required for admission to MENSA? b. If three individuals are chosen at random from the general population what is the probability that all three satisfy the minimum requirement for M EN SA? Example 7 A manufacturing process produces semiconductor chips with a known failure rate 6.3%. Assume that chip failures are independent of one another. You will be producing 2000 chips tomorrow. a. Find the expected number of defective chips produced. b. Find the standard deviation of the number of defective chips. c. Find the probability (approximate) that you will produce less than 135 defects. 11

12 EXERCISE 8 Suppose that the height (X) in inches, of a 25-year-old man is a normal random variable with mean µ = 70 inches. If P (X > 79) = what is the standard deviation of this random normal variable? EXERCISE 9 Suppose that the weight (X) in pounds, of a 40-year-old man is a normal random variable with standard deviation σ = 20 pounds. If 5% of this population is heavier than 214 pounds what is the mean µ of this distribution? Problem 10 At Heinz ketchup factory the amounts which go into bottles of ketchup are supposed to be normally distributed with mean 36 oz. and standard deviation 0.1 oz. Once every 30 minutes a bottle is selected from the production line, and its contents are noted precisely. If the amount of the bottle goes below 35.8 oz. or above 36.2 oz., then the bottle will be declared out of control. a. If the process is in control, meaning µ = 36 oz. and σ = 0.1 oz., find the probability that a bottle will be declared out of control. b. In the situation of (a), find the probability that the number of bottles found out of control in an eight-hour day (16 inspections) will be zero. c. In the situation of (a), find the probability that the number of bottles found out of control in an eight-hour day (16 inspections) will be exactly one. d. If the process shifts so that µ = 37 oz and σ = 0.4 oz, find the probability that a bottle will be declared out of control. Problem 11 Suppose that a binary message -either 0 or 1- must be trasmitted by wire from location A to location B. However, the data sent over the wire are subject to a channel noise disturbance, so to reduce the possibilty of error, the value 2 is sent over the wire when the message is 1 and the value -2 is sent when the message is 0. If x, x = ±2, is the value sent from location A, then R, the value received at location B, is given by R = x + N, where N is the channel noise disturbance. When the message is received at location B the receiver decodes it according to the following rule: If R 0.5, then 1 is concluded If R < 0.5, then 0 is concluded If the channel noise follows the standard normal distribution compute the probability that the message will be wrong when decoded. 12

15 Normal distribution - Practice problems Problem 1 The chickens of the Ornithes farm are processed when they are 20 weeks old. The distribution of their weights is normal with mean 3.8 lb, and standard deviation 0.6 lb. The farm has created three categories for these chickens according to their weight: petite (weight less than 3.5 lb), standard (weight between 3.5 lb and 4.9 lb), and big (weight above 4.9 lb). a. What proportion of these chickens will be in each category? Show these proportions on the normal distribution graph. b. Find the 60 th percentile of the distribution of the weight. In other words find c such that P (X < c) = c. Suppose that 5 chickens are selected at random. What is the probability that 3 of them will be petite? Problem 2 A television cable company receives numerous phone calls throughout the day from customers reporting service troubles and from would-be subscribers to the cable network. Most of these callers are put on hold until a company operator is free to help them. The company has determined that the length of time a caller is on hold is normally distributed with a mean of 3.1 minutes and a standard deviation 0.9 minutes. Company experts have decided that if as many as 5% of the callers are put on hold for 4.8 minutes or longer, more operators should be hired. a. What proportion of the company s callers are put on hold for more than 4.8 minutes? Should the company hire more operators? Show these probabilities on a sketch of the normal curve. b. At another cable company (length of time a caller is on hold follows the same distribution as before), 2.5% of the callers are put on hold for longer than x minutes. Find the value of x, and show this on a sketch of the normal curve. Problem 3 Answer the following questions: a. Suppose that the height (X) in inches, of a 25-year-old man is a normal random variable with mean µ = 70 inches. If P (X > 79) = what is the standard deviation of this random normal variable? b. Suppose that the weight (X) in pounds, of a 40-year-old man is a normal random variable with standard deviation σ = 20 pounds. If 5% of this population weigh less than 160 pounds what is the mean µ of this distribution? c. Find an interval that covers the middle 95% of X N(64, 8). 15

16 Problem 4 A bag of cookies is underweight if it weighs less than 500 grams. The filling process dispenses cookies with weight that follows the normal distribution with mean 510 grams and standard deviation 4 grams. a. What is the probability that a randomly selected bag is underweight? b. If you randomly select 5 bags, what is the probability that exactly 2 of them will be underweight? Problem 5 Answer the following questions: a. Suppose that X follows the normal distribution with mean µ = 5. If P (X > 9) = 0.2 find the variance of X. b. Let X be a normal random variable with mean µ = 12 and standard deviation σ = 2. Find the 10th percentile of this distribution. c. The weight X of water melons is normally distributed with mean µ = 10 pounds and standard deviation σ = 2 pounds. Find c such that P (X > c) = d. The montly return of a particular stock follows the normal distribution with mean 0.02 and standard deviation 0.1. Find the 85 th percentile of this distribution. e. Find the probability that the monthly return of the stock in question (b) will be larger that 0.2. f. Find the probability that in one year (12 months), the return of the stock in question (e) will be larger than 0.2 on exactly 4 months. Assume that the returns are independent from month to month. g. The annual rainfall X (in inches) at a certain region is normally distributed with mean µ = 40 pounds and standard deviation σ = 4. What is the probability that starting with this year, it will take more than 10 years before a year occurs having a rainfall of over 50 inches? h. Let X N(100, 20). Find P (X > 70 X < 90). Problem 6 The diameters of apples from the Milo Farm follow the normal distribution with mean 3 inches and standard deviation 0.3 inch. Apples can be size-sorted by being made to roll over a mesh screens. First the apples are rolled over a screen with mesh size 2.5 inches. This separates out all the apples with diameters < 2.5 inches. Second, the remaining apples are rolled over a screen with mash size 3.2 inches. Find the proportion of apples with diameters < 2.5 inches, between 2.5 and 3.2 inches, and greater than 3.2 inches. Use only SOCR to find the answers and print the appropriate snapshots. 16

17 Normal distribution - Practice problems Solutions Problem 1 The chickens of the Ortnithes farm are processed when they are 20 weeks old. The distribution of their weights is normal with mean 3.8 lb, and standard deviation 0.6 lb. The farm has created three categories for these chickens according to their weight: petite (weight less than 3.5 lb), standard (weight between 3.5 lb and 4.9 lb), and big (weight above 4.9 lb). a. What proportion of these chickens will be in each category? Show these proportions on the normal distribution graph. Petite: P (X < 3.5) = P (Z < ) = P (Z < 0.50) = Standard: P (3.5 < X < 4.9) = P ( < Z < ) = P ( 0.5 < Z < 1.83) = = Big: P (X > 4.9) = P (Z > ) = P (Z > 1.83) = = b. Find the 60 th percentile of the distribution of the weight. In other words find c such that P (X < c) = From the z table approximately z = Therefore, = x x = (0.6) = c. Suppose that 5 chickens are selected at random. What is the probability that 3 of them will be petite? This is binomial with n = 5, p = Therefore, P (Y = 3) = ( 5 3) ( ) 2 = Problem 2 A television cable company receives numerous phone calls throughout the day from customers reporting service troubles and from would-be subscribers to the cable network. Most of these callers are put on hold until a company operator is free to help them. The company has determined that the length of time a caller is on hold is normally distributed with a mean of 3.1 minutes and a standard deviation 0.9 minutes. Company experts have decided that if as many as 5% of the callers are put on hold for 4.8 minutes or longer, more operators should be hired. a. What proportion of the company s callers are put on hold for more than 4.8 minutes? Should the company hire more operators? Show these probabilities on a sketch of the normal curve. P (X > 4.8) = P (Z > ) = P (Z > 1.89) = = b. At another cable company (length of time a caller is on hold follows the same distribution as before), 2.5% of the callers are put on hold for longer than x minutes. Find the value of x, and show this on a sketch of the normal curve. From the z table we find that z = Therefore, 1.96 = x x = (0.9) =

18 Problem 3 Answer the following questions: a. Suppose that the height (X) in inches, of a 25-year-old man is a normal random variable with mean µ = 70 inches. If P (X > 79) = what is the standard deviation of this random normal variable? 1.96 = σ σ = = b. Suppose that the weight (X) in pounds, of a 40-year-old man is a normal random variable with standard deviation σ = 20 pounds. If 5% of this population weigh less than 160 pounds what is the mean µ of this distribution? = 160 µ 20 µ = (1.645) = c. Find an interval that covers the middle 95% of X N(64, 8). We have 2.5% probability at each one of the two tails. Therefore 1.96 = x 64 8 x = (8) = = x 64 8 x = (8) = Problem 4 A bag of cookies is underweight if it weighs less than 500 grams. The filling process dispenses cookies with weight that follows the normal distribution with mean 510 grams and standard deviation 4 grams. a. What is the probability that a randomly selected bag is underweight? P (X < 500) = P (Z < ) = P (Z < 2.5) = b. If you randomly select 5 bags, what is the probability that exactly 2 of them will be underweight? P (Y = 2) = ( 5 2) ( ) 3 =

19 Normal approximation to binomial Suppose that X follows the binomial distribution with parameters n and p. We can approximate binomial probabilities using the normal distribution as follows: Calculate np and n(1 p). If both are 5 continue. Compute µ = np and σ = np(1 p). Here is how you can approximate binomial probabilities: At least k successes More than k successes At most k successes Less than k successes Exactly k successes P (X k) = P (Z > k 0.5 µ ) = σ P (X > k) = P (Z > k+0.5 µ ) = σ P (X k) = P (Z < k+0.5 µ ) = σ P (X < k) = P (Z < k 0.5 µ ) = σ P (X = k) = P ( k 0.5 µ < Z < k+0.5 µ ) = σ σ Some comments: The approximation is good if both np and n(1 p) are 5. These 2 requirements hold if n is large, or if n is not very large but p 0.5. The ±0.5 in the formulas above is the so called continuity correction and we should use it for better approximation. Example: A coin is flipped 1000 times. Find the probability that in these 1000 tosses we obtain at least 530 heads? See figures below for the shape of the binomial distribution for different values of n and p. 19

22 X b(15, 0.55) Example: A manufacturing process produces semiconductor chips with a known failure rate 6.3%. Assume that chip failures are independent of one another. You will be producing 2000 chips tomorrow. a. Find the expected number of defective chips produced. b. Find the standard deviation of the number of defective chips. c. Find the probability (approximate) that you will produce less than 135 defects. 22

Unit 16 Normal Distributions Objectives: To obtain relative frequencies (probabilities) and percentiles with a population having a normal distribution While there are many different types of distributions

The Islamic University of Gaza Faculty of Commerce Department of Economics and Political Sciences An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 7 - Sampling and Sampling

Chapter 5 Section 5.1: Central Tendency Mode: the number or numbers that occur most often. Median: the number at the midpoint of a ranked data. Example 1: The test scores for a test were: 78, 81, 82, 76,

The basics of probability theory. Distribution of variables, some important distributions 1 Random experiment The outcome is not determined uniquely by the considered conditions. For example, tossing a

Practice for Chapter 6 & 7 Math 227 This is merely an aid to help you study. The actual exam is not multiple choice nor is it limited to these types of questions. Using the following uniform density curve,

Density Curve A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: 1. The total area under the curve must equal 1. 2. Every point on the curve

Summarizing Data: Measures of Variation One aspect of most sets of data is that the values are not all alike; indeed, the extent to which they are unalike, or vary among themselves, is of basic importance

Review Exam 2 This is a sample of problems that would be good practice for the exam. This is by no means a guarantee that the problems on the exam will look identical to those on the exam but it should

Practice Exam Use the following to answer questions 1-2: A census is done in a given region. Following are the populations of the towns in that particular region (in thousands): 35, 46, 52, 63, 64, 71,

STATISTICS/GRACEY PRACTICE TEST/EXAM 2 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Identify the given random variable as being discrete or continuous.

STT 315 Practice Problems II for Sections 4.1-4.8 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Classify the following random

MAT 2379 3X (Summer 2012) Continuous Distributions Up to now we have been working with discrete random variables whose R X is finite or countable. However we will have to allow for variables that can take

LESSON SEVEN CONFIDENCE INTERVALS FOR MEANS AND PROPORTIONS An interval estimate for μ of the form a margin of error would provide the user with a measure of the uncertainty associated with the point estimate.

3. Continuous Random Variables A continuous random variable is one which can take any value in an interval (or union of intervals) The values that can be taken by such a variable cannot be listed. Such

Chapter 6 The Normal Distribution 6.1 The Normal Distribution 1 6.1.1 Student Learning Objectives By the end of this chapter, the student should be able to: Recognize the normal probability distribution

The Binomial Model The binomial probability distribution is a discrete probability distribution function Useful in many situations where you have numerical variables that are counts or whole numbers Classic

4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

Notes on Continuous Random Variables Continuous random variables are random quantities that are measured on a continuous scale. They can usually take on any value over some interval, which distinguishes

13.2 Measures of Central Tendency Measures of Central Tendency For a given set of numbers, it may be desirable to have a single number to serve as a kind of representative value around which all the numbers

Probability Experiment is a process that results in an observation that cannot be determined with certainty in advance of the experiment. Each observation is called an outcome or a sample point which may

CHAPTER 7: THE CENTRAL LIMIT THEOREM Exercise 1. Yoonie is a personnel manager in a large corporation. Each month she must review 16 of the employees. From past experience, she has found that the reviews

6-2 The Standard Normal Distribution This section presents the standard normal distribution which has three properties: 1. Its graph is bell-shaped. 2. Its mean is equal to 0 (μ = 0). 3. Its standard deviation

ch6apractest Using the following uniform density curve, answer the question. 1) What is the probability that the random variable has a value greater than 2? 1) A) 0.625 B) 0.875 C) 0.700 D) 0.750 2) What

Lesson2 Characteristics of Binomial Distributions In the last lesson, you constructed several binomial distributions, observed their shapes, and estimated their means and standard deviations. In Investigation

INTRODUCTION TO PROBABILITY AND STATISTICS Conditional probability and independent events.. A fair die is tossed twice. Find the probability of getting a 4, 5, or 6 on the first toss and a,,, or 4 on the

1 5 ProbabilityDistributions. 5.1 Probability Distributions for Discrete Variables A probability distribution for a random variable summarizes or models the probabilities associated with the events for

8.8 APPLICATIONS OF THE NORMAL CURVE If a set of real data, such as test scores or weights of persons or things, can be assumed to be generated by a normal probability model, the table of standard normal-curve

4.6 I company that manufactures and bottles of apple juice uses a machine that automatically fills 6 ounce bottles. There is some variation, however, in the amounts of liquid dispensed into the bottles

Introduction to the Practice of Statistics Fifth Edition Moore, McCabe Section 1.3 Homework Answers 1.80 If you ask a computer to generate "random numbers between 0 and 1, you uniform will get observations

8.2 Confidence Intervals for One Population Mean When σ is Known Tom Lewis Fall Term 2009 8.2 Confidence Intervals for One Population Mean When σ isfall Known Term 2009 1 / 6 Outline 1 An example 2 Finding

8. THE NORMAL DISTRIBUTION The normal distribution with mean μ and variance σ 2 has the following density function: The normal distribution is sometimes called a Gaussian Distribution, after its inventor,

Chapter 3 Normal Distribution Density curve A density curve is an idealized histogram, a mathematical model; the curve tells you what values the quantity can take and how likely they are. Example Height

Discrete vs Continuous Data The Normal Curve and The Sampling Distribution We have seen examples of probability distributions for discrete variables X, such as the binomial distribution. We could use it

1 Central Tendency CENTRAL TENDENCY: A statistical measure that identifies a single score that is most typical or representative of the entire group Usually, a value that reflects the middle of the distribution

32 Measures of Central Tendency and Dispersion In this section we discuss two important aspects of data which are its center and its spread. The mean, median, and the mode are measures of central tendency

ACTM Regional Statistics Multiple Choice Questions This exam includes 2 multiple- choice items and three constructed- response items that may be used as tie- breakers. Record your answer to each of the

Multiple Choice: (Questions 1 24) Answer the following questions on the scantron provided. Give the response that best answers the question. Each multiple choice correct response is worth 2.5 points. Please

Statistical Inference Idea: Estimate parameters of the population distribution using data. How: Use the sampling distribution of sample statistics and methods based on what would happen if we used this

2. Describing Data We consider 1. Graphical methods 2. Numerical methods 1 / 56 General Use of Graphical and Numerical Methods Graphical methods can be used to visually and qualitatively present data and

STP 231 EXAM #1 (Example) Instructor: Ela Jackiewicz Honor Statement: I have neither given nor received information regarding this exam, and I will not do so until all exams have been graded and returned.

Important Probability Distributions OPRE 6301 Important Distributions... Certain probability distributions occur with such regularity in real-life applications that they have been given their own names.

MATH 01 Final ANSWERS August 1, 016 Part A 1. 17 points) A bag contains three different types of dice: four 6-sided dice, five 8-sided dice, and six 0-sided dice. A die is drawn from the bag and then rolled.

Chapter 9 Review Construct the indicated confidence interval for the difference between the two population means. Assume that the two samples are independent simple random samples selected from normally

CONFIDENCE INTERVALS I ESTIMATION: the sample mean Gx is an estimate of the population mean µ point of sampling is to obtain estimates of population values Example: for 55 students in Section 105, 45 of

MATH 183 The Chi-Square Distributions Dr. Neal, WKU The chi-square distributions can be used in statistics to analyze the standard deviation " of a normally distributed measurement and to test the goodness

Mind on Statistics Chapter 8 Sections 8.1-8.2 Questions 1 to 4: For each situation, decide if the random variable described is a discrete random variable or a continuous random variable. 1. Random variable

MAT 118 DEPARTMENTAL FINAL EXAMINATION (written part) REVIEW Ch 1-3 One problem similar to the problems below will be included in the final 1.This table presents the price distribution of shoe styles offered

PROBABILITY Chapter. Overview.. Conditional Probability If E and F are two events associated with the same sample space of a random experiment, then the conditional probability of the event E under the

Ismor Fischer, 5/9/01 5.-1 5. Formal Statement and Examples Comments: Sampling Distribution of a Normal Variable Given a random variable. Suppose that the population distribution of is known to be normal,

Statistics for Management II-STAT 362-Final Review Multiple Choice Identify the letter of the choice that best completes the statement or answers the question. 1. The ability of an interval estimate to

Lecture 6: Chapter 6: Normal Probability Distributions A normal distribution is a continuous probability distribution for a random variable x. The graph of a normal distribution is called the normal curve.

FINAL EXAM REVIEW - Fa 13 Determine which of the four levels of measurement (nominal, ordinal, interval, ratio) is most appropriate. 1) The temperatures of eight different plastic spheres. 2) The sample

Discrete and Continuous Random Variables Summer 003 Random Variables A random variable is a rule that assigns a numerical value to each possible outcome of a probabilistic experiment. We denote a random

University of Notre Dame Math 114: Statistics Exam III Nov 21, 2003 Instructor: Bei Hu Name: Record your answers to the multiple choice problems by placing an through one letter for each problem on this