COMPLETE Statistics Notes - Part 2 (got 4.0 in the course)

15
Pages

64
Views

Unlock Document

School

Boston College

Department

Economics

Course

ECON 1151

Professor

All

Semester

Spring

Description

Study Guide: Midterm #2 Keely Henesey
Chapter 5: Continuous Random Variables & Probability Distributions
5.1: Continuous Random Variables
Cumulative Distribution Function: expresses the probability that continuous random variable X does not exceed the value of x, as a
function of x
F(x)=P(X≤x)
Probability That a Continuous Random Variable X Falls in a Specified Range :
P(a0 ] for all values of x
• Area under the probability density function over all values of X within its range is equal to 1
• Probability that X lies between a and b (given ) is the area beneath x) between these points
b
P(a≤X≤b)= f (x∫dx
a
• Cumulative distribution function( 0 is the area under the probability density functio up to x 0
x0
F ( 0 f∫(x)dx x =MmnimumValueof X
xm
o Represents the probability of all values x up to and inc0uding
Areas Under Continuous Probability Density Functions: let X be a continuous random variable with probability density function
f (x) and cumulative density function )
• Total area under the curve x) is 1
• Derivative of the cumulative distribution function is the probability dens[F x =f xion ]
o Area under the curve f (x) to the left of0 is F ( 0
5.2: Expectations for Continuous Random Variables
Expected Value of Continuous Random Variable X : defined over rX ∈e[a,b ]
b
μ=E [X ]= ∫f (x)dx
a
Variance of Continuous Random Variable X : defined over rX ∈e[a,b ]
b b
σ =E (X−μ) 2]= ∫x−μ) f (x)dx σ =E [X 2−μ = x ∫ (x)dx−μ 2
a a
Expected Value of Functions of Continuous Random Variables: defined over rangX∈ [a,b ] Study Guide: Midterm #2 Keely Henesey
b
E [(X)=]g∫x)f (x)dx
a
Properties of Uniform Distributions: the probability is the same for all outcomes, defX ∈d[a,b] range
Probability Density Function f (x)= 1 a≤ X ≤b
*Graph: horizontal line over range b−a
Cumulative Probability Density Function F ( 0f (x∙[x0−a ]
*Graph: linear line constant slope
a+b
Expected Value μ=E [X =
2
(b−a) 2
Variance σ =E ([−μ) = 2]
12
Properties for Linear Functions of a Continuous Random Variable: (same as those for discrete random variables)
• Let X be a continuous random variable with mean and varianceσ 2
X X
• Define random variable W as =a+bX ]
μ WE a+bX =a+bμ X σW=Var a+bX =b σ 2 X σW= b∣ σX= √ W
Mean & Variance of Special Linear Functions:
Special Case Mean Variance
X−μ Z=a+bX
Z= X [=−μ /σX X ] E [Z]=0 Var(Z)=1
σX
[=1/σ X ]
5.3: The Normal Distribution
Probability Density Function for a Normally Distributed Random Variable X :
1 −(x−μ)/2σ
f (x)= 2e for−∞5 … The normal distribution provides a good approximation for the binomial distribution
• If nP(1−P)<5 … Use binomial distribution to determine probabilities
• Mean of the Binomial Distribution :
2
• Variance of the Binomial DistVar X =σ =nP 1−P )
• If n is Big Enough the Distribution of Random Variable Z is Approx. a Standard Normal Distribution...
X−E [X] X−nP
Z= ( )= ( )
√Var X √nP 1−P
Proportion Random Variab(P) : computed by dividing the number ofXsu by the sample snze
Proportion Random Variable: μ=P
P= X 2 P(1−P)
n σ = n Study Guide: Midterm #2 Keely Henesey
• A proportion random variable can be used to compute probabilities for proportion or percentage intervals using an extension
of the normal distribution approximation for the binomial distribution
o Resulting mean and variance can be used with Normal distribution to compute desired probability
5.5: The Exponential Distribution
Exponential Random Variable: useful for waiting­line or queuing problems
T(t>0)
• Used to Represent…
o Length of time until the end of a a service time
o Length of time until the next arrival to a queuing process, beginning at an arbitrary time 0
• Exponential distribution differs from the normal distribution in two ways: (1) It is restricted to random variables with positive
values, and (2) Its distribution is not symmetric
Probability Density Function:
( ) −λt
f t =λe fort>0
• [λ]→ Mean Number of Independent Arrivals Per Time Unit
o [1/λ]→ Mean Time Between Occurences
• [t]→ Number of Time Units Until the Next Arrival
• Used to Represent… The probability that a success or arrival will occur during an interval of time t
Cumulative Distribution Function:
F t =1−e −λfor t>0
• [1/λ → Mean
• [1/λ →] Variance
5.6: Jointly Distributed Continuous Random Variables
X x
Joint Cumulative Probability Distribution Function: defines the probability that simulta1e is less than 1 ,
X 2 is less than 2 , and so on
F ( 1x 2…, x =K)X i
Linear Combination of Two Random Variables: if both X and Y are joint normally distributed random variables, then the
resulting random variableW , is also normally distributed
W=aX+bY ]
2 2 2 2 2
σ Wa σ +b X +2abYov X ,Y ( )
μW=E [W =aμ +bX Y 2 2 2 2 2
σ Wa σ +b X +2abY X ,Y σ σ ) X Y
W=aX−bY ]
2 2 2 2 2
σ Wa σ +b X −2abYov X,Y ( )
μW=E [W ]=aμ −Xμ Y 2 2 2 2 2
σ Wa σ +b X −2abY X,Y σ σ( ) X Y
See Page 214 for Example of a General Portfolio Analysis Study Guide: Midterm #2 Keely Henesey
Chapter 6: Sampling & Sampling Distributions
6.1: Sampling from a Population
Random Sample ( SRS ) : chosen by a process that selects a sample from a population in such that each member of the population has
the same probability of being selected, each selection is independent, and every possible sample of a given size has the same
probability of being chosen
• Insurance policy against allowing personal biases to influence the selection
SRSs Generally Achieve Greater Accuracy Than Spending the Resources to Measure Every Item…
1. Often very difficult/costly to obtain and measure every item in a population
2. Properly selected samples can be used to obtain measured estimates of actual population values
3. Using the probability distribution of sample statistics allows us to determine the error associated with the estimate of the
population parameter
Sample Information is Used To… Make INFERENCES about the parent population
• Can’t describe entire population based on a small SRS, but we CAN make educated inferences about important
characteristics of the population distribution
Sampling Distribution of a Statistic: distribution of the values of the statistic from ALL possible samples of a given size from the
same population
• Sampling distributions can be used to make an inference about its associated parameter
• Sampling distribution becomes concentrated closer to the population parameter as the sample size increases
6.2: Sampling Distributions of Sample Means
Sample Mean: let X 1 , X 2 , … , Xn denote a random sample from a population
n
́ 1
X= n∑i=1Xi
Sampling Distribution of the Sample Mean: Xet denote sample mean of a random samplenof independent observations
from a population with mean and variancσ 2
X
σ
If N≥10n…σ = X
√ n
E [X =μ
Otherwise…σ = X σ ∙ N−n
√ n √ N−1
́ ́
• Because E [X =μ , X is an unbiased estimator of
• Each sample distribution is centered on the mean, increases, the distribution becomes more concentrated around
population mean because the standard error of the sample mean dncr increases
o Thus… Probability that sample mean is a fixed distance from population mean decreases nith increased
Standard Normal Approximation Can Be Used If...
X−μ
Z= σ
X́
• Normally Distributed Parent Population (thus, normally distributed sampling distribution of sample means)
• Central Limit Theorem Holds (as becomes large, regardless of the distribution of the parent population, the sample
distribution of the sample means will be normal) Study Guide: Midterm #2 Keely Henesey
o Because… The sum of thn random variables will be approximately normally distributed and the mean is that
sum divided byn , thus, it will be normally distributed as well
General Guidelines for C.L.T
Normally Symmetric Uniform
Population Distribution Distributed Distribution Distribution Highly Skewed
Necessaryn for a Anyn n≥25 n≥50
Normal Sample Mean n 20,25 ]
Distribution
Law of Large Numbers: regardless of the underlying probability distribution, the sample mean will approach the population mean as
the sample sine becomes large
lim X=μ pop
n→∞
• As n becomes large, the variance becomes small, until eventually the distribution approaches a constant
Derivation of the Mean & Standard Deviation of the Sample Mean…
E[X =E 1(X +X +…+X = ) 1 ∑ X = ∑ xP(́x)= nμ=μ
[n 1 2 n ] n i n
• As more and more samples are drawn, the mean of the sample means approaches the true population mean
2 2 2
Var X =Var 1X 1 X +…2 X = 1 n ∑ 1 σ i n2 = σ Thus…σ =X σ
[n n n ] (n n n ⇒ √n
• As sample size increases, variance decreases (larger sampling sizes result in more concentrated sampling distributions)
• If sample size is not a small fraction of the populat, then the individual sample members are not
distributed independently of one another ▯THUS… the observations are not selected independently
σ N−n
Var X = ∙√ N−1
√ n
• Finite Population Correction Factor : shows that the sample size, not the fraction of the population in the sample,
determines the precision of results (as measured by variance of sample mean) from a random sample
Acceptance Intervals: an interval within which a sample mean has a high probability of occurring, given that the population mean
and variance are known
μ±z α/2X́
• Symmetric Acceptance Interval : can be u has a normal distribution
o [1−α ]→ Probability That the Sample Mean is Included in the Interval
o [α]→ Probability That the Sample Mean is NOT Included in the Interval
• If the Sample Mean is Within the Acceptance Interval… We can accept the conclusion that the random sample came from
the population with the known population mean and variance
• If the Sample Mean is NOT Within the Acceptance Interval… We suspect that the population mean is not
Control Intervals & Their Application to Product Quality… (a control interval is an acceptance interval for sample means)
• Engineers work to achieve a small variance for product measurements that are directly related to product quality
• Once Small Variance is Achieved… Control interval is established that periodic random samples will be compared to
o If sample mean is within control interval it is concluded that the process is operating properly
6.3: Sampling Distributions of Sample Proportions Study Guide: Midterm #2 Keely Henesey
Sample Proportion: sample proportpon provides an estimate of paraPet, the proportion of population members that
have characteristic of interXst be the number of successes in a binomial snm observations with the parameter
P )
p= X
n
• X ]→ The sum of a set of independent Bernoulli random variables, each with probability of success
• p → The mean of a set of independent random variables
• Note… p can be modeled as a normally distributed random variable if CLT can be applied
Sampling Distribution of the Sample Proportiop: be the sample proportion of successes in a random sample from a
population with proportion of sPccess
E p =Pσ =̂p P 1−P )
√ n
• Because E[p =P , p is an unbiased estimator of
• As the sample size increases, the standard deviation decreases (large samples ▯ low variability)
• If nP(1−P)>5 , random variable is approximately standard normally distributed
p−P
Z= σ
̂p
Chapter 7: Single Population Estimation
7.1: Properties of Point Estimators
Estimator of a Population Parameter: a random variable that depends on sample information (its value provides approximations of
this unknown parameter) ▯A function of random variables
• Estimate : a specific value of the estimator of the population parameter ▯A single number
• Key Concept… An estimator is the process and the estimate is the result of that process
Point Estimator of a Population Parameter: a function of the sample information that produces a single number
• Point Estimate : the single number produced by the point estimator
Population Point Point Most
Parameter Estimator Estimate Unbiased? Efficient?
μ X x ✓ ✓ *
μ Median ­­ ✓ *