Transcription

3 Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction and assessment. It defines the content knowledge, skills, and understandings that are measured by the Standards of Learning assessment. It provides additional guidance to teachers as they develop an instructional program appropriate for their students. It also assists teachers in their lesson planning by identifying essential understandings, defining essential content knowledge, and describing the intellectual skills students need to use. This Guide delineates in greater specificity the content that all teachers should teach and all students should learn. The format of the Curriculum Guide facilitates teacher planning by identifying the key concepts, knowledge, and skills that should be the focus of instruction for each objective. The Curriculum Guide is divided into sections:, Essential Knowledge and Skills, Key Vocabulary,,, Resources, and Sample Instructional Strategies and Activities. The purpose of each section is explained below. : This section includes the objective, focus or topic, and in some, not all, foundational objectives that are being built upon. Essential Knowledge and Skills: Each objective is expanded in this section. What each student should know and be able to do in each objective is outlined. This is not meant to be an exhaustive list nor is a list that limits what is taught in the classroom. This section is helpful to teachers when planning classroom assessments as it is a guide to the knowledge and skills that define the objective. : This section includes vocabulary that is key to the objective and many times the first introduction for the student to new concepts and skills. : This section delineates the key concepts, ideas, and mathematical relationships that all students should grasp to demonstrate an understanding of the objectives. : This section includes background information for the teacher. It contains content that is necessary for teaching this objective and may extend the teachers knowledge of the objective beyond the current grade level. It may also contain definitions of key vocabulary to help facilitate student learning. Resources: This section lists various resources that teachers may use when planning instruction. Teachers are not limited to only these resources. Sample Instructional Strategies and Activities: This section lists ideas and suggestions that teachers may use when planning instruction. 1

4 Probability and Statistics in Prince William County is a semester course. The following chart lists the objectives for the Probability and Statistics Curriculum organized by topic. Specific objectives have been selected from the VDOE Probability and Statistics Standards to meet the objectives of this semester course. The chart organizes the objectives by topic. The Prince William County cross-content vocabulary terms that are in this course are: analyze, compare and contrast, conclude, evaluate, explain, generalize, question/inquire, sequence, solve, summarize, and synthesize. Objectives Descriptive Statistics PS 1, PS 2, PS 3, PS 4 Data Collection PS 8, PS 9 Probability PS 11, PS 12, PS 13, PS 16 Inferential Statistics PS 17 2

5 Descriptive Statistics Virginia Standard PS.1 The student will analyze graphical displays of univariate data including dotplots (line plot), stemplots (stem-and-leaf plot), and histograms, to identify and describe patterns and departures from patterns, using central tendency, spread, clusters, gaps, and outliers. Appropriate technology will be used to create graphical displays. Essential Knowledge and Skills The student will use problem solving, mathematical communication, mathematical reasoning, connections and representations to: Create and interpret graphical displays of data, including dotplots, stemplots and histograms. Examine graphs of data for clusters and gaps, and relate those phenomena to the data in context. Examine graphs of data for outliers, and explain the outlier(s) within the context of the data. Examine graphs of data, and identify the central tendency of the data as well as the spread. Explain the central tendency and the spread of the data within the context of the data. cluster dotplot (line plot) gap histogram mean measures of dispersion median mode outliers population spread stemplots (stem-and-leaf plot) univariate data Essential Questions What are different methods by which data can be displayed? How do measures of dispersion describe data? Essential Understandings Data are collected for a purpose and have meaning in a context. Measures of central tendency describe how the data cluster or group. Measures of dispersion describe how the data spread (disperse) around the center of the data. Graphical displays of data may be analyzed informally. Data analysis must take place within the context of the problem. Univariate refers to an expression, equation, function, or polynomial of only one variable. Univariate data involves a single variable per case. A categorical variable places an individual into one of several groups or categories. A quantitative variable takes numerical values for which arithmetic operations such as adding and averaging make sense. Quantitative data is often displayed on a histogram and categorical data is often displayed on a bar chart. A histogram is a bar graph that represents the frequency distribution of a data set. It has a horizontal scale that is quantitative and measures the data values, has a vertical scale that measures the frequencies of the classes, and the consecutive bars must touch. Population is the entire set of individuals from which samples are drawn. It is the set of people or things (units) that is being investigated. A measure of central tendency is a value that represents a typical or central entry of a data set. The three most commonly used measures of central tendency are the mean, the median, and the mode. The mean of a data set is the sum of the data entries divided by the number of entries. It is the balance point of a distribution. To find the mean of a data set, use one of the following formulas. x x Population Mean: µ = Sample Mean: x = N n The lowercase Greek letter µ (pronounced mu) represents the population mean and x (read as: x bar ) represents the sample mean. Note that N represents the number of entries in a population and n represents the number of entries in a sample. The symbol Σ, for sum, means to add up all the values of x. (continued) 3

6 Descriptive Statistics Virginia Standard PS.1 The student will analyze graphical displays of univariate data including dotplots (line plot), stemplots (stem-and-leaf), and histograms, to identify and describe patterns and departures from patterns, using central tendency, spread, clusters, gaps, and outliers. Appropriate technology will be used to create graphical displays. (continued) The median is the midpoint of a distribution, the number such that half the data set is smaller and the other half is larger. To find the median of a distribution: 1. Arrange all data in the set in order of size, from smallest to largest. n If the number n of data in the set is odd, the median M is the center in the ordered list. The location of the median is found by counting data 2 up from the bottom of the list. 3. If the number n of data in the set is even, the median M is the average of the two center data in the ordered list. The location of the median is again n + 1 from the bottom of the list. 2 The mode is a peak of the distribution. There may be one, more than one, or no mode. A cluster is a naturally occurring subgroup of a population used in sampling. On a plot, a cluster is a group of data clustering close to the same value, away from other groups. A gap is a difference as in between two totals. On a plot, a gap is the space that separates clusters of data. A stemplot (stem and leaf plot) is similar to a histogram but has the advantage that the graph still contains the original data values. In a stemplot, each number is separated into a stem (the entries leftmost digits) and a leaf (the rightmost digit). A dotplot is used to graph quantitative data. Each data entry is plotted using a point above its value on a horizontal axis. Like a stemplot, a dotplot illustrates how data are distributed, determines specific data entries, and identifies unusual data values (outliers). An outlier is a data entry that is far removed from the other entries in the data set. It is data that falls outside the overall pattern of the graph. Spread is the degree to which values in a distribution differ. Measures of variability or spread, for quantitative variables include the standard deviation, interquartile range, and range. Statisticians like to measure and analyze the measures of dispersion (spread) of the data set about the mean in order to assist in making inferences about the population. One measure of spread would be to find the sum of the deviations between each element and the mean whose sum is always zero. There are two methods to overcome this mathematical dilemma: 1. take the absolute value of the deviations before finding the average; or 2. square the deviations before finding the average. The mean absolute deviation uses the first method and the variance and standard deviation uses the second. 4

8 Descriptive Statistics Virginia Standard PS.2 The student will analyze numerical characteristics of univariate data sets to describe patterns and departure from patterns, using mean, median, mode, variance, standard deviation, interquartile range, range, and outliers. Essential Knowledge and Skills The student will use problem solving, mathematical communication, mathematical reasoning, connections and representations to: Interpret mean, median, mode, range, interquartile range, variance, and standard deviation of a univariate data set in terms of the problem s context. Identify possible outliers, using an algorithm. Explain the influence of outliers on a univariate data set. Explain ways in which standard deviation addresses dispersion by examining the formula for standard deviation. deviation dispersion interquartile range mean median mode outlier range standard deviation variance Essential Questions Why is data collected? What is an outlier and how does it influence a data set? Do all dispersions contain an outlier? How are measures of central tendency used? What is meant by the spread of the data? Essential Understandings Data are collected for a purpose and have meaning within a context. Analysis of the descriptive statistical information generated by a univariate data set should include the interplay between central tendency and dispersion as well as among specific measures. Data points identified algorithmically as outliers should not be excluded from the data unless sufficient evidence exists to show them to be in error. The mean of a data set is the sum of the data entries divided by the number of entries. The median of a data set is the value that lies in the middle of the data when the data set is ordered. If the data set has an odd number of entries, the median is the middle data entry. If the data set has an even number of entries, the median is the mean of the two middle data entries. The mode of a data set is the data entry that occurs with the greatest frequency. If no data entry is repeated, the data set has no mode. If two entries occur with the same greatest frequency, each entry is a mode and the data set is called bimodal. The range of a data set is the difference between the maximum and minimum data entries in the set. Range = (Maximum data entry) (Minimum data entry) Deviation of a data entry in a population data set is the difference between the entry and the mean µ of the data set. It is the difference from the mean x x, or other measure of center. Deviation of x = x µ Dispersion is the degree to which the values of a frequency distribution are scattered around some central point, usually the arithmetic mean or median. The standard deviation is the positive square root of the variance of the data set. The greater the value of the standard deviation, the more spread out the data are about the mean. The lesser (closer to 0) the value of the standard deviation, the closer the data are clustered about the mean. (continued) 6

9 Descriptive Statistics Virginia Standard PS.2 The student will analyze numerical characteristics of univariate data sets to describe patterns and departure from patterns, using mean, median, mode, variance, standard deviation, interquartile range, range and outliers. (continued) Another characteristic of data is its level of measurement. The level of measurement determines which statistical calculations are meaningful. The four levels of measurement, in order from lowest to highest, are nominal, ordinal, interval, and ratio. The following table summarizes the four levels of measurement. Level of Measurement Nominal (Qualitative data) Ordinal (Qualitative or quantitative data) Interval (Quantitative data) Ratio (Quantitative data) Meaningful Calculations Put in a category. Put in a category and put in order. Put in a category, put in order, and find differences between values. Put in a category, put in order, find differences between values, and find ratios of values. A variance is a measure of spread equal to the square of the standard deviation. The average of the squared deviations from the mean is known as the variance, and is another measure of the spread of the elements in a data set. n 2 ( xi µ ) 2 i= 1 To calculate variance use ( σ ) = n, where µ represents the mean of the data set, n represents the number of elements in the data set, and x i represents the i th element of the data set. (This is the formula that will be used on the new Algebra I SOL assessment and included on the formula sheet for the Algebra EOC SOL.) The differences between the elements and the arithmetic mean are squared so that the differences do not cancel each other out when finding the sum. When squaring the differences, the units of measure are squared and larger differences are weighted more heavily than smaller differences. In order to provide a measure of variation in terms of the original units of the data, the square root of the variance is taken, yielding the standard deviation. The standard deviation is the positive square root of the variance of the data set. The greater the value of the standard deviation, the more spread out the data are about the mean. The lesser (closer to 0) the value of the standard deviation, the closer the data are clustered about the mean. i= 1 σ To calculate standard deviation use ( ) = ( x µ ) i 2, where µ represents the mean of the data set, n represents the number of elements in the data set, n and x i represents the i th element of the data set. (This is the formula that will be used on the new Algebra I SOL assessment and included on the formula sheet for the Algebra EOC SOL.) Often, textbooks will use two distinct formulas for standard deviation. In these formulas, the Greek letter σ, written and read sigma, represents the standard deviation of a population, and s represents the sample standard deviation. The population standard deviation can be estimated by calculating the sample standard deviation. The formulas for sample and population standard deviation look very similar except that in the sample standard deviation formula, n 1 is used instead of n in the denominator. The reason for this is to account for the possibility of greater variability of data in the population than what is seen in the sample. When n 1 is used in the denominator, the result is a larger number. Therefore, the calculated value of the sample standard deviation will be larger than the population standard deviation. As sample sizes get larger (n gets larger), the difference between the sample standard deviation and the population standard deviation gets smaller. The use of n 1 to calculate the sample standard deviation is known as Bessel s correction. (continued) 7

10 Descriptive Statistics Virginia Standard PS.2 The student will analyze numerical characteristics of univariate data sets to describe patterns and departure from patterns, using mean, median, mode, variance, standard deviation, interquartile range, range, and outliers. (continued) To locate the center of distribution, divide the data into a lower and upper half. Find the values that divide each half in half again. These two values, the lower quartile, Q 1 and the upper quartiles, Q 3, together with the median, divide the data into fourths. The interquartile range or measure of spread is the distance between upper and lower quartiles (IQR = Q 3 Q 1 ). IQR represents 50% of the data. Outliers are unusual data values. Typically, outliers are 1.5 IQR away from Q 1 and Q 3. It is possible that distributions will not contain outliers. For example, in a normal distribution, there are no outliers. Appropriate technology will be used to calculate statistics. 8

12 Descriptive Statistics Virginia Standard PS.3 The student will compare distributions of two or more univariate data sets, analyzing center and spread (within group and between group variations, clusters and gaps, shapes, outliers, or other unusual features. Essential Knowledge and Skills The student will use problem solving, mathematical communication, mathematical reasoning, connections and representations to: Compare and contrast two or more univariate data sets by analyzing measures of center and spread within a contextual framework. Describe any unusual features of the data, such as clusters, gaps, or outliers, within the context of the data. Analyze in context kurtosis and skewness in conjunction with other descriptive measures. kurtosis measure of center normal distribution skewness statistical tendency Essential Questions How can unusual data be described? How is statistical tendency used? Essential Understandings Data are collected for a purpose and have meaning in a context. Statistical tendency refers to typical cases but not necessarily to individual cases. A measure of center is a single number summary that measures the center of a distribution; usually the mean (or average) is used. Median and mode are also measures of center. A normal distribution is a useful probability distribution that has a symmetric bell or mound shape and tails extending infinitely in both directions. Normal distributions are a family of probability models that assign probabilities to events as areas under a curve. The normal curves are symmetric and bell-shaped. A specific normal curve is completely described by giving its mean, µ, and its standard deviation, σ. Kurtosis is a measure of the concentration of a distribution about its mean. Statistical tendency is a way in which something (data) typically behaves or happens or is likely to react, behave or happen. Skewness is a measure of the symmetry of a distribution about its mean. If the mean equals the median there is no skew, the distribution is symmetric. If the mean is smaller than the median the distribution will be left skewed. If the mean is larger than the median the distribution will be right skewed. Distributions are skewed when they show bunching at one end and a long tail stretching out in the other direction. Data are useful only if it can be organized and presented so the meaning is clear. Two principles that are useful in exploring and analyzing data are: 1. Examine each variable by itself, and then look at the relationship among the variables. 2. Begin with a graph or graphs then add numerical summaries of specific aspects of the data. In any graph of data, the overall pattern can be described by its shape, center, and spread. Outliers fall outside the overall pattern. Appropriate technology will be used to generate graphical displays. 10

14 Descriptive Statistics Virginia Standard PS.4 The student will analyze scatterplots to identify and describe the relationship between two variables, using shape; strength of relationship; clusters; positive, negative, or no association; outliers; and influential points. Essential Knowledge and Skills The student will use problem solving, mathematical communication, mathematical reasoning, connections and representations to: Examine scatterplots of data, and describe skewness, kurtosis, and correlation within the context of the data. Describe and explain any unusual features of the data, such as clusters, gaps or outliers, within the context of the data. Identify influential data points (observations that have great effect on a line of best fit because of extreme x-values) and describe the effect of the influential points. bivariate data correlation influential point kurtosis regression scatterplot Essential Questions How can graphs be used to examine data? What is the role of outliers in data observations? What is strength of an association between two variables? Essential Understandings A scatterplot serves two purposes: - to determine if there is a useful relationship between two variables, and - to determine the family of equations that describes the relationship. Data are collected for a purpose and have meaning in a context. Association between two variables considers both the direction and strength of the association. The strength of an association between two variables reflects how accurately the value of one variable can be predicted based on the value of the other variable. Outliers are observations with large residuals and do not follow the pattern apparent in the other data points. A scatterplot is a plot that shows the relationship between two quantitative variables, usually with each case represented by a dot. On a scatterplot, an influential point is a point that strongly influences the regression equation and correlation. To judge a point s influence, fit a line and compute a correlation first with, and then without, the point in question. Kurtosis is a descriptive property of distributions designed to indicate the general form of concentration around the mean. A correlation is a numerical value between 1 and 1 inclusive that measures the strength and direction of a linear relationship between two variables. Strength of an association between two variables is strong if there is little variation within each vertical strip (conditional distribution of y given x). If there is a lot of variation, the relationship is weak. A regression is the statistical study of the relationship between two (or more) quantitative variables, such as fitting a line to bivariate data. Bivariate data is data that involve two variables per case. For quantitative variables, it is often displayed on a scatterplot. Influential points are points that cause large changes in parameter estimates when they are deleted. For example, a substantially low score on a test (outlier) will affect the mean of the distribution. If deleted, the measures of central tendency and dispersion will change. (continued) 12

15 Descriptive Statistics Virginia Standard PS.4 The student will analyze scatterplots to identify and describe the relationship between two variables, using shape; strength of relationship; clusters; positive, negative, or no association; outliers; and influential points. (continued) If a logical relationship exists between two variables, a graph is used to plot the available data. A scatterplot contains an x (independent or explanatory) value and a y (dependent or response) value. A scatterplot serves two purposes: 1. it helps to see if there is a useful relationship between the two variables; and 2. it helps to determine the type of equation to use to describe the relationship. Appropriate technology will be used to generate scatterplots and identify outliers and influential points. 13

17 Data Collection Virginia Standard PS.8 The student will describe the methods of data collection in a census, sample survey, experiment, and observational study and identify an appropriate method of solution for a given problem setting. Essential Knowledge and Skills The student will use problem solving, mathematical communication, mathematical reasoning, connections and representations to: Compare and contrast controlled experiments and observational studies and the conclusions one can draw from each. Compare and contrast population and sample and parameter and statistic. Identify biased sampling methods. Describe simple random sampling. Select a data collection method appropriate for a given context. biased biased sampling census control group experiment observational study parameter population sample survey simple random sampling statistic Essential Questions What are the various methods of data collection? How does data collection affect conclusions for a problem? What are the differences between controlled experiments and observational studies? What determines whether a sample is biased? Essential Understandings The value of a sample statistic varies from sample to sample if the simple random samples are taken repeatedly from the population of interest. Poor data collection can lead to misleading and meaningless conclusions. An experiment is when a treatment is assigned to a person, animal, or object, to observe a response. A control group is a group in an experiment that provides a standard for comparison to evaluate the effectiveness of a treatment; often given the placebo. An observational study is a study in which the conditions of interest are already built into the units being studied and are not randomly assigned. Population is the set of people or things (units) that is being investigated. Census is a count or measure of the entire population. A parameter is a summary number that describes a population (usually unknown) or a probability distribution. It is a numerical description of a population characteristic. A statistic is any function of a number of random variables usually identically distributed, that may be used as an estimator for a population parameter. A statistic is a numerical description of a sample characteristic. The value of a statistic is known when a sample is taken, but it can change from sample to sample. A sampling method is biased if it tends to give samples in which some characteristic of the population is underrepresented or overrepresented (biased sampling). Sample selection bias (convenience sampling) is the extent to which a sampling procedure produces samples that tend to result in numerical summaries that are systematically too high or too low. Simple random sampling is a sample in which individuals are selected by using some random process. A sample survey is an investigation of one or more characteristics of a population. (continued) 15

18 Data Collection (continued) Virginia Standard PS.8 The student will describe the methods of data collection in a census, sample survey, experiment, and observational study and identify an appropriate method of solution for a given problem setting. Statistical Studies Observational Experimental (Observe and measure Differences (Apply some but do not modify) treatment) Observe individuals and measure variables of interest but do not influence the responses. The purpose is to describe some group or situation. Impose some treatment on individuals in order to observe their responses. The purpose of an experiment is to study whether the treatment causes a change in the response. 16

20 Data Collection Virginia Standard PS.9 The student will plan and conduct a survey. The plan will address sampling techniques (e.g., simple random and stratified) and methods to reduce bias. Essential Knowledge and Skills The student will use problem solving, mathematical communication, mathematical reasoning, connections and representations to: Investigate and describe sampling techniques, such as simple random sampling, stratified sampling, and cluster sampling. Determine which sampling technique is best, given a particular context. Plan a survey to answer a question or address an issue. Given a plan for a survey, identify possible sources of bias, and describe ways to reduce bias. Design a survey instrument. Conduct a survey. biased sample cluster cluster sample convenience sample simple random sampling stratified sampling survey Essential Questions What is required to plan and conduct a survey? What are sampling techniques and how do they reduce bias? Essential Understandings The purpose of sampling is to provide sufficient information so that population characteristics may be inferred. Inherent bias diminishes as sample size increases. To survey is to look over or examine in detail. A survey is a detailed collection of information. Surveys can be valuable in determining the attitude of a population about a candidate, product, or issue. The most common types of surveys are done by interview, mail, or telephone. In designing a survey, it is important to word the questions so they do not lead to biased results. The design of a statistical study is biased if it systematically favors certain outcomes. A biased sample has a distribution that is not determined only by the population from which it is drawn, but also by some property that influences the distribution of the sample. Biased samples do not represent the entire population of the study. For example, an opinion poll might be biased by geographical location. Another source of bias is voluntary response samples. These are biased because people with strong opinions are more likely to respond. A cluster is a naturally occurring subgroup of a population used in stratified sampling. A cluster sample is when a population falls into a naturally occurring subgroup which has similar characteristics. Simple random sampling is the process of collecting samples devised to avoid any interference from any shared property of, or relation between the elements selected, so that its distribution is affected only by that of the whole population and can therefore be taken to be representative of it. A stratified sample is a sample that is not drawn at random from the whole population, but is drawn separately from a number of disjoint strata of the population in order to ensure a more representative sample. To achieve a stratified random sample, divide the units of the sampling frame into nonoverlapping subgroups and choose a simple random sample from each subgroup. A type of sample that often leads to biased studies is a convenience sample. A convenience sample consists only of available members of the population or members of the population that are easiest to reach. For this reason, this is not a recommended sampling technique. 18

22 Probability Virginia Standard PS.11 The student will identify and describe two or more events as complementary, dependent, independent, and/or mutually exclusive. Essential Knowledge and Skills The student will use problem solving, mathematical communication, mathematical reasoning, connections and representations to: Define and give contextual examples of complementary, dependent, independent, and mutually exclusive events. Given two or more events in a problem setting, determine if the events are complementary, dependent, independent, and/or mutually exclusive. complement dependent event independent event mutually exclusive Essential Questions What is meant by mutually exclusive? What is meant by independent/dependent outcomes? How are events defined and what are examples of each? Essential Understandings The complement of event A consists of all outcomes in which event A does not occur. Two events, A and B, are independent if the occurrence of one does not affect the probability of the occurrence of the other. If A and B are not independent, then they are said to be dependent. Events A and B are mutually exclusive if they cannot occur simultaneously. The sum of the probabilities of all outcomes in a sample space is 1 or 100%. In the following examples, the rectangle represents the total probability of the sample space. Two events A and B are mutually exclusive if A and B cannot occur at the same time. A A and B are mutually exclusive. B The complement of event E is the set of all outcomes in a sample space that are not included in event E. The complement of event E is denoted by E and is read as E prime. E E The area of the circle represents the probability of event E, and the area outside the circle represents the probability of the complement of event E. (continued) 20

23 Probability (continued) Two events are independent if the occurrence of one of the events does not affect the probability of the occurrence of the other event. Events that are not independent are dependent events. An example of an independent event is the roll of a die and the flip of a coin. Virginia Standard PS.11 The student will identify and describe two or more events as complementary, dependent, independent, and/or mutually exclusive. 21

25 Probability Virginia Standard PS.12 The student will find probabilities (relative frequency and theoretical), including conditional probabilities for events that are either dependent or independent, by applying the Law of Large Numbers concept, the addition rule, and the multiplication rule. Essential Knowledge and Skills The student will use problem solving, mathematical communication, mathematical reasoning, connections and representations to: Calculate relative frequency and expected frequency. Find conditional probabilities for dependent, independent, and mutually exclusive events. addition rule conditional probability Law of Large Numbers multiplication rule relative frequency theoretical probability Essential Questions How are probabilities calculated? How is the Law of Large Numbers applied? Essential Understandings Data are collected for a purpose and have meaning in a context. Venn diagrams may be used to find conditional probabilities. The Law of Large Numbers states that as a procedure is repeated again and again, the relative frequency probability of an event tends to approach the actual probability. Theoretical probability is used when each outcome in a sample space is equally likely to occur. Number of outcomes in E P( E ) = Total number of outcomes in sample space A conditional probability is the probability of an event occurring, given that another event has already occurred. The conditional probability of event B occurring, given that event A has occurred, is denoted by P(B A) and is read as probability of B, given A. To find the probability of two events occurring in sequence, use the multiplication rule. If events A and B are independent then the rule is: P(A and B) = P(A) P(B) To use the multiplication rule, first find the probability that the first event occurs, find the probability the second event occurs given the first event has occurred, and then multiply these two probabilities. Two events A and B are mutually exclusive if A and B cannot occur at the same time. The addition rule for the probability of A or B states that the probability that events A or B will occur (A or B) is given by: P(A or B) = P(A) + P(B) P(A and B). If events A and B are mutually exclusive, then the rule can be simplified to: P(A or B) = P(A) + P(B). As the number of times a probability experiment is repeated, the empirical probability (relative frequency) of an event approaches the theoretical probability of the event (The Law of Large Numbers). The relative frequency of a class is the portion or percentage of the data that falls in that class. To find the relative frequency of a class, divide the frequency f by the sample size n. Class frequency Relative frequency = Sample size = f n 23

27 Probability Virginia Standard PS.13 The student will develop, interpret, and apply the binomial probability distribution for discrete random variables, including computing the mean and standard deviation for the binomial variable. Essential Knowledge and Skills The student will use problem solving, mathematical communication, mathematical reasoning, connections and representations to: Develop the binomial probability distribution within a real-world context. Calculate the mean and standard deviation for the binomial variable. Use the binomial distribution to calculate probabilities associated with experiments for which there are only two possible outcomes. binomial distribution probability distribution Essential Questions How is the mean and standard deviation calculated for a binomial variable? What is a probability distribution? What is the relationship between variances and standard deviation? What is meant by binomial distribution? How are binomial probabilities determined? How can the binomial distribution be applied to real-world applications? Essential Understandings A probability distribution is a complete listing of all possible outcomes of an experiment together with their probabilities. The procedure has a fixed number of independent trials. A random variable assumes different values depending on the event outcome. A probability distribution combines descriptive statistical techniques and probabilities to form a theoretical model of behavior. A binomial experiment is a probability experiment that satisfies the following conditions: 1. The experiment is repeated for a fixed number n of trials, where each trial is independent of the other trials. 2. There are only two possible outcomes of interest for each trial. The outcomes can be classified as a success or as a failure. 3. The probability of a success is the same for each trial. 4. The random variable x counts the number of successful trials out of the n trials. The parameters of a binomial distribution are n and p. If data are produced in a binomial setting, then the random variable X = number of successes is called a binomial random variable, and the probability distribution of X is called a binomial distribution. The distribution of the count X of successes in the binomial setting is the binomial distribution with parameters n and p. The parameter n is the number of observation, and p is the probability of a success on any one observation. The possible values of X are the whole numbers from 0 to n or X is B(n, p). There are several ways to find the probability of x successes in n trials. One way is to use the binomial probability formula. x n x P( x) = C p q n x = n! p q ( n x )! x! x n x where: x = the number of successes in n trials p = the probability of success in a single trial q = probability of failure in a single trial q = 1 p The mean and the standard deviation of a binomial distribution can be computed using the formulas: µ = np and σ = np(1 p). 25

29 Probability Virginia Standard PS.16 The student will identify properties of a normal distribution and apply the normal distribution to determine probabilities, using a table or graphing calculator. Essential Knowledge and Skills The student will use problem solving, mathematical communication, mathematical reasoning, connections and representations to: Identify the properties of a normal probability distribution. Describe how the standard deviation and the mean affect the graph of the normal distribution. Determine the probability of a given event, using the normal distribution. continuous probability distribution normal curve normal distribution Essential Questions What are the properties of a normal probability distribution? How does the standard deviation and mean affect the graph of the normal distribution? How is the probability of an event calculated? Essential Understandings The normal distribution curve is a family of symmetrical curves defined by the mean and the standard deviation. Areas under the curve represent probabilities associated with continuous distributions. The normal curve is a probability distribution and the total area under the curve is 1. A continuous random variable has an infinite number of possible values that can be represented by an interval on the number line. Its probability distribution is called a continuous probability distribution. A normal distribution is a continuous probability distribution for a random variable x. The graph of a normal distribution is called the normal curve and has the following properties: 1. The mean, median, and mode are equal. 2. The normal curve is bell shaped and is symmetric about the mean. 3. The total area under the normal curve is equal to one. 4. The normal curve approaches but never touches the x-axis as it extends farther and farther away from the mean. 5. Between µ σ and µ + σ (in the center of the curve) the graph curves downward. The graph curves upward to the left of µ σ and to the right of µ + σ. The points at which the curve changes from curving upward to curving downward are called inflection points. Inflection points -3σ -2σ -1σ µ 1σ 2σ 3σ x 68% 95% 99.7% 27

31 Inferential Statistics Virginia Standard PS.17 The student, given data from a large sample, will find and interpret point estimates and confidence intervals for parameters. The parameters will include proportion and mean, difference between two proportions, and difference between two means (independent and paired). Essential Knowledge and Skills The student will use problem solving, mathematical communication, mathematical reasoning, connections and representations to: Construct confidence intervals to estimate a population parameter, such as a proportion or the difference between two proportions; or a mean or the difference between two means. Select a value for alpha (Type I error) for a confidence interval. Interpret confidence intervals in the context of the data. Explain the importance of random sampling for confidence intervals. Calculate point estimates for parameters, and discuss the limitations of point estimates. confidence interval parameter point estimate Type I error Essential Questions Why are confidence intervals and tests of significance important? How is sampling used and why is it important? Essential Understandings A primary goal of sampling is to estimate the value of a parameter based on a statistic. Confidence intervals use the sample statistic to construct an interval of values that one can be reasonably certain contains the true (unknown) parameter. Confidence intervals and tests of significance are complementary procedures. Paired comparisons experimental design allows control for possible effects of extraneous variables. A parameter is a numerical description of a population characteristic. A statistic is a numerical description of a sample characteristic. A Type I error is the error of rejecting the null hypothesis when it is in fact true. In a hypothesis test, the level of significance is the maximum allowable probability of making a Type I error. To decrease the probability of a Type I error, decrease the significance level. Changing the sample size has no effect of the probability of a Type I error. A Type I error is denoted by α, the lowercase Greek letter alpha. A point estimate is a single value estimate for a population parameter. The most unbiased point estimate of the population means µ is the sample mean x. Using a point estimate and a margin of error, an interval estimate of a population parameter such as µ can be constructed. This interval estimate is called a confidence interval. The margin of error is calculated by the z-score for the given confidence level times the standard error. p(1 p) For proportions, the standard error can be computed using the formula σ =. n A confidence interval for proportion is p plus or minus margin of error. A c-confidence interval for the population mean µ is x E < µ < x + E. The probability that the confidence interval contains µ is c. 29

Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

Learning objectives Descriptive Statistics F. Farrokhyar, MPhil, PhD, PDoc To recognize different types of variables To learn how to appropriately explore your data How to display data using graphs How

A frequency distribution is a table used to describe a data set. A frequency table lists intervals or ranges of data values called data classes together with the number of data values from the set that

A Correlation of to the South Carolina Data Analysis and Probability Standards INTRODUCTION This document demonstrates how Stats in Your World 2012 meets the indicators of the South Carolina Academic Standards

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) B Bar graph a diagram representing the frequency distribution for nominal or discrete data. It consists of a sequence

Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

Chapter 3: Data Description Numerical Methods Learning Objectives Upon successful completion of Chapter 3, you will be able to: Summarize data using measures of central tendency, such as the mean, median,

Chapter 3: Central Tendency Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately describes the center of the distribution and represents

GCSE HIGHER Statistics Key Facts Collecting Data When writing questions for questionnaires, always ensure that: 1. the question is worded so that it will allow the recipient to give you the information

Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about

DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

Descriptive Statistics Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Statistics as a Tool for LIS Research Importance of statistics in research

GCSE Statistics Revision notes Collecting data Sample This is when data is collected from part of the population. There are different methods for sampling Random sampling, Stratified sampling, Systematic

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE MAT 12O ELEMENTARY STATISTICS I 3 Lecture Hours, 1 Lab Hour, 3 Credits Pre-Requisite:

BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential

Summarizing and Displaying Categorical Data Categorical data can be summarized in a frequency distribution which counts the number of cases, or frequency, that fall into each category, or a relative frequency

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE MAT 119 STATISTICS AND ELEMENTARY ALGEBRA 5 Lecture Hours, 2 Lab Hours, 3 Credits Pre-

10-3 Measures of Central Tendency and Variation So far, we have discussed some graphical methods of data description. Now, we will investigate how statements of central tendency and variation can be used.

Chapter 4: Data & the Nature of Graziano, Raulin. Research Methods, a Process of Inquiry Presented by Dustin Adams Research Variables Variable Any characteristic that can take more than one form or value.

WHAT IT IS Return to Table of ontents Descriptive statistics include the numbers, tables, charts, and graphs used to describe, organize, summarize, and present raw data. Descriptive statistics are most

Statistics for Engineers 4-1 4. Introduction to Statistics Descriptive Statistics Types of data A variate or random variable is a quantity or attribute whose value may vary from one unit of investigation

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

13.2 Measures of Central Tendency Measures of Central Tendency For a given set of numbers, it may be desirable to have a single number to serve as a kind of representative value around which all the numbers

Exploratory Data Analysis Exploratory Data Analysis involves both graphical displays of data and numerical summaries of data. A common situation is for a data set to be represented as a matrix. There is

Chapter 15 Multiple Choice Questions (The answers are provided after the last question.) 1. What is the median of the following set of scores? 18, 6, 12, 10, 14? a. 10 b. 14 c. 18 d. 12 2. Approximately

Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different

Numerical Summarization of Data OPRE 6301 Motivation... In the previous session, we used graphical techniques to describe data. For example: While this histogram provides useful insight, other interesting

HOSP 1207 (Business Stats) Learning Centre Describing Data This worksheet focuses on describing data through measuring its central tendency and variability. These measurements will give us an idea of what

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in

Interpreting Data in Normal Distributions This curve is kind of a big deal. It shows the distribution of a set of test scores, the results of rolling a die a million times, the heights of people on Earth,

ALGEBRA I (Created 2014) Amherst County Public Schools The 2009 Mathematics Standards of Learning Curriculum Framework is a companion document to the 2009 Mathematics Standards of Learning and amplifies

We Can Early Learning Curriculum PreK Grades 8 12 INSIDE ALGEBRA, GRADES 8 12 CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA April 2016 www.voyagersopris.com Mathematical

Final Exam Review MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) A researcher for an airline interviews all of the passengers on five randomly

Glossary Brase: Understandable Statistics, 10e A B This is the notation used to represent the conditional probability of A given B. A and B This represents the probability that both events A and B occur.

MATH BOOK OF PROBLEMS SERIES New from Pearson Custom Publishing! The Math Book of Problems Series is a database of math problems for the following courses: Pre-algebra Algebra Pre-calculus Calculus Statistics

AP Statistics 2001 Solutions and Scoring Guidelines The materials included in these files are intended for non-commercial use by AP teachers for course and exam preparation; permission for any other use

STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members

Content Area: Mathematics Grade Level Expectations: High School Standard: Number Sense, Properties, and Operations Understand the structure and properties of our number system. At their most basic level

Session 1.6 Measures of Central Tendency Measures of location (Indices of central tendency) These indices locate the center of the frequency distribution curve. The mode, median, and mean are three indices

Introduction to Statistics for Psychology and Quantitative Methods for Human Sciences Jonathan Marchini Course Information There is website devoted to the course at http://www.stats.ox.ac.uk/ marchini/phs.html

Sociology 301 Exam Review Liying Luo 03.22 Exam Review: Logistics Exams must be taken at the scheduled date and time unless 1. You provide verifiable documents of unforeseen illness or family emergency,

COMMON CORE STATE STANDARDS FOR Mathematics (CCSSM) High School Statistics and Probability Mathematics High School Statistics and Probability Decisions or predictions are often based on data numbers in

Graphical and Tabular Summarization of Data OPRE 6301 Introduction and Re-cap... Descriptive statistics involves arranging, summarizing, and presenting a set of data in such a way that useful information

Chapter 3 Descriptive Statistics: Numerical Measures Slide 1 Learning objectives 1. Single variable Part I (Basic) 1.1. How to calculate and use the measures of location 1.. How to calculate and use the

A Few Sources for Data Examples Used Introduction to Environmental Statistics Professor Jessica Utts University of California, Irvine jutts@uci.edu 1. Statistical Methods in Water Resources by D.R. Helsel

MODE The mode of the sample is the value of the variable having the greatest frequency. Example: Obtain the mode for Data Set 1 77 For a grouped frequency distribution, the modal class is the class having

AP Statistics Topic 9 ~ Measures of Spread Activity 9 : Baseball Lineups The table to the right contains data on the ages of the two teams involved in game of the 200 National League Division Series. Is

Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

This document is designed to help North Carolina educators teach the Common Core (Standard Course of Study). NCDPI staff are continually updating and improving these tools to better serve teachers. Statistics

Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample

The Problems of the Month (POM) are used in a variety of ways to promote problem solving and to foster the first standard of mathematical practice from the Common Core State Standards: Make sense of problems

Prescription: 430 Statistics and Financial Mathematics for Business Elective prescription Level 4 Credit 20 Version 2 Aim Students will be able to summarise, analyse, interpret and present data, make predictions

Central Tendency Central Tendency n A single summary score that best describes the central location of an entire distribution of scores. n Measures of Central Tendency: n Mean n The sum of all scores divided