Lab3-ChiSquareGOF.pdf

Chi Square Test (GOF and Independence)

Name ___________________________ Per. _____

Chi Square Modeling Using M & M’s Candies

Introduction:

Have you ever wondered why the package of M&Ms you just bought never seems to have enough of your

favorite color? Or, why is it that you always seem to get the package of mostly brown M&Ms? What’s

going on at the Mars Company? Is the number of the different colors of M&Ms in a package really

different from one package to the next, or does the Mars Company

do something to insure that each package gets the correct number

of each color of M&M? You’ve probably stayed up nights

pondering this!

One way that we could determine if the Mars Co. is true to its

word is to sample a package of M&Ms and do a type of statistical

test known as a “goodness of fit” test. These type of statistical

tests allow us to determine if any differences between our observed

measurements (counts of colors from our M&M sample) and our

expected (what the Mars Co. claims) are simply due to chance

sample error or some other reason (i.e. the Mars Co.’s sorters aren’t really doing a very good job of

putting the correct number of M&M’s in each package). The goodness of fit test we will be doing today

is called a Chi Square Analysis. This test is generally used when we are dealing with discrete data (i.e.

count data, or non continuous data). We will be calculating a statistic called a Chi square or X2 We will be

using a table to determine a probability of getting a particular X2 value. Remember, our probability values

tell us what the chances are that the differences in our data are due simply to chance alone (sample error).

The Chi Square test (X2) is often used in science to test if data you observe from an experiment is the

same as the data that you would predict from the experiment. This investigation will help you to use the

Chi Square test by allowing you to practice it with a population of familiar objects, M&M candies.

Objectives: After this investigation you should be able to:

 write a null hypothesis that pertains to the investigation;

 determine the degrees of freedom (df) for an investigation;

 calculate the X2 value for a given set of data;

 use the critical values table to determine if the calculated value is equal to or less than the critical value;

 determine if the Chi Square value exceeds the critical value and if the null hypothesis is accepted or rejected.

Chi Square Test (GOF and Independence)

M&M DATA (Individual)

Here are the percentages given by M&M on their website for each color.

 Brown = 12%

 Red = 12%

 Yellow = 15%

 Green = 15%

 Orange = 23%

 Blue =23%

1) Open 2 bags of M&Ms. (If you do not have 2 bags of M&M’s email me and I will send you a set of data.) 2) Separate the M&Ms into color categories and count the number of each color. 3) Record your M&M color totals in the data table.

Table 1

Brown Red Yellow Green Orange Blue

Total Number of M&M’s _______________ 4) Calculate the expected number of M&Ms in your package by multiplying the total number of M&Ms in the package by the color percent listed on page 1 of the activity. For example, if your package contains 500 M&Ms and you want to find the expected number of red M&Ms you will need to multiply 500 by 20% (500 x 0.20). Record your calculations in the data table. 5) Calculate the difference between the observed and expected numbers for each M&M color. Record your calculations in the data table. 6) Square the difference between the observed and expected. Record your calculations in the data table. 7) Divide the square of the difference by the expected. Record your calculations in the data table. 8) Total all the answers from step 7 to determine the chi-square (λ2) value. Record the chi-square (λ2) in the data table.

Chi Square Test (GOF and Independence)

Table 2

Colors Observed (o)

4) Expected (e)

5) o-e 6) (o-e)2 7) (o-e)2 e

Brown

Red

Yellow

Green

Orange

Blue

8)  = __________ Analysis Questions: 1. What are the null and alternative Hypothesis? Ho: Ha: Now you must determine the probability that the difference between the observed and expected values (as summarized by the calculated value of chi square) occurred simply by chance. To do this you will need to compare the calculated value of chi-square with the appropriate value from the Chi Square Distribution Table on the next page. Examine the table. Note the term “degrees of freedom.” For this statistical test the degrees of freedom is equal to the number of classes (color categories) minus one. Complete the following to determine the degrees of freedom for the M&M analysis: # of color categories ______________ – 1 degrees of freedom _______________ The reason why it is important to consider degrees of freedom is that the value of the chi-square statistic is calculated as the sum of the squared differences for all classes. The natural increase in the value of chi-square with an increase in classes must be taken into account. Scan across the row corresponding to 5 degrees of freedom. Values of the chi-square are given for several different probabilities ranging from 0.95 on the left to 0.001 on the right. Note that the chi-square increases as the probability increases. Notice that a chi-square value of 1.63 would be expected by chance in 95% (0.95) of the cases, whereas one of 12.59 would be expected in 5% (0.05) of the cases. Use the chi-square value calculated and recorded on the data table to determine the probability for the M&M analysis. If the exact chi square value is not listed in the table estimate the probability. Record your answer below.

Chi Square Test (GOF and Independence)

2, Draw your Chi Square Curve and put in the critical value (p = 0.05). 3. What is the λ2 value for your data? __________________ 4. Is your null hypothesis accepted or rejected? Explain why or why not.