hw8_due20220404.pdf

1

Quan 2600. Statistics I

Homework 8. Sampling. Due 04/04.

Problem 1. Sampling

A researcher sets out to determine the share of time urban population of the United

States listens to dance music. They identified the following population groups: construction workers, gym goers, bar and night club patrons, and library readers.

(a) What is the name for all those population subgroups?

(b) What should the researcher ensure within each group and across his sample to preserve randomness?

Now the researcher changes their strategy in favor of pestering drivers about their musical preferences at three toll booths in LA, NY, and Chicago.

(c) What type of population subgroup are drivers on interstates in LA, NY, and Chicago, relative to urban population of the United States?

(d) What condition ensures that these three groups constitute a random sample

of urban US music listeners?

2

Problem 2. Standard error of the mean

You collect four samples, 𝑛 = 6 observations each, from the population of 40 (which is large). The means are 42, 52, 54, and 44.

(a) Calculate the expected sample mean, 𝐸(�̅�), as the average of the sample means. What does this, in conjunction with the central limit theorem, tell you about the population mean?

(b) Find the squared deviation of each sample mean from the expected sample mean. Then, find the standard deviation of the four sample means (treat

them as population). Given that those four values are sample means, what is the name for what you just found?

(c) Find the z score of the sample mean �̅� = 37.8 and establish the probability that the next random sample mean will be lower than that. Draw the density bell curve of the situation (while at it, note that its width is determined by

the SEM, not 𝜎).

3

(d) Use the idea of the standard error of the mean to find the population

standard deviation, 𝜎. Treat the population as infinitely large – don’t worry about the correction term.

(e) Find the probability that a random observation in the population has the

value lower than 𝑥 = 37.8 (in this case, the z score is not one of the seven values that we studied, so use Excel to find its CDF).

(f) Compare the probabilities of observing a sample mean lower than �̅� = 37.8 and a value in the population lower than 𝑥 = 37.8. Explain why you think they are different.

4

Problem 3. Sample proportion

Pokemon is the largest-revenue toy franchise in history. However, given all the

competitors Amazon currently cites as the top sellers (LOL Surprise!, Harry Potter Lego, My Lovely Unicorn), doubts are creeping in.

You take samples of children, 𝑛 = 20 at a time, and it turns out that on average 𝐸(�̅�) =0.35 of them would like a pokemon. What is the probability that at most 𝑝 = 0.5 of the next random sample of 20 children care about pokemons?

(a) Using 𝑝 and 𝑛, find the standard error of the proportion, SEP (hint: use formula 7.5 in the book)

(b) Find the z score of �̅� = 0.5.

5

(c) Use excel to find the CDF of �̅� = 0.5 (in math notation, 𝐹(�̅�|𝐸(�̅�), 𝑆𝐸𝑃)). Then interpret your result to determine the probability that in the next random sample, 50% of children will care about pokemon.