Psy 5003 Test 1

1. One hundred individuals are selected at random and classified according to gender and hair color. The result of the classification is given below.

Brown Blond Black Green

Females 25 17 15 3
Males 18 10 10 2

Find the following probabilities:

a. P(female and blond)

b. P(male)

c. P(blond)

d. P(green|male)

e. Are green and gender independent?

 

2. A dermatologist is interested on the effect of two lotions (A and B) on persons suffering from poison ivy rash. She selects at random 20 patients with lesions on both hands, and instructs the nurse to apply Lotion A to one hand and B to the other hand. The nurse is supposed to keep a record of which hand got what lotion, but she is not to tell the dermatologist. Five days after treatment, the dermatologist evaluated each patient. Fifteen times out 20, Lotion A was more effective than B. What is the probability of obtaining this result by chance if indeed Lotion A and B are equally effective?

 

3. Define the following terms:

a. Sampling distribution of a statistic

b. Unbiased estimator

c. Population

d. Random sample

e. Statistic

f. Parameter

 

4. For the following data construct a stem-and-leaf plot and compute the median, mean, mode, variance, and standard deviation:

32 33 10 10 3 2 16 17 18 21 24 25 27

 

5. Assume that x is normally distributed with mean zero and variance one and find:

a. P(x > 1.96)=

b. P(-5< x < a)= .50, a=

c. P(-1 < x < 1)=

d. P(-b < x < b)=.80, b=

e. P(x < 1.5)=

f. P(x > -1)=

 

6. Scores on the Oklahoma Inventory of Social Skills (OKISS) are normally distributed with mean 100 and standard deviation of 10.

a. If you select an individual at random, what is the probability his score will fall between 97 and 103?

b. If you select 10 individuals at random, what is the probability that the mean score will fall between 97 and 103?

c. Why is the probability for the mean higher than the probability for the single score? Explain.

 

7. If a population has a mean of 50 and a variance of 100, for samples of size 16, the sampling distribution of the mean

a. has a mean of _______;

b. a variance of ________;

c. a standard deviation of ________; and

d. a shape like the ______________.

Psy 5003 Test 2

1. Answer each of the questions below assuming that the variable Z is normally distributed with a mean of zero and variance of one: (10pts.)

a. P(1.11 < Z <2.22)=

b. P(-.77 < Z < 1.00)=

c. P(Z < a)=.90, find a.=

d. P(Z<-2 OR Z>1)=

e. P(Z>1 And Z>1.5)=

 

2. Serum cholesterol levels in men aged 18 to 24 are normally distributed with a mean of 178.1 and a standard deviation of 40.7. All units are in mg/100 mL. Find the probability that: (30pts)

a. a randomly selected man aged 18-24 has a serum cholesterol level between 100 and 200?

b. the mean cholesterol level of 20 randomly selected men aged 18-24 is 179 or higher?

c. ten or more men (out of the 20 men) have serum cholesterol levels between 140 and 220?

 

3. For the data below compute the mean, the median, and the variance (unbiased). After computing these statistics, construct a 95% confidence interval for the mean and a 95% confidence interval for the standard deviation of the population. Assume that the sample was drawn from a normal distribution. (25pts)

(44.9, 43.5, 47.1, 54.7, 56.1, 48.5, 45.9, 39.3, 55.9, 51.8, 56.7, 40.2, 49.7, 53.1, 44.4)

 

4. The tobacco industry closely monitors all surveys that involve smoking. One survey showed that among 785 randomly selected subjects who completed four years of college, 18.3% smoked. You work for the tobacco industry and have been asked to construct the 98% confidence interval for the proportion of smokers in this population. What is the interval? What is the margin of error for this estimate (plus or minus ____ %)? (10pts)

 

5. Fill in the blanks: (20pts)

a. As sample size increases the estimator gets closer and closer (in probability) to the parameter it is estimating. This estimator is called a __________________ estimator. When the expected value of an estimator is equal to the parameter it is estimating, the estimator is called __________ estimator.

b. The standard error of the mean (the standard deviation of the sampling distribution of the mean) gets _________ as sample size increases. All other things being equal, the length of a confidence interval (on µ) based on n=25 would be _______________ than one based on n=250.

c. If I a take an infinite (very large number) number of random samples of size 50 from a population with mean 125 and variance of 100, the sampling distribution of the mean would have a mean of ________, a variance of ____________, and its shape would be _____________.

d. If X is normally distributed with a mean of 10 and variance of 9, then Y=3+5X is ____________ distributed with a mean of _________ and a variance of ___________.

 

6. You toss a coin 8 times and you get 4 heads. How would you construct a 93% confidence interval for the "true proportion of heads". (Because np < 5, we do not recommend the normal approximation.) (5pts)

 

Psy 5003 Test 3

I. Define the following terms: (10 pts.)

Type I Error-

Type II Error-

Null Hypothesis-

 

II. Most Engineering Schools tend to be found in large state and private universities. You wish to see if starting salaries vary according to type of school attended. You collect a random sample of 20 newly graduated engineers, 10 from each type of school. The results are as follows:(25pts.)

State Universities Private Universities
Mean=$42,299 Mean=$44,000
S=595 S=605

a. State the null and alternative hypotheses.

b. Test the null hypothesis using an alpha level of .05.

c. What assumptions did you make to perform the test?

 

II. In a study of leadership, a random sample of 100 military officers is given an empowerment scale. With business leaders, the scale is known to be normally distributed with a mean of 84 and a standard deviation of 6.5. The random sample of military officers had a mean of 83.2. (25 pts.)

a. Can we claim that military leaders are less empowering than business leaders? Set =.05. (The smaller the score the less empowering the leader is.)

b. What is the power of this test against the alternative hypothesis that µ=82?

c. What is beta?

 

IV. An experimenter believes that women lose weight faster than men when dieting. To investigate this hypothesis, she collects data on a random sample of 15 husband-wife pairs who were dieting. Their weight losses after two weeks of dieting for the women and the men were as follows: (20 pts)

W 2.7 4.4 3.5 3.7 5.6 5.1 3.8 3.5 5.6 4.2 6.3 4.4 3.9 5.1 3.4
M 5.0 3.3 4.3 6.1 2.5 1.9 3.2 4.1 4.5 2.7 7.0 1.5 3.7 5.2 1.9

Did the wives lose significantly (=.01) more weight than the husbands? (Make sure to state the Null and Alternative hypotheses.)

 

V. An experimenter conducted three t-tests. Each test was computed on a different (independent) data set. He set alpha at the .05 level for each test. Assuming that three null hypotheses were true, what is the probability that he made at least one Type of I Error? (10 pts.)

 

VI. Power is an important component of any well-designed experiment. If we do not have adequate power to detect the desired effect size, we can't make "good" decisions. Expand on this concept and tell me why power is important in testing hypotheses. (10 pts.)

 

Psy 5003 Final

For all problems involving calculations, please clearly indicate your final answer.

1. Define:

Correlation-

r2 -

Least squares regression-

 

2. A researcher is interested in the relationship between large muscle group (x variable) and small muscle group (y variable) efficiency in a sample of 25 male college athletes. The data are as follows:

X Y

Large Small

73 64

76 59

72 60

77 53

73 58

75 57

78 53

71 58

76 55

78 56

72 57

74 64

77 60

73 61

76 57

71 56

74 60

75 60

72 67

77 55

78 56

73 57

75 60

74 62

78 55

Large: mean = 74.72 SD = 2.30

Small: mean = 58.40 SD = 3.46

COV(x,y) = -4.09

a) Calculate Pearson's r correlation coefficient.

b) What is the interpretation of the correlation coefficient in this problem?

c) Find the regression line for regressing y on x. Give the slope and intercept.

d) What is the predicted value for y when x = 68, x = 79?

e) Test the hypothesis that = 0 (=.05).

 

3. In 1995, Frisancho et. al. reported a study of aerobic capacity of five groups of adults living in La Paz, Bolivia. The five groups are described below:

HARN- High altitude rural natives

HAUN- High altitude urban natives

AHAB- Bolivians of foreign ancestry who were acclimatized to high altitude since birth

AHAG- Bolivians of foreign ancestry who were acclimatized to high altitude during growth

AHAA- Non-Bolivians acclimatized to high altitude during adulthood.

Each participant performed an intensive exercise routine following which a number of energy-related measures were taken. The results for maximal aerobic capacity (VO2STDP) for the male participants are given below: (Aerobic capacity is the amount of oxygen intake per unit of body weight; higher numbers are preferable.)

Group Mean SD n

HARN 48.24 5.93 40

HAUN 40.00 4.24 40

AHAB 40.43 4.77 40

AHAG 42.95 6.05 40

AHAA 36.69 7.70 40

 

Compute an ANOVA and enter your results in an ANOVA table. Make sure to indicate whether F is significant (=.05). (Hint: Recall that the MSE is the average variance and the MSB is the (unbiased) variance of the means multiplied by n.) Then answer the questions below using HSD (Tukey's) procedure (=.05).

a) Are some of the groups significantly different from High-altitude rural natives?

b) Are some of the groups significantly different from AHAA group?

 

4. When we move from the one-way ANOVA design to the two-way ANOVA, in essence what we are doing is partitioning MSB into three mean squares: MSC (columns), MSR (rows) and MSI (interaction). Granted in a two-way ANOVA we have two independent variables. Nevertheless, in a two-way ANOVA we could "turn" the columns into rows and run a (large) one-way ANOVA. (In fact, this is what SAS does under the so called Model test.) What are the advantages and disadvantages of running a two-way instead of a "large" one-way ANOVA?

 

5. Complete the two-way ANOVA table below and indicate whether each of the F test is significant (=.05). In addition, a) indicate the null and alternative hypothesis for each test, b) state the assumptions necessary to perform the ANOVA, and c) specify the model.

Sources df SS MS F
Columns 2 19    
Rows 1   9.50  
Interaction       3.89
Error 12 20    

Hypotheses:

Assumptions:

Model: