# Stat 202: Lecture 3 (covers pp. 34-47)

Nathan VanHoudnos
9/29/2014

### Agenda

2. Checkpoint #2 results
3. Lecture 3 (covers pp. 34-47)

• 21 of you have already turned it in
• Rock on!
• Homework #2 will be released on Wednesday

### Agenda

2. Checkpoint #2 results
3. Lecture 3 (covers pp. 34-47)

### Checkpoint #2 results

• PSA: Max time on a checkpoint is 2 hours

• Average percent correct: 82%

• A little over ½ of you missed the same two questions

### Question #10, Checkpoint #2

Here again are the boxplots showing annual incomes (in thousands of dollars) for households in two cities.

Which city has a greater percentage of households with annual incomes between $50,000 and$80,000?

### Question #14, Checkpoint #2

Here again are the boxplots showing the real estate values of single family homes in 2 neighboring cities (in thousands of dollars).

Which city has a greater percentage of homes with real estate values between $55,000 and$85,000?

### Agenda

2. Checkpoint #2 results
3. Lecture 3 (covers pp. 34-47)

### Roles and Types of variables

Roles

• Explanatory variables explain, predict, or affect the response (independent variable)

• The response variables are the outcome (dependent variable)

Types

• Categorical represent categories or labels
• Quantitative represent numerical measurements

### Role-Type examples

Are the smoking habits of a person (yes, no) related to the person's gender?

• Gender: categorical explanatory
• Smoking habits: categorical response

Is there a relationship between and test scores on a particular standardized test?

• Gender: categorical explanatory
• Test score: quantitative response

### Role-Type examples

How well can we predict a student's freshman year GPA from his/her SAT score?

• SAT score: quantitative explanatory
• GPA: quantitative response

Can you predict a person's favorite type of music based on his/her IQ?

• IQ: quantitative explanatory
• music: categorical response

### Role-Type Classification

 Response Explanatory Categorical Quantitative Categorical $$C \rightarrow C$$ $$C \rightarrow Q$$ Quantitative $$Q \rightarrow C$$ $$Q \rightarrow Q$$
• $$Q \rightarrow C$$
• not covered by introductory statistics
• requires post high school mathematics
• much of machine learning and “data science”

### $$C \rightarrow Q$$

Age by General Health

Compare the distribution of the response $$Q$$ for each category of the explanatory $$C$$

Data display

• side-by-side boxplots

Numeric summaries

• descriptive statistics

### $$C \rightarrow Q$$

Age by General Health

Example

• Center: The median poor respondent is 60 years old while the median excellent respondent is 40 years old.
• Relative 5 number: In fact, approximately ¾ of poor respondents are as old or older than the oldest ¼ of excellent respondents.

### $$C \rightarrow C$$

Are men and women just as likely to think their weight is about right?

Two-way table:

### $$C \rightarrow C$$

Marginal distribution of Body Image