Stat 202: Lecture 3 (covers pp. 34-47)

Nathan VanHoudnos
9/29/2014

Agenda

  1. Homework comments
  2. Checkpoint #2 results
  3. Lecture 3 (covers pp. 34-47)

Homework comments

  • 21 of you have already turned it in
    • Rock on!
  • Homework #2 will be released on Wednesday

Agenda

  1. Homework comments
  2. Checkpoint #2 results
  3. Lecture 3 (covers pp. 34-47)

Checkpoint #2 results

  • PSA: Max time on a checkpoint is 2 hours

  • Average percent correct: 82%

  • A little over ½ of you missed the same two questions

Question #10, Checkpoint #2

Here again are the boxplots showing annual incomes (in thousands of dollars) for households in two cities.
a
Which city has a greater percentage of households with annual incomes between $50,000 and $80,000?

Question #14, Checkpoint #2

Here again are the boxplots showing the real estate values of single family homes in 2 neighboring cities (in thousands of dollars).
a
Which city has a greater percentage of homes with real estate values between $55,000 and $85,000?

Agenda

  1. Homework comments
  2. Checkpoint #2 results
  3. Lecture 3 (covers pp. 34-47)

Roles and Types of variables

Roles

  • Explanatory variables explain, predict, or affect the response (independent variable)

  • The response variables are the outcome (dependent variable)

Types

  • Categorical represent categories or labels
  • Quantitative represent numerical measurements

Role-Type examples

Are the smoking habits of a person (yes, no) related to the person's gender?

  • Gender: categorical explanatory
  • Smoking habits: categorical response

Is there a relationship between and test scores on a particular standardized test?

  • Gender: categorical explanatory
  • Test score: quantitative response

Role-Type examples

How well can we predict a student's freshman year GPA from his/her SAT score?

  • SAT score: quantitative explanatory
  • GPA: quantitative response

Can you predict a person's favorite type of music based on his/her IQ?

  • IQ: quantitative explanatory
  • music: categorical response

Role-Type Classification

Response
Categorical Quantitative
Explanatory Categorical \( C \rightarrow C \) \( C \rightarrow Q \)
Quantitative \( Q \rightarrow C \) \( Q \rightarrow Q \)
  • \( Q \rightarrow C \)
    • not covered by introductory statistics
    • extremely important to business applications (predict decision to buy)
    • requires post high school mathematics
    • much of machine learning and “data science”

$$C \rightarrow Q$$

Age by General Health plot of chunk unnamed-chunk-1

Compare the distribution of the response \( Q \) for each category of the explanatory \( C \)

Data display

  • side-by-side boxplots

Numeric summaries

  • descriptive statistics

$$C \rightarrow Q$$

Age by General Health plot of chunk unnamed-chunk-2

Example

  • Center: The median poor respondent is 60 years old while the median excellent respondent is 40 years old.
  • Relative 5 number: In fact, approximately ¾ of poor respondents are as old or older than the oldest ¼ of excellent respondents.

$$C \rightarrow C$$

Are men and women just as likely to think their weight is about right?

Two-way table: a

$$C \rightarrow C$$

Marginal distribution of Body Image