Stat 202: Lecture 9 (covers pp. 118-123)

Nathan VanHoudnos
10/13/2014

Agenda

  1. Homework comments
  2. Checkpoint #10 results
  3. Lecture 9 (covers pp. 118-123)

Homework comments

Agenda

  1. Homework comments
  2. Checkpoint #10 results
  3. Lecture 9 (covers pp. 118-123)

Checkpoint #10 results

to fill in

Checkpoint #10 Question 5

Dogs are inbred for such desirable characteristics as blue eye color; but an unfortunate by-product of such inbreeding can be the emergence of characteristics such as deafness. A 1992 study of Dalmatians (by Strain and others, as reported in The Dalmatians Dilemma) found the following:

(i) 31% of all Dalmatians have blue eyes.
(ii) 38% of all Dalmatians are deaf.
(iii) 42% of blue-eyed Dalmatians are deaf.

Based on the results of this study is “having blue eyes” independent of “being deaf”?

Checkpoint #10 Question 5

Write this out:

(i) 31% of all Dalmatians have blue eyes.
\[ P(B) = .31 \]

(ii) 38% of all Dalmatians are deaf.
\[ P(D) = .38 \]

(iii) 42% of blue-eyed Dalmatians are deaf.
\[ P(D|B) \text{ or } P(B|D) \]

\[ P(D|B) = .42 \]

Checkpoint #10 Question 5

\[ \begin{aligned} P(B) & = .31 & P(D) & = .38 \\ P(D|B) & = .42 \ \end{aligned} \]

Based on the results of this study is “having blue eyes” independent of “being deaf”?

  • a) No, since .31 * .38 is not equal to .42.
  • b) No, since .38 is not equal to .42.
  • c) No, since .31 is not equal to .42.

Write out the symbols…

Checkpoint #10 Question 5

\[ \begin{aligned} P(B) & = .31 & P(D) & = .38 \\ P(D|B) & = .42 \ \end{aligned} \]

Based on the results of this study is “having blue eyes” independent of “being deaf”?

  • a) No, since \( P(B)P(D) \) is not equal to \( P(D|B) \).
  • b) No, since \( P(D) \) is not equal to \( P(D|B) \).
  • c) No, since \( P(B) \) is not equal to \( P(D|B) \).

Checkpoint #10 Question 5

\[ \begin{aligned} P(B) & = .31 & P(D) & = .38 \\ P(D|B) & = .42 \ \end{aligned} \]

Based on the results of this study is “having blue eyes” independent of “being deaf”?

  • a) No, since \( P(B)P(D) \) is not equal to \( P(D|B) \).
  • b) No, since \( P(D) \) is not equal to \( P(D|B) \).
  • c) No, since \( P(B) \) is not equal to \( P(D|B) \).

Independence if and only if: \[ P(D|B) = P(D) \] therefore b) is the correct choice.

Agenda

  1. Homework comments
  2. Checkpoint #10 results
  3. Lecture 9 (covers pp. 118-123)

Probability Rules!

Let \( S \) be the sample space, \( A \) any event, \( A^c \) its complement, and \( B \) another event.

  1. \( 0 \le P(A) \le 1 \)

  2. \( P(S) = 1 \)

  3. \( P(A^c) = 1 - P(A) \)

  4. \( P(A \text{ or } B ) = P(A) + P(B) - P(A \text{ and } B) \)

  5. If and only if \( A \) and \( B \) are independent, then

    \[ P(A \text{ and } B ) = P(A) \times P(B) \]

General multiplication rule

Recall the definition of conditional probability:

\[ P(A|B) = \frac{P(A \text{ and } B )}{P(B)} \]

Therefore, for all events \( A \) and \( B \):

\[ P(A \text{ and } B )= P(A|B) \times P(B) \]

Why a generalization?

The general rule:

\[ P(A \text{ and } B ) = P(A|B) \times P(B) \]

Recall that \( A \) is independent of \( B \) if and only if

\[ P(A|B) = P(A) \]

Therefore, if \( A \) is independent of \( B \),

\[ \begin{aligned} P(A \text{ and } B ) & = P(A|B) P(B) \\ & = P(A) P(B) \end{aligned} \]

which is rule #5.

Probability Rules!

Let \( S \) be the sample space, \( A \) any event, \( A^c \) its complement, and \( B \) another event.

  1. \( 0 \le P(A) \le 1 \)

  2. \( P(S) = 1 \)

  3. \( P(A^c) = 1 - P(A) \)

  4. \( P(A \text{ or } B ) = P(A) + P(B) - P(A \text{ and } B) \)

  5. \( P(A \text{ and } B ) = P(A|B) P(B) \)

  6. Independent: if and only if \( P(A|B) = P(A) \).

General multiplication rule

Note that: \[ \begin{aligned} P(A|B) & = \frac{P(A \text{ and } B )}{P(B)} \\ P(B|A) & = \frac{P(A \text{ and } B )}{P(A)} \end{aligned} \]

Therefore:

\[ P(A \text{ and } B ) = P(A|B) P(B) = P(B|A) P(A) \]

Both are correct.

More than two events:

In later courses you will work with objects like this:

\[ \begin{aligned} P(X, & \mu, \text{ and } \sigma) \\ & = P(X, \big\{ \mu \text{ and } \sigma \big\} ) \\ & = P(X|\big\{ \mu \text{ and } \sigma \big\}) P( \big\{ \mu \text{ and } \sigma \big\} ) \\ & = P(X| \mu, \sigma ) P(\mu|\sigma)P(\sigma) ) \end{aligned} \]

This chaining of the general multiplication rule is important for:

  • Hierarchical Linear Models (HLM)
  • All of Bayesian statistics

An example

In a certain region, one in every thousand people (0.001) of all individuals are infected by the HIV virus that causes AIDS. Tests for presence of the virus are fairly accurate but not perfect. If someone actually has HIV, the probability of testing positive is 0.95.

Let \( H \) denote the event of having HIV, and \( T \) the event of testing positive.

\[ \begin{aligned} P(H) & = ? & P(T) & = ? \\ P(H \text{ and } T) & = ? & P(H \text{ or } T) & = ? \\ P(H|T) & = ? & P(T|H) & = ? \end{aligned} \]

An example

In a certain region, one in every thousand people (0.001) of all individuals are infected by the HIV virus that causes AIDS. Tests for presence of the virus are fairly accurate but not perfect. If someone actually has HIV, the probability of testing positive is 0.95.

Let \( H \) denote the event of having HIV, and \( T \) the event of testing positive.

\[ \begin{aligned} P(H) & = 0.001 & P(T) & = ? \\ P(H \text{ and } T) & = ? & P(H \text{ or } T) & = ? \\ P(H|T) & = ? & P(T|H) & = 0.95 \end{aligned} \]

An example

What is the probability that someone chosen at random tests has HIV and tests positive?

\[ \begin{aligned} P(H) & = 0.001 & P(T) & = ? \\ P(H \text{ and } T) & = ? & P(H \text{ or } T) & = ? \\ P(H|T) & = ? & P(T|H) & = 0.95 \end{aligned} \]

We need: \[ \begin{aligned} P(H \text{ and } T) & = P(T|H)P(H) \\ & = (0.95) (0.001) = 0.00095 \end{aligned} \]

implying that approximately 1/10 of 1% of people will have HIV and will test positive for it.

A further example

A sales representative tells his friend that the probability of landing a major contract by the end of the week, resulting in a large commission, is .4. If the commission comes through, the probability that he will indulge in a weekend vacation in Bermuda is .9. Even if the commission doesn't come through, he may still go to Bermuda, but only with probability .3.

\[ \begin{aligned} P(C) & = ? & P(V) & = ? \\ P(V|C) & = ? & P(V|\text{not } C) & = ? \\ \end{aligned} \]

A further example

A sales representative tells his friend that the probability of landing a major contract by the end of the week, resulting in a large commission, is .4. If the commission comes through, the probability that he will indulge in a weekend vacation in Bermuda is .9. Even if the commission doesn't come through, he may still go to Bermuda, but only with probability .3.

\[ \begin{aligned} P(C) & = 0.40 & P(V) & = ? \\ P(V|C) & = 0.90 & P(V|\text{not } C) & = 0.3 \\ \end{aligned} \]

Probability Trees

“the probability of landing a major contract … is .4”

       +----0.40
      /       
     C        
    /         
---<
    \         
     not C    
      \       
       +----0.60

Probability Trees

“If the commission comes through, the probability [of a] vacation … is .9.”

                        /-----0.90
                       V
                      /
       +----0.40-----< 
      /               \
     C                 not V
    /                   \-------------[
---<
    \                  /--------------[
     not C            V
      \              /
       +----0.60----<         
                     \
                      not V
                       \--------------[

Probability Trees

“If the commission comes through, the probability [of a] vacation … is .9.”

                        /-----0.90
                       V
                      /
       +----0.40-----< 
      /               \
     C                 not V
    /                   \-----0.10
---<
    \                  /--------------[
     not C            V
      \              /
       +----0.60----<         
                     \
                      not V
                       \--------------[

Probability Trees

“Even if the commission doesn't come through, he may still go … with probability .3.”

                        /-----0.90
                       V
                      /
       +----0.40-----< 
      /               \
     C                 not V
    /                   \-----0.10
---<
    \                  /------0.30
     not C            V
      \              /
       +----0.60----<         
                     \
                      not V
                       \------0.70

Read off conditional probabilities

                   /--P(V|C)         = 0.90
                  V
          P(C)   /
       +--0.40--< 
      /          \
     C            not V     
    /              \--P(not V|C)     = 0.10
---<
    \              /--P(V|not C)     = 0.30
     not C        V       
      \          /
       +--0.60--<         
        P(not C) \
                  not V
                   \--P(not V|not C) = 0.70

A two-way probability table

      |  V   | not V | Total
------|------|-------|------
C     |      |       |  0.40
------|------|-------|------
not C |      |       |  0.60  
------|------|-------|------
Total |      |       |  1 

From the tree we have

  • \( P(V|C) \), \( P(\text{not } V|C) \),
  • \( P(V|\text{not } C) \) and \( P(\text{not } V|\text{not } C) \).

How do we get \( P(V \text{ and } C) \) etc. to put in the table?

E.g. Probability of a vacation?

There are two ways to take a vacation, with and without the commission:

\[ P(V) = P(V \text{ and } C) + P(V \text{ and not } C) \]

By the general multiplication rule we have:

\[ \begin{aligned} P(V \text{ and } C) & = P(V|C) P(C) \\ P(V \text{ and not } C) & = P(V|\text{not } C) P(\text{not } C) \end{aligned} \]

Therefore:

\[ P(V) = P(V|C) P(C) + P(V|\text{not } C) P(\text{not } C) \]

Probability of a vacation?

\( P(V) = P(V|C) P(C) + P(V|\text{not } C) P(\text{not } C) \)

       +-0.90 # P(V|C)P(C) = 0.90 * 0.40 
       V                   = 0.36
       |    
+-0.40-+ 
|      |
C      not V     
|      +-0.10
<
|      +-0.30 
not C  V      
|      |      
+-0.60-+      
       |      
       not V  
       +-0.70 

Probability of a vacation?

\( P(V) = P(V|C) P(C) + P(V|\text{not } C) P(\text{not } C) \)

       +-0.90 # P(V|C)P(C) = 0.90 * 0.40 
       V                   = 0.36
       |    
+-0.40-+ 
|      |
C      not V     
|      +-0.10
<
|      +-0.30 # P(V|not C)P(not C)  
not C  V                  = 0.60 * 0.30
|      |                  = 0.18
+-0.60-+      
       |      
       not V  
       +-0.70 

Probability of a vacation?

\( P(V) = P(V|C) P(C) + P(V|\text{not } C) P(\text{not } C) \)

       +-0.90 # P(V|C)P(C) = 0.90 * 0.40 
       V                   = 0.36
       |    
+-0.40-+ 
|      |
C      not V     
|      +-0.10
<
|      +-0.30 # P(V|not C)P(not C)  
not C  V                  = 0.60 * 0.30
|      |                  = 0.18
+-0.60-+         ______________________
       |        { Therefore:           }
       not V    { P(V) = 0.36 + 0.18   }
       +-0.70   {      = 0.54          }

A two-way probability table

After multiplying out the tree:

      |  V   | not V | Total
------|------|-------|------
C     | 0.36 |       |  0.40
------|------|-------|------
not C | 0.18 |       |  0.60  
------|------|-------|------
Total | 0.54 |       |  1 

A two-way probability table

And finding the rest:

      |  V   | not V | Total
------|------|-------|------
C     | 0.36 | 0.04  |  0.40
------|------|-------|------
not C | 0.18 | 0.42  |  0.60  
------|------|-------|------
Total | 0.54 | 0.46  |  1 

Summary thus far....

Two way tables (or Venn Diagrams)

  • when the problem gives \( P(A \text{ and } B) \) etc.

Probability trees

  • when the problem gives \( P(A|B) \) etc.

Can convert back-and-forth between them as needed.

An exercise

Suppose the friend left for a week and came back to the office. When the friend returned, the salesman had left for Bermuda.

What is the probability that the salesman received a commission given that he is on vacation in Bermuda?

\[ P(C|V) = ? \]

From the probability table ...

\[ P(C|V) = ? \]

      |  V   | not V | Total
------|------|-------|------
C     | 0.36 | 0.04  |  0.40
------|------|-------|------
not C | 0.18 | 0.42  |  0.60  
------|------|-------|------
Total | 0.54 | 0.46  |  1 

\[ P(C|V) = \frac{P(C \text{ and } V)}{P(V)} = \frac{.36}{.54} = .667 \]

implying that there is a 67% chance that he received the commission.

Tree to table is an unsatisfying

The probability of landing a major contract by the end of the week is .4. If the commission comes through, the probability that he will vacation in Bermuda is .9. Even if the commission doesn't come through, he may still go to Bermuda, but only with probability .3.

\[ \begin{aligned} P(C) & = 0.40 & P(V) & = ? \\ P(V|C) & = 0.90 & P(V|\text{not } C) & = 0.3 \\ \end{aligned} \]

What is the probability that the salesman received a commission given that he is on vacation in Bermuda? \[ P(C|V) = ? \]

A better way

Recall: \[ \begin{aligned} P(A|B) & = \frac{P(A \text{ and } B )}{P(B)} \\ P(B|A) & = \frac{P(A \text{ and } B )}{P(A)} \\ P(A \text{ and } B ) & = P(A|B) P(B) = P(B|A) P(A) \end{aligned} \]

Therefore: \[ P(A|B) = \frac{P(B|A) P(A)}{P(B)} \]

This is Bayes' Rule.

Bayes' Rule and Total Probability

Bayes' Rule:

\[ P(A|B) = \frac{P(B|A) P(A)}{P(B)} \]

  • Allows you to reverse a conditional probability.

Law of Total Probability

\[ \begin{aligned} P(B) & = P(B \text{ and } A) + P(B \text{ and not} A) \\ & = P(B|A)P(A) + P(B|\text{not }A)P(\text{not A}) \end{aligned} \]

  • combine with Bayes' Rule to reverse a probability tree.

Salesman Reprise

\[ \begin{aligned} P(C) & = 0.40 & P(V) & = ? \\ P(V|C) & = 0.90 & P(V|\text{not } C) & = 0.3 \\ \end{aligned} \]

What is the probability that the salesman recieved a commission given that he is on vacation in Bermuda? \[ P(C|V) = ? \]

\[ \begin{aligned} P(C|V) & = \frac{P(V|C) P(C)}{P(V)} \\ P(V) & = P(V|C)P(C) + P(V|\text{not }C)P(\text{not C}) \end{aligned} \]

Salesman Reprise

\[ \begin{aligned} P(C) & = 0.40 & P(V) & = ? \\ P(V|C) & = 0.90 & P(V|\text{not } C) & = 0.3 \\ \end{aligned} \]

What is the probability that the salesman recieved a commission given that he is on vacation in Bermuda? \[ P(C|V) = ? \]

\[ \begin{aligned} P(C|V) & = \frac{ 0.90 * 0.40 }{P(V)} \\ P(V) & = .90 * 0.40 + 0.3 * (1 - 0.40) = 0.667 \end{aligned} \]

Salesman Reprise

\[ \begin{aligned} P(C) & = 0.40 & P(V) & = ? \\ P(V|C) & = 0.90 & P(V|\text{not } C) & = 0.3 \\ \end{aligned} \]

What is the probability that the salesman recieved a commission given that he is on vacation in Bermuda?

\[ \begin{aligned} P(C|V) & = \frac{ 0.90 * 0.40 }{0.54} \\ & = 0.667 \end{aligned} \]

implying that there is a 67% chance that he recieved the comission.

Summary

Recall that Stat 202 will let you solve probability problems your own way.

  • brute force: (i) make a probability tree, then (ii) make a table, then (iii) find the relevant conditional probability.

  • elegant: Bayes' Rule, the Law of Total Probability, and other probability rules.

A spy example

Polygraph (lie-detector) tests are often routinely administered to employees or prospective employees in sensitive positions. Lie detector results are “better than chance, but well below perfection.” Typically, the test may conclude someone is a spy 80% of the time when he or she actually is a spy, but 16% of the time the test will conclude someone is a spy when he or she is not.

Let us assume that 1 in 1,000, or .001, of the employees in a certain highly classified workplace are actual spies.

A spy example

Test may conclude someone is a spy 80% of the time when he or she actually is a spy, but 16% of the time the test will conclude someone is a spy when he or she is not. Assume that 1 in 1,000, or .001, are actual spies.

\[ P(S) = ? \]

\[ P(S) = 0.001 \]

\[ P(D|S) \text{ or } P(S|D)? \]

\[ P(D|S) = 0.80 \]

\[ P(D|\text{not }S) = ? \]

\[ P(D|\text{not }S) = .16 \]

A spy example

\[ P(S) = 0.001 \]

\[ P(D|S) = 0.80 \]

\[ P(D|\text{not }S) = .16 \]

If the polygraph detects a spy, are you convinced that the person is actually a spy?

\[ P(S|D) \text{ or } P(D|S)? \]

\[ P(S|D) = \frac{P(D|S)P(S)}{P(D)} \]

\[ P(D) = P(D|S)P(S) \\ + P(D|\text{not }S)P(\text{not }S) \]

A spy example

Law of total probability: \[ \begin{aligned} P(D) & = P(D|S)P(S) + P(D|\text{not }S)P(\text{not }S) \\ & = .80 * .001 + .16 * (1 - .001) \\ & = .161 \end{aligned} \]

Bayes' Rule \[ \begin{aligned} P(S|D) & = \frac{P(D|S)P(S)}{P(D)} \\ & = \frac{ 0.80 * 0.001 }{ .161} \\ & = 0.005 \end{aligned} \]

A spy example

If the polygraph detects a spy, are you convinced that the person is actually a spy?

\[ P(S|D) = 0.005 \]

implying that about one half of one percent of “detections” are actual spies.

Are you convinced?

$$P(S|D) \ne P(D|S) \text{reprise}$$

The order of conditioning matters:

\[ P(S|D) = 0.005 \]

implying that about one half of one percent of “detections” are actual spies.

\[ P(D|S) = 0.80 \]

implying that 80% of actual spies are dectected by the test.

Careful attention is required!

Bayesian Statistics

Thomas Bayes a

c. 1701-1761

  • Richard Price published Bayes' Rule after Bayes' death.

Pierre-Simon Laplace a

1749 - 1847

  • Set the foundation for Bayesian Statististics

An editorial comment