PROBLEMS ON CONFUSING ASPECTS
OF PROBABILITY
1. Using Venn Diagrams
Suppose you are studying a
sample of 200 butterflies taken from your garden. You notice
that 40 of them are Orange
and 90 of them have Blue spots close to the eye. Upon
examining the Orange ones,
you find that 3/4 of them have Blue spots.
Draw a Venn diagram,
approximately to scale, representing the above data. For each of the
following questions about
this sample, give a numerical value where possible. If it is not
possible to compute the
answer exactly, does the data provided allow you to at least
estimate it?
If so, put in your estimate.
If not, say why not.
Connect any pairs of
probabilities that are inverses of one another with brackets.
a) P(O) =
40/200
b) P(B) = 90/200
c) P(O/B) = 30/90
d) P(B/O) = 30/40
e) P(not-O) = 160/200
f) P(not-B) = 110/200
g) P(not-O/B) = 60/90
h) P(not-B/O) = 10/40
i) P(O/not-B) = 10/110
j) P(B/not-O) = 60/160
Pairs of inverses are: c and d, g and j, h and i.
2. Conjunction
Consider the following
description of Bill and then some candidate statements about him.
You do not need to rank order
the options yourself. Instead, first specify any limitations the
rules of probability would
place on the rankings that you should assign if you were following
the dictates of rationality.
Secondly, describe what you would expect most people to actually
do in comparison with what
you have specified above.
"Bill is 34 years old.
He is intelligent, but unimaginative, compulsive, and generally lifeless. In
school, he was strong in
mathematics but weak in social studies and humanities.
Please rank order the
following statements by their probability, using 1 for the most probable
and 8 for the least probable.
Bill is a physician who plays
poker for a hobby
Bill is an architect
Bill is an accountant
Bill plays jazz for a hobby
Bill surfs for a hobby
Bill is a reporter
Bill is an accountant who
plays jazz for a hobby
Bill climbs mountains for a
hobby"
Answer:
According to the rules of probability, compound events should be less
likely than the single events which they are made up of. For example, from the description of
Bill, the probability that he is an accountant is very high, while the
probability that he plays jazz for a hobby is rather low. Thus, the conjunction of “Bill is
an accountant who plays jazz for a hobby” should be ranked lower than
both “Bill is an accountant” and “Bill plays jazz for a
hobby.” However, most people
will rank the conjunction “Bill is an accountant who plays jazz for a
hobby” between “Bill is an accountant” and “Bill plays
jazz for a hobby.”
3. Bill again
Suppose you now learn that
the little description of Bill was drawn at random from a file
drawer that contains the
combined notes of a Human Resources clerk who was
interviewing people for two
jobs, one in accounting and one in sales. Most of the candidates
(75%) were looking for the
sales job but a quarter of them wanted to be accountants.
The clerk describes Bill as
follows:
"Bill is 34 years old.
He seems to be intelligent, but unimaginative, compulsive, and generally
lifeless. In school, he was
strong in mathematics but weak in social studies and
humanities.")
Using what you have learned
in this course, how probable should you think it is that Bill is an
accountant? Justify that
answer. Do you think most people would agree with that answer?
Answer: The
probability that Bill is an accountant should be a slightly greater than 25%
percent. The description of Bill
leads one to think that he is an accountant, but because 75% of the files in
the drawer belonged to sales people, one should not say that the probability of
Bill being an accountant is therefore much greater than 25%. Most people probably would not agree
with this answer, and would say that the probability of Bill being an
accountant should be much more probable (say 75% or so). This is the result of giving greater
weight to the description of Bill, and ignoring the fact that 75% of the
applicants were sales people, and only 25% were accountants.
4. Bjorn Borg
This problem comes from an
experiment conducted when Bjorn Borg was winning
everything.
"Suppose Bjorn Borg
reaches the Wimbledon finals in 1981. Please rank order the following
outcomes from most to least
likely:
Borg will win the match
- 1
Borg will lose the first set
- 2
Borg will win the first set
but lose the match - 4
Borg will lose the first set
but win the match" - 3
First, provide your own rank
orderings to the above (even though you don't know much
about Borg). Secondly,
discuss what the cognitive psychologists were probably looking for
when they designed the above
question and what they most likely found.
Answer: Since Borg was winning everything at this time, the
probability of him winning the match is the most likely. The probability of him losing the first
set is the next most probable event, since it is a single event, and you can
lose the first set and still win the match. Since “Borg will lose the first set but win the
match” contains the first and second most probable events, this is ranked
third, and finally, since “Borg will win the first set but lose the
match” contains the least probable events, this is ranked fourth.
When cognitive psychologists
designed the above question, they were probably looking to see how people
ranked the probability of single events compared with compound events. One likely result would be that people
ranked “Borg will win the match” as most likely and “Borg
will lose the first set” as least likely, but then ranked “Borg
will lose the first set but win the match” as either the second or third
most likely alternative, which contradicts the rules of probability.
5. The Hit and Run Cab
A cab was involved in a hit
and run accident at night. Two cab companies, the Green and
the Blue, operate in the city
but Green has the bigger fleet, nine times bigger to be precise.
a) At first the police can
find no witness. At this point what is a reasonable value to assign to
the probability that the hit
and run cab was Green? Explain briefly.
Since we don’t have any
additional information, we must rely on the number of green and blue cabs in
the city, and say that the probability of the cab being green is 90%.
b) The police now find a
witness, but he is badly color blind and hasn't the foggiest idea
what color the cab was.
"But it was going too fast," he says. At this point, what is a
reasonable probability to
assign to the hypothesis that the cab was Green?
Explain briefly.
90% - for the same reason as
in A.
c) Another witness comes
forward now and testifies that the cab was Blue. Lawyers for the
cab company try to impugn her
testimony by testing the reliability of the witness under the
circumstances that existed on
the night of the accident and concluded that the witness
correctly identified each of
the colors 80% of the time and failed 20% of the time.
Which, if any, of the
following statements correctly summarizes part or all of the findings of
the reliability test:
I) prob of reporting green,
given that the cab was in fact green, is 80%
This is true. When we know the color of the cab, we
know that her report is correct 80% of the time.
2) prob that the cab was in
fact green, given that the witness reported it as such, is 80%
This is false (this is the
inverse of Statement #1). We can predict the accuracy of the report if we know
the color of the cab, but we can’t predict the color of the cab from her
report. The probability that the
cab is green, given that the witness reported it as such, is not 80%. In determining the probability that the
cab was green, we also have to take into the consideration the fact that 90% of
the cabs in the city are green.
3) you can believe what the
witness says 80% of the time - if she says the cab was Blue,
you can bet that it was blue
and be right four times out of five.
This is also false, for the
same reason as in #2.
4) 80% of the time, the
witness reports the correct color - if the cab is a Blue one, four to
one she'll call it Blue.
This is true, for the same
reason as in #1.
5) If you put the witness out
on a street corner on a night like that and had her watch the
cabs go by, 80% of the time
she would report green and 20% of the time she would report
blue - on average.
This is false since 90% of the
cabs are green and she is equally accurate in reporting both colors.
Extra Credit:
This is a standard Bayes’
Theorem problem. The base rate for Green is 90% but the Report that the cab was
blue pulls in the opposite direction.
P (G/R) = ____p (G) x
p(R, G)______________
[p (G)
x p(R, G)] + [p(B) x
p(R,B)]
= ____ 0.9
x
0.2 ______________
[ 0.9
x 0.2] + [ 0.1 x 0.8]
= 9/13
The answer still favors Green
but by much less than 90%.