(ok, so it grew to 13, but saying Top 13 just doesn't sound right!)
What's the difference between confounding and lurking variables?
What's the difference between independence and mutually exclusive?
What's the deal with adding variances? Var(2X) =?? Var(X) + Var(X)
I'm running out time and what can I do (especially with inference for slope)?
|
Here is the shortest possible (and still honest) answer, given by Dan Teague: Unfortunately, there is no good/easy answer to any question in AP
| |
Here are the two examples that most helped me explain r^2 to my students. They have been posted on the list many times over the years and I have lost track of the original author!
|
Height explains weight. Not totally, but roughly. Suppose r^2 is 75% for a dataset between height and weight. We know that other things affect weight, in addition to height, including genetics, diet and exercise. So we say that 75% of a person's variation in weight can be explained by the variation in height, but that 25% of that variation is due to other factors. | |
|
Suppose you are buying a pizza that is $7 plus $1.50 for each topping. Clearly Price = 7 + 1.50(# of toppings). Clearly r and r^2 are 1 and 100%. Does this mean that the number of toppings 100% determines my cost? No, clearly the $7 base price has a lot to do with the price! However, my variation in price is 100% by the variation in the number of toppings I choose. |
Al Coons has an activity regarding this topic that is archived at this location.
Dan Teague gives a nice explanation of the math involved for r^2 on this archived post.
r^2 was discussed on the list on this date.
Yes!
Why? (especially when my calculator can do it for me and has all these fancy commands! Can't I have my students use those buttons?)
It's on the course description! And here's why:
The idea of transforming data to achieve linearity is a powerful and important idea. It is this idea we are teaching. Re-expressing data and dealing with it in it's transformed and linear state is crucial. As is understanding how to back-transform to make an appropriate prediction.
Dave Bock discusses transformations for linearity on this archived post.
Paul Velleman gives a great post about confounding and lurking variables here.
Josh Zucker discusses the issue of extraneous variables and gives a list of links.
If two events are independent, the outcome of one will not affect the outcome of the other. i.e., Whether or not it rains and whether or not a coin flips heads or tails.
If two events are mutually exclusive, if one happens the other event cannot happen. For example, in picking one M&M from a bag, I can find the probability of drawing green or red. But if I draw green, I cannot draw red.
As we saw on the '02 MC #23, it is useful to notice that if two events are mutually exclusive, they affect each other quite powerfully: if one of them happens, the other CANNOT occur. Thus they are dependent.
Independence vs. mutually exclusive has been discussed on the list.
'Floyd Bullard has submitted an awesome post about probability that can be read in the archives. I would strongly encourage rookies to read this post BEFORE teaching probability!
Here's a great explanation from Dave Bock:
For a short answer, try a thought experiment.
Let X represent the outcome when you roll a die. the 2X represents
rolling one die and doubling the result. The possible outcomes are {2,
4, 6, 8, 10, 12}; they are equiprobable.
On the other hand, X+X represents rolling two dice (or one twice). Now
the possible outcomes are {2, 3, 4, 5, 6, 7, ..., 12}. Some are far
less likely than others. Clearly this is a very different situation.
You can actually calculate both variances, but first just think about
the distributions. It should be pretty obvious that X+X is unimodal and
symmetric, peaking around 7 with very low tails while 2X is uniform
across the same range. The two means are the same, but X+X has a
smaller variance than 2X.
When confronting these situations, students must learn to ask
themselves how many random values they are working with. One random
value multiplied by a constant behaves much differently from summing
several different random values.
I urge students to recognize that a random variable in Statistics is
not the same animal as a variable in algebra. In algebra what we call a
"variable" is really just an unspecified constant. With that
understand, no matter what number I use for X I'll always substitute
that same value every time I see an X, so it must be true that X+X+X =
3X.
Then I put my Statistics hat on, declare them "random variables", and
pick up a die. I substitute the results of the first roll for the first
X, roll again for the value of the second X, etc. It's pretty clear now
that this X+X+X = 3X equation that seems so obvious in algebra is false
for random variables in Statistics. (One time the four values I
randomly rolled actually worked! The kids thought that was hilarious.
Their laughter at my bad luck clearly showed they understood the
issue.)
Read a list discussion thread about adding random variables.Pete Flannigan-Hyde as written an article for AP Central about adding random variable.
|
Note that once you introduce inference, you can teach the last part of the year very quickly! Especially inference for slope, which is on the AP test. | |
|
For inference for slope, focusing on interpreting the computer output can save time. | |
|
Not getting into all the nitty-gritty details about homogeniety and independence can save time. | |
|
Following the pacing guide that comes with the textbooks, can help avoid this problem to begin with, but if you're reading this, it may be too late! :o) | |
|
Starting cumulative review while finishing inference can eliminate the need for lots of days of review. | |
|
Reviewing regression while teaching inference for slope is a natural and helpful step for preparing for the exam. |
The short answer is that most list contributors recommend that students show formulas. Both with just variables and then with the numbers plugged in. It shows that the student understands what is going on and it eliminates the concern that students would lose points if they accidentally plugged something into their calculator incorrectly.
A note about the t* for t-intervals. If a student uses technology for certain procedures (e.g., 1-sample with n = 167 or any 2-sample interval), the t* will not be on the table. It is OK to leave the formula with all the numbers plugged in and the t* just stays as a variable. OR a student can use a conservative approach that uses a t* that is on the table, but then they need to calculate their interval by hand so their answer matches the df they used.
If students and/or teacher really want to find the t*, they can use the inverse t function. If students have an 83, they need a t-inverse program. This program is legal (because it just matches the 84) and can be made by typing this simple little program:
Prompt N
Prompt A
solve(tcdf(X,1E99,N-1)-A,X,1)-
->K Disp K
A few other points about this:
|
For hypothesis tests and confidence intervals, the AP rubrics have (thus far!) required name OR formula. So students can get full credit without the formula. | |||||||||||
|
Numerous multiple problems on the '02 exam require formula understanding:
| |||||||||||
|
TI-talk is discouraged. Statements like: normalcdf (1.2, 9999) are just not good communication. While showing a total by-hand formula is not required, good communication is. For example, on a binomial problem, students could write:
| |||||||||||
|
It has been frequently recommended on this list that students show z-score calculations and don't use technology to shortcut that step! |
Charles Peltier has written an article for AP Central about pooling.
In short, we are assuming in our null hypothesis that p1 = p2. So then the question arises: which p do I use the compute the standard deviation of (p1 - p2)? The best solution is to form the pooled-p. This pooled-p takes the weighted average of the two proportions, thus takes a compromise position. This is the best way we can assume that p1 = p2 when calculating the standard deviation.
At first this problem seems impossible! How could a two-tailed test reject what a one-tailed test failed to reject!?!? Answer: if the one-tailed test shaded the wrong way! Only z = -1.98 a sufficient value to reject a two-tailed test. And if z = -1.98 is shaded greater than, then the one-tailed test fails to reject! Pretty tricky!