Introduction: Mixing and Matching Random Variables

Welcome! In your previous statistics studies, you’ve likely looked at single random variables (like the result of rolling one die). But in the real world, things are rarely that simple. Often, we deal with combinations of different events. For example, your total commute time is the sum of your walking time and your bus time.

In this chapter, we are going to learn how to calculate the mean (expected value) and the variance (spread) when we add, subtract, or multiply discrete random variables. Don't worry if this seems tricky at first—we'll break it down into simple rules that work every time!

1. The Basics: Linear Transformations

Before we combine two different variables, let's quickly review what happens when we change just one variable by multiplying it by a number (scaling) or adding a constant (shifting).

If \(X\) is a random variable and \(a\) and \(b\) are constants:
1. The Mean: \(E(aX + b) = aE(X) + b\)
2. The Variance: \(Var(aX + b) = a^2Var(X)\)

Why does this happen?
Imagine every student in a class gets 5 extra marks on a test. The average (mean) goes up by 5. However, the spread (variance) stays exactly the same because everyone moved up together! If we double everyone's marks, the average doubles, but the spread increases by \(2^2 = 4\) times.

Quick Review:
- Adding a constant (\(b\)) affects the mean but not the variance.
- Multiplying by a constant (\(a\)) affects the mean by \(a\) and the variance by \(a^2\).

2. Combining Two Variables: The Mean Rule

The best news in this chapter is that Means are friendly! They behave exactly how you would expect them to, whether the variables are related or completely independent.

For any two random variables \(X\) and \(Y\):
\(E(aX + bY) = aE(X) + bE(Y)\)

Real-World Example:
If you earn \$10 per hour (\(X\)) and your friend earns \$12 per hour (\(Y\)), and you both work different hours, your combined expected total earnings is just the sum of your individual expected earnings. It doesn't matter if you work at the same shop or different ones!

Key Takeaway: You can always just add or subtract means directly. \(E(X - Y) = E(X) - E(Y)\).

3. Combining Two Variables: Covariance and Correlation

Before we look at the variance of combined variables, we need to understand how \(X\) and \(Y\) "talk" to each other. This is where Covariance and Correlation come in.

Covariance \(Cov(X, Y)\): This measures how much two variables change together.
- If \(Cov(X, Y)\) is positive, when \(X\) goes up, \(Y\) tends to go up.
- If \(Cov(X, Y)\) is negative, when \(X\) goes up, \(Y\) tends to go down.
- If \(Cov(X, Y)\) is zero, there is no linear relationship.

Correlation (\(\rho\)): This is just a "standardized" version of covariance that always stays between -1 and 1. It is easier to interpret than covariance.
The formula relating them is:
\( \rho = \frac{Cov(X, Y)}{\sqrt{Var(X)Var(Y)}} \)

Did you know?

Correlation doesn't care about the units! Whether you measure height in cm or inches, the correlation between height and weight remains exactly the same.

4. The Variance of a Linear Combination

Calculating the variance of combined variables is a bit more complex because we have to account for how they interact.

The General Formula:
\(Var(aX + bY) = a^2Var(X) + b^2Var(Y) + 2abCov(X, Y)\)

Wait, what if we are subtracting?
If you are calculating \(Var(aX - bY)\), the formula becomes:
\(Var(aX - bY) = a^2Var(X) + b^2Var(Y) - 2abCov(X, Y)\)

Common Mistake Alert!
Students often forget the \(a^2\) and \(b^2\). Remember: variance is always related to the square of the multiplier. Even if you multiply a variable by -1, the variance is multiplied by \((-1)^2 = 1\). Variance is always a positive measure of spread!

5. The Special Case: Independent Variables

If \(X\) and \(Y\) are independent (meaning one does not affect the other at all), then \(Cov(X, Y) = 0\). This makes our lives much easier!

For independent variables \(X\) and \(Y\):
1. \(E(aX + bY) = aE(X) + bE(Y)\)
2. \(Var(aX + bY) = a^2Var(X) + b^2Var(Y)\)
3. \(Var(aX - bY) = a^2Var(X) + b^2Var(Y)\)

Look closely at that last one!
Even when you subtract two independent variables, you add their variances.
Analogy: Imagine you are cutting a piece of wood. There is uncertainty in the length of the wood (\(X\)) and uncertainty in where you place your saw (\(Y\)). When you cut them, the total "error" or "wobble" (variance) in the final piece gets larger, not smaller, because you have two sources of randomness working against you!

Summary Table for Independent Variables:
- Operation: \(X + Y\) \(\rightarrow\) Mean: \(E(X)+E(Y)\) \(\rightarrow\) Variance: \(Var(X)+Var(Y)\)
- Operation: \(X - Y\) \(\rightarrow\) Mean: \(E(X)-E(Y)\) \(\rightarrow\) Variance: \(Var(X)+Var(Y)\)

6. Sum of \(n\) Independent Variables

Sometimes you aren't just adding \(X\) and \(Y\), but adding many observations of the same type of variable (like the total weight of 10 independent apples).

If \(X_1, X_2, ..., X_n\) are independent observations of the same variable \(X\):
\(E(X_1 + X_2 + ... + X_n) = nE(X)\)
\(Var(X_1 + X_2 + ... + X_n) = nVar(X)\)

Important Distinction:
There is a huge difference between \(X_1 + X_2\) (two different independent apples) and \(2X\) (one apple that we double the weight of).
- \(Var(X_1 + X_2) = Var(X) + Var(X) = 2Var(X)\)
- \(Var(2X) = 2^2Var(X) = 4Var(X)\)
Doubling one random measurement is much "riskier" (more variable) than taking two separate measurements and adding them!

Step-by-Step: Solving a Typical Problem

Problem: Let \(X\) be the score on a red die and \(Y\) be the score on a blue die. Find the mean and variance of \(S = 3X - Y\), assuming the dice are independent.

Step 1: Identify the properties of \(X\) and \(Y\).
For a fair 6-sided die: \(E(X) = 3.5\) and \(Var(X) = \frac{35}{12} \approx 2.917\).

Step 2: Find the new Mean.
\(E(3X - Y) = 3E(X) - E(Y)\)
\(E(3X - Y) = 3(3.5) - 3.5 = 10.5 - 3.5 = 7\).

Step 3: Find the new Variance.
Since they are independent, use the \(a^2Var(X) + b^2Var(Y)\) rule.
\(Var(3X - Y) = 3^2Var(X) + (-1)^2Var(Y)\)
\(Var(3X - Y) = 9Var(X) + 1Var(Y)\)
\(Var(3X - Y) = 10 \times 2.917 = 29.17\).

Quick Review:
- Did you add the variances? Yes!
- Did you square the multipliers (3 and -1)? Yes!

Final Summary and Key Points

1. Means are easy: Just follow the signs in the equation.
2. Variances are tricky: Always square the constants, and always add the variance terms if the variables are independent.
3. Covariance matters: If variables are not independent, you must include the \(2abCov(X, Y)\) term.
4. Independence is your friend: It simplifies the variance formula by removing the covariance term.
5. \(n\) variables vs. \(n \times\) one variable: Adding \(n\) independent copies results in \(nVar(X)\), but multiplying one variable by \(n\) results in \(n^2Var(X)\).

Keep practicing these rules! Once you get used to squaring the constants for variance and checking for independence, these problems will become some of your favorite "quick wins" in the exam.