Expectation, variance, and covariance: measuring uncertainty

Stat refreshers

A refresher on expectation, variance, and covariance, focused on how uncertainty and dependence are quantified in time series.

Author

Pablo Benavides-Herrera

Modified

February 13, 2026

This document is optional, but strongly recommended.

Once uncertainty is represented through random variables, the next step is to quantify it.

Expectation, variance, and covariance are the core mathematical tools we use to describe:

typical behavior,
variability,
and dependence.

They appear everywhere in forecasting, whether explicitly or implicitly.

1 Expectation: the long-run average

The expectation (or expected value) of a random variable describes its typical value in the long run.

For a discrete random variable X with probability mass function p(x),

\mathbb{E}[X] = \sum_x x \, p(x).

For a continuous random variable with density f(x),

\mathbb{E}[X] = \int_{-\infty}^{\infty} x \, f(x)\, dx.

Tip

Expectation is not what you expect to observe next.

It is what you would obtain on average, across many repetitions.

In forecasting, expectation often corresponds to:

the point forecast,
the mean of a forecast distribution,
the baseline around which uncertainty is measured.

2 Linearity of expectation (this matters a lot)

One of the most important properties of expectation is linearity:

\mathbb{E}[aX + b] = a\,\mathbb{E}[X] + b,

for constants a and b.

Even more importantly,

\mathbb{E}[X + Y] = \mathbb{E}[X] + \mathbb{E}[Y],

regardless of whether X and Y are independent.

Important

Linearity of expectation holds without independence.

This fact underlies many results in time series and forecasting.

3 Variance: how uncertain is a random variable?

Expectation alone is not enough.
Two random variables may have the same mean but very different levels of uncertainty.

The variance of X measures how much values fluctuate around their expectation:

\operatorname{Var}(X) = \mathbb{E}\big[(X - \mu)^2\big], \qquad \mu = \mathbb{E}[X].

An equivalent and often more convenient expression is:

\operatorname{Var}(X) = \mathbb{E}[X^2] - \mu^2.

Note

Variance is measured in squared units.

The standard deviation \sigma = \sqrt{\operatorname{Var}(X)} restores the original scale.

In forecasting, variance captures:

forecast uncertainty,
volatility,
the typical size of forecast errors.

4 How variance reacts to transformations

Variance behaves very differently from expectation under transformations.

For constants a and b,

\operatorname{Var}(aX + b) = a^2 \operatorname{Var}(X).

Adding a constant does nothing to variance.
Scaling a variable scales variance quadratically.

Warning

Variance is sensitive to scale.

This is why transformations (e.g., logarithms) can dramatically change model behavior.

5 Covariance: measuring dependence

Variance describes uncertainty of a single random variable.
Covariance describes how two random variables vary together.

For random variables X and Y,

\operatorname{Cov}(X, Y) = \mathbb{E}\big[(X - \mu_X)(Y - \mu_Y)\big] = \mathbb{E}[XY] - \mu_X \mu_Y.

Positive covariance: large values of X tend to occur with large values of Y
Negative covariance: large values of X tend to occur with small values of Y

Note

If X and Y are independent, then

\operatorname{Cov}(X,Y) = 0.

The converse is not generally true.

6 Covariance in time series

In time series analysis, covariance appears in a very specific form.

Given a sequence \{Y_t\}, we define the lag-h covariance as:

\operatorname{Cov}(Y_t, Y_{t-h}).

This quantity measures how observations relate to their own past.

Important

Autocovariance is the foundation of:

autocorrelation,
stationarity,
AR and MA models.

If you understand covariance, you are already halfway to understanding time series models.

7 Correlation (preview, not the full story)

Covariance depends on scale, which makes comparisons difficult.

The correlation coefficient rescales covariance:

\rho(X,Y) = \frac{\operatorname{Cov}(X,Y)} {\sqrt{\operatorname{Var}(X)\operatorname{Var}(Y)}}.

Correlation lies between -1 and 1, but:

Warning

Correlation measures linear association, not general dependence.

This distinction becomes critical in time series.

Correlation deserves its own refresher — and it will get one.

8 Where this shows up in the course

Expectation, variance, and covariance appear throughout the course:

point forecasts and forecast distributions,
forecast error evaluation,
residual diagnostics,
autocorrelation functions,
stationarity assumptions.

They are not optional background — they are the mathematical backbone of forecasting.

9 What you do not need yet

At this stage, you do not need:

higher-order moments,
distribution-specific formulas,
closed-form derivations.

Those concepts matter, but only once the core quantities are fully internalized.

10 Takeaway

Important

Expectation describes the center.

Variance describes uncertainty.

Covariance describes dependence.

Together, they define how time series behave.