To really understand confidence intervals one must understand what probability actually is. In daily life we use the term “probability” in a somewhat vague and sometimes contradictory way. But vagueness has no place in hard science, so we have to define what we are talking about.
I will use the frequentist interpretation of probability since that is what I’m most familiar with and what is most commonly used in my field of work. There are other interpretations of probability though, the most prominent probably being Bayesian probability. But I will not go into that here. If you do not know what school you belong to, chances are that you are a frequentist.
Frequentism defines the probability of an event as the ratio of times that event happens to the total number of tries, if — and this is kind of important — you do your experiment infinite times. Or mathematically speaking:
This definition has a couple of important consequences:
- First of all, we obviously cannot measure the exact probability, since it is impossible to repeat an experiment ad infinitum. But we can get as close to the truth as we want by simply repeating the experiment very, very often.
- In this definition, the probability is an objective property of the process or experiment we are looking at. This is in contrast to the Bayesian interpretation, where probability is a measure for a more subjective certainty about something.
- This definition — strictly speaking — makes it impossible to define a range of values that, for example, have a 90% probability to include the real value of a parameter of a theory. This is because the real parameter either lies within that range or it does not. You can ask the same question as often as you want, the answer will always be the same. So the probability, as defined above, of a parameter being in a given range of values is either 1 or 0, although we do not know which of the two it is.
The last part is especially important to understand confidence intervals. We cannot construct a range with any given probability of including the true parameter. But we can define an algorithm that produces an interval — the confidence interval CI — so that if you repeat the whole experiment, there is a certain probability of including the true value of the parameter in the now different interval. It might seem like useless semantics, but this is an important difference.
The probability of inclusion of the true parameter in the constructed intervals is called the confidence level CL. So let’s say we want to be fairly certain that our interval covers the actual value and chose a confidence level of 95%. That means we have to find an algorithm that turns your data into intervals so that, if you were to repeat your experiment over and over and over again, each producing a different interval, the true value would be included in 95% of those intervals.
Now we know what a confidence interval is and what it is not. All that is left, is to actually define an algorithm to produce the confidence intervals.