# A statistical analysis of a malfunctioning hot water heater

Calculating the accepted ranges for single p_{0} values is no problem, but p can take any value between 0 and 1. That’s an infinite amount of possible values. We could limit the resolution and go through that range in, for example, 0.0001 sized steps, but there is a more elegant way.

We need a simple measure that tells us whether or not a given p_{0} belongs to the confidence interval at our chosen confidence level. And for that we use the so called p-value. Please do not confuse this with our single-shower-turning-cold-probability p. A p-value is simply the probability, that a given random distribution will produce a result that is equal to or *more extreme* than the measured result. Since you can either deviate to the positive extreme or the negative, there are two p-values:

So if your measurement yields a result left of the mean of the distribution, you sum over all the probabilities left of your measured value, including your measured value. If that sum is smaller than or equal to the amount of probability we cut away for the accepted range, the measured value is outside that range and the tested p_{0} is outside the confidence interval. If we plot those p-values vs. all possible values of p, we can see at a glance what parts are inside and what parts are outside the CI.

# p-values

plowc = []

phighc = []

plowo = []

for p in p_range:

plowc.append(binom(Nc,p).cdf(Ncb))

phighc.append(binom(Nc,p).sf(Ncb-1))

plowo.append(binom(No,p).cdf(Nob))

plowc_inter = interp1d(* zip(* sorted(zip(plowc, p_range))))(0.025)

phighc_inter = interp1d(* zip(* sorted(zip(phighc, p_range))))(0.025)

plowo_inter = interp1d(* zip(* sorted(zip(plowo, p_range))))(0.05)

fig, ax = plt.subplots(1,1)

ax.plot(p_range, plowc, label=’$p_\mathrm{low,c}$’)

ax.plot(p_range, phighc, label=’$p_\mathrm{high,c}$’)

ax.plot(p_range, plowo, label=’$p_\mathrm{low,o}$’)

ax.plot((p_range[0], p_range[-1]), (0.025, 0.025), ‘k–‘)

ax.plot((plowc_inter, plowc_inter), (1e-2, 0.025), ‘k–‘)

ax.plot((phighc_inter, phighc_inter), (1e-2, 0.025), ‘k–‘)

ax.plot((p_range[0], p_range[-1]), (0.05, 0.05), ‘k–‘)

ax.plot((plowo_inter, plowo_inter), (1e-2, 0.05), ‘k–‘)

ax.set_title(‘final plot’)

ax.set_xlabel(‘single bad shower probability $p$’)

ax.set_ylabel(‘p-value’)

ax.set_yscale(‘log’)

ax.set_ylim((1e-2, 1))

ax.legend(loc=’best’)

ax.grid()

fig.savefig(‘Plots/p-values.png’)

I left out p_{high} for the open window data, since we measured 0 bad showers there and the probability of getting 0 or more bad showers is 1, no matter the value of p. This means we cannot put a lower limit on p_{o}. This is okay, since our maximum likelihood estimation was 0 and you cannot get lower than that anyway.

Since we cannot but a lower limit on p_{o}, we only cut away the left part of the possible PMFs and thus have to check whether the p_{low}-value is smaller than 5%. This is the case for p_{o} > 13.3%. For p_{c} we want to calculate an upper and a lower limit, so we have to check where p_{low} and p_{high} are less than 2.5% respectively. This leads to the following confidence intervals:

As you can see, the confidence intervals for the closed window data and open window data overlap, even if it’s just a tiny bit. Which means, we cannot exclude that there is no connection between the window and the fire going out at all at this confidence level. But it looks like it will take only a few more showers to have enough statistics to fix that.

Speaking of the initial estimates: The notation above is quite common for values where you only have an upper or a lower limit, but it is less common for values where you have a given point estimate (our initial guess, the maximum likelihood estimate) and a confidence interval. In that case, you usually write it down like

or if the CI is symmetric around the estimate:

Here the numbers after the estimate denote how far you can stray from it while staying inside a 68.26% confidence interval. This odd number corresponds to 1 standard deviation of a bell curve, or normal distribution. The normal distribution is the most important of them all since pretty much all other distributions will approach it if you increase the number of samples. Also 2 standard deviations, or sigmas, correspond to a 95,45% CL and 3 sigmas include the true value 99,73% of the time. So if you want to convert a given estimate plus standard error to a 95% CI, you just have to multiply the errors by ~2. This is not exact unless the random distribution is actually a normal distribution, but we’re only estimating anyway and if you have no idea what the real distribution looks like, the normal distribution tends to be a good guess. If we calculate the 68.26% CI for the closed window data we get

.

**To conclude:**

- Naively calculating the probability by dividing the number of bad showers by the total number of showers is equal to the maximum likelihood estimation. It yields the probability of a single shower turning cold that is most likely to produce the given result.
- Since a single number holds no information about how sure we can be about that estimate, we construct a confidence interval, CI, of which we can be fairly certain that the true value lies within that interval. The degree of certainty we demand is called the confidence level, CL. A CL of 95% means that of all such constructed CIs, 95% will include the true value of the parameter. Strictly speaking, it does
*not*mean that the probability of the true value being in a given CI is 95%, even though people tend to take that short-cut when talking about CIs. The true value is fixed and either lies within that interval or not, so the (frequentist) probability is 1 or 0, even though we do not know which it is. - The constructed CI for open and closed window overlap, so we cannot exclude the possibility that the open window has nothing to do with the probability of the fire going out. The estimated values for p
_{o}and p_{c}are:

I guess I will have to take a few more showers before I can confidently tell the plumber that my heater is broken, but opening a window fixes it. The only question remaining is: why do my posts here always turn out much longer than I initially intend them to be?