## Confidence levels and critical values

#### Confidence levels

During a statistical analysis, when studying a sample from a population and obtaining a particular result, a confidence level refers to the amount of trust you have on your own experiment and/or analysis to yield results that match with that of the actual population. In simple words, we can define confidence level as the percentage of times in which an experiment can be repeated and it will yield a result truthful to the actual characteristics of the population that is being studied, just based on the sample analysed.

Therefore, a higher level of confidence for a sample analysis means that the characteristics being depicted in the study are reliable and represent the actual population; while a very low confidence level means that the results are not to be trusted.

Having said this, it is important to know that a statistic with a 100% confidence level does not exist. Why? A 100% level of confidence would mean that if you were to take a sample from a population (lets say you use random sampling methods), make an estimation from such whole population based on the sample to obtain a particular result and repeat this same experiment, over and over again, you would ALWAYS obtain the same result. As you may have guessed, such result would be the true value for the whole population and unless your sample contains the whole population being studied, this is highly unlikely to happen.

In other words, a confidence level in no way refers to blind faith in your methods, but fact-based and empirical trust that your methodology was carried in the most efficient way and that your experiment is repeatable.

In order to demonstrate clearly how we can understand confidence levels and their critical values, let us make use of the empirical rule (also called the 68-95-99.7 rule) which gives us the approximation of data percentage found in different regions of a normal distribution (the regions usually denoted using the standard deviation marks).

When using a standard normal distribution, the empirical rule is easy to understand as follows:

Figure one basically shows that:

- 68.26% of the data points in the distribution are found within one standard deviation from the mean.
- 95.44% of the data points in the distribution are within two standard deviations from the mean.
- 99.72% of the data points are within three standard deviations from the mean.

If we think of these percentages as confidence levels, we can say that there is a 68.26 confidence level that a particular data point from this distribution is found within one standard deviation distance from the mean and the same can be said with the rest of the percentages that we already have: there is a confidence level of 95.44 that a particular data point of this distribution is located within two standard deviations from the mean, and a confidence level of 99.72 that the point will be located within three standard deviations from the mean.

So now that we have a better idea of what a confidence level is, what is a confidence interval and a critical value then?

**Confidence intervals**

A very simple confidence interval definition can be provided by referencing to the empirical rule above (figure 1) since it is clear that such interval must be the range of values comprising a particular confidence level. This is simple to remember since we can define interval simply as a range of values of a particular parameter, then for the case of a confidence interval, we can just add that this particular range of values is the one that is believed to contain a specific parameter mark of the population that is being studied, in other words, is the range in which a confidence level falls (and thus why it is likely that a particular parameter value will fall in there).

Confusing? Just take a look at the figure below:

The percentages of 68.26%, 95.44% and 99.72% showcased in the normal distribution from figure 1 represent confidence levels, and belong to what we call two-sided confidence intervals because their range starts and ends within the distribution. Just take a look at figure 2, you can see that the confidence interval has a lower limit (-1) and an upper limit (1).

**What is a critical value?**

In general, the critical value definition refers to a particular point on the horizontal axis of a graph which divides the area of the graph in two pieces (not necessarily equal pieces). On this case we will focus on critical values of z (also called z critical values), which means that we will be looking at critical values related to a z-score and thus our graph will always be a standard normal distribution (z-distribution).

A critical value of z allows you to divide the area under the standard normal curve into two pieces, and thus, it can help you in the calculation of probabilities or any other related characteristics of the data points from the distribution.

When using confidence intervals delimiting the area under the standard normal curve for a confidence level, we can use any of the edges of the interval as a critical value and either calculate the probability and confidence level being delimited by the interval; or, if the confidence level is given, we can find the critical value by looking at the z-score which produces the areas delimited by the interval.

How does this work? Let us explain:

Think on the empirical rule shown in figure 1. In this case, you can see that there is a confidence level of 0.6826 that a data point from this set will be located inside the confidence interval delimited by the cyan area under the curve. For this case, we know that the edges of this confidence interval are -1 and 1, but if only the percentage of 68.26% had been given to you, how would you know?

Well, if the confidence level in cyan color occupies 68.26% of the total area under the curve (which is 1) it means that it covers an area of 0.6826, leaving 0.3174 of the area divided in two pieces, one on each side.

Therefore, the tail area on the left would be half the 0.3174, and the tail area on the right would be the other half. Each of them would have a value of 0.1587. To find the critical value, we look at the tail area on the left and see that this 0.1587 is equivalent to the probability of a data point to be located on this area, which is delimited by a certain z-score (or z-value).

To obtain this z-value we just had to go and take a look at the z-tables and find the z-score which produces the probability value of 0.1587. So you can think of the z-table as a table of critical values if you know how to use it!. The z-tables are below for you to take a look.

As you can see, the z-score which produces a probability value of 0.1587 is z=-1, which is correct! This is the critical value for a two sided confidence interval with a confidence level of 0.6826. Or in other words, that is the value on the horizontal axis where the confidence interval starts.

We know if correct, because we already knew this from the empirical rule. You think this example was redundant? Then let us take a look at the next section of our lesson, where the first example problem will ask you to find the critical values in this same way we just did above, but now, for distinct and varied confidence levels.

#### How to find a critical value

The steps to find a critical value when knowing the confidence level are:

- dentify the limit (or limits) of the confidence interval.
- If the confidence interval belongs to the left-most side of the distribution, then use the area proportion of the confidence level to find the corresponding z-value on the z-table.
- This is your critical value.
- If you are looking at a two-sided confidence level centered at the mean, then you need to calculate the area under the standard normal curve which doesnt belong to the confidence level (this area is called $\alpha$α).
- You will have half of on the left, and half of it on the right.
- Calculate the value of $\alpha$α/2 and then use this value to find the corresponding z-value from the z-table. Notice this is done, since this $\alpha$α/2 value is equal to the area under the curve on the left tail of the distribution.
- This is your critical value (the value of z at which the confidence interval has its lower limit).

As you can see, critical values and confidence levels are strongly related to each other when studying probabilities in the standard normal curve. Also notice, at this point we dont have to calculate critical values, is more like finding critical values using the z-tables.

Next you will have some examples where you can practice what we have mentioned so far.

__Example 1__

On this problem we will focus on finding the critical value corresponding to the following confidence levels:

**1. $\quad$ A confidence level of 0.50**

The first thing to do is to draw the standard normal curve where you have that the mean is equal to zero ($\mu$μ = 0) and the standard deviation is one ($\sigma$σ = 1).

On this case, we are looking for the critical value corresponding to a confidence level of 50%, or 0.5, which means that there is a 50% chance that the result of the experiment we are working on is on our distribution.

Taking into account the empirical rule shown in figure 1, we can easily say that a confidence level of 0.5 must be found closer than one standard deviation from the mean since this is equivalent to 50% of the data points centered at the mean; therefore, this is how that looks like in the standard normal curve:

So, if we are looking for the critical value related to a confidence level of 0.5, then we are looking for the value of x which happens to be the left side of the confidence interval for the confidence level of 0.5 in the distribution! Now, how do we find that value?

Notice that since the confidence interval encloses an area under the curve which is 50% of the total area under the curve, and since this area is centered on the mean; then, each little piece on each side outside of the confidence interval must account for 25% of the area under the curve. This means that there is a probability of 25% for a data point to be within the area under the curve in the left hand side of the confidence interval, and we can use this bit of information to look for the z-score which produces this probability of 0.25.

EASY! Use the z-tables.

__Example 2__

Theoretical interpretation of the confidence level and critical value

What would be the resulting critical value for

**A confidence level of 1? + $\infty$∞****A confidence level of 0? - $\infty$∞**

This is the end of our lesson. Before you go, we recommend you to take a look at this handout on confidence intervals which relates our topic of today with our next lesson: Margin of error.

This is it for today, see you in the next one!