Chapter 8 Comparing Data


Statisticians use several statistical measures like the percentage difference, percentage change and percentage error to evaluate the differences between measured values. All three differ in what they measure.

Percentage difference is the difference between two values divided by the average of two values multiplied by 100%. This is typically used to understand how close two values are to one another.

If the data tracks changes in values over time (comparing old values to new values) then it is nearly always better to calculate the percentage change instead of the percentage difference. This is the key difference between the two.

While percentage change aims to measure change over time, the percentage difference seeks to understand the difference between two values.

8.1 Percentage Difference

The percentage difference is used to compare two values.

\[ \textrm{Percentage Difference} = 100\% \frac{|\textrm{Value}_1 - \textrm{Value}_2|}{\frac{\textrm{Value}_1+\textrm{Value}_2}{2}}.\]

The | symbol in the formula below indicates that the absolute value should be taken.

Information

The absolute value of a calculation is the result of the calculation if the numerical answer was always assumed to be positive.

For example:

\[ |3 - 2| = 1,\] \[ |10 - 5| = 5,\] \[ |7 - 9| = 2.\] Seven minus nine is equal to -2 but when we take the ‘absolute’ value (|7-9|) we ignore the negative sign and report the result as 2.

8.1.1 Example

One researcher produced thirteen research reports in 2022 another produced 11. What is the percentage difference?

\[ \textrm{Percentage Difference} = 100\% \frac{|13 - 11|}{(\frac{13+11}{2})},\]

\[ \textrm{Percentage Difference} = 100\% \frac{2}{(\frac{24}{2})},\]

\[ \textrm{Percentage Difference} = 100\% \frac{2}{12} = 16.7\%\]

8.2 Percentage Change

Percentage change is about comparing old to new values. The formula for calculating a percentage change is given below:

\[ \textrm{Percentage Change} = 100\% \frac{\textrm{New Value} - \textrm{Old Value}}{\textrm{Old Value}}.\]

8.2.1 Example

The table below details mid-2020 population estimates for several Local Government Districts in Northern Ireland (NISRA 2021). What is the percentage change in the reported population of Banbridge between 2010 and 2020?

Mid Year Population Estimates, 2010-2020
Local Government District Persons (2010) Persons (2020)
Craigavon 92,242 103,341
Dungannon 57,263 63,552
Lisburn 119,442 129,485
Armagh 59,137 63,874
Newry and Mourne 99,136 106,813
Magherafelt 44,664 47,789
Banbridge 47,821 51,108

Use the formula above to calculate the percentage change for Upper Banbridge:

\[ \textrm{Percentage Change} = 100\% \frac{\textrm{New Value} - \textrm{Old Value}}{\textrm{Old Value}},\]

\[ \textrm{Percentage Change} = 100\% \frac{51,108-47,821}{47,281}=6.9\%,\]

A negative percentage change indicates a percentage decrease while a positive percentage change indicates a percentage increase so the population has increased by 6.9%.

The percentage change in the population of each area between 2010 and 2020 has been calculated using this formula and the results are detailed in the table below.

Mid Year Population Estimates, 2010-2020
LGD Persons (2010) Persons (2020) Percentage Change
Craigavon 92,242 103,341 12.0%
Dungannon 57,263 63,552 11.0%
Lisburn 119,442 129,485 8.4%
Armagh 59,137 63,874 8.0%
Newry and Mourne 99,136 106,813 7.7%
Magherafelt 44,664 47,789 7.0%
Banbridge 47,821 51,108 6.9%

8.3 Percentage Point Change

Note that subtracting one percentage from another gives the percentage point change rather than the percentage change.

8.4 Quantiles

Quantiles are subsets of a larger data set that has been split into some number of equal parts. Quartiles and quintiles are commonly used quantiles which divide data in four and five equal parts respectively.

Quartiles divide a rank-ordered data set into four equal parts. The values that divide each part are called the first, second, and third quartiles; and they are denoted by Q1, Q2, and Q3, respectively. 25% of the measurements or observations in the data set are less than or equal to the first quartiles (Q1); 50% are less than or equal to the second quartile (Q2); and 75% are less than or equal to the third quartile (Q3).

8.5 Deciles and Percentiles

Deciles are another type of quantile that divide the data into 10 equal parts, instead of Q1, Q2 and Q3, we have decile 1 (D1) through to decile 9 (D9) as a result.

Percentiles divide the data into 100 equal parts resulting in percentile 1 (P1) through to percentile 99 (P99).Assume that the elements in a data set are rank ordered from the smallest to the largest. The values that divide a rank-ordered set of elements into 100 equal parts are called percentiles. The observation at the 50th percentile would be denoted would be greater than 50 percent of the observations in the set. An observation at the 50th percentile would correspond to the median value in the set.

Summary

The percentage difference is used to compare two values.

\[ \textrm{Percentage Difference} = 100\% \frac{|Value_1 - Value_2|}{\frac{Value_1+Value_2}{2}}.\] The formula for calculating a percentage change is given below:

\[ \textrm{Percentage Change} = 100\% \frac{\textrm{New Value} - \textrm{Old Value}}{\textrm{Old Value}}.\] Subtracting one percentage from another gives the percentage point change rather than the percentage change.

References

———. 2021. “Population Totals (Administrative Geographies), 1991-2020.”