Measures of central tendency

13 July 2020 statistics

This is the first post where I’ll be making use of all these newfangled MathML functions included in modern browsers these days! Only downside, for some values of downside, is that you must write MathML, and not LaTeX and have something like MathJax translate that for you.

A great post by one John Myles White on the three (are there more? I actually am not sure!) measures of central tendency, used to summarize distributions. The measures are mode, median, and mean, and besides providing a summary, the difference between them can be used to understand the underlying distribution, which does not need to be known.

Besides being of course very useful, they also have something in common, very clearly presented in the above post. For a set of points x_i and a (single) summary s, let’s define three measures of discrepancy:

d_{i} = {(| x_{i} - s |)}^{0},

d_{i} = {(| x_{i} - s |)}^{1},

d_{i} = {(| x_{i} - s |)}^{2}

Now, if you take the sum over i, you get the error (E). If you minimize the error, you’ll find that you’ll find the mode, median and mean respectively! A plot helps you see this, and also helps you see why nobody ever takes the third power of the discrepancy: it would be hard (or trivial 😜) to minimize! What happens with higher, even orders, such as 4, or 6? Do they provide other meaningful measures of central tendency? I have no idea, because I had only half a statistics course… A follow-up post is posted here, but I have not read it yet.