Find top 1-on-1 online tutors for Coding, Math, Science, AP and 50+ subjects

Table of Contents

AP

What are the Differences Between Mean and Median?

Central tendencies are a measure of centralizing values in a data set, i.e., they measure some sort of average value of the observations in a survey or an experiment. There are three most basic measures of central tendencies, Mean, Median, and Mode. In this article, we are going to learn about Mean and Median, how to find them, their properties, and their applications.

Data and central tendencies

Data is a collection of observation values in a survey or an experiment. In a data set usually, the frequency of the observations tends to have higher density towards the center, i.e., the frequency-observation graph peaks near the center. Thus, we can assume the values near the center tend to give a lot of information about the data, and thus we define central tendencies.

Central Tendencies are quantities about a data set that tell a certain central value of the data. Central Tendencies may or may not actually belong to the data set.

Mean and Median

Mean is defined as the average value of the observations in a data set, i.e., it represents the center of the data set values by the average of all the data points. There are 3 different types of mean in mathematics, Arithmetic mean, Geometric mean, and Harmonic mean, each corresponding to a different relation between the data points.

2 data points and their arithmetic mean form an arithmetic sequence, i.e., the difference between two consecutive terms have a constant difference, similarly geometric mean forms a geometric sequence, and harmonic mean forms a harmonic sequence. For the sake of simplicity and practicality, we mainly focus on the arithmetic mean in statistics.

The arithmetic mean is defined as the average value. It is calculated by dividing the sum of all the observations by the total number of observations.

The median is the middle term of all the observations arranged in ascending or descending order.

How to find Mean and Median?

Mean

  1. Ungrouped data

To find the mean of discrete data, i.e., ungrouped data, we simply add all the observations and divide that sum by the number of observations, thus the formula for the mean of ungrouped data is given by, 

{\rm{Mean}} = \frac{{{\rm{Sum of all the observations}}}}{{{\rm{Number of observations}}}}

  1. Grouped Data

To find the mean of grouped data, we have two different types of grouping of data.

  1. Simple Grouping: In the simple grouping of data, we simply group the data with the same value. This type of grouping is practical only for a small sample set. To find the mean in simple grouping we first find the total value each data point is contributing towards the data by multiplying the data values by their respective frequency, and then we add all these values found by the product of the data point and their frequency. Then that sum is divided by the total number of observations which is simply the sum of all the frequencies.

    \[{\rm{Mean}}\left( {\bar x} \right) = \frac{{\mathop \sum \limits_{i = 1}^n {f_i}{x_i}}}{{\mathop \sum \limits_{i = 1}^n {f_i}}}\]

Where, {x_i}’s are the observations, and {f_i}’s are their respective frequencies.

  1. Ranged Grouping: In the ranged grouping of data, we divide the total range of the data, i.e., the highest value to the lowest value into small class intervals, and any value falling into a range of class intervals is counted towards that class interval only.
    This type of grouping is practical only for a large sample set, for small sample sets it may vary from the mean found by simple grouping or ungrouped data since it approximates the value of all the data points in a given interval to a single point.
    But in a large data set that approximation becomes relatively irrelevant and the mean found is a very good approximation of the actual mean. In ranged grouping, according to the value of each data set, we can apply three different methods to find the mean, i.e., Direct method, Assumed mean method, and Step-Deviation method. Let’s learn more about each of these methods in later sections with examples.

The formula for mean using the direct method is,

    \[{\rm{Mean}}\left( {\bar x} \right) = \frac{{\mathop \sum \limits_{i = 1}^n {f_i}{x_i}}}{{\mathop \sum \limits_{i = 1}^n {f_i}}}\]

Where, {x_i}’s are the class marks, and {f_i}’s are their respective frequencies.

The formula for the Assumed mean method is

    \[{\rm{Mean}}\left( {\bar x} \right) = a + \frac{{\mathop \sum \limits_{i = 1}^n {f_i}{d_i}}}{{\mathop \sum \limits_{i = 1}^n {f_i}}}\]

Where, {d_i}’s are the differences between class marks and assumed mean a, and {f_i}’s are their respective frequencies.

The formula for the Step Deviation method is,

    \[{\rm{Mean}}\left( {\bar x} \right) = a + \frac{{\mathop \sum \limits_{i = 1}^n {f_i}{u_i}}}{{\mathop \sum \limits_{i = 1}^n {f_i}}} \times h\]

Where, {u_i}’s are the ratios of differences between class marks and assumed mean a with respect to the class height h, and {f_i}’s are their respective frequencies.

ap statistics practice tests and past papers download

Median

  1. Ungrouped Data

To find the median of ungrouped data we arrange the data in a monotone, i.e., either ascending or descending order, then we select the middlemost value from the arranged data. If the number of observations is odd then we simply pick out the middlemost data point. But if the number of observations is even then we have two middlemost data points, then we find the arithmetic mean of those two points.

  1. Grouped Data

To find the median of grouped data we have two methods, i.e., numerical and geometrical. Both methods have the same basis known as the cumulative frequency method.
We can make the cumulative frequency table in two ways, i.e., the ‘More than’ table and the ‘Less than’ table, median found numerically from both the tables is the same and we have a simple formula for finding the median.
Graphically More than cumulative frequency table results in a More than Ogive and the Less than the table in Less than Ogive, the value of the data at the intersection of both ogives gives the median.

{\rm{Median}} = l + \left( {\frac{{\frac{n}{2} - {c_f}}}{f}} \right) \times h

Where, l is the lower limit, f is the frequency of the median class and {c_f} is the cumulative frequency of the class preceding the median class and n is the total number of observations or simply the total frequency and h is the height of the class intervals. The median class is selected by choosing the class that has a cumulative frequency of just more than the value n/2.

Differences between Mean and Median

MeanMedian
Mean is the average value of all the data points, the mean of a data set may or may not actually belong to the data set.Median is the middle point of a data set. The median of a data set always belongs to the data set.
The Mean of a data set depends on all the data points.Median of a data set does not actually depend on all the data points, just the center points. If all the right half is increased and all of the left half is decreased, the median remains the same.
The Mean of a data set is the most sensitive central tendency out of the three. Mean of the data set changes if any value of the data set is changed.Median of a data set is not too sensitive to the change in data points. The median of a data set does not necessarily change when any one of the data points is changed.
Mean is the preferred central tendency when the data is distributed normally.Median is the preferred central tendency when the data is formed by skewed distribution.

Solved Examples

Example 1: Find the mean and median for the following data set.

    \[X = \{ 10,13,11,14,17,16,13,11,19,12\} \]

Solution:

The mean of the given data set is given by

    \[\begin{array}{l}\bar x = \frac{{\sum {{x_i}} }}{n}\\\bar x = \frac{{10 + 13 + 11 + 14 + 17 + 16 + 13 + 11 + 19 + 12}}{{10}}\\\bar x = \frac{{136}}{{10}}\\\bar x = 13.6\end{array}\]

And since the number of observations is even, the median is the arithmetic mean of the middlemost term in the arranged data set {X^*}  given by

    \[{X^*} = \{ 10,11,11,12,13,13,14,16,17,19\} \]

Thus, median is given by

    \[\begin{array}{l}{\rm{Median}} = \frac{{13 + 13}}{2}\\{\rm{Median}} = \frac{{26}}{2}\\{\rm{Median}} = 13\end{array}\]

Example 2: Find the mean of the following grouped data by all three methods and verify your answers. Also, find the median using More than cumulative frequency distribution.

Class Intervals0-1010-2020-3030-4040-50
Frequency61014128

Solution:

To find the mean we will construct the following distribution table

Class IntervalsClass Marks{x_i}Frequency {f_i}{f_i}{x_i}{d_i} = {x_i} - a{f_i}{d_i}{u_i} = \frac{{{d_i}}}{h}{f_i}{u_i}
0-105630-20-120-2-12
10-201510150-10-100-1-10
20-3025= a143500000
30-40351242010120112
40-5045836020160216
\sum {{f_i}}  = 50\sum {{f_i}} {x_i} = 1310\sum {{f_i}} {d_i} = 60\sum {{f_i}} {u_i} = 6

Now, we can find the mean of this data

Direct method,

    \[\begin{array}{l}\bar x = \frac{{\sum {{f_i}} {x_i}}}{{\sum {{f_i}} }}\\\bar x = \frac{{1310}}{{50}}\\\bar x = 26.2\end{array}\]

Assumed mean method,

    \[\begin{array}{l}\bar x = a + \frac{{\sum {{f_i}} {d_i}}}{{\sum {{f_i}} }}\\\bar x = 25 + \frac{{60}}{{50}}\\\bar x = 25 + 1.2\\\bar x = 26.2\end{array}\]

Step deviation method

    \[\begin{array}{l}\bar x = a + \frac{{\sum {{f_i}} {u_i}}}{{\sum {{f_i}} }} \times h\\\bar x = 25 + \frac{6}{{50}} \times 10\\\bar x = 25 + 1.2\\\bar x = 26.2\end{array}\]

Hence, the mean of the given grouped data is 26.2, and we have verified the answer using all three methods.

Now for median using more than cumulative frequency distribution we have the following table,

Class IntervalsFrequencyMore ThanCumulative Frequency
0-106More than 050
10-2010More than 1044
20-30(Median Class)14(f)More than 2034
30-4012More than 3020({c_f})
40-508More than 408
n = 50

In the given table, we have the following values,

l = 20,\;f = 14,\;n = 50,\;{c_f} = 20,\;h = 10

And the median is given by,

    \[{\rm{Median}} = l + \left( {\frac{{\frac{n}{2} - {c_f}}}{f}} \right) \times h\]

Substituting the values we have,

    \[\begin{array}{l}{\rm{Median}} = 20 + \left( {\frac{{25 - 20}}{{14}}} \right) \times 10\\{\rm{Median}} = 20 + \frac{{50}}{{14}}\\{\rm{Median}} = 20 + 3.57\\{\rm{Median}} = 23.57\end{array}\]

Hence, the median of the given grouped data is 23.57.

Conclusion

In this article, we have learned about Mean and Median as central tendencies. Mean is the average value of all the observations of all the data points, whereas the median is the middlemost data point.

When the data is formed by the normal distribution, we prefer to use the mean as the central tendency as in a normal distribution the variance is optimum, whereas if the data is formed by skewed distribution, we prefer the median as the central tendency since the data is weighted to one of the sides, and hence, the mean is not near the actual center.

Frequently asked questions (FAQs)

References

Pham-Gia, T., & Hung, T. L. (2001). The mean and median absolute deviations. Mathematical and computer Modelling34(7-8), 921-936.

Lewis, J. R. (1993). Multipoint scales: Mean and median differences and observed significance levels. International Journal of Human‐Computer Interaction5(4), 383-392.

Get 1-on-1 online AP Statistics tutor