What is Percentiles?
A collection of data is divided into 100 equal parts by a value called a percentile. The percentage of values in a distribution that a given value exceeds or is equal to is known as the percentile rank.
Percentiles are a tool used in Statistics to comprehend and analyze data. The value at which n percent of the data falls below it is known as the nth percentile of a set of data. Percentiles are commonly used in everyday life to comprehend numbers such as test results, health indicators, and other metrics.
The equation used to calculate the percentile is as follows:
is the percentile,
is the number of values in the data set
is the ordinal rank of the given value.
The median is also known as the percentile and can be used to calculate the ordinal, ratio, and interval variables.
Percentiles are restricted to the data used in their computation and do not call for any distributional presumptions. As a result, percentiles may serve as useful benchmarks for both normal and non-normal distributions, and their limits will always be contained by the observed data’s min and max values. Mean, median and modes can be only used for normal data distributions.
A very important term that is used in percentiles is quartiles. These are the values that divide the data into quarters and are based on percentiles.
- The percentile value is found in the first quartile, commonly referred to as or the lower quartile. Three-quarters of the scores are higher than this figure, while the bottom quarter is lower.
- The value of the percentile is found in the second quartile, sometimes referred to as or the median. Scores range from 50% above to 50% below.
- The value of the 75% percentile is found in the third quartile, commonly referred to as or the upper quartile. The top quarter of the scores exceeds this number, whereas the worst three-quarters do not.
For example, if a student has a percentile score of 65 in a class of 100 students it shows that he has performed better than or equal to 64 students in the class and has 35 students who perform better than him.
Percentiles in real-world applications
Percentiles are commonly used in everyday life to comprehend numbers such as test results, health indicators, and other metrics. Percentiles are useful whenever a set of data has to be divided into manageable pieces.
- They are used to interpret scores such as SAT scores so that the students taking the tests can know how they are doing compared to other students.
- They are used in children’s growth charts that give an estimate of the height and weight of a child as compared to other children of the same age group.
- Percentiles are frequently used in descriptive statistics to examine vast volumes of financial data. Data are ranked using a percentile scale from 0 to 100 to achieve this. The percentile value for the lowest value is zero, and the percentile value for the highest value is one hundred. Between these ranges, all other values are given corresponding percentiles.
Percentiles and percentile ranks
A score’s percentile rank is the proportion of scores in its frequency distribution that are lower or equal to it.
Percentile rank formula:
Where, is the percentile and is the total number of items.
Steps to calculate the percentile rank are as follows:
- Find the percentile of the data set.
- Plug in the values in the formula
- Interpreting the rank, let’s say a student has a percentile rank of 92%. This indicates that the student’s test result was higher than or equal to 92% of the reference group.
Percentiles and data distribution
- Normal distribution
- A number that has a particular proportion of the observed data below it is called the percentile for a normal distribution.
- The percentile is the percentage of data that is below the z-score for a normal distribution, and the z-score is the number of standard deviations a value is from the mean.
- For a normal distribution, the mean is the percentile.
- Skewed distribution
- For a skewed data set, we have to count the observations and set the percentiles after setting them in order.
- For skewed distributions, we use the interquartile range rule of proximity to remove outliers. The rule to be used is as follows:
- The data points which fall above or below are to be taken as outliers. Where
Percentiles and group comparison
We can use percentiles to compare the data of two different groups as long as we are comparing the same kind of data, let us say the scores of a test of two different classes.
For the same percentile when we find the score with respect to that percentile, we will be able to see which class is doing better and vice versa, that is for a given score we can calculate the percentile of each class and come to a conclusion as to where the range of the scores lie.
For example, let us say two classes of the same strength A and B have percentiles of 89 and 80 respectively for a score of 40 out of 50, we can say that there is a better ratio of good-performing students in class B because 80 percentile would mean there are 20 percent students who have scored better than 40, whereas in class A there is 11 percent of students who have scored better than 40.
Analyzing a cumulative frequency graph
To analyze a cumulative frequency graph, we first use quartiles. Let us see how to calculate the different quartiles from the graph.
- The first quartile: By first organizing the data into an ordered list, and then splitting the data into two groups, the first quartile may be determined. You omit the median if the total number of items in the data collection is odd (the element in the middle). Then, we calculate the median for the lowest half of the data for this new subset of data.
The First Quartile will be this median.
is the first quartile position.
- Since the median and the second quartile are identical, they may be used to determine each other.
is the second quartile position.
- Similar to how the first quartile is located, the third quartile is also located. The distinction is that after splitting the data into two groups, you consider the data in the upper half rather than the bottom half, and then we calculate the median of this subset of data.
Third quartile will be this median.
is the third quartile position.
Understanding where a value fits within a distribution of values using percentiles is a relatively intuitive process. In this article, we have learned how important percentiles are in statistics and how they are used in different types of distributions. We also saw how percentiles are used in real-life analyses. We learned how quartiles work and how they can be used in frequency graphs.
Example 1: The scores of 5 students in a test out of 10 are as follows,
Find the percentile of the student who scored 7.
Let’s arrange the data in ascending order.
We can see that there are 2 students who have scored less than 7.
And the total number of scores is 5.
Therefore, using the formula:
Example 2: Given a list of heights,
Find the 40th percentile of the data given.
Making an ordered list we have,
Finding the ordinal rank for the 40th percentile we have,
The third value in the ordered list is 120.
Example 3: Find the interquartile range for the following data.
Ordering the values from low to high given we get,
Now, we divide the data into two halves,
Finding the median of the first half we get to be 29 and the second half’s median is 63 which is .
Using the formula
We get .
Example 4: If a student has a rank of 5 out of 15 what is the student’s percentile rank?
The total number of scores below the rank is 10 which is .
Therefore, the percentile rank is .
Where is the total number of students.
Therefore, the percentile rank is 67.
Example 5: A table with its cumulative frequency is given. Find all the quartiles positions.
Therefore, using the formulas positions are as follows:
Frequently asked questions (FAQs)
What is the median?
After all, observations are organized in ascending order, or from least to greatest, it is the “middle value” in the group.
What is the mean?
A data set’s mean (average) is calculated by summing all of the numbers in the set, then dividing by the total number of values in the set.
What is the cumulative frequency?
The number of observations in a data collection that is above (or below) a specific value may be found using cumulative frequency.
What is the percentage?
A figure or ratio that may be stated as a fraction of 100 is a percentage.
What is a skewed distribution?
Because the data values fall down more abruptly on one side than the other, a skewed distribution is neither symmetric nor normal.
Written byPrerit Jain