Find top 1-on-1 online tutors for Coding, Math, Science, AP and 50+ subjects
Tutoring
Tutors by Subject
Computer Science
Math
AP (Advanced Placement)
Courses
Coding Classes for Kids
Robotics Classes for Kids
Design Classes for Kids
Resources
AP (Advanced Placement)
Calculators
Length Calculators
Weight Calculators
Tools
Tutorials
Scratch Tutorial
Learn
Math Tutorials
AP Statistics Tutorials
Python Tutorials
Blog
Z-score, in statistics, is a value related to a data point that represents the accuracy or closeness of the said data point to a central tendency, usually, an arithmetic mean, with respect to the standard deviation. Z-score of 0 represents that the data point is equal to the mean. The Z-score can be positive or negative depending on if the data point is greater or smaller than the mean.
Z-score is used by investors and statisticians to scope out good data with a smaller z-score range for a high percentage of data points. Most normally distributed data has a z-score range of -3 to 3 for 99.7%, but investors prefer to use the range -1.5 to 1.5 so as to find more reliable data.
Want to learn AP Statistics from experts? Explore Wiingy’s Online AP Statistics tutoring services to learn from top mathematicians and experts.
Z-score is a number associated with a particular data point in a given data.
Let be a collection of
points that represent a data, with
as its arithmetic mean and
as its standard deviation. Then we can calculate the z-score,
, of a data point
using the following formula,
Step-by-step procedure to find the z-score of a given data.
Let, represent a data.
Then to find the z-score we first need to find the arithmetic mean using the following formula,
Once, we have the mean, we can move onto finding standard deviation given by the formula,
Where Var(X) represents the Variance of X given by the formula,
Here, represent the mean of data set
and the data set
is given by,
, i.e., the collection of squares of all the points.
Once, we have the Variance, we can simply find its square root to get the standard deviation.
After we have both mean and standard deviation, we can find the z-score of all the data points in X using the formula,
Let’s see how do we calculate the z-score of data using the following example.
![]() | 5 | 10 | 15 | 20 | 25 | 30 | 35 | 40 |
![]() | 3 | 5 | 6 | 8 | 12 | 8 | 6 | 2 |
Let’s find the mean and standard deviation of the given data set,
![]() | ![]() | ![]() | ![]() | ![]() |
5 | 2 | 10 | 25 | 50 |
10 | 5 | 50 | 100 | 500 |
15 | 6 | 90 | 225 | 1350 |
20 | 9 | 180 | 400 | 3600 |
25 | 12 | 300 | 625 | 7500 |
30 | 8 | 240 | 900 | 7200 |
35 | 6 | 210 | 1225 | 7350 |
40 | 2 | 80 | 1600 | 3200 |
![]() | ![]() | ![]() |
Now we can find the Mean,
And the Standard Deviation,
Thus, we have the values of both mean and standard deviation, and now we can simply substitute the values of ,
, and
in the following formula to find the respective z-score.
![]() | ![]() | ![]() |
5 | 2 | -2.07 |
10 | 5 | -1.51 |
15 | 6 | -0.94 |
20 | 9 | -0.36 |
25 | 12 | 0.20 |
30 | 8 | 0.78 |
35 | 6 | 1.35 |
40 | 2 | 1.92 |
We can also use z-score to find a good distribution.
Let X and Y be two data, with 10 observations each. Let represent their z-scores given by,
Here, we can see that Distribution Y has z-score in the range -2 to 2, while distribution X has z-score range -3 to 3. Thus, Distribution Y is a good distribution.
Advantages
Disadvantages
This article gives a brief insight into the concept of z-score. Z-score is a value assigned to a data point in a distribution, that represents how many standard deviations away the data point is from a central tendency, usually the arithmetic mean.
Z-score helps to find the position of a data point in a distribution without actually knowing the value of the data point. Z score also helps find and compare data with different conditions and a number of observations. Although the z-score is calculated on the assumption that the data is a normal distribution which is not always true.
Example 1: Find the z-scores for the following data:
X = 1, 1, 2, 2, 3, 3, 3, 4, 5, 6
Solution:
First, we need to find the mean of the data.
Next, we will find the standard deviation,
Now that we have both mean and standard deviation, we can start calculating z score using the formula,
Then the respective z-scores are,
Z = -1.29, -1.29, -0.64, -0.64, 0, 0, 0, 0.64, 1.29, 1.94
Example 2: In the previous example, what percent of data lies within -1.5 to 1.5 z-score range? Can we call this distribution a good distribution?
Solution:
We have the z-scores of the given data as follows.
Z = -1.29, -1.29, -0.64, -0.64, 0, 0, 0, 0.64, 1.29, 1.94
Here the number of data points lying in the -1.5 to 1.5 z-score range is 9.
Thus 9/10 or 90% of the data lies within the -1.5 to 1.5 z-score range, which is more than sufficient than the criteria for good distribution, i.e., 68%.
Example 3: If the distribution of data is given by the function , where n represents the position of the data point. Find the function that represents its z-score.
Solution:
Let’s assume that the data has 2k observations, then we have k observations at odd positions and k at even.
Let’s find the mean function.
Replacing 2k=n, we have the mean function,
Standard deviation is given by,
Replacing 2k = n,
Then, the function for z-score, is given by,
Want to learn AP Statistics from experts? Explore Wiingy’s Online AP Statistics tutoring services to learn from top mathematicians and experts.
A1. A good distribution is defined as the distribution, in which at least 68% of the data lies within the -1.5 to 1.5 range of z-score, i.e., 68% or more data points have z-score in the range (-1.5, 1.5).
Investors and statisticians use the z-score to compare data from different companies over the past few years. This is done by checking for their stock market value changes and charting each, they then find the data with high mean, low variability, and the data with dense z-score around the mean, i.e., they can check for percent of data points in a certain z-score range. After doing the calculations, the company that is giving a high average with a dense z-score range and low variability is assumed to be the most profitable and good for investment.
A large data set, normal distribution, usually has 99.7% of data points z-score in the range -3 to 3, 95% in the range -2 to 2, and 68% in the range -1.5 to 1.5.
Most normal distributions have 99.7% of data points with the z-score within the range of -3.0 to 3.0. But a higher or lower z-score is not necessarily bad. It all depends on the context and the type of data.
For example, in an exam students’ scores are recorded. Then the student with a high (positive) z-score scored above the average scores of all the students which is a good z-score for this context.
Another example, we have investors looking for companies with data with high numbers of data points z-score within the range -1.5 to 1.5, because too high or low a z-score represents the market isn’t stable and it’s not good for investment.
Yes, we can find a z-score about other central tendencies. Mode and Median are the two other central tendencies that aren’t as reliable in practicality as Mean. Mean is a number that defines the average of the data, and it is always representing the central point of the data, more or less, whereas Mode can occur at extreme ends of the data and the same for the median, if one side of the data has high density, then median tends to shift to that side.