N
Numly
Go Pro
๐Ÿ“Š

Descriptive Statistics

Mean, median, mode, standard deviation, variance, and more from a dataset.

Dr. Ade BamideleVerified

PhD Statistics, Fellow of the Royal Statistical Society

Statistician and data scientist with 15 years in applied statistics, probability theory and data visualisation across industry and academia.

Enter values above to see your result

Related calculators

About the Descriptive Statistics

Descriptive statistics are summary measures that capture the central tendency, spread, and shape of a dataset, reducing potentially thousands of data points to a handful of numbers that tell the essential story. They form the foundation of data analysis in every quantitative discipline โ€” from clinical trials measuring whether a drug reduces blood pressure, to A/B tests comparing conversion rates, to quality control monitoring manufacturing tolerances. Unlike inferential statistics (which draw conclusions about populations from samples), descriptive statistics simply describe what is in the data โ€” but doing so precisely and completely is often more informative than people expect.

Three measures of central tendency each capture something different: the mean is the arithmetic average and is sensitive to outliers (a billionaire in a village raises the mean income dramatically); the median is the middle value when sorted and is robust to outliers (it barely changes when the billionaire arrives); the mode is the most frequent value and is most useful for categorical or discrete data. The choice of which to report matters significantly. Income and house price statistics are typically reported as median rather than mean precisely because the distribution is right-skewed โ€” the mean overstates what a typical person earns.

Spread is equally important. Two classes could have the same mean exam score of 65%, but one might range from 40โ€“90% (high variance โ€” students are at very different levels) and another from 60โ€“70% (low variance โ€” students are clustered around the same ability). Standard deviation quantifies this spread as the average distance of data points from the mean; interquartile range (IQR) gives the range covering the middle 50% of data and is outlier-resistant. Understanding spread is essential for interpreting means โ€” a mean of 65% with an SD of 15 points tells a very different story than a mean of 65% with an SD of 3 points.

How it works

Mean (xฬ„) = ฮฃx / n
Median = middle value when sorted (average of two middles if n is even)
Variance (sยฒ) = ฮฃ(x โˆ’ xฬ„)ยฒ / (n โˆ’ 1)  [sample]
Standard Deviation (s) = โˆšvariance
IQR = Q3 โˆ’ Q1

Where

xฬ„Sample mean โ€” arithmetic average of all values
nNumber of data points
ฮฃSum across all values
sยฒSample variance โ€” average squared deviation from the mean (divided by nโˆ’1 for unbiased estimate)
Q1 / Q325th and 75th percentiles respectively; IQR = Q3 โˆ’ Q1 covers the middle 50% of data

Worked example

Dataset: exam scores for 9 students: 45, 52, 61, 63, 63, 70, 74, 82, 94.

n = 9, ฮฃx = 604, Mean = 604/9 = 67.1.

Sorted: 45, 52, 61, 63, 63, 70, 74, 82, 94. Median = 5th value = 63.

Mode = 63 (appears twice).

Variance: deviations from mean (67.1): โˆ’22.1ยฒ, โˆ’15.1ยฒ, โˆ’6.1ยฒ, โˆ’4.1ยฒ, โˆ’4.1ยฒ, 2.9ยฒ, 6.9ยฒ, 14.9ยฒ, 26.9ยฒ = 488.41 + 228.01 + 37.21 + 16.81 + 16.81 + 8.41 + 47.61 + 222.01 + 723.61 = 1788.9.

Sample variance = 1788.9 / (9โˆ’1) = 223.6. Standard deviation = โˆš223.6 = 14.95.

Q1 โ‰ˆ 58 (between 52 and 61), Q3 โ‰ˆ 74, IQR = 74 โˆ’ 58 = 16.

Interpretation: The class averages 67.1 (mean) or 63 (median), with two thirds of students within ~15 points of the mean.

Tips to improve your result

  • 1.

    When your data has outliers or is skewed (income, house prices, time-to-complete), use median and IQR rather than mean and standard deviation. Median is always more representative of the "typical" value in asymmetric distributions.

  • 2.

    The "nโˆ’1" in sample standard deviation (Bessel's correction) is not an error โ€” it corrects for the fact that sample variance systematically underestimates population variance. Use nโˆ’1 for sample data; n for population data. Most statistical software and this calculator use nโˆ’1 by default.

  • 3.

    Outliers can be identified using the IQR rule: any value below Q1 โˆ’ 1.5ร—IQR or above Q3 + 1.5ร—IQR is considered a mild outlier; below Q1 โˆ’ 3ร—IQR or above Q3 + 3ร—IQR is an extreme outlier. This is the standard method used by box-and-whisker plots.

  • 4.

    Standard deviation has the same units as the original data. Variance (SDยฒ) has squared units, which is why SD is more interpretable in context. If exam scores have SD = 15 points, that's directly meaningful; "variance = 225 pointsยฒ" is not.

  • 5.

    The coefficient of variation (CV = SD / Mean ร— 100) allows comparing variability across datasets with different scales or units. A height dataset with mean 170cm and SD 8cm has CV = 4.7%; a weight dataset with mean 75kg and SD 12kg has CV = 16% โ€” weight is relatively more variable despite the absolute SD being larger.

Frequently asked questions

Was this helpful?
0 found helpful