top of page

The Sizzle and Pop of the Normal Distribution!

Writer's picture: Ashish J. EdwardAshish J. Edward

Updated: Oct 11, 2024

So, picture the Normal Distribution as the classic "bell curve." You've probably seen it before. It's smooth, symmetrical, and has that iconic hump in the middle. It's like the superstar of statistics, showing up everywhere from test scores to heights of people.



The Normal Distribution is all about how data is spread out. In a perfect Normal Distribution, most of the data is clustered around the mean (the average), and then it gradually tapers off as you move further away. It's like a mountain with the peak at the mean, and the slopes gently descending on both sides.





The Math Part (Don't Worry, It's Friendly!):


f(x)= 1/(σ√2π) . e - 1/2 ((x-µ)/σ)2







μ is the mean of the distribution. σ is the standard deviation.σ2 is the variance. e is the base of the natural logarithm, approximately equal to 2.71828.π is the mathematical constant Pi, approximately equal to 3.14159.But don't get bogged down by the formula; the key takeaway is that the mean tells you where the peak is, and the standard deviation tells you how wide the mountain is.


Question:

Suppose the weights of bags of apples at a local grocery store follow a Normal Distribution with a mean of 5 pounds and a standard deviation of 0.5 pounds.

What is the probability that a randomly selected bag of apples weighs between 4.5 and 5.5 pounds?


Solution:

To solve this, we'll need to calculate the Z-scores for both 4.5 and 5.5 pounds and then find the corresponding probabilities.

Step 1: Calculate the Z-scores

The Z-score formula is: Z = (X - µ)/σ

Where, Z is the value we are interested in, µ is the mean, and σ is the standard deviation.

For 4.5 pounds: Z_4.5 = (4.5 -5)/0.5 = -1 & For 5.5 pounds: Z_5.5 = (5.5 -5)/0.5 = 1

Step 2 : Using a Z-table (a standard statistical table that gives probabilities for different Z-scores), we can find the probabilities corresponding to these Z-scores.

•For Z=−1, the probability is approximately 0.1587. & for Z=1, the probability is approximately 0.8413.

Step 3: Calculate the Probability for the Range : The probability that the weight falls between 4.5 and 5.5 pounds is the difference between these two probabilities:

P(4.5 < X < 5.5)=P(Z_5.5)−P(Z_4.5 )=0.8413−0.1587=0.6826


So, the probability that a randomly selected bag of apples weighs between 4.5 and 5.5 pounds is approximately 0.6826, or 68.26%.


How to test is the data is “Normal” ?


In the previous example, we assumed the data is normal. Let’s say we have a data set and we want to know for a fact, if its normally distributed. Here is how we go about it. We will use our trusted friend MS Excel to figure this out.


Let's consider a dataset of 20 exam scores: 85, 90, 88, 92, 87, 95, 89, 91, 86, 93, 94, 88, 90, 91, 92, 87, 89, 85, 86, 94


Here's how you can test for normality in Excel:


Create a Histogram

  • Copy the data into an Excel column (e.g., Column A).

  • Go to Insert > Charts > Histogram.

The “shape” of the data in excel is not bell shaped as in a Normal distribution, hence the data is not normal. Lets look at other ways to determine is the data is normal using the same data set.


To do this, you need to install the Analysis Tool Pak from Excel.

File > Options > Add-ins > Select Excel Add-ins from Manage > Click Go > Select the Analysis Tool Pak > Click Ok

You will now see Data Analysis on extreme right of the Excel ribbon when you select Data heading.


Lets look at another way to determine is the data is normal using the same data set.


Skewness and Kurtosis : lets revisit what we learnt about Skew in our tutorial on Descriptive Statistics.

Hope that was a good flashback. Skew close of 0 suggests normality.


Lets talk about Kurtosis – it’s a statistical measure that describes the shape of a data distribution, specifically focusing on the tails and the peak.


  • If the kurtosis is close to 3, it means the data follows a normal distribution pretty closely—think of it as the "Goldilocks zone," not too flat and not too peaky.

  • If it's greater than 3, the distribution has "heavy tails," meaning there are more outliers than in a normal distribution. This is called "leptokurtic."

  • If it's less than 3, the distribution is "light-tailed" or "platykurtic," meaning it has fewer outliers and is more spread out.


So, kurtosis is like a "tail-teller," giving you the inside scoop on how extreme values behave in your data set.


Now, lets get to the data set we plot the histogram in the previous slide & calculate the SKEW & KURTOSIS.

Exam scores = 85, 90, 88, 92, 87, 95, 89, 91, 86, 93, 94, 88, 90, 91, 92, 87, 89, 85, 86, 94. On excel we paste it from A1:A20.

SKEW : =SKEW(A1:A20) = 0.134507741 – skew is close of 0

KURTOSIS : =KURT(A1:A20) = -1.085708654 – kurtosis is no-where close to 3 – hence, data is not normal.


Practical problem solving using Normal distribution


Inventory Management for a Retail Store : Let's say you run a retail store and you're trying to figure out how many units of a popular product to keep in stock. You don't want to overstock and waste money, but you also don't want to understock and miss out on sales.


  • Gather Data : First, you collect sales data for the past few months. You find that you sell an average of 100 units per day with a standard deviation of 20 units.

  • Set Confidence Level : You decide you want to be 95% confident that you won't run out of stock.

  • Calculate Z-Score : For a 95% confidence level, the Z-score is approximately 1.96 (you can find this in a Z-table or use Excel's =NORM.S.INV(0.975) function). Why 0.975? Well, a 95% confidence level actually means that you're capturing the middle 95% of the data. Since the standard normal distribution is symmetrical, you'll have 2.5% of the data in each tail. So, you look up the Z-score that corresponds to 97.5% (100% - 2.5%) to get the Z-score for a 95% confidence level.

  • Calculate Safety Stock : Safety Stock = Z-score x Standard Deviation = 1.96 x 20 = 39.2, rounded up to 40 units

  • Calculate Optimal Inventory Level = Average Sales + Safety Stock = 100 + 40 = 140 units


By using the normal distribution, you've calculated that keeping 140 units in stock will give you a 95% confidence level of not running out. This means you're making a data-backed decision to optimize your inventory, balancing both cost and customer satisfaction. So, the next time you're placing an order, aim for 140 units to keep your shelves well-stocked and your customers happy, all while keeping costs in check.


Managing Call Volumes in a Contact Centre : Imagine you're managing a customer service contact centre. You've noticed that during peak hours, customers are waiting too long to get connected to an agent, leading to poor customer satisfaction scores. You want to figure out how many agents you should have on duty during these peak hours to handle the call volume efficiently.


  • Gather Data : You collect data and find that you receive an average of 60 calls per hour during peak times, with a standard deviation of 10 calls.

  • Set Confidence Level : You decide you want to be 99% confident that every customer will be attended to without long wait times.

  • Calculate Z-Score : For a 99% confidence level, the Z-score is approximately 2.576 (you can find this in a Z-table or use Excel's =NORM.S.INV(0.995) function).

  • Calculate Safety Margin: Safety Margin = Z-score x Standard Deviation= 2.576 x 10 = 25.76, rounded up to 26 calls

  • Calculate Optimal Number of Agents = Average Calls + Safety Margin = 60 + 26 = 86 agents.


By applying the normal distribution, you've determined that you should have 86 agents on duty during peak hours to be 99% confident of handling all incoming calls efficiently. This data-backed decision helps you improve customer satisfaction by reducing wait times, without overstaffing and incurring extra costs. So, for the next scheduling cycle, aim to have 86 agents on duty during peak hours to ensure a smooth and satisfying customer experience.


Next, we cover Poisson distribution....its gonna get exciting now :)

13 views0 comments

Recent Posts

See All

Comments


bottom of page