A Gentle Introduction to Student’s T-Test

Guinness is a tasty dark stout beer. It contributed a fundamental tool in science and statistics beyond its taste. In this post, we are going to mention student’s t-test which is an important examination technique to find out an idea from small samples of series.

Guinness Beer by Majd Sheikh (Pexels)

Vlog

You can either continue to read this tutorial or switch to following video. They both cover the same topic: the history of student’s t-test, t-test explained and coded in python from scratch.


🙋‍♂️ You may consider to enroll my top-rated machine learning course on Udemy

Decision Trees for Machine Learning

Guinness Beers

Formerly, brewers have been assessing the quality of beers with the subjective techniques such as appearance, color and scent of hops. William Sealy Gosset was a scientist in the Guinness brewery, Dublin. He invented the t-test to handle quality control with small samples. In that way, he could examine the yield varieties of barleys in the beer brewing process.

Creator

Thereafter, publishing papers was not allowed by the Guinness authority. So, he published this technique in 1908 under the anonymous name – student. That’s why, this technique is still recalled as Student’s T-Test.

Publication
William Sealy Gosset (Student)

Use-case

International bitterness unit or shortly IBU measures the gauge of beer’s bitterness.

There are two types of beer yeasts: lager and ale. Ales are top-fermenting yeasts at room temperature (15-21 celsius) whereas lagers are bottom-fermenting yeasts at lower temperatures (1-10 celsius). The following video covers beer yeasts deeply.

So, beer styles such as pilsener (special kind of lager) and ale have a meaningful impact on IBU? If we have the IBU scores of those beer styles for lots of examples, we can find out the correlation. Can we find out this with a limited number of samples?

Suppose that you are going to drink just 10 beers for both Pilsner and Ale. Thereafter, can you understand the existence of difference on IBU between them? You are able find out this with t-test!

Data set

We are going to use the data set published in the Beer study repo of David Stround. It mentions the IBU scores and beer styles for thousands of beers. Some records have no IBU score. We will discard them.

import pandas as pd

# dataset source: https://github.com/davestroud/BeerStudy/blob/master/Beers.csv
beers = pd.read_csv("Beers.csv")

#discard records with no ibu score
beers = beers[~beers["IBU"].isna()]

beers.head()
Beers data set

Collecting two independent samples

We are going to get the IBU scores for Czech Pilsener and American Pale Ale (APA).





sample_size = 10

sample_one = beers[beers["Style"] =="Czech Pilsener"]["IBU"].sample(sample_size, random_state = 17)
sample_two = beers[beers["Style"] =="American Pale Ale (APA)"]["IBU"].sample(sample_size, random_state = 17)

sample_one.plot(kind='kde', title = 'IBU', label ="Sample 1", legend = True)
sample_two.plot(kind='kde', title = 'IBU', label ="Sample 2", legend = True)

Those collections have normal distribution. Pilsener has 35 mean value whereas Ale has 43 mean value. Based on this distribution graph, pilsener and ale style beers seem to have different scales for IBU. So, can we say that APA beers have a higher IBU score than Pilsener beers? Wait until the end of this tutorial before answer 🙂

Distributions of two beer styles

T-value

T-test requires to find the t-value first. Its formula is shown below. Here, x terms are the means of the series; s terms are the standard deviation of the series; and n terms are the number of samples in the series.

T-value

We already stored two independent series in pandas series. Now, we can find the t-value.

t_value = abs(sample_one.mean() - sample_two.mean()) 
t_value = t_value / math.sqrt(((sample_two.std() ** 2) 
                               / sample_two.shape[0]) + ((sample_one.std() ** 2) / sample_one.shape[0]))

So, t-value is 1.63 in our experiment.

T-test

We are going to compare the found t-value and the threshold in the next step. However, the number of samples in the independent series and confidence interval will differ the threshold. We will use t-distribution table to find out the threshold.

# dataset source: https://github.com/serengil/tensorflow-101/blob/master/dataset/t_distribution.csv
t_dist = pd.read_csv("t_distribution.csv")
T-distribution

The degrees of freedom comes from the number of samples in the two independent series. We have 10 samples in them both. So, degrees of freedom will be 18 in our case.

degree_of_freedom = sample_one.shape[0] + sample_two.shape[0] - 2

Let’s find out the corresponding row in the t-table.

t_dist[t_dist["degrees_of_freedom"] == degree_of_freedom]
T-table rows for our case

Corresponding row shows many confidence interval scores in columns. In science, we mostly use p=0.05.

threshold = t_dist[t_dist["degrees_of_freedom"] == degree_of_freedom]["p_0.05"].values[0]

So, threshold is 2.101.

Hypothesis

We found the both t-value and the threshold. Hypothesis is that there is no statistically significant difference between the samples if the t-value is less than the threshold.





if t_value < threshold:
    print("there is no statistically significant difference between the samples")
else:
    print("there is statistically significant difference between the samples")

This condition is satisfied in our case. So, there is no statistically significant difference between pilsener and ale samples on IBU.

Fact

German purity law or Reinheitsgebot is a regulation limiting the ingredients in beer: water, barley, hops and yeast (this is added a long time after the initial law).

Here, hops increase the bitterness score in beers. That’s why, IPA style beers have a high bitter taste because of the high amount of hops. So, it is a fact that there is no relation between IBU and beer yeast types.

However, one might fail when he looks at the distribution graph of pilsener and ale beers. Because, ale samples have a higher IBU score than pilsener samples. T-test show that it is totally random and there is no difference on IBU between pilsener and ale beers.

Bonus: Dark Beers

If you enjoy this content, you might be interested in the history of dark beers: porter and stout.

Conclusion

So, we have mentioned student’s t-test on a beer data set to find out the existence of statistically significant difference between two styles of beers. Funnily, t-test was developed for that kind of an intend as well. You would enjoy dark beers more if you know how t-test was developed!

I pushed the source code of this study into GitHub. You can support this study if you star the repo.

Finally, my thanks to Orkun Sevinc who taught me t-test and its history in Guinness brewery.


Like this blog? Support me on Patreon

Buy me a coffee