A t-test is any hypothesis test where the test statistic follows a Student's t-distribution. In this version of a t-test, we are testing the probability that two independent samples were drawn from the same population based on the means (and variances) of those samples. More specifically, this version of a t-test is used when:
You can use this calculator to estimate:
[1] It is a common mistake to try to calculate the power of a completed study based on the observed effect size. You need to know (or estimate) the true effect size to calculate the power of a study.
When conducting a t-test with two independent samples, the following assumptions are made about your data:
[1] This does not require your underlying data to be normally distributed. With larger samples, the Central Limit Theorem typically means the sample means will be normally distributed.
[2] This assumption holds if the underlying data are normally distributed, but not neccessarily if you are relying on the Central Limit Theorem for normally distributed sample means.
Significance Level (α): The probability of incorrectly rejecting the null hypothesis (H0: θ = 0; where θ = μ1 - μ2), also known as the false positive rate or the Type I error rate. An α of 0.05 (5%) means that if we repeated an experiment where we drew samples from the same population many times, we would expect to incorrectly reject the null hypothesis in 5% of cases. α can also be thought of as a measure of how extreme the observed difference in sample means has to be before we reject the null hypothesis. With an α of 0.05, we would reject the null hypothesis when observing a difference that we would expect to see 5% (or less) of the time when drawing two samples from the same population.
Statistical Power (1 - β): β is the probability that we will fail to reject the null hypothesis when the samples are drawn from different populations. This is also known as the false negative rate or the Type II error rate. Statistical power or 1 - β is therefore the probablity that we will correctly reject the null hypothesis. In the same way that we can draw samples with different means from the same population, there is also a risk that we draw samples with very similar means from two different populations.
Effect Size (Cohen's d): A standardized measure of the difference in the means (can be sample or population means depending on the context). The difference in means is divided by the pooled standard deviation of the two samples/populations to provide a metric, in units of standard deviations, that can be compared across studies. It can also be used directly in some calculations instead of the means and standard deviations of the samples.