t-test Calculator

When to use a t-test, which t-test, how to do a t-test, p-value from t-test, t-test critical values, how to use our t-test calculator, one-sample t-test, two-sample t-test, paired t-test, t-test vs z-test.

Welcome to our t-test calculator! Here you can not only easily perform one-sample t-tests , but also two-sample t-tests , as well as paired t-tests .

Do you prefer to find the p-value from t-test, or would you rather find the t-test critical values? Well, this t-test calculator can do both! 😊

What does a t-test tell you? Take a look at the text below, where we explain what actually gets tested when various types of t-tests are performed. Also, we explain when to use t-tests (in particular, whether to use the z-test vs. t-test) and what assumptions your data should satisfy for the results of a t-test to be valid. If you've ever wanted to know how to do a t-test by hand, we provide the necessary t-test formula, as well as tell you how to determine the number of degrees of freedom in a t-test.

A t-test is one of the most popular statistical tests for location , i.e., it deals with the population(s) mean value(s).

There are different types of t-tests that you can perform:

  • A one-sample t-test;
  • A two-sample t-test; and
  • A paired t-test.

In the next section , we explain when to use which. Remember that a t-test can only be used for one or two groups . If you need to compare three (or more) means, use the analysis of variance ( ANOVA ) method.

The t-test is a parametric test, meaning that your data has to fulfill some assumptions :

  • The data points are independent; AND
  • The data, at least approximately, follow a normal distribution .

If your sample doesn't fit these assumptions, you can resort to nonparametric alternatives. Visit our Mann–Whitney U test calculator or the Wilcoxon rank-sum test calculator to learn more. Other possibilities include the Wilcoxon signed-rank test or the sign test.

Your choice of t-test depends on whether you are studying one group or two groups:

One sample t-test

Choose the one-sample t-test to check if the mean of a population is equal to some pre-set hypothesized value .

The average volume of a drink sold in 0.33 l cans — is it really equal to 330 ml?

The average weight of people from a specific city — is it different from the national average?

Choose the two-sample t-test to check if the difference between the means of two populations is equal to some pre-determined value when the two samples have been chosen independently of each other.

In particular, you can use this test to check whether the two groups are different from one another .

The average difference in weight gain in two groups of people: one group was on a high-carb diet and the other on a high-fat diet.

The average difference in the results of a math test from students at two different universities.

This test is sometimes referred to as an independent samples t-test , or an unpaired samples t-test .

A paired t-test is used to investigate the change in the mean of a population before and after some experimental intervention , based on a paired sample, i.e., when each subject has been measured twice: before and after treatment.

In particular, you can use this test to check whether, on average, the treatment has had any effect on the population .

The change in student test performance before and after taking a course.

The change in blood pressure in patients before and after administering some drug.

So, you've decided which t-test to perform. These next steps will tell you how to calculate the p-value from t-test or its critical values, and then which decision to make about the null hypothesis.

Decide on the alternative hypothesis :

Use a two-tailed t-test if you only care whether the population's mean (or, in the case of two populations, the difference between the populations' means) agrees or disagrees with the pre-set value.

Use a one-tailed t-test if you want to test whether this mean (or difference in means) is greater/less than the pre-set value.

Compute your T-score value :

Formulas for the test statistic in t-tests include the sample size , as well as its mean and standard deviation . The exact formula depends on the t-test type — check the sections dedicated to each particular test for more details.

Determine the degrees of freedom for the t-test:

The degrees of freedom are the number of observations in a sample that are free to vary as we estimate statistical parameters. In the simplest case, the number of degrees of freedom equals your sample size minus the number of parameters you need to estimate . Again, the exact formula depends on the t-test you want to perform — check the sections below for details.

The degrees of freedom are essential, as they determine the distribution followed by your T-score (under the null hypothesis). If there are d degrees of freedom, then the distribution of the test statistics is the t-Student distribution with d degrees of freedom . This distribution has a shape similar to N(0,1) (bell-shaped and symmetric) but has heavier tails . If the number of degrees of freedom is large (>30), which generically happens for large samples, the t-Student distribution is practically indistinguishable from N(0,1).

💡 The t-Student distribution owes its name to William Sealy Gosset, who, in 1908, published his paper on the t-test under the pseudonym "Student". Gosset worked at the famous Guinness Brewery in Dublin, Ireland, and devised the t-test as an economical way to monitor the quality of beer. Cheers! 🍺🍺🍺

Recall that the p-value is the probability (calculated under the assumption that the null hypothesis is true) that the test statistic will produce values at least as extreme as the T-score produced for your sample . As probabilities correspond to areas under the density function, p-value from t-test can be nicely illustrated with the help of the following pictures:

p-value from t-test

The following formulae say how to calculate p-value from t-test. By cdf t,d we denote the cumulative distribution function of the t-Student distribution with d degrees of freedom:

p-value from left-tailed t-test:

p-value = cdf t,d (t score )

p-value from right-tailed t-test:

p-value = 1 − cdf t,d (t score )

p-value from two-tailed t-test:

p-value = 2 × cdf t,d (−|t score |)

or, equivalently: p-value = 2 − 2 × cdf t,d (|t score |)

However, the cdf of the t-distribution is given by a somewhat complicated formula. To find the p-value by hand, you would need to resort to statistical tables, where approximate cdf values are collected, or to specialized statistical software. Fortunately, our t-test calculator determines the p-value from t-test for you in the blink of an eye!

Recall, that in the critical values approach to hypothesis testing, you need to set a significance level, α, before computing the critical values , which in turn give rise to critical regions (a.k.a. rejection regions).

Formulas for critical values employ the quantile function of t-distribution, i.e., the inverse of the cdf :

Critical value for left-tailed t-test: cdf t,d -1 (α)

critical region:

(-∞, cdf t,d -1 (α)]

Critical value for right-tailed t-test: cdf t,d -1 (1-α)

[cdf t,d -1 (1-α), ∞)

Critical values for two-tailed t-test: ±cdf t,d -1 (1-α/2)

(-∞, -cdf t,d -1 (1-α/2)] ∪ [cdf t,d -1 (1-α/2), ∞)

To decide the fate of the null hypothesis, just check if your T-score lies within the critical region:

If your T-score belongs to the critical region , reject the null hypothesis and accept the alternative hypothesis.

If your T-score is outside the critical region , then you don't have enough evidence to reject the null hypothesis.

Choose the type of t-test you wish to perform:

A one-sample t-test (to test the mean of a single group against a hypothesized mean);

A two-sample t-test (to compare the means for two groups); or

A paired t-test (to check how the mean from the same group changes after some intervention).

Two-tailed;

Left-tailed; or

Right-tailed.

This t-test calculator allows you to use either the p-value approach or the critical regions approach to hypothesis testing!

Enter your T-score and the number of degrees of freedom . If you don't know them, provide some data about your sample(s): sample size, mean, and standard deviation, and our t-test calculator will compute the T-score and degrees of freedom for you .

Once all the parameters are present, the p-value, or critical region, will immediately appear underneath the t-test calculator, along with an interpretation!

The null hypothesis is that the population mean is equal to some value μ 0 \mu_0 μ 0 ​ .

The alternative hypothesis is that the population mean is:

  • different from μ 0 \mu_0 μ 0 ​ ;
  • smaller than μ 0 \mu_0 μ 0 ​ ; or
  • greater than μ 0 \mu_0 μ 0 ​ .

One-sample t-test formula :

  • μ 0 \mu_0 μ 0 ​ — Mean postulated in the null hypothesis;
  • n n n — Sample size;
  • x ˉ \bar{x} x ˉ — Sample mean; and
  • s s s — Sample standard deviation.

Number of degrees of freedom in t-test (one-sample) = n − 1 n-1 n − 1 .

The null hypothesis is that the actual difference between these groups' means, μ 1 \mu_1 μ 1 ​ , and μ 2 \mu_2 μ 2 ​ , is equal to some pre-set value, Δ \Delta Δ .

The alternative hypothesis is that the difference μ 1 − μ 2 \mu_1 - \mu_2 μ 1 ​ − μ 2 ​ is:

  • Different from Δ \Delta Δ ;
  • Smaller than Δ \Delta Δ ; or
  • Greater than Δ \Delta Δ .

In particular, if this pre-determined difference is zero ( Δ = 0 \Delta = 0 Δ = 0 ):

The null hypothesis is that the population means are equal.

The alternate hypothesis is that the population means are:

  • μ 1 \mu_1 μ 1 ​ and μ 2 \mu_2 μ 2 ​ are different from one another;
  • μ 1 \mu_1 μ 1 ​ is smaller than μ 2 \mu_2 μ 2 ​ ; and
  • μ 1 \mu_1 μ 1 ​ is greater than μ 2 \mu_2 μ 2 ​ .

Formally, to perform a t-test, we should additionally assume that the variances of the two populations are equal (this assumption is called the homogeneity of variance ).

There is a version of a t-test that can be applied without the assumption of homogeneity of variance: it is called a Welch's t-test . For your convenience, we describe both versions.

Two-sample t-test if variances are equal

Use this test if you know that the two populations' variances are the same (or very similar).

Two-sample t-test formula (with equal variances) :

where s p s_p s p ​ is the so-called pooled standard deviation , which we compute as:

  • Δ \Delta Δ — Mean difference postulated in the null hypothesis;
  • n 1 n_1 n 1 ​ — First sample size;
  • x ˉ 1 \bar{x}_1 x ˉ 1 ​ — Mean for the first sample;
  • s 1 s_1 s 1 ​ — Standard deviation in the first sample;
  • n 2 n_2 n 2 ​ — Second sample size;
  • x ˉ 2 \bar{x}_2 x ˉ 2 ​ — Mean for the second sample; and
  • s 2 s_2 s 2 ​ — Standard deviation in the second sample.

Number of degrees of freedom in t-test (two samples, equal variances) = n 1 + n 2 − 2 n_1 + n_2 - 2 n 1 ​ + n 2 ​ − 2 .

Two-sample t-test if variances are unequal (Welch's t-test)

Use this test if the variances of your populations are different.

Two-sample Welch's t-test formula if variances are unequal:

  • s 1 s_1 s 1 ​ — Standard deviation in the first sample;
  • s 2 s_2 s 2 ​ — Standard deviation in the second sample.

The number of degrees of freedom in a Welch's t-test (two-sample t-test with unequal variances) is very difficult to count. We can approximate it with the help of the following Satterthwaite formula :

Alternatively, you can take the smaller of n 1 − 1 n_1 - 1 n 1 ​ − 1 and n 2 − 1 n_2 - 1 n 2 ​ − 1 as a conservative estimate for the number of degrees of freedom.

🔎 The Satterthwaite formula for the degrees of freedom can be rewritten as a scaled weighted harmonic mean of the degrees of freedom of the respective samples: n 1 − 1 n_1 - 1 n 1 ​ − 1 and n 2 − 1 n_2 - 1 n 2 ​ − 1 , and the weights are proportional to the standard deviations of the corresponding samples.

As we commonly perform a paired t-test when we have data about the same subjects measured twice (before and after some treatment), let us adopt the convention of referring to the samples as the pre-group and post-group.

The null hypothesis is that the true difference between the means of pre- and post-populations is equal to some pre-set value, Δ \Delta Δ .

The alternative hypothesis is that the actual difference between these means is:

Typically, this pre-determined difference is zero. We can then reformulate the hypotheses as follows:

The null hypothesis is that the pre- and post-means are the same, i.e., the treatment has no impact on the population .

The alternative hypothesis:

  • The pre- and post-means are different from one another (treatment has some effect);
  • The pre-mean is smaller than the post-mean (treatment increases the result); or
  • The pre-mean is greater than the post-mean (treatment decreases the result).

Paired t-test formula

In fact, a paired t-test is technically the same as a one-sample t-test! Let us see why it is so. Let x 1 , . . . , x n x_1, ... , x_n x 1 ​ , ... , x n ​ be the pre observations and y 1 , . . . , y n y_1, ... , y_n y 1 ​ , ... , y n ​ the respective post observations. That is, x i , y i x_i, y_i x i ​ , y i ​ are the before and after measurements of the i -th subject.

For each subject, compute the difference, d i : = x i − y i d_i := x_i - y_i d i ​ := x i ​ − y i ​ . All that happens next is just a one-sample t-test performed on the sample of differences d 1 , . . . , d n d_1, ... , d_n d 1 ​ , ... , d n ​ . Take a look at the formula for the T-score :

Δ \Delta Δ — Mean difference postulated in the null hypothesis;

n n n — Size of the sample of differences, i.e., the number of pairs;

x ˉ \bar{x} x ˉ — Mean of the sample of differences; and

s s s  — Standard deviation of the sample of differences.

Number of degrees of freedom in t-test (paired): n − 1 n - 1 n − 1

We use a Z-test when we want to test the population mean of a normally distributed dataset, which has a known population variance . If the number of degrees of freedom is large, then the t-Student distribution is very close to N(0,1).

Hence, if there are many data points (at least 30), you may swap a t-test for a Z-test, and the results will be almost identical. However, for small samples with unknown variance, remember to use the t-test because, in such cases, the t-Student distribution differs significantly from the N(0,1)!

🙋 Have you concluded you need to perform the z-test? Head straight to our z-test calculator !

What is a t-test?

A t-test is a widely used statistical test that analyzes the means of one or two groups of data. For instance, a t-test is performed on medical data to determine whether a new drug really helps.

What are different types of t-tests?

Different types of t-tests are:

  • One-sample t-test;
  • Two-sample t-test; and
  • Paired t-test.

How to find the t value in a one sample t-test?

To find the t-value:

  • Subtract the null hypothesis mean from the sample mean value.
  • Divide the difference by the standard deviation of the sample.
  • Multiply the resultant with the square root of the sample size.

90% confidence interval

Chilled drink, sample size.

  • Biology (100)
  • Chemistry (100)
  • Construction (144)
  • Conversion (292)
  • Ecology (30)
  • Everyday life (261)
  • Finance (569)
  • Health (440)
  • Physics (509)
  • Sports (104)
  • Statistics (182)
  • Other (181)
  • Discover Omni (40)

Hypothesis Testing Calculator

Related: confidence interval calculator, type ii error.

The first step in hypothesis testing is to calculate the test statistic. The formula for the test statistic depends on whether the population standard deviation (σ) is known or unknown. If σ is known, our hypothesis test is known as a z test and we use the z distribution. If σ is unknown, our hypothesis test is known as a t test and we use the t distribution. Use of the t distribution relies on the degrees of freedom, which is equal to the sample size minus one. Furthermore, if the population standard deviation σ is unknown, the sample standard deviation s is used instead. To switch from σ known to σ unknown, click on $\boxed{\sigma}$ and select $\boxed{s}$ in the Hypothesis Testing Calculator.

Next, the test statistic is used to conduct the test using either the p-value approach or critical value approach. The particular steps taken in each approach largely depend on the form of the hypothesis test: lower tail, upper tail or two-tailed. The form can easily be identified by looking at the alternative hypothesis (H a ). If there is a less than sign in the alternative hypothesis then it is a lower tail test, greater than sign is an upper tail test and inequality is a two-tailed test. To switch from a lower tail test to an upper tail or two-tailed test, click on $\boxed{\geq}$ and select $\boxed{\leq}$ or $\boxed{=}$, respectively.

In the p-value approach, the test statistic is used to calculate a p-value. If the test is a lower tail test, the p-value is the probability of getting a value for the test statistic at least as small as the value from the sample. If the test is an upper tail test, the p-value is the probability of getting a value for the test statistic at least as large as the value from the sample. In a two-tailed test, the p-value is the probability of getting a value for the test statistic at least as unlikely as the value from the sample.

To test the hypothesis in the p-value approach, compare the p-value to the level of significance. If the p-value is less than or equal to the level of signifance, reject the null hypothesis. If the p-value is greater than the level of significance, do not reject the null hypothesis. This method remains unchanged regardless of whether it's a lower tail, upper tail or two-tailed test. To change the level of significance, click on $\boxed{.05}$. Note that if the test statistic is given, you can calculate the p-value from the test statistic by clicking on the switch symbol twice.

In the critical value approach, the level of significance ($\alpha$) is used to calculate the critical value. In a lower tail test, the critical value is the value of the test statistic providing an area of $\alpha$ in the lower tail of the sampling distribution of the test statistic. In an upper tail test, the critical value is the value of the test statistic providing an area of $\alpha$ in the upper tail of the sampling distribution of the test statistic. In a two-tailed test, the critical values are the values of the test statistic providing areas of $\alpha / 2$ in the lower and upper tail of the sampling distribution of the test statistic.

To test the hypothesis in the critical value approach, compare the critical value to the test statistic. Unlike the p-value approach, the method we use to decide whether to reject the null hypothesis depends on the form of the hypothesis test. In a lower tail test, if the test statistic is less than or equal to the critical value, reject the null hypothesis. In an upper tail test, if the test statistic is greater than or equal to the critical value, reject the null hypothesis. In a two-tailed test, if the test statistic is less than or equal the lower critical value or greater than or equal to the upper critical value, reject the null hypothesis.

When conducting a hypothesis test, there is always a chance that you come to the wrong conclusion. There are two types of errors you can make: Type I Error and Type II Error. A Type I Error is committed if you reject the null hypothesis when the null hypothesis is true. Ideally, we'd like to accept the null hypothesis when the null hypothesis is true. A Type II Error is committed if you accept the null hypothesis when the alternative hypothesis is true. Ideally, we'd like to reject the null hypothesis when the alternative hypothesis is true.

Hypothesis testing is closely related to the statistical area of confidence intervals. If the hypothesized value of the population mean is outside of the confidence interval, we can reject the null hypothesis. Confidence intervals can be found using the Confidence Interval Calculator . The calculator on this page does hypothesis tests for one population mean. Sometimes we're interest in hypothesis tests about two population means. These can be solved using the Two Population Calculator . The probability of a Type II Error can be calculated by clicking on the link at the bottom of the page.

An open portfolio of interoperable, industry leading products

The Dotmatics digital science platform provides the first true end-to-end solution for scientific R&D, combining an enterprise data platform with the most widely used applications for data analysis, biologics, flow cytometry, chemicals innovation, and more.

hypothesis testing two means calculator

Statistical analysis and graphing software for scientists

Bioinformatics, cloning, and antibody discovery software

Plan, visualize, & document core molecular biology procedures

Electronic Lab Notebook to organize, search and share data

Proteomics software for analysis of mass spec data

Modern cytometry analysis platform

Analysis, statistics, graphing and reporting of flow cytometry data

Software to optimize designs of clinical trials

T test calculator

A t test compares the means of two groups. There are several types of two sample t tests and this calculator focuses on the three most common: unpaired, welch's, and paired t tests. Directions for using the calculator are listed below, along with more information about two sample t tests and help on which is appropriate for your analysis. NOTE: This is not the same as a one sample t test; for that, you need this One sample t test calculator .

1. Choose data entry format

Caution: Changing format will erase your data.

2. Choose a test

Help me choose

3. Enter data

Help me arrange the data

4. View the results

What is a t test.

A t test is used to measure the difference between exactly two means. Its focus is on the same numeric data variable rather than counts or correlations between multiple variables. If you are taking the average of a sample of measurements, t tests are the most commonly used method to evaluate that data. It is particularly useful for small samples of less than 30 observations. For example, you might compare whether systolic blood pressure differs between a control and treated group, between men and women, or any other two groups.

This calculator uses a two-sample t test, which compares two datasets to see if their means are statistically different. That is different from a one sample t test , which compares the mean of your sample to some proposed theoretical value.

The most general formula for a t test is composed of two means (M1 and M2) and the overall standard error (SE) of the two samples:

t test formula

See our video on How to Perform a Two-sample t test for an intuitive explanation of t tests and an example.

How to use the t test calculator

  • Choose your data entry format . This will change how section 3 on the page looks. The first two options are for entering your data points themselves, either manually or by copy & paste. The last two are for entering the means for each group, along with the number of observations (N) and either the standard error of that mean (SEM) or standard deviation of the dataset (SD) standard error. If you have already calculated these summary statistics, the latter options will save you time.
  • Choose a test from the three options: Unpaired t test, Welch's unpaired t test, or Paired t test. Use our Ultimate Guide to t tests if you are unsure which is appropriate, as it includes a section on "How do I know which t test to use?". Notice not all options are available if you enter means only.
  • Enter data for the test, based on the format you chose in Step 1.
  • Click Calculate Now and View the results. All options will perform a two-tailed test .

Performing t tests? We can help.

Sign up for more information on how to perform t tests and other common statistical analyses.

Common t test confusion

In addition to the number of t test options, t tests are often confused with completely different techniques as well. Here's how to keep them all straight.

Correlation and regression are used to measure how much two factors move together. While t tests are part of regression analysis, they are focused on only one factor by comparing means in different samples.

ANOVA is used for comparing means across three or more total groups. In contrast, t tests compare means between exactly two groups.

Finally, contingency tables compare counts of observations within groups rather than a calculated average. Since t tests compare means of continuous variable between groups, contingency tables use methods such as chi square instead of t tests.

Assumptions of t tests

Because there are several versions of t tests, it's important to check the assumptions to figure out which is best suited for your project. Here are our analysis checklists for unpaired t tests and paired t tests , which are the two most common. These (and the ultimate guide to t tests ) go into detail on the basic assumptions underlying any t test:

  • Exactly two groups
  • Sample is normally distributed
  • Independent observations
  • Unequal or equal variance?
  • Paired or unpaired data?

Interpreting results

The three different options for t tests have slightly different interpretations, but they all hinge on hypothesis testing and P values. You need to select a significance threshold for your P value (often 0.05) before doing the test.

While P values can be easy to misinterpret , they are the most commonly used method to evaluate whether there is evidence of a difference between the sample of data collected and the null hypothesis. Once you have run the correct t test, look at the resulting P value. If the test result is less than your threshold, you have enough evidence to conclude that the data are significantly different.

If the test result is larger or equal to your threshold, you cannot conclude that there is a difference. However, you cannot conclude that there was definitively no difference either. It's possible that a dataset with more observations would have resulted in a different conclusion.

Depending on the test you run, you may see other statistics that were used to calculate the P value, including the mean difference, t statistic, degrees of freedom, and standard error. The confidence interval and a review of your dataset is given as well on the results page.

Graphing t tests

This calculator does not provide a chart or graph of t tests, however, graphing is an important part of analysis because it can help explain the results of the t test and highlight any potential outliers. See our Prism guide for some graphing tips for both unpaired and paired t tests.

Prism is built for customized, publication quality graphics and charts. For t tests we recommend simply plotting the datapoints themselves and the mean, or an estimation plot . Another popular approach is to use a violin plot, like those available in Prism.

For more information

Our ultimate guide to t tests includes examples, links, and intuitive explanations on the subject. It is quite simply the best place to start if you're looking for more about t tests!

If you enjoyed this calculator, you will love using Prism for analysis. Take a free 30-day trial to do more with your data, such as:

  • Clear guidance to pick the right t test and detailed results summaries
  • Custom, publication quality t test graphics, violin plots, and more
  • More t test options, including normality testing as well as nested and multiple t tests
  • Non-parametric test alternatives such as Wilcoxon, Mann-Whitney, and Kolmogorov-Smirnov

Check out our video on how to perform a t test in Prism , for an example from start to finish!

Remember, this page is just for two sample t tests. If you only have one sample, you need to use this calculator instead.

We Recommend:

Analyze, graph and present your scientific work easily with GraphPad Prism. No coding required.

T-test for two Means – Unknown Population Standard Deviations

Instructions : Use this T-Test Calculator for two Independent Means calculator to conduct a t-test for two population means (\(\mu_1\) and \(\mu_2\)), with unknown population standard deviations. This test apply when you have two-independent samples, and the population standard deviations \(\sigma_1\) and \(\sigma_2\) and not known. Please select the null and alternative hypotheses, type the significance level, the sample means, the sample standard deviations, the sample sizes, and the results of the t-test for two independent samples will be displayed for you:

hypothesis testing two means calculator

The T-test for Two Independent Samples

More about the t-test for two means so you can better interpret the output presented above: A t-test for two means with unknown population variances and two independent samples is a hypothesis test that attempts to make a claim about the population means (\(\mu_1\) and \(\mu_2\)).

More specifically, a t-test uses sample information to assess how plausible it is for the population means \(\mu_1\) and \(\mu_2\) to be equal. The test has two non-overlapping hypotheses, the null and the alternative hypothesis.

The null hypothesis is a statement about the population means, specifically the assumption of no effect, and the alternative hypothesis is the complementary hypothesis to the null hypothesis.

Properties of the two sample t-test

The main properties of a two sample t-test for two population means are:

  • Depending on our knowledge about the "no effect" situation, the t-test can be two-tailed, left-tailed or right-tailed
  • The main principle of hypothesis testing is that the null hypothesis is rejected if the test statistic obtained is sufficiently unlikely under the assumption that the null hypothesis is true
  • The p-value is the probability of obtaining sample results as extreme or more extreme than the sample results obtained, under the assumption that the null hypothesis is true
  • In a hypothesis tests there are two types of errors. Type I error occurs when we reject a true null hypothesis, and the Type II error occurs when we fail to reject a false null hypothesis

How do you compute the t-statistic for the t test for two independent samples?

The formula for a t-statistic for two population means (with two independent samples), with unknown population variances shows us how to calculate t-test with mean and standard deviation and it depends on whether the population variances are assumed to be equal or not. If the population variances are assumed to be unequal, then the formula is:

On the other hand, if the population variances are assumed to be equal, then the formula is:

Normally, the way of knowing whether the population variances must be assumed to be equal or unequal is by using an F-test for equality of variances.

With the above t-statistic, we can compute the corresponding p-value, which allows us to assess whether or not there is a statistically significant difference between two means.

Why is it called t-test for independent samples?

This is because the samples are not related with each other, in a way that the outcomes from one sample are unrelated from the other sample. If the samples are related (for example, you are comparing the answers of husbands and wives, or identical twins), you should use a t-test for paired samples instead .

What if the population standard deviations are known?

The main purpose of this calculator is for comparing two population mean when sigma is unknown for both populations. In case that the population standard deviations are known, then you should use instead this z-test for two means .

Related Calculators

Chi-Square Test for Goodness of Fit

log in to your account

Reset password.

Statology

Statistics Made Easy

Two Sample t-test Calculator

t = -1.608761

p-value (one-tailed) = 0.060963

p-value (two-tailed) = 0.121926

' src=

Published by Zach

One reply to “two sample t-test calculator”.

good work zach

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Other calculators ...

Free statistical calculators, comparison of means calculator, description.

This procedure calculates the difference between the observed means in two independent samples. A significance value (P-value) and 95% Confidence Interval (CI) of the difference is reported. The P-value is the probability of obtaining the observed difference between the samples if the null hypothesis were true. The null hypothesis is the hypothesis that the difference is 0.

Required input

Computational notes.

Comparison of means: pooled standard deviation

The standard error se of the difference between the two means is calculated as:

Comparison of means: standard error of the difference between two means

The P-value is the area of the t distribution with n 1 + n 2 − 2 degrees of freedom, that falls outside ± t (see Values of the t distribution table).

When the P-value is less than 0.05 (P<0.05), the conclusion is that the two means are significantly different. Note that in MedCalc P-values are always two-sided (or two-tailed).

How to cite this page

hypothesis testing two means calculator

P-value Calculator

Statistical significance calculator to easily calculate the p-value and determine whether the difference between two proportions or means (independent groups) is statistically significant. T-test calculator & z-test calculator to compute the Z-score or T-score for inference about absolute or relative difference (percentage change, percent effect). Suitable for analysis of simple A/B tests.

Related calculators

  • Using the p-value calculator
  • What is "p-value" and "significance level"
  • P-value formula
  • Why do we need a p-value?
  • How to interpret a statistically significant result / low p-value
  • P-value and significance for relative difference in means or proportions

    Using the p-value calculator

This statistical significance calculator allows you to perform a post-hoc statistical evaluation of a set of data when the outcome of interest is difference of two proportions (binomial data, e.g. conversion rate or event rate) or difference of two means (continuous data, e.g. height, weight, speed, time, revenue, etc.). You can use a Z-test (recommended) or a T-test to find the observed significance level (p-value statistic). The Student's T-test is recommended mostly for very small sample sizes, e.g. n < 30. In order to avoid type I error inflation which might occur with unequal variances the calculator automatically applies the Welch's T-test instead of Student's T-test if the sample sizes differ significantly or if one of them is less than 30 and the sampling ratio is different than one.

If entering proportions data, you need to know the sample sizes of the two groups as well as the number or rate of events. These can be entered as proportions (e.g. 0.10), percentages (e.g. 10%) or just raw numbers of events (e.g. 50).

If entering means data, simply copy/paste or type in the raw data, each observation separated by comma, space, new line or tab. Copy-pasting from a Google or Excel spreadsheet works fine.

The p-value calculator will output : p-value, significance level, T-score or Z-score (depending on the choice of statistical hypothesis test), degrees of freedom, and the observed difference. For means data it will also output the sample sizes, means, and pooled standard error of the mean. The p-value is for a one-sided hypothesis (one-tailed test), allowing you to infer the direction of the effect (more on one vs. two-tailed tests ). However, the probability value for the two-sided hypothesis (two-tailed p-value) is also calculated and displayed, although it should see little to no practical applications.

Warning: You must have fixed the sample size / stopping time of your experiment in advance, otherwise you will be guilty of optional stopping (fishing for significance) which will inflate the type I error of the test rendering the statistical significance level unusable. Also, you should not use this significance calculator for comparisons of more than two means or proportions, or for comparisons of two groups based on more than one metric. If a test involves more than one treatment group or more than one outcome variable you need a more advanced tool which corrects for multiple comparisons and multiple testing. This statistical calculator might help.

    What is "p-value" and "significance level"

The p-value is a heavily used test statistic that quantifies the uncertainty of a given measurement, usually as a part of an experiment, medical trial, as well as in observational studies. By definition, it is inseparable from inference through a Null-Hypothesis Statistical Test (NHST) . In it we pose a null hypothesis reflecting the currently established theory or a model of the world we don't want to dismiss without solid evidence (the tested hypothesis), and an alternative hypothesis: an alternative model of the world. For example, the statistical null hypothesis could be that exposure to ultraviolet light for prolonged periods of time has positive or neutral effects regarding developing skin cancer, while the alternative hypothesis can be that it has a negative effect on development of skin cancer.

In this framework a p-value is defined as the probability of observing the result which was observed, or a more extreme one, assuming the null hypothesis is true . In notation this is expressed as:

p(x 0 ) = Pr(d(X) > d(x 0 ); H 0 )

where x 0 is the observed data (x 1 ,x 2 ...x n ), d is a special function (statistic, e.g. calculating a Z-score), X is a random sample (X 1 ,X 2 ...X n ) from the sampling distribution of the null hypothesis. This equation is used in this p-value calculator and can be visualized as such:

p value statistical significance explained

Therefore the p-value expresses the probability of committing a type I error : rejecting the null hypothesis if it is in fact true. See below for a full proper interpretation of the p-value statistic .

Another way to think of the p-value is as a more user-friendly expression of how many standard deviations away from the normal a given observation is. For example, in a one-tailed test of significance for a normally-distributed variable like the difference of two means, a result which is 1.6448 standard deviations away (1.6448σ) results in a p-value of 0.05.

The term "statistical significance" or "significance level" is often used in conjunction to the p-value, either to say that a result is "statistically significant", which has a specific meaning in statistical inference ( see interpretation below ), or to refer to the percentage representation the level of significance: (1 - p value), e.g. a p-value of 0.05 is equivalent to significance level of 95% (1 - 0.05 * 100). A significance level can also be expressed as a T-score or Z-score, e.g. a result would be considered significant only if the Z-score is in the critical region above 1.96 (equivalent to a p-value of 0.025).

    P-value formula

There are different ways to arrive at a p-value depending on the assumption about the underlying distribution. This tool supports two such distributions: the Student's T-distribution and the normal Z-distribution (Gaussian) resulting in a T test and a Z test, respectively.

In both cases, to find the p-value start by estimating the variance and standard deviation, then derive the standard error of the mean, after which a standard score is found using the formula [2] :

test statistic

X (read "X bar") is the arithmetic mean of the population baseline or the control, μ 0 is the observed mean / treatment group mean, while σ x is the standard error of the mean (SEM, or standard deviation of the error of the mean).

When calculating a p-value using the Z-distribution the formula is Φ(Z) or Φ(-Z) for lower and upper-tailed tests, respectively. Φ is the standard normal cumulative distribution function and a Z-score is computed. In this mode the tool functions as a Z score calculator.

When using the T-distribution the formula is T n (Z) or T n (-Z) for lower and upper-tailed tests, respectively. T n is the cumulative distribution function for a T-distribution with n degrees of freedom and so a T-score is computed. Selecting this mode makes the tool behave as a T test calculator.

The population standard deviation is often unknown and is thus estimated from the samples, usually from the pooled samples variance. Knowing or estimating the standard deviation is a prerequisite for using a significance calculator. Note that differences in means or proportions are normally distributed according to the Central Limit Theorem (CLT) hence a Z-score is the relevant statistic for such a test.

    Why do we need a p-value?

If you are in the sciences, it is often a requirement by scientific journals. If you apply in business experiments (e.g. A/B testing) it is reported alongside confidence intervals and other estimates. However, what is the utility of p-values and by extension that of significance levels?

First, let us define the problem the p-value is intended to solve. People need to share information about the evidential strength of data that can be easily understood and easily compared between experiments. The picture below represents, albeit imperfectly, the results of two simple experiments, each ending up with the control with 10% event rate treatment group at 12% event rate.

why p value and significance

However, it is obvious that the evidential input of the data is not the same, demonstrating that communicating just the observed proportions or their difference (effect size) is not enough to estimate and communicate the evidential strength of the experiment. In order to fully describe the evidence and associated uncertainty , several statistics need to be communicated, for example, the sample size, sample proportions and the shape of the error distribution. Their interaction is not trivial to understand, so communicating them separately makes it very difficult for one to grasp what information is present in the data. What would you infer if told that the observed proportions are 0.1 and 0.12 (e.g. conversion rate of 10% and 12%), the sample sizes are 10,000 users each, and the error distribution is binomial?

Instead of communicating several statistics, a single statistic was developed that communicates all the necessary information in one piece: the p-value . A p-value was first derived in the late 18-th century by Pierre-Simon Laplace, when he observed data about a million births that showed an excess of boys, compared to girls. Using the calculation of significance he argued that the effect was real but unexplained at the time. We know this now to be true and there are several explanations for the phenomena coming from evolutionary biology. Statistical significance calculations were formally introduced in the early 20-th century by Pearson and popularized by Sir Ronald Fisher in his work, most notably "The Design of Experiments" (1935) [1] in which p-values were featured extensively. In business settings significance levels and p-values see widespread use in process control and various business experiments (such as online A/B tests, i.e. as part of conversion rate optimization, marketing optimization, etc.).

    How to interpret a statistically significant result / low p-value

Saying that a result is statistically significant means that the p-value is below the evidential threshold (significance level) decided for the statistical test before it was conducted. For example, if observing something which would only happen 1 out of 20 times if the null hypothesis is true is considered sufficient evidence to reject the null hypothesis, the threshold will be 0.05. In such case, observing a p-value of 0.025 would mean that the result is interpreted as statistically significant.

But what does that really mean? What inference can we make from seeing a result which was quite improbable if the null was true?

Observing any given low p-value can mean one of three things [3] :

  • There is a true effect from the tested treatment or intervention.
  • There is no true effect, but we happened to observe a rare outcome. The lower the p-value, the rarer (less likely, less probable) the outcome.
  • The statistical model is invalid (does not reflect reality).

Obviously, one can't simply jump to conclusion 1.) and claim it with one hundred percent certainty, as this would go against the whole idea of the p-value and statistical significance. In order to use p-values as a part of a decision process external factors part of the experimental design process need to be considered which includes deciding on the significance level (threshold), sample size and power (power analysis), and the expected effect size, among other things. If you are happy going forward with this much (or this little) uncertainty as is indicated by the p-value calculation suggests, then you have some quantifiable guarantees related to the effect and future performance of whatever you are testing, e.g. the efficacy of a vaccine or the conversion rate of an online shopping cart.

Note that it is incorrect to state that a Z-score or a p-value obtained from any statistical significance calculator tells how likely it is that the observation is "due to chance" or conversely - how unlikely it is to observe such an outcome due to "chance alone". P-values are calculated under specified statistical models hence 'chance' can be used only in reference to that specific data generating mechanism and has a technical meaning quite different from the colloquial one. For a deeper take on the p-value meaning and interpretation, including common misinterpretations, see: definition and interpretation of the p-value in statistics .

    P-value and significance for relative difference in means or proportions

When comparing two independent groups and the variable of interest is the relative (a.k.a. relative change, relative difference, percent change, percentage difference), as opposed to the absolute difference between the two means or proportions, the standard deviation of the variable is different which compels a different way of calculating p-values [5] . The need for a different statistical test is due to the fact that in calculating relative difference involves performing an additional division by a random variable: the event rate of the control during the experiment which adds more variance to the estimation and the resulting statistical significance is usually higher (the result will be less statistically significant). What this means is that p-values from a statistical hypothesis test for absolute difference in means would nominally meet the significance level, but they will be inadequate given the statistical inference for the hypothesis at hand.

In simulations I performed the difference in p-values was about 50% of nominal: a 0.05 p-value for absolute difference corresponded to probability of about 0.075 of observing the relative difference corresponding to the observed absolute difference. Therefore, if you are using p-values calculated for absolute difference when making an inference about percentage difference, you are likely reporting error rates which are about 50% of the actual, thus significantly overstating the statistical significance of your results and underestimating the uncertainty attached to them.

In short - switching from absolute to relative difference requires a different statistical hypothesis test. With this calculator you can avoid the mistake of using the wrong test simply by indicating the inference you want to make.

    References

1 Fisher R.A. (1935) – "The Design of Experiments", Edinburgh: Oliver & Boyd

2 Mayo D.G., Spanos A. (2010) – "Error Statistics", in P. S. Bandyopadhyay & M. R. Forster (Eds.), Philosophy of Statistics, (7, 152–198). Handbook of the Philosophy of Science . The Netherlands: Elsevier.

3 Georgiev G.Z. (2017) "Statistical Significance in A/B Testing – a Complete Guide", [online] https://blog.analytics-toolkit.com/2017/statistical-significance-ab-testing-complete-guide/ (accessed Apr 27, 2018)

4 Mayo D.G., Spanos A. (2006) – "Severe Testing as a Basic Concept in a Neyman–Pearson Philosophy of Induction", British Society for the Philosophy of Science , 57:323-357

5 Georgiev G.Z. (2018) "Confidence Intervals & P-values for Percent Change / Relative Difference", [online] https://blog.analytics-toolkit.com/2018/confidence-intervals-p-values-percent-change-relative-difference/ (accessed May 20, 2018)

Cite this calculator & page

If you'd like to cite this online calculator resource and information as provided on the page, you can use the following citation: Georgiev G.Z., "P-value Calculator" , [online] Available at: https://www.gigacalculator.com/calculators/p-value-significance-calculator.php URL [Accessed Date: 11 Apr, 2024].

Our statistical calculators have been featured in scientific papers and articles published in high-profile science journals by:

springer

The author of this tool

Georgi Z. Georgiev

     Statistical calculators

hypothesis testing two means calculator

  • Calculators
  • Descriptive Statistics
  • Merchandise
  • Which Statistics Test?

T Test Calculator for 2 Dependent Means

The t -test for dependent means (also called a repeated-measures t -test, paired samples t -test, matched pairs t -test and matched samples t -test) is used to compare the means of two sets of scores that are directly related to each other. So, for example, it could be used to test whether subjects' galvanic skin responses are different under two conditions - first, on exposure to a photograph of a beach scene; second, on exposure to a photograph of a spider.

Requirements

  • The data is normally distributed
  • Scale of measurement should be interval or ratio
  • The two sets of scores are paired or matched in some way

Null Hypothesis

H 0 : U D = U 1 - U 2 = 0, where U D equals the mean of the population of difference scores across the two measurements.

hypothesis testing two means calculator

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Mathematics LibreTexts

9.2: Comparing Two Independent Population Means (Hypothesis test)

  • Last updated
  • Save as PDF
  • Page ID 125735

  • The two independent samples are simple random samples from two distinct populations.
  • if the sample sizes are small, the distributions are important (should be normal)
  • if the sample sizes are large, the distributions are not important (need not be normal)

The test comparing two independent population means with unknown and possibly unequal population standard deviations is called the Aspin-Welch \(t\)-test. The degrees of freedom formula was developed by Aspin-Welch.

The comparison of two population means is very common. A difference between the two samples depends on both the means and the standard deviations. Very different means can occur by chance if there is great variation among the individual samples. In order to account for the variation, we take the difference of the sample means, \(\bar{X}_{1} - \bar{X}_{2}\), and divide by the standard error in order to standardize the difference. The result is a t-score test statistic.

Because we do not know the population standard deviations, we estimate them using the two sample standard deviations from our independent samples. For the hypothesis test, we calculate the estimated standard deviation, or standard error , of the difference in sample means , \(\bar{X}_{1} - \bar{X}_{2}\).

The standard error is:

\[\sqrt{\dfrac{(s_{1})^{2}}{n_{1}} + \dfrac{(s_{2})^{2}}{n_{2}}}\]

The test statistic ( t -score) is calculated as follows:

\[\dfrac{(\bar{x}-\bar{x}) - (\mu_{1} - \mu_{2})}{\sqrt{\dfrac{(s_{1})^{2}}{n_{1}} + \dfrac{(s_{2})^{2}}{n_{2}}}}\]

  • \(s_{1}\) and \(s_{2}\), the sample standard deviations, are estimates of \(\sigma_{1}\) and \(\sigma_{1}\), respectively.
  • \(\sigma_{1}\) and \(\sigma_{2}\) are the unknown population standard deviations.
  • \(\bar{x}_{1}\) and \(\bar{x}_{2}\) are the sample means. \(\mu_{1}\) and \(\mu_{2}\) are the population means.

The number of degrees of freedom (\(df\)) requires a somewhat complicated calculation. However, a computer or calculator calculates it easily. The \(df\) are not always a whole number. The test statistic calculated previously is approximated by the Student's t -distribution with \(df\) as follows:

Degrees of freedom

\[df = \dfrac{\left(\dfrac{(s_{1})^{2}}{n_{1}} + \dfrac{(s_{2})^{2}}{n_{2}}\right)^{2}}{\left(\dfrac{1}{n_{1}-1}\right)\left(\dfrac{(s_{1})^{2}}{n_{1}}\right)^{2} + \left(\dfrac{1}{n_{2}-1}\right)\left(\dfrac{(s_{2})^{2}}{n_{2}}\right)^{2}}\]

We can also use a conservative estimation of degree of freedom by taking DF to be the smallest of \(n_{1}-1\) and \(n_{2}-1\)

When both sample sizes \(n_{1}\) and \(n_{2}\) are five or larger, the Student's t approximation is very good. Notice that the sample variances \((s_{1})^{2}\) and \((s_{2})^{2}\) are not pooled. (If the question comes up, do not pool the variances.)

It is not necessary to compute the degrees of freedom by hand. A calculator or computer easily computes it.

Example \(\PageIndex{1}\): Independent groups

The average amount of time boys and girls aged seven to 11 spend playing sports each day is believed to be the same. A study is done and data are collected, resulting in the data in Table \(\PageIndex{1}\). Each populations has a normal distribution.

Is there a difference in the mean amount of time boys and girls aged seven to 11 play sports each day? Test at the 5% level of significance.

The population standard deviations are not known. Let g be the subscript for girls and b be the subscript for boys. Then, \(\mu_{g}\) is the population mean for girls and \(\mu_{b}\) is the population mean for boys. This is a test of two independent groups, two population means.

Random variable: \(\bar{X}_{g} - \bar{X}_{b} =\) difference in the sample mean amount of time girls and boys play sports each day.

  • \(H_{0}: \mu_{g} = \mu_{b}\)  
  • \(H_{0}: \mu_{g} - \mu_{b} = 0\)
  • \(H_{a}: \mu_{g} \neq \mu_{b}\)  
  • \(H_{a}: \mu_{g} - \mu_{b} \neq 0\)

The words "the same" tell you \(H_{0}\) has an "=". Since there are no other words to indicate \(H_{a}\), assume it says "is different." This is a two-tailed test.

Distribution for the test: Use \(t_{df}\) where \(df\) is calculated using the \(df\) formula for independent groups, two population means. Using a calculator, \(df\) is approximately 18.8462. Do not pool the variances.

Calculate the p -value using a Student's t -distribution: \(p\text{-value} = 0.0054\)

This is a normal distribution curve representing the difference in the average amount of time girls and boys play sports all day. The mean is equal to zero, and the values -1.2, 0, and 1.2 are labeled on the horizontal axis. Two vertical lines extend from -1.2 and 1.2 to the curve. The region to the left of x = -1.2 and the region to the right of x = 1.2 are shaded to represent the p-value. The area of each region is 0.0028.

\[s_{g} = 0.866\]

\[s_{b} = 1\]

\[\bar{x}_{g} - \bar{x}_{b} = 2 - 3.2 = -1.2\]

Half the \(p\text{-value}\) is below –1.2 and half is above 1.2.

Make a decision: Since \(\alpha > p\text{-value}\), reject \(H_{0}\). This means you reject \(\mu_{g} = \mu_{b}\). The means are different.

Press STAT . Arrow over to TESTS and press 4:2-SampTTest . Arrow over to Stats and press ENTER . Arrow down and enter 2 for the first sample mean, \(\sqrt{0.866}\) for Sx1, 9 for n1, 3.2 for the second sample mean, 1 for Sx2, and 16 for n2. Arrow down to μ1: and arrow to does not equal μ2. Press ENTER . Arrow down to Pooled: and No . Press ENTER . Arrow down to Calculate and press ENTER . The \(p\text{-value}\) is \(p = 0.0054\), the dfs are approximately 18.8462, and the test statistic is -3.14. Do the procedure again but instead of Calculate do Draw.

Conclusion: At the 5% level of significance, the sample data show there is sufficient evidence to conclude that the mean number of hours that girls and boys aged seven to 11 play sports per day is different (mean number of hours boys aged seven to 11 play sports per day is greater than the mean number of hours played by girls OR the mean number of hours girls aged seven to 11 play sports per day is greater than the mean number of hours played by boys).

Exercise \(\PageIndex{1}\)

Two samples are shown in Table. Both have normal distributions. The means for the two populations are thought to be the same. Is there a difference in the means? Test at the 5% level of significance.

The \(p\text{-value}\) is \(0.4125\), which is much higher than 0.05, so we decline to reject the null hypothesis. There is not sufficient evidence to conclude that the means of the two populations are not the same.

When the sum of the sample sizes is larger than \(30 (n_{1} + n_{2} > 30)\) you can use the normal distribution to approximate the Student's \(t\).

Example \(\PageIndex{2}\)

A study is done by a community group in two neighboring colleges to determine which one graduates students with more math classes. College A samples 11 graduates. Their average is four math classes with a standard deviation of 1.5 math classes. College B samples nine graduates. Their average is 3.5 math classes with a standard deviation of one math class. The community group believes that a student who graduates from college A has taken more math classes, on the average. Both populations have a normal distribution. Test at a 1% significance level. Answer the following questions.

  • Is this a test of two means or two proportions?
  • Are the populations standard deviations known or unknown?
  • Which distribution do you use to perform the test?
  • What is the random variable?
  • What are the null and alternate hypotheses? Write the null and alternate hypotheses in words and in symbols.
  • Is this test right-, left-, or two-tailed?
  • What is the \(p\text{-value}\)?
  • Do you reject or not reject the null hypothesis?
  • Student's t
  • \(\bar{X}_{A} - \bar{X}_{B}\)
  • \(H_{0}: \mu_{A} \leq \mu_{B}\) and \(H_{a}: \mu_{A} > \mu_{B}\)

alt

  • h. Do not reject.
  • i. At the 1% level of significance, from the sample data, there is not sufficient evidence to conclude that a student who graduates from college A has taken more math classes, on the average, than a student who graduates from college B.

Exercise \(\PageIndex{2}\)

A study is done to determine if Company A retains its workers longer than Company B. Company A samples 15 workers, and their average time with the company is five years with a standard deviation of 1.2. Company B samples 20 workers, and their average time with the company is 4.5 years with a standard deviation of 0.8. The populations are normally distributed.

  • Are the population standard deviations known?
  • Conduct an appropriate hypothesis test. At the 5% significance level, what is your conclusion?
  • They are unknown.
  • The \(p\text{-value} = 0.0878\). At the 5% level of significance, there is insufficient evidence to conclude that the workers of Company A stay longer with the company.

Example \(\PageIndex{3}\)

A professor at a large community college wanted to determine whether there is a difference in the means of final exam scores between students who took his statistics course online and the students who took his face-to-face statistics class. He believed that the mean of the final exam scores for the online class would be lower than that of the face-to-face class. Was the professor correct? The randomly selected 30 final exam scores from each group are listed in Table \(\PageIndex{3}\) and Table \(\PageIndex{4}\).

Is the mean of the Final Exam scores of the online class lower than the mean of the Final Exam scores of the face-to-face class? Test at a 5% significance level. Answer the following questions:

  • Are the population standard deviations known or unknown?
  • What are the null and alternative hypotheses? Write the null and alternative hypotheses in words and in symbols.
  • Is this test right, left, or two tailed?
  • At the ___ level of significance, from the sample data, there ______ (is/is not) sufficient evidence to conclude that ______.

(See the conclusion in Example, and write yours in a similar fashion)

Be careful not to mix up the information for Group 1 and Group 2!

  • Student's \(t\)
  • \(\bar{X}_{1} - \bar{X}_{2}\)
  • \(H_{0}: \mu_{1} = \mu_{2}\) Null hypothesis: the means of the final exam scores are equal for the online and face-to-face statistics classes.
  • \(H_{a}: \mu_{1} < \mu_{2}\) Alternative hypothesis: the mean of the final exam scores of the online class is less than the mean of the final exam scores of the face-to-face class.
  • left-tailed

This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the left of zero extends from the axis to the curve. The region under the curve to the left of the line is shaded representing p-value = 0.0011.

Figure \(\PageIndex{3}\).

  • Reject the null hypothesis

At the 5% level of significance, from the sample data, there is (is/is not) sufficient evidence to conclude that the mean of the final exam scores for the online class is less than the mean of final exam scores of the face-to-face class.

First put the data for each group into two lists (such as L1 and L2). Press STAT. Arrow over to TESTS and press 4:2SampTTest. Make sure Data is highlighted and press ENTER. Arrow down and enter L1 for the first list and L2 for the second list. Arrow down to \(\mu_{1}\): and arrow to \(\neq \mu_{1}\) (does not equal). Press ENTER. Arrow down to Pooled: No. Press ENTER. Arrow down to Calculate and press ENTER.

Cohen's Standards for Small, Medium, and Large Effect Sizes

Cohen's \(d\) is a measure of effect size based on the differences between two means. Cohen’s \(d\), named for United States statistician Jacob Cohen, measures the relative strength of the differences between the means of two populations based on sample data. The calculated value of effect size is then compared to Cohen’s standards of small, medium, and large effect sizes.

Cohen's \(d\) is the measure of the difference between two means divided by the pooled standard deviation: \(d = \dfrac{\bar{x}_{2}-\bar{x}_{2}}{s_{\text{pooled}}}\) where \(s_{pooled} = \sqrt{\dfrac{(n_{1}-1)s^{2}_{1} + (n_{2}-1)s^{2}_{2}}{n_{1}+n_{2}-2}}\)

Example \(\PageIndex{4}\)

Calculate Cohen’s d for Example. Is the size of the effect small, medium, or large? Explain what the size of the effect means for this problem.

\(\mu_{1} = 4 s_{1} = 1.5 n_{1} = 11\)

\(\mu_{2} = 3.5 s_{2} = 1 n_{2} = 9\)

\(d = 0.384\)

The effect is small because 0.384 is between Cohen’s value of 0.2 for small effect size and 0.5 for medium effect size. The size of the differences of the means for the two colleges is small indicating that there is not a significant difference between them.

Example \(\PageIndex{5}\)

Calculate Cohen’s \(d\) for Example. Is the size of the effect small, medium or large? Explain what the size of the effect means for this problem.

\(d = 0.834\); Large, because 0.834 is greater than Cohen’s 0.8 for a large effect size. The size of the differences between the means of the Final Exam scores of online students and students in a face-to-face class is large indicating a significant difference.

Example 10.2.6

Weighted alpha is a measure of risk-adjusted performance of stocks over a period of a year. A high positive weighted alpha signifies a stock whose price has risen while a small positive weighted alpha indicates an unchanged stock price during the time period. Weighted alpha is used to identify companies with strong upward or downward trends. The weighted alpha for the top 30 stocks of banks in the northeast and in the west as identified by Nasdaq on May 24, 2013 are listed in Table and Table, respectively.

Is there a difference in the weighted alpha of the top 30 stocks of banks in the northeast and in the west? Test at a 5% significance level. Answer the following questions:

  • Calculate Cohen’s d and interpret it.
  • Student’s-t
  • \(H_{0}: \mu_{1} = \mu_{2}\) Null hypothesis: the means of the weighted alphas are equal.
  • \(H_{a}: \mu_{1} \neq \mu_{2}\) Alternative hypothesis : the means of the weighted alphas are not equal.
  • \(p\text{-value} = 0.8787\)
  • Do not reject the null hypothesis

This is a normal distribution curve with mean equal to zero. Both the right and left tails of the curve are shaded. Each tail represents 1/2(p-value) = 0.4394.

Figure \(\PageIndex{4}\).

  • \(d = 0.040\), Very small, because 0.040 is less than Cohen’s value of 0.2 for small effect size. The size of the difference of the means of the weighted alphas for the two regions of banks is small indicating that there is not a significant difference between their trends in stocks.
  • Data from Graduating Engineer + Computer Careers. Available online at www.graduatingengineer.com
  • Data from Microsoft Bookshelf .
  • Data from the United States Senate website, available online at www.Senate.gov (accessed June 17, 2013).
  • “List of current United States Senators by Age.” Wikipedia. Available online at en.Wikipedia.org/wiki/List_of...enators_by_age (accessed June 17, 2013).
  • “Sectoring by Industry Groups.” Nasdaq. Available online at www.nasdaq.com/markets/barcha...&base=industry (accessed June 17, 2013).
  • “Strip Clubs: Where Prostitution and Trafficking Happen.” Prostitution Research and Education, 2013. Available online at www.prostitutionresearch.com/ProsViolPosttrauStress.html (accessed June 17, 2013).
  • “World Series History.” Baseball-Almanac, 2013. Available online at http://www.baseball-almanac.com/ws/wsmenu.shtml (accessed June 17, 2013).

Two population means from independent samples where the population standard deviations are not known

  • Random Variable: \(\bar{X}_{1} - \bar{X}_{2} =\) the difference of the sampling means
  • Distribution: Student's t -distribution with degrees of freedom (variances not pooled)

Formula Review

Standard error: \[SE = \sqrt{\dfrac{(s_{1}^{2})}{n_{1}} + \dfrac{(s_{2}^{2})}{n_{2}}}\]

Test statistic ( t -score): \[t = \dfrac{(\bar{x}_{1}-\bar{x}_{2}) - (\mu_{1}-\mu_{2})}{\sqrt{\dfrac{(s_{1})^{2}}{n_{1}} + \dfrac{(s_{2})^{2}}{n_{2}}}}\]

Degrees of freedom:

\[df = \dfrac{\left(\dfrac{(s_{1})^{2}}{n_{1}} + \dfrac{(s_{2})^{2}}{n_{2}}\right)^{2}}{\left(\dfrac{1}{n_{1} - 1}\right)\left(\dfrac{(s_{1})^{2}}{n_{1}}\right)^{2}} + \left(\dfrac{1}{n_{2} - 1}\right)\left(\dfrac{(s_{2})^{2}}{n_{2}}\right)^{2}\]

  • \(s_{1}\) and \(s_{2}\) are the sample standard deviations, and n 1 and n 2 are the sample sizes.
  • \(x_{1}\) and \(x_{2}\) are the sample means.

OR use the   DF to be the smallest of \(n_{1}-1\) and \(n_{2}-1\)

Cohen’s \(d\) is the measure of effect size:

\[d = \dfrac{\bar{x}_{1} - \bar{x}_{2}}{s_{\text{pooled}}}\]

\[s_{\text{pooled}} = \sqrt{\dfrac{(n_{1} - 1)s^{2}_{1} + (n_{2} - 1)s^{2}_{2}}{n_{1} + n_{2} - 2}}\]

  • The domain of the random variable (RV) is not necessarily a numerical set; the domain may be expressed in words; for example, if \(X =\) hair color, then the domain is {black, blond, gray, green, orange}.
  • We can tell what specific value x of the random variable \(X\) takes only after performing the experiment.

ListenData

Two Sample T-Test Calculator

Two sample t-test is used to check whether the means of two groups are significantly different from each other. For example, if you want to see if mean weight of males and females have statistically significant difference between them.

Independent T-test assumes that the two samples have equal variances. Welch's t-test is used if you have unequal variances.

Either enter raw data or summary information to calculate two sample t-test. You can directly paste data from MS Excel.

Enter Raw Data

Enter Summary Data

  • Scores are normally distributed within each of the two groups
  • Each score is sampled independently and randomly.
  • Data must be continuous

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Inference for Comparing 2 Population Means (HT for 2 Means, independent samples)

More of the good stuff! We will need to know how to label the null and alternative hypothesis, calculate the test statistic, and then reach our conclusion using the critical value method or the p-value method.

The Test Statistic for a Test of 2 Means from Independent Samples:

[latex]t = \displaystyle \frac{(\bar{x_1} - \bar{x_2}) - (\mu_1 - \mu_2)}{\sqrt{\displaystyle \frac{s_1^2}{n_1} + \displaystyle \frac{s_2^2}{n_2}}}[/latex]

What the different symbols mean:

[latex]n_1[/latex] is the sample size for the first group

[latex]n_2[/latex] is the sample size for the second group

[latex]df[/latex], the degrees of freedom, is the smaller of [latex]n_1 - 1[/latex] and [latex]n_2 - 1[/latex]

[latex]\mu_1[/latex] is the population mean from the first group

[latex]\mu_2[/latex] is the population mean from the second group

[latex]\bar{x_1}[/latex] is the sample mean for the first group

[latex]\bar{x_2}[/latex] is the sample mean for the second group

[latex]s_1[/latex] is the sample standard deviation for the first group

[latex]s_2[/latex] is the sample standard deviation for the second group

[latex]\alpha[/latex] is the significance level , usually given within the problem, or if not given, we assume it to be 5% or 0.05

Assumptions when conducting a Test for 2 Means from Independent Samples:

  • We do not know the population standard deviations, and we do not assume they are equal
  • The two samples or groups are independent
  • Both samples are simple random samples
  • Both populations are Normally distributed OR both samples are large ([latex]n_1 > 30[/latex] and [latex]n_2 > 30[/latex])

Steps to conduct the Test for 2 Means from Independent Samples:

  • Identify all the symbols listed above (all the stuff that will go into the formulas). This includes [latex]n_1[/latex] and [latex]n_2[/latex], [latex]df[/latex], [latex]\mu_1[/latex] and [latex]\mu_2[/latex], [latex]\bar{x_1}[/latex] and [latex]\bar{x_2}[/latex], [latex]s_1[/latex] and [latex]s_2[/latex], and [latex]\alpha[/latex]
  • Identify the null and alternative hypotheses
  • Calculate the test statistic, [latex]t = \displaystyle \frac{(\bar{x_1} - \bar{x_2}) - (\mu_1 - \mu_2)}{\sqrt{\displaystyle \frac{s_1^2}{n_1} + \displaystyle \frac{s_2^2}{n_2}}}[/latex]
  • Find the critical value(s) OR the p-value OR both
  • Apply the Decision Rule
  • Write up a conclusion for the test

Example 1: Study on the effectiveness of stents for stroke patients [1]

In this study , researchers randomly assigned stroke patients to two groups: one received the current standard care (control) and the other received a stent surgery in addition to the standard care (stent treatment). If the stents work, the treatment group should have a lower average disability score . Do the results give convincing statistical evidence that the stent treatment reduces the average disability from stroke?

Since we are being asked for convincing statistical evidence, a hypothesis test should be conducted. In this case, we are dealing with averages from two samples or groups (the patients with stent treatment and patients receiving the standard care), so we will conduct a Test of 2 Means.

  • [latex]n_1 = 98[/latex] is the sample size for the first group
  • [latex]n_2 = 93[/latex] is the sample size for the second group
  • [latex]df[/latex], the degrees of freedom, is the smaller of [latex]98 - 1 = 97[/latex] and [latex]93 - 1 = 92[/latex], so [latex]df = 92[/latex]
  • [latex]\bar{x_1} = 2.26[/latex] is the sample mean for the first group
  • [latex]\bar{x_2} = 3.23[/latex] is the sample mean for the second group
  • [latex]s_1 = 1.78[/latex] is the sample standard deviation for the first group
  • [latex]s_2 = 1.78[/latex] is the sample standard deviation for the second group
  • [latex]\alpha = 0.05[/latex] (we were not told a specific value in the problem, so we are assuming it is 5%)
  • One additional assumption we extend from the null hypothesis is that [latex]\mu_1 - \mu_2 = 0[/latex]; this means that in our formula, those variables cancel out
  • [latex]H_{0}: \mu_1 = \mu_2[/latex]
  • [latex]H_{A}: \mu_1 < \mu_2[/latex]
  • [latex]t = \displaystyle \frac{(\bar{x_1} - \bar{x_2}) - (\mu_1 - \mu_2)}{\sqrt{\displaystyle \frac{s_1^2}{n_1} + \displaystyle \frac{s_2^2}{n_2}}} = \displaystyle \frac{(2.26 - 3.23) - 0)}{\sqrt{\displaystyle \frac{1.78^2}{98} + \displaystyle \frac{1.78^2}{93}}} = -3.76[/latex]
  • StatDisk : We can conduct this test using StatDisk. The nice thing about StatDisk is that it will also compute the test statistic. From the main menu above we click on Analysis, Hypothesis Testing, and then Mean Two Independent Samples. From there enter the 0.05 significance, along with the specific values as outlined in the picture below in Step 2. Notice the alternative hypothesis is the [latex]<[/latex] option. Enter the sample size, mean, and standard deviation for each group, and make sure that unequal variances is selected. Now we click on Evaluate. If you check the values, the test statistic is reported in the Step 3 display, as well as the P-Value of 0.00011.
  • Applying the Decision Rule: We now compare this to our significance level, which is 0.05. If the p-value is smaller or equal to the alpha level, we have enough evidence for our claim, otherwise we do not. Here, [latex]p-value = 0.00011[/latex], which is definitely smaller than [latex]\alpha = 0.05[/latex], so we have enough evidence for the alternative hypothesis…but what does this mean?
  • Conclusion: Because our p-value  of [latex]0.00011[/latex] is less than our [latex]\alpha[/latex] level of [latex]0.05[/latex], we reject [latex]H_{0}[/latex]. We have convincing statistical evidence that the stent treatment reduces the average disability from stroke.

Example 2: Home Run Distances

In 1998, Sammy Sosa and Mark McGwire (2 players in Major League Baseball) were on pace to set a new home run record. At the end of the season McGwire ended up with 70 home runs, and Sosa ended up with 66. The home run distances were recorded and compared (sometimes a player’s home run distance is used to measure their “power”). Do the results give convincing statistical evidence that the home run distances are different from each other? Who would you say “hit the ball farther” in this comparison?

Since we are being asked for convincing statistical evidence, a hypothesis test should be conducted. In this case, we are dealing with averages from two samples or groups (the home run distances), so we will conduct a Test of 2 Means.

  • [latex]n_1 = 70[/latex] is the sample size for the first group
  • [latex]n_2 = 66[/latex] is the sample size for the second group
  • [latex]df[/latex], the degrees of freedom, is the smaller of [latex]70 - 1 = 69[/latex] and [latex]66 - 1 = 65[/latex], so [latex]df = 65[/latex]
  • [latex]\bar{x_1} = 418.5[/latex] is the sample mean for the first group
  • [latex]\bar{x_2} = 404.8[/latex] is the sample mean for the second group
  • [latex]s_1 = 45.5[/latex] is the sample standard deviation for the first group
  • [latex]s_2 = 35.7[/latex] is the sample standard deviation for the second group
  • [latex]H_{A}: \mu_1 \neq \mu_2[/latex]
  • [latex]t = \displaystyle \frac{(\bar{x_1} - \bar{x_2}) - (\mu_1 - \mu_2)}{\sqrt{\displaystyle \frac{s_1^2}{n_1} + \displaystyle \frac{s_2^2}{n_2}}} = \displaystyle \frac{(418.5 - 404.8) - 0)}{\sqrt{\displaystyle \frac{45.5^2}{70} + \displaystyle \frac{35.7^2}{65}}} = 1.95[/latex]
  • StatDisk : We can conduct this test using StatDisk. The nice thing about StatDisk is that it will also compute the test statistic. From the main menu above we click on Analysis, Hypothesis Testing, and then Mean Two Independent Samples. From there enter the 0.05 significance, along with the specific values as outlined in the picture below in Step 2. Notice the alternative hypothesis is the [latex]\neq[/latex] option. Enter the sample size, mean, and standard deviation for each group, and make sure that unequal variances is selected. Now we click on Evaluate. If you check the values, the test statistic is reported in the Step 3 display, as well as the P-Value of 0.05221.
  • Applying the Decision Rule: We now compare this to our significance level, which is 0.05. If the p-value is smaller or equal to the alpha level, we have enough evidence for our claim, otherwise we do not. Here, [latex]p-value = 0.05221[/latex], which is larger than [latex]\alpha = 0.05[/latex], so we do not have enough evidence for the alternative hypothesis…but what does this mean?
  • Conclusion: Because our p-value  of [latex]0.05221[/latex] is larger than our [latex]\alpha[/latex] level of [latex]0.05[/latex], we fail to reject [latex]H_{0}[/latex]. We do not have convincing statistical evidence that the home run distances are different.
  • Follow-up commentary: But what does this mean? There actually was a difference, right? If we take McGwire’s average and subtract Sosa’s average we get a difference of 13.7. What this result indicates is that the difference is not statistically significant; it could be due more to random chance than something meaningful. Other factors, such as sample size, could also be a determining factor (with a larger sample size, the difference may have been more meaningful).
  • Adapted from the Skew The Script curriculum ( skewthescript.org ), licensed under CC BY-NC-Sa 4.0 ↵

Basic Statistics Copyright © by Allyn Leon is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Teach yourself statistics

Hypothesis Test: Difference Between Means

This lesson explains how to conduct a hypothesis test for the difference between two means. The test procedure, called the two-sample t-test , is appropriate when the following conditions are met:

  • The sampling method for each sample is simple random sampling .
  • The samples are independent .
  • Each population is at least 20 times larger than its respective sample .
  • The population distribution is normal.
  • The population data are symmetric , unimodal , without outliers , and the sample size is 15 or less.
  • The population data are slightly skewed , unimodal, without outliers, and the sample size is 16 to 40.
  • The sample size is greater than 40, without outliers.

This approach consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze sample data, and (4) interpret results.

State the Hypotheses

Every hypothesis test requires the analyst to state a null hypothesis and an alternative hypothesis . The hypotheses are stated in such a way that they are mutually exclusive. That is, if one is true, the other must be false; and vice versa.

The table below shows three sets of null and alternative hypotheses. Each makes a statement about the difference d between the mean of one population μ 1 and the mean of another population μ 2 . (In the table, the symbol ≠ means " not equal to ".)

The first set of hypotheses (Set 1) is an example of a two-tailed test , since an extreme value on either side of the sampling distribution would cause a researcher to reject the null hypothesis. The other two sets of hypotheses (Sets 2 and 3) are one-tailed tests , since an extreme value on only one side of the sampling distribution would cause a researcher to reject the null hypothesis.

When the null hypothesis states that there is no difference between the two population means (i.e., d = 0), the null and alternative hypothesis are often stated in the following form.

H o : μ 1 = μ 2

H a : μ 1 ≠ μ 2

Formulate an Analysis Plan

The analysis plan describes how to use sample data to accept or reject the null hypothesis. It should specify the following elements.

  • Significance level. Often, researchers choose significance levels equal to 0.01, 0.05, or 0.10; but any value between 0 and 1 can be used.
  • Test method. Use the two-sample t-test to determine whether the difference between means found in the sample is significantly different from the hypothesized difference between means.

Analyze Sample Data

Using sample data, find the standard error, degrees of freedom, test statistic, and the P-value associated with the test statistic.

SE = sqrt[ (s 1 2 /n 1 ) + (s 2 2 /n 2 ) ]

DF = (s 1 2 /n 1 + s 2 2 /n 2 ) 2 / { [ (s 1 2 / n 1 ) 2 / (n 1 - 1) ] + [ (s 2 2 / n 2 ) 2 / (n 2 - 1) ] }

t = [ ( x 1 - x 2 ) - d ] / SE

  • P-value. The P-value is the probability of observing a sample statistic as extreme as the test statistic. Since the test statistic is a t statistic, use the t Distribution Calculator to assess the probability associated with the t statistic, having the degrees of freedom computed above. (See sample problems at the end of this lesson for examples of how this is done.)

Interpret Results

If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null hypothesis. Typically, this involves comparing the P-value to the significance level , and rejecting the null hypothesis when the P-value is less than the significance level.

Test Your Understanding

In this section, two sample problems illustrate how to conduct a hypothesis test of a difference between mean scores. The first problem involves a two-tailed test; the second problem, a one-tailed test.

Problem 1: Two-Tailed Test

Within a school district, students were randomly assigned to one of two Math teachers - Mrs. Smith and Mrs. Jones. After the assignment, Mrs. Smith had 30 students, and Mrs. Jones had 25 students.

At the end of the year, each class took the same standardized test. Mrs. Smith's students had an average test score of 78, with a standard deviation of 10; and Mrs. Jones' students had an average test score of 85, with a standard deviation of 15.

Test the hypothesis that Mrs. Smith and Mrs. Jones are equally effective teachers. Use a 0.10 level of significance. (Assume that student performance is approximately normal.)

Solution: The solution to this problem takes four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze sample data, and (4) interpret results. We work through those steps below:

State the hypotheses. The first step is to state the null hypothesis and an alternative hypothesis.

Null hypothesis: μ 1 - μ 2 = 0

Alternative hypothesis: μ 1 - μ 2 ≠ 0

  • Formulate an analysis plan . For this analysis, the significance level is 0.10. Using sample data, we will conduct a two-sample t-test of the null hypothesis.

SE = sqrt[(s 1 2 /n 1 ) + (s 2 2 /n 2 )]

SE = sqrt[(10 2 /30) + (15 2 /25] = sqrt(3.33 + 9)

SE = sqrt(12.33) = 3.51

DF = (10 2 /30 + 15 2 /25) 2 / { [ (10 2 / 30) 2 / (29) ] + [ (15 2 / 25) 2 / (24) ] }

DF = (3.33 + 9) 2 / { [ (3.33) 2 / (29) ] + [ (9) 2 / (24) ] } = 152.03 / (0.382 + 3.375) = 152.03/3.757 = 40.47

t = [ ( x 1 - x 2 ) - d ] / SE = [ (78 - 85) - 0 ] / 3.51 = -7/3.51 = -1.99

where s 1 is the standard deviation of sample 1, s 2 is the standard deviation of sample 2, n 1 is the size of sample 1, n 2 is the size of sample 2, x 1 is the mean of sample 1, x 2 is the mean of sample 2, d is the hypothesized difference between the population means, and SE is the standard error.

Since we have a two-tailed test , the P-value is the probability that a t statistic having 40 degrees of freedom is more extreme than -1.99; that is, less than -1.99 or greater than 1.99.

We use the t Distribution Calculator to find P(t < -1.99) is about 0.027.

  • If you enter 1.99 as the sample mean in the t Distribution Calculator, you will find the that the P(t ≤ 1.99) is about 0.973. Therefore, P(t > 1.99) is 1 minus 0.973 or 0.027. Thus, the P-value = 0.027 + 0.027 = 0.054.
  • Interpret results . Since the P-value (0.054) is less than the significance level (0.10), we cannot accept the null hypothesis.

Note: If you use this approach on an exam, you may also want to mention why this approach is appropriate. Specifically, the approach is appropriate because the sampling method was simple random sampling, the samples were independent, the sample size was much smaller than the population size, and the samples were drawn from a normal population.

Problem 2: One-Tailed Test

The Acme Company has developed a new battery. The engineer in charge claims that the new battery will operate continuously for at least 7 minutes longer than the old battery.

To test the claim, the company selects a simple random sample of 100 new batteries and 100 old batteries. The old batteries run continuously for 190 minutes with a standard deviation of 20 minutes; the new batteries, 200 minutes with a standard deviation of 40 minutes.

Test the engineer's claim that the new batteries run at least 7 minutes longer than the old. Use a 0.05 level of significance. (Assume that there are no outliers in either sample.)

Null hypothesis: μ 1 - μ 2 <= 7

Alternative hypothesis: μ 1 - μ 2 > 7

where μ 1 is battery life for the new battery, and μ 2 is battery life for the old battery.

  • Formulate an analysis plan . For this analysis, the significance level is 0.05. Using sample data, we will conduct a two-sample t-test of the null hypothesis.

SE = sqrt[(40 2 /100) + (20 2 /100]

SE = sqrt(16 + 4) = 4.472

DF = (40 2 /100 + 20 2 /100) 2 / { [ (40 2 / 100) 2 / (99) ] + [ (20 2 / 100) 2 / (99) ] }

DF = (20) 2 / { [ (16) 2 / (99) ] + [ (2) 2 / (99) ] } = 400 / (2.586 + 0.162) = 145.56

t = [ ( x 1 - x 2 ) - d ] / SE = [(200 - 190) - 7] / 4.472 = 3/4.472 = 0.67

where s 1 is the standard deviation of sample 1, s 2 is the standard deviation of sample 2, n 1 is the size of sample 1, n 2 is the size of sample 2, x 1 is the mean of sample 1, x 2 is the mean of sample 2, d is the hypothesized difference between population means, and SE is the standard error.

Here is the logic of the analysis: Given the alternative hypothesis (μ 1 - μ 2 > 7), we want to know whether the observed difference in sample means is big enough (i.e., sufficiently greater than 7) to cause us to reject the null hypothesis.

Interpret results . Suppose we replicated this study many times with different samples. If the true difference in population means were actually 7, we would expect the observed difference in sample means to be 10 or less in 75% of our samples. And we would expect to find an observed difference to be more than 10 in 25% of our samples Therefore, the P-value in this analysis is 0.25.

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants

Scientific Calculator

  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

27: Hypothesis Test for a Population Mean Given Statistics Calculator

  • Last updated
  • Save as PDF
  • Page ID 8349

  • Larry Green
  • Lake Tahoe Community College

hypothesis test for a population mean given statistics calculator

Select if the population standard deviation, \(\sigma\), is known or unknown. Then fill in the standard deviation, the sample mean, \(\bar{x}\), the sample size, \(n\), the hypothesized population mean \(\mu_0\), and indicate if the test is left tailed, <, right tailed, >, or two tailed, \(\neq\).  Then hit "Calculate" and the test statistic and p-Value will be calculated for you.

Hypothesis testing for the mean Calculator

hypothesis testing two means calculator

State the null and alternative hypothesis:

Calculate our test statistic z:, determine rejection region:, you have 2 free calculationss remaining, what is the answer, how does the hypothesis testing for the mean calculator work, what 1 formula is used for the hypothesis testing for the mean calculator, what 7 concepts are covered in the hypothesis testing for the mean calculator.

  • alternative hypothesis
  • hypothesis testing for the mean
  • null hypothesis
  • test statistic

hypothesis testing two means calculator

An Automated Online Math Tutor serving 8.1 million parents and students in 235 countries and territories.

hypothesis testing two means calculator

Our Services

  • All Courses
  • Our Experts

Top Categories

  • Development
  • Photography
  • Refer a Friend
  • Festival Offer
  • Limited Membership
  • Scholarship

Scroll to Top

IMAGES

  1. Two Sample T-Test (Two Means)

    hypothesis testing two means calculator

  2. Hypothesis Testing with Two Independent Means

    hypothesis testing two means calculator

  3. Hypothesis Testing Formula

    hypothesis testing two means calculator

  4. Hypothesis Testing Population Mean

    hypothesis testing two means calculator

  5. Two proportion hypothesis test calculator

    hypothesis testing two means calculator

  6. Student's t-Test (t0, te & H0) Calculator, Formulas & Examples

    hypothesis testing two means calculator

VIDEO

  1. hypothesis testing two means part II

  2. Hypothesis Testing-Two Means Examples 2016

  3. Hypothesis Testing Two Means DEPENDENT Samples

  4. Hypothesis Testing Two Sample Test Chapter 10

  5. Hypothesis Testing, Difference of Means, Sigma Unknown, Critical Region(Traditional) Method

  6. COSM

COMMENTS

  1. Difference in Means Hypothesis Test Calculator

    Use the calculator below to analyze the results of a difference in sample means hypothesis test. Enter your sample means, sample standard deviations, sample sizes, hypothesized difference in means, test type, and significance level to calculate your results. You will find a description of how to conduct a two sample t-test below the calculator.

  2. T-Test Calculator for 2 Independent Means

    T-Test Calculator for 2 Independent Means. This simple t -test calculator, provides full details of the t-test calculation, including sample mean, sum of squares and standard deviation. A t -test is used when you're looking at a numerical variable - for example, height - and then comparing the averages of two separate populations or groups (e.g ...

  3. Two Population Calculator with Steps

    The calculator above computes confidence intervals and hypothesis tests for the difference between two population means. The simpler version of this is confidence intervals and hypothesis tests for a single population mean. For confidence intervals about a single population mean, visit the Confidence Interval Calculator.

  4. t-test Calculator

    A two-sample t-test (to compare the means for two groups); or. A paired t-test (to check how the mean from the same group changes after some intervention). Decide on the alternative hypothesis: Two-tailed; Left-tailed; or. Right-tailed. This t-test calculator allows you to use either the p-value approach or the critical regions approach to ...

  5. Hypothesis Testing Calculator with Steps

    Hypothesis Testing Calculator. The first step in hypothesis testing is to calculate the test statistic. The formula for the test statistic depends on whether the population standard deviation (σ) is known or unknown. If σ is known, our hypothesis test is known as a z test and we use the z distribution. If σ is unknown, our hypothesis test is ...

  6. T-Test Calculator for 2 Independent Means

    Enter the values for your two treatment conditions into the text boxes below, either one score per line or as a comma delimited list. Select your significance level and whether your hypothesis is one or two-tailed. Then give your data a final check, and press the "Calculate T and P Values" button. Treatment 1 ( X) Treatment 2 ( X) Significance ...

  7. T test calculator

    A t test compares the means of two groups. There are several types of two sample t tests and this calculator focuses on the three most common: unpaired, welch's, and paired t tests. Directions for using the calculator are listed below, along with more information about two sample t tests and help on which is appropriate for your analysis. NOTE: This is not the same as a one sample t test; for ...

  8. T-test for two Means

    The T-test for Two Independent Samples More about the t-test for two means so you can better interpret the output presented above: A t-test for two means with unknown population variances and two independent samples is a hypothesis test that attempts to make a claim about the population means (\(\mu_1\) and \(\mu_2\)).

  9. Two Sample t-test Calculator

    If this is not the case, you should instead use the Welch's t-test calculator. To perform a two sample t-test, simply fill in the information below and then click the "Calculate" button. Enter raw data Enter summary data. Sample 1. 301, 298, 295, 297, 304, 305, 309, 298, 291, 299, 293, 304. Sample 2.

  10. MedCalc's Comparison of means calculator

    Description. This procedure calculates the difference between the observed means in two independent samples. A significance value (P-value) and 95% Confidence Interval (CI) of the difference is reported. The P-value is the probability of obtaining the observed difference between the samples if the null hypothesis were true.

  11. Hypothesis Test Calculator

    Calculation Example: There are six steps you would follow in hypothesis testing: Formulate the null and alternative hypotheses in three different ways: H 0: θ = θ 0 v e r s u s H 1: θ ≠ θ 0. H 0: θ ≤ θ 0 v e r s u s H 1: θ > θ 0. H 0: θ ≥ θ 0 v e r s u s H 1: θ < θ 0.

  12. 10.29: Hypothesis Test for a Difference in Two Population Means (1 of 2

    Step 1: Determine the hypotheses. The hypotheses for a difference in two population means are similar to those for a difference in two population proportions. The null hypothesis, H 0, is again a statement of "no effect" or "no difference.". H 0: μ 1 - μ 2 = 0, which is the same as H 0: μ 1 = μ 2. The alternative hypothesis, H a ...

  13. P-value Calculator & Statistical Significance Calculator

    Statistical significance calculator to easily calculate the p-value and determine whether the difference between two proportions or means (independent groups) is statistically significant. T-test calculator & z-test calculator to compute the Z-score or T-score for inference about absolute or relative difference (percentage change, percent effect).

  14. T-Test Calculator for 2 Dependent Means

    Scale of measurement should be interval or ratio. The two sets of scores are paired or matched in some way. Null Hypothesis. H 0: U D = U 1 - U 2 = 0, where U D equals the mean of the population of difference scores across the two measurements. Equation. A T-test calculator that compares 2 dependent population means for statistical significance.

  15. 32: Two Independent Samples With Statistics Calculator

    Two Independent Samples with statistics Calculator. Enter in the statistics, the tail type and the confidence level and hit Calculate and the test statistic, t, the p-value, p, the confidence interval's lower bound, LB, and the upper bound, UB will be shown. Be sure to enter the confidence level as a decimal, e.g., 95% has a CL of 0.95. Back to ...

  16. 9.2: Comparing Two Independent Population Means (Hypothesis test)

    This is a test of two independent groups, two population means. Random variable: ˉXg − ˉXb = difference in the sample mean amount of time girls and boys play sports each day. H0: μg = μb. H0: μg − μb = 0. Ha: μg ≠ μb. Ha: μg − μb ≠ 0.

  17. Two Sample T-Test Calculator

    Two Sample T-Test Calculator. Two sample t-test is used to check whether the means of two groups are significantly different from each other. For example, if you want to see if mean weight of males and females have statistically significant difference between them. Independent T-test assumes that the two samples have equal variances.

  18. Hypothesis Testing: 2 Means (Independent Samples)

    Since we are being asked for convincing statistical evidence, a hypothesis test should be conducted. In this case, we are dealing with averages from two samples or groups (the home run distances), so we will conduct a Test of 2 Means. n1 = 70 n 1 = 70 is the sample size for the first group. n2 = 66 n 2 = 66 is the sample size for the second group.

  19. Hypothesis Test: Difference in Means

    The first step is to state the null hypothesis and an alternative hypothesis. Null hypothesis: μ 1 - μ 2 = 0. Alternative hypothesis: μ 1 - μ 2 ≠ 0. Note that these hypotheses constitute a two-tailed test. The null hypothesis will be rejected if the difference between sample means is too big or if it is too small.

  20. Test Statistic Calculator

    Test Statistic = ¯ x - μ 0 σ √n. Test Statistic = 66 - 40 4 √16. Test Statistic = 26 4 4. Test Statistic = 26 1. Test Statistic = 26. Now as the computed value is 26, find a Standardized test statistic calculator to verify it. With it, you will have the correct results in fractions of seconds.

  21. 27: Hypothesis Test for a Population Mean Given Statistics Calculator

    hypothesis test for a population mean given statistics calculator. Select if the population standard deviation, σ σ, is known or unknown. Then fill in the standard deviation, the sample mean, x¯ x ¯ , the sample size, n n, the hypothesized population mean μ0 μ 0, and indicate if the test is left tailed, <, right tailed, >, or two tailed ...

  22. Difference in Means Calculator

    This is then used in the calculation of the test statistic for the difference between the two means: T = ( x − y) − μ 0 s p 2 n x + s p 2 n y. We interpret the formula in the following way: s p 2 = ( n x − 1) s x 2 + ( n y − 1) s y 2 n x + n y − 2. In this formula, we have:

  23. Hypothesis testing for the mean Calculator

    Hypothesis testing for the mean Calculator: Hypothesis Testing for the Mean Calculator. 1-800-234-2933; [email protected]; Home; About Us; Download; Math Subjects; ... Since our null hypothesis is H 0: μ = 8, this is a two tailed test Checking our table of z-scores for α(left); = 0.005 and α(right); = 0.995, we get: