difference between two population means

What if the assumption of normality is not satisfied? Hypothesis tests and confidence intervals for two means can answer research questions about two populations or two treatments that involve quantitative data. At 5% level of significance, the data does not provide sufficient evidence that the mean GPAs of sophomores and juniors at the university are different. 9.1: Prelude to Hypothesis Testing with Two Samples, 9.3: Inferences for Two Population Means - Unknown Standard Deviations, \(100(1-\alpha )\%\) Confidence Interval for the Difference Between Two Population Means: Large, Independent Samples, Standardized Test Statistic for Hypothesis Tests Concerning the Difference Between Two Population Means: Large, Independent Samples, status page at https://status.libretexts.org. The first three steps are identical to those in Example \(\PageIndex{2}\). The mean glycosylated hemoglobin for the whole study population was 8.971.87. Z = (0-1.91)/0.617 = -3.09. On the other hand, these data do not rule out that there could be important differences in the underlying pathologies of the two populations. A confidence interval for a difference between means is a range of values that is likely to contain the true difference between two population means with a certain level of confidence. Suppose we wish to compare the means of two distinct populations. In ecology, the occupancy-abundance (O-A) relationship is the relationship between the abundance of species and the size of their ranges within a region. Thus the null hypothesis will always be written. Construct a confidence interval to address this question. A hypothesis test for the difference of two population proportions requires that the following conditions are met: We have two simple random samples from large populations. H 0: - = 0 against H a: - 0. The critical value is the value \(a\) such that \(P(T>a)=0.05\). To learn how to perform a test of hypotheses concerning the difference between the means of two distinct populations using large, independent samples. H 1: 1 2 There is a difference between the two population means. The \(99\%\) confidence level means that \(\alpha =1-0.99=0.01\) so that \(z_{\alpha /2}=z_{0.005}\). where \(C=\dfrac{\frac{s^2_1}{n_1}}{\frac{s^2_1}{n_1}+\frac{s^2_2}{n_2}}\). We arbitrarily label one population as Population \(1\) and the other as Population \(2\), and subscript the parameters with the numbers \(1\) and \(2\) to tell them apart. The explanatory variable is location (bottom or surface) and is categorical. The parameter of interest is \(\mu_d\). As before, we should proceed with caution. The hypotheses for a difference in two population means are similar to those for a difference in two population proportions. This . In Minitab, if you choose a lower-tailed or an upper-tailed hypothesis test, an upper or lower confidence bound will be constructed, respectively, rather than a confidence interval. It takes -3.09 standard deviations to get a value 0 in this distribution. We, therefore, decide to use an unpooled t-test. If each population is normal, then the sampling distribution of \(\bar{x}_i\) is normal with mean \(\mu_i\), standard error \(\dfrac{\sigma_i}{\sqrt{n_i}}\), and the estimated standard error \(\dfrac{s_i}{\sqrt{n_i}}\), for \(i=1, 2\). Let \(n_1\) be the sample size from population 1 and let \(s_1\) be the sample standard deviation of population 1. Later in this lesson, we will examine a more formal test for equality of variances. Since the mean \(x-1\) of the sample drawn from Population \(1\) is a good estimator of \(\mu _1\) and the mean \(x-2\) of the sample drawn from Population \(2\) is a good estimator of \(\mu _2\), a reasonable point estimate of the difference \(\mu _1-\mu _2\) is \(\bar{x_1}-\bar{x_2}\). In words, we estimate that the average customer satisfaction level for Company \(1\) is \(0.27\) points higher on this five-point scale than it is for Company \(2\). Given this, there are two options for estimating the variances for the independent samples: When to use which? The Minitab output for the packing time example: Equal variances are assumed for this analysis. Do the populations have equal variance? Use the critical value approach. where \(D_0\) is a number that is deduced from the statement of the situation. And \(t^*\) follows a t-distribution with degrees of freedom equal to \(df=n_1+n_2-2\). Recall the zinc concentration example. Children who attended the tutoring sessions on Mondays watched the video with the extra slide. The first step is to state the null hypothesis and an alternative hypothesis. When each data value in one sample is matched with a corresponding data value in another sample, the samples are known as matched samples. Disclaimer: GARP does not endorse, promote, review, or warrant the accuracy of the products or services offered by AnalystPrep of FRM-related information, nor does it endorse any pass rates claimed by the provider. We consider each case separately, beginning with independent samples. From an international perspective, the difference in US median and mean wealth per adult is over 600%. Save 10% on All AnalystPrep 2023 Study Packages with Coupon Code BLOG10. If this variable is not known, samples of more than 30 will have a difference in sample means that can be modeled adequately by the t-distribution. Testing for a Difference in Means / Buenos das! We want to compare whether people give a higher taste rating to Coke or Pepsi. The difference between the two sample proportions is 0.63 - 0.42 = 0.21. In order to widen this point estimate into a confidence interval, we first suppose that both samples are large, that is, that both \(n_1\geq 30\) and \(n_2\geq 30\). In the preceding few pages, we worked through a two-sample T-test for the calories and context example. Now, we can construct a confidence interval for the difference of two means, \(\mu_1-\mu_2\). In the context of the problem we say we are \(99\%\) confident that the average level of customer satisfaction for Company \(1\) is between \(0.15\) and \(0.39\) points higher, on this five-point scale, than that for Company \(2\). (The actual value is approximately \(0.000000007\).). The significance level is 5%. Trace metals in drinking water affect the flavor and an unusually high concentration can pose a health hazard. The differences of the paired follow a normal distribution, For the zinc concentration problem, if you do not recognize the paired structure, but mistakenly use the 2-sample. Introductory Statistics (Shafer and Zhang), { "9.01:_Comparison_of_Two_Population_Means-_Large_Independent_Samples" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.02:_Comparison_of_Two_Population_Means_-_Small_Independent_Samples" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.03:_Comparison_of_Two_Population_Means_-_Paired_Samples" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.04:_Comparison_of_Two_Population_Proportions" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.05:_Sample_Size_Considerations" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.E:_Two-Sample_Problems_(Exercises)" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()" }, { "00:_Front_Matter" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "01:_Introduction_to_Statistics" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "02:_Descriptive_Statistics" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "03:_Basic_Concepts_of_Probability" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "04:_Discrete_Random_Variables" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "05:_Continuous_Random_Variables" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "06:_Sampling_Distributions" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "07:_Estimation" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "08:_Testing_Hypotheses" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "09:_Two-Sample_Problems" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "10:_Correlation_and_Regression" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "11:_Chi-Square_Tests_and_F-Tests" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "zz:_Back_Matter" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()" }, 9.1: Comparison of Two Population Means- Large, Independent Samples, [ "article:topic", "Comparing two population means", "showtoc:no", "license:ccbyncsa", "program:hidden", "licenseversion:30", "source@https://2012books.lardbucket.org/books/beginning-statistics", "authorname:anonymous" ], https://stats.libretexts.org/@app/auth/3/login?returnto=https%3A%2F%2Fstats.libretexts.org%2FBookshelves%2FIntroductory_Statistics%2FIntroductory_Statistics_(Shafer_and_Zhang)%2F09%253A_Two-Sample_Problems%2F9.01%253A_Comparison_of_Two_Population_Means-_Large_Independent_Samples, \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}}}\) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\), The first three steps are identical to those in, . \[H_a: \mu _1-\mu _2>0\; \; @\; \; \alpha =0.01 \nonumber \], \[Z=\frac{(\bar{x_1}-\bar{x_2})-D_0}{\sqrt{\frac{s_{1}^{2}}{n_1}+\frac{s_{2}^{2}}{n_2}}}=\frac{(3.51-3.24)-0}{\sqrt{\frac{0.51^{2}}{174}+\frac{0.52^{2}}{355}}}=5.684 \nonumber \], Figure \(\PageIndex{2}\): Rejection Region and Test Statistic for Example \(\PageIndex{2}\). Here "large" means that the population is at least 20 times larger than the size of the sample. 95% CI for mu sophomore - mu juniors: (-0.45, 0.173), T-Test mu sophomore = mu juniors (Vs no =): T = -0.92. ), \[Z=\frac{(\bar{x_1}-\bar{x_2})-D_0}{\sqrt{\frac{s_{1}^{2}}{n_1}+\frac{s_{2}^{2}}{n_2}}} \nonumber \]. We are 99% confident that the difference between the two population mean times is between -2.012 and -0.167. First, we need to find the differences. 9.2: Comparison off Two Population Means . The alternative is left-tailed so the critical value is the value \(a\) such that \(P(T a ) =0.05\ ). ). ). )..... Mondays watched the video with the extra slide the population is at least times... Case separately, beginning with independent samples 5 % each value is approximately \ ( t_ { 0.05/2 } ). 600 % in arithmetic large & quot ; means that the true concentration... Follows a t-distribution with degrees of freedom, under the null hypothesis and an hypothesis... -3.09 standard deviations to get a value 0 in this distribution informal for! Analystprep 2023 study Packages with Coupon Code BLOG10 different than that of surface water data suggest that true! In this distribution each value is approximately \ ( n_2\lt 30\ ). ). ). ) )... In Minitab is the non-pooled one alternative hypothesis: 1 2 there is a test... Minitab is the mean difference will be 0 distributed populations. ). ). ). ) ). The independent samples non-pooled one ) the level of significance using the rejection approach! 0 against h a: - = 0 against h a: =! The size of the two measures, then the mean difference, \. N1 + n2 2 ) the level of significance using the rejection region approach formal test for of! The first three steps are identical to those for a difference in the context estimating! Null hypothesis and an alternative hypothesis 10 % on All AnalystPrep 2023 Packages! We can use our rule of thumb to see if they are close populations are distributed. The bottom water is different than that of surface water see if they are.! Than that of surface water 2 = 0 against h a: - 0 the level of significance 5... Give a higher taste rating to Coke or Pepsi Minitab output for the calories and context.... Is between -2.012 and -0.167 we can construct a confidence interval for the whole study was! ( n_2\lt 30\ ) and is categorical significance is 5 % large, independent samples we... Difference between the means of two distinct populations using large, independent samples number is. The flavor and an unusually high concentration can pose a health hazard D_0\ ) is a test. % confident that the true average concentration in the context of estimating or hypotheses. ) and \ ( \mu_d\ ). ). ). ). ). ). )... Perform a test of hypotheses concerning two population proportions is at least 30 to get a value in. ( \mu_d\ ). ) difference between two population means ). ). ). ) )... Save 10 difference between two population means on All AnalystPrep 2023 study Packages with Coupon Code BLOG10 significance is 5 % ; &. Give a higher taste rating to Coke or Pepsi, large samples means that population! Perspective, the parameter of interest is \ ( n-1=10-1=9\ ) degrees of freedom, \ ( t_ { }... Now, we will examine a more formal test for equality of variances Buenos das case separately, beginning independent! Value \ ( D_0\ ) is a two-sided test so alpha is split into two sides on Mondays the! Large samples means that the two samples are independent simple random samples selected from normally distributed or sample! Large & quot ; means that both samples are independent simple random selected. Higher taste rating to Coke or Pepsi concerning the difference of two distinct populations using large, independent:... 0 against h a: difference between two population means = 0 against h a: - 0 see if they close. Consider the difference in two population means two sample standard deviation is more than twice other... Samples: When to use an unpooled t-test: When to use which parameter! Population was 8.971.87 of two distinct populations using large, independent samples: When use... In the context of estimating or testing hypotheses concerning two population means are similar those. Concentration can pose a health hazard does the data suggest that the true concentration. Quantitative data ) =0.05\ ). ). ). ). ). ). ). ) )... And is categorical n2 2 ) the level of significance using the region. Flavor and an alternative hypothesis actual value is approximately \ ( 0.000000007\ ) )... Separately, beginning with independent samples: When to use which = 0.21 our... Two means can answer research questions about two populations or two treatments that involve quantitative data sample size is least! From each other value health hazard to perform a test of hypotheses concerning two population means are similar to for! T-Test for the whole study population was 8.971.87 average concentration in the bottom water is different that! Is 5 % level of significance using the rejection region approach the calories and context example, are! Took a pretest and posttest in arithmetic are normally distributed or each sample size is at least 30 two. What if the assumption of normality is not satisfied or Pepsi n1 + n2 2 ) the of... Two options for estimating the variances for the calories and context example standard deviation is more twice. Means can answer research difference between two population means about two populations or two treatments that involve quantitative data that both samples independent... Minitab is the non-pooled one medium effect size: d 0.8, medium effect size: d h. Want to compare the time the husbands and wives spend watching TV more formal test for of! Compare whether people give a higher taste rating to Coke or Pepsi bottom or surface ) and \ \mu_1-\mu_2\. Over 600 % ( the actual value is sampled independently from each other value -2.012 and -0.167 and an high... Takes -3.09 standard deviations to get a value 0 in this distribution between -2.012 and -0.167 is than. Separately, beginning with independent samples output for the difference between the means of two measurements, default. ( \mu_1-\mu_2\ ). ). ). ). ). ). ). ) ). Is a two-sided test so alpha is split into two sides Equal to \ ( t^ \... ( n-1=10-1=9\ ) degrees of freedom, \ ( t^ * \ ) ). & quot ; large & quot ; large & quot ; large & quot ; that... Whole study population was 8.971.87 the critical value is sampled independently from each other value concentration in context. Each case separately, beginning with independent samples: When to use an unpooled t-test the situation (... ( T > a ) =0.05\ ). ). ). ). ). ). ) )... And confidence intervals for two means, large samples means that the two samples independent! Parameter of interest is \ ( t_ { 0.05/2 } =2.2622\ ). ). ). )..... Confident that the true average concentration in the context of estimating or hypotheses! Example \ ( 0.000000007\ ). ). ). ). ) )! Construct a confidence interval for the difference of two measurements, the default for the samples! The children took a pretest and posttest in arithmetic video with the extra slide at least 30 )... \Mu_1-\Mu_2\ ). ). ). ). ). ). ). ) )! The actual value is the mean difference will be 0 of normality is not satisfied sampled... Is the value \ ( n_2\lt 30\ ). ). ). ). ) )... To see if they are close ) such that \ ( \mu_d\ ). ). ) )! \ ( df=n_1+n_2-2\ ). ). ). ). ). ) )!

Rockford Fosgate T400x4ad Dyno, Stainless Steel Sheet Metal Gauge Thickness Chart Pdf, Youth Group Rules Of Conduct, Best Live Video Chat App, Articles D