2 sample t test for sample sizes - 30,000 and 150,000How to perform t-test with huge samples?Safely determining sample size for A/B testingPower of the t-test under unequal sample sizesA/B test with unequal sample sizeIndependent samples t-test with unequal sample sizesHow to test for significance if groups differed at baseline?Sample size for A/B testingPaired T-Test and general linear hypothesisDetermining minimum required sample size for control (for purposes of measuring lift)Which hypothesis test to use to compare two data sets that have a lot of zeros?Non-inferiority margin and minimum detectable effect vs sample size

Is there really no use for MD5 anymore?

Where was the County of Thurn und Taxis located?

Unknown code in script

What does "function" actually mean in music?

Drawing a german abacus as in the books of Adam Ries

Can I criticise the more senior developers around me for not writing clean code?

Why do games have consumables?

Why is the underscore command _ useful?

How do I produce this Greek letter koppa: Ϟ in pdfLaTeX?

Do I need to watch Ant-Man and the Wasp and Captain Marvel before watching Avengers: Endgame?

What is the best way to deal with NPC-NPC combat?

A strange hotel

How do I reattach a shelf to the wall when it ripped out of the wall?

Why didn't the Space Shuttle bounce back into space as many times as possible so as to lose a lot of kinetic energy up there?

Von Neumann Extractor - Which bit is retained?

Older movie/show about humans on derelict alien warship which refuels by passing through a star

How can I practically buy stocks?

Why did C use the -> operator instead of reusing the . operator?

What is purpose of DB Browser(dbbrowser.aspx) under admin tool?

Can a stored procedure reference the database in which it is stored?

How long after the last departure shall the airport stay open for an emergency return?

"Whatever a Russian does, they end up making the Kalashnikov gun"? Are there any similar proverbs in English?

Does a large simulator bay have standard public address announcements?

Why did Rep. Omar conclude her criticism of US troops with the phrase "NotTodaySatan"?



2 sample t test for sample sizes - 30,000 and 150,000


How to perform t-test with huge samples?Safely determining sample size for A/B testingPower of the t-test under unequal sample sizesA/B test with unequal sample sizeIndependent samples t-test with unequal sample sizesHow to test for significance if groups differed at baseline?Sample size for A/B testingPaired T-Test and general linear hypothesisDetermining minimum required sample size for control (for purposes of measuring lift)Which hypothesis test to use to compare two data sets that have a lot of zeros?Non-inferiority margin and minimum detectable effect vs sample size






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








2












$begingroup$


I have 2 samples, one with sample size of 30,000 customers and the other with 150,000. I have to perform a 2 sample t test(on conversion rates of the 2 groups). My question is, will t test in this case be biased towards the smaller sample? If yes, what is the correct approach to perform a test?










share|cite|improve this question









New contributor




Shivam Tiwari is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$







  • 7




    $begingroup$
    Samples of that size will almost certainly result in statistically significant findings, but the differences may not be of any practical significance. See here for another discussion about this: stats.stackexchange.com/questions/4075/…. What are the actual goals of your analysis too?
    $endgroup$
    – StatsStudent
    Apr 22 at 19:24











  • $begingroup$
    the test was to determine which list is better in conversion for emails.1 list was from a prediction model(30,000) and the other, the current list(150,000). We had set up an initial test frame but previous conversion rates(0.05%) and power analysis yielding huge sample sizes for significance, we decided to disregard the framework(our model could not have produced huge sample without lowering the accuracy). Hence, we decided to send the emails to both the lists and compute the results after. We have the conversions now and are trying to establish whether or not the difference is significant
    $endgroup$
    – Shivam Tiwari
    2 days ago










  • $begingroup$
    Are the 30,000 predicted a selection of the predictive most likely to respond from the larger list of 150,000? Can there be any overlap?
    $endgroup$
    – StatsStudent
    2 days ago










  • $begingroup$
    there were overlaps, we had removed them from the current list of 150,000(so that a customer didn't receive the same email twice). But while computing conversions we did include the overlap in both the lists(for fair comparison). Please note as the test was to compare conversion rates of lists; same email was sent to both the lists
    $endgroup$
    – Shivam Tiwari
    2 days ago


















2












$begingroup$


I have 2 samples, one with sample size of 30,000 customers and the other with 150,000. I have to perform a 2 sample t test(on conversion rates of the 2 groups). My question is, will t test in this case be biased towards the smaller sample? If yes, what is the correct approach to perform a test?










share|cite|improve this question









New contributor




Shivam Tiwari is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$







  • 7




    $begingroup$
    Samples of that size will almost certainly result in statistically significant findings, but the differences may not be of any practical significance. See here for another discussion about this: stats.stackexchange.com/questions/4075/…. What are the actual goals of your analysis too?
    $endgroup$
    – StatsStudent
    Apr 22 at 19:24











  • $begingroup$
    the test was to determine which list is better in conversion for emails.1 list was from a prediction model(30,000) and the other, the current list(150,000). We had set up an initial test frame but previous conversion rates(0.05%) and power analysis yielding huge sample sizes for significance, we decided to disregard the framework(our model could not have produced huge sample without lowering the accuracy). Hence, we decided to send the emails to both the lists and compute the results after. We have the conversions now and are trying to establish whether or not the difference is significant
    $endgroup$
    – Shivam Tiwari
    2 days ago










  • $begingroup$
    Are the 30,000 predicted a selection of the predictive most likely to respond from the larger list of 150,000? Can there be any overlap?
    $endgroup$
    – StatsStudent
    2 days ago










  • $begingroup$
    there were overlaps, we had removed them from the current list of 150,000(so that a customer didn't receive the same email twice). But while computing conversions we did include the overlap in both the lists(for fair comparison). Please note as the test was to compare conversion rates of lists; same email was sent to both the lists
    $endgroup$
    – Shivam Tiwari
    2 days ago














2












2








2





$begingroup$


I have 2 samples, one with sample size of 30,000 customers and the other with 150,000. I have to perform a 2 sample t test(on conversion rates of the 2 groups). My question is, will t test in this case be biased towards the smaller sample? If yes, what is the correct approach to perform a test?










share|cite|improve this question









New contributor




Shivam Tiwari is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$




I have 2 samples, one with sample size of 30,000 customers and the other with 150,000. I have to perform a 2 sample t test(on conversion rates of the 2 groups). My question is, will t test in this case be biased towards the smaller sample? If yes, what is the correct approach to perform a test?







hypothesis-testing statistical-significance t-test ab-test






share|cite|improve this question









New contributor




Shivam Tiwari is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|cite|improve this question









New contributor




Shivam Tiwari is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|cite|improve this question




share|cite|improve this question








edited Apr 22 at 17:53







Shivam Tiwari













New contributor




Shivam Tiwari is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked Apr 22 at 17:40









Shivam TiwariShivam Tiwari

112




112




New contributor




Shivam Tiwari is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Shivam Tiwari is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Shivam Tiwari is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







  • 7




    $begingroup$
    Samples of that size will almost certainly result in statistically significant findings, but the differences may not be of any practical significance. See here for another discussion about this: stats.stackexchange.com/questions/4075/…. What are the actual goals of your analysis too?
    $endgroup$
    – StatsStudent
    Apr 22 at 19:24











  • $begingroup$
    the test was to determine which list is better in conversion for emails.1 list was from a prediction model(30,000) and the other, the current list(150,000). We had set up an initial test frame but previous conversion rates(0.05%) and power analysis yielding huge sample sizes for significance, we decided to disregard the framework(our model could not have produced huge sample without lowering the accuracy). Hence, we decided to send the emails to both the lists and compute the results after. We have the conversions now and are trying to establish whether or not the difference is significant
    $endgroup$
    – Shivam Tiwari
    2 days ago










  • $begingroup$
    Are the 30,000 predicted a selection of the predictive most likely to respond from the larger list of 150,000? Can there be any overlap?
    $endgroup$
    – StatsStudent
    2 days ago










  • $begingroup$
    there were overlaps, we had removed them from the current list of 150,000(so that a customer didn't receive the same email twice). But while computing conversions we did include the overlap in both the lists(for fair comparison). Please note as the test was to compare conversion rates of lists; same email was sent to both the lists
    $endgroup$
    – Shivam Tiwari
    2 days ago













  • 7




    $begingroup$
    Samples of that size will almost certainly result in statistically significant findings, but the differences may not be of any practical significance. See here for another discussion about this: stats.stackexchange.com/questions/4075/…. What are the actual goals of your analysis too?
    $endgroup$
    – StatsStudent
    Apr 22 at 19:24











  • $begingroup$
    the test was to determine which list is better in conversion for emails.1 list was from a prediction model(30,000) and the other, the current list(150,000). We had set up an initial test frame but previous conversion rates(0.05%) and power analysis yielding huge sample sizes for significance, we decided to disregard the framework(our model could not have produced huge sample without lowering the accuracy). Hence, we decided to send the emails to both the lists and compute the results after. We have the conversions now and are trying to establish whether or not the difference is significant
    $endgroup$
    – Shivam Tiwari
    2 days ago










  • $begingroup$
    Are the 30,000 predicted a selection of the predictive most likely to respond from the larger list of 150,000? Can there be any overlap?
    $endgroup$
    – StatsStudent
    2 days ago










  • $begingroup$
    there were overlaps, we had removed them from the current list of 150,000(so that a customer didn't receive the same email twice). But while computing conversions we did include the overlap in both the lists(for fair comparison). Please note as the test was to compare conversion rates of lists; same email was sent to both the lists
    $endgroup$
    – Shivam Tiwari
    2 days ago








7




7




$begingroup$
Samples of that size will almost certainly result in statistically significant findings, but the differences may not be of any practical significance. See here for another discussion about this: stats.stackexchange.com/questions/4075/…. What are the actual goals of your analysis too?
$endgroup$
– StatsStudent
Apr 22 at 19:24





$begingroup$
Samples of that size will almost certainly result in statistically significant findings, but the differences may not be of any practical significance. See here for another discussion about this: stats.stackexchange.com/questions/4075/…. What are the actual goals of your analysis too?
$endgroup$
– StatsStudent
Apr 22 at 19:24













$begingroup$
the test was to determine which list is better in conversion for emails.1 list was from a prediction model(30,000) and the other, the current list(150,000). We had set up an initial test frame but previous conversion rates(0.05%) and power analysis yielding huge sample sizes for significance, we decided to disregard the framework(our model could not have produced huge sample without lowering the accuracy). Hence, we decided to send the emails to both the lists and compute the results after. We have the conversions now and are trying to establish whether or not the difference is significant
$endgroup$
– Shivam Tiwari
2 days ago




$begingroup$
the test was to determine which list is better in conversion for emails.1 list was from a prediction model(30,000) and the other, the current list(150,000). We had set up an initial test frame but previous conversion rates(0.05%) and power analysis yielding huge sample sizes for significance, we decided to disregard the framework(our model could not have produced huge sample without lowering the accuracy). Hence, we decided to send the emails to both the lists and compute the results after. We have the conversions now and are trying to establish whether or not the difference is significant
$endgroup$
– Shivam Tiwari
2 days ago












$begingroup$
Are the 30,000 predicted a selection of the predictive most likely to respond from the larger list of 150,000? Can there be any overlap?
$endgroup$
– StatsStudent
2 days ago




$begingroup$
Are the 30,000 predicted a selection of the predictive most likely to respond from the larger list of 150,000? Can there be any overlap?
$endgroup$
– StatsStudent
2 days ago












$begingroup$
there were overlaps, we had removed them from the current list of 150,000(so that a customer didn't receive the same email twice). But while computing conversions we did include the overlap in both the lists(for fair comparison). Please note as the test was to compare conversion rates of lists; same email was sent to both the lists
$endgroup$
– Shivam Tiwari
2 days ago





$begingroup$
there were overlaps, we had removed them from the current list of 150,000(so that a customer didn't receive the same email twice). But while computing conversions we did include the overlap in both the lists(for fair comparison). Please note as the test was to compare conversion rates of lists; same email was sent to both the lists
$endgroup$
– Shivam Tiwari
2 days ago











3 Answers
3






active

oldest

votes


















7












$begingroup$

Maybe a couple of examples will help to illustrate some of the issues.



Suppose the two populations are $X sim mathsfNorm(mu = 500, sigma =30)$
and $Y sim mathsfNorm(mu = 501, sigma = 20.)$



If both sample sizes are $150,000,$ then there is sufficient power to detect
the small difference in means.



set.seed(422)
x = rnorm(150000, 500, 30)
y = rnorm(150000, 501, 20)
t.test(x, y)

Welch Two Sample t-test

data: x and y
t = -10.983, df = 261530, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-1.2042715 -0.8395487
sample estimates:
mean of x mean of y
499.9804 501.0023


If we use only the first 30,000 values in the first sample, results are
very nearly the same for most practical purposes.



t.test(x[1:30000], y)

Welch Two Sample t-test

data: x[1:30000] and y
t = -6.3728, df = 35463, p-value = 1.879e-10
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-1.5126269 -0.8010336
sample estimates:
mean of x mean of y
499.8455 501.0023


Here is a boxplot of the data used in the second t test (the wider box indicates a larger sample):



enter image description here



Issues of minimal concern:



  • Even though labeled as 'Welch t tests', sample sizes are sufficiently large
    that these are essentially t tests. Unless the data are very far from normal,
    we would still detect the small difference in means.


  • The power of the test is heavily dependent on the smaller sample size. But
    power is not a concern here.


Issues warranting attention:



  • With such large samples
    in the real world (not the simulation world),
    one is entitled to wonder whether data are truly simple random samples from
    their respective populations. Could smaller, more carefully collected samples provide better information?


  • Although we did not do a formal test to confirm that variances differ, it seems clear from the boxplot that they do. In the Welch test,
    it is OK for variances to differ. But would different variances have important practical implications?


  • Although the null hypothesis that the two population means are equal is soundly rejected with minuscule P-values, it is important to realize that "statistically significant" differences (by whatever definition) are not necessarily differences of practical importance or interest. For what purpose are you
    taking the effort of check whether means are different? And what do the results
    of the t test actually contribute to that purpose?






share|cite|improve this answer











$endgroup$








  • 1




    $begingroup$
    the test was to determine which list is better in conversion for emails.1 list was from a prediction model(30,000) and the other, the current list(150,000). We had set up an initial test frame but previous conversion rates(0.05%) and power analysis yielding huge sample sizes for significance, we decided to disregard the framework(our model could not have produced huge sample without lowering the accuracy). Hence, we decided to send the emails to both the lists and compute the results after. We have the conversions now and are trying to establish whether or not the difference is significant
    $endgroup$
    – Shivam Tiwari
    2 days ago


















6












$begingroup$

I can hardly imagine any worthwhile effect size that requires such a large sample size to be decently powered. There's no "bias" of having unequal sample sizes$^1$. The only disadvantage is that the power of the test tends to be somewhat limited by the smaller group. For even very small effects, 30,000 observations may confer quite a powerful test.



$^1$ except if you inappropriately use the "equal variance" assumption, in which case the "pooled variance" estimate is more heavily weighted toward the larger group (not toward the smaller as you suggested).






share|cite|improve this answer











$endgroup$




















    2












    $begingroup$

    I agree with the most that was said so far but I do not completely agree with the satement from @AdamO that "There's no "bias" of having unequal sample sizes".



    Unfortunately, we don't know what the purpose of your study is. But let's assume you are interested in gender differences in regard to salary. We know that in population there should be about 50% male and 50% female and hence if you had drawn random samples and if there was MAR (missing at random) we would expect both groups having approximately same sample sizes. To put it differently, if the ratio between sample sizes of both groups is very different than their ratio in the population this can indicate that either the samples are not random (what could cause a bias) or that the missings are not random (what could cause a bias, too). Talking about the gender example I would be surprised if someone would report such big differences in samples sizes (for example: Did more women refuse to answer questions about their salary? Are the women who responded to the question representative or did only those with a high salary answer the question? And so on... Non-random missings would obviously cause a bias here and make the results misleading).



    Thus the question that I would ask myself is why the groups have unequal sample sizes. If there is an reasonable answer like "There are more people without heart failure than people with heart failure" the data might be alright. But if you would expect equal sample sizes based on what you know about the groups in the population there might be some bias because the samples/ the missings seem to be not random.






    share|cite|improve this answer










    New contributor




    stats.and.r is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.






    $endgroup$












    • $begingroup$
      You seem to be conflating the idea of "bias" and "inefficient design". One is the property of a statistic (in this case the mean difference), the other is a property of a test. The mean difference is never biased no matter how imbalanced the sample. But the power of the test can suffer.
      $endgroup$
      – AdamO
      9 hours ago











    • $begingroup$
      @AdamO: I think I used the word "bias" differently but since I explained what I mean I guess this should be okay to understand what problem to my understanding can arise if the groups are very unequal. I don't know what the word for the misleading effect I describe is correct. Please edit if you think it is necessary.
      $endgroup$
      – stats.and.r
      9 hours ago











    • $begingroup$
      you can't disagree with me based on a fundamentally incorrect understanding of a term.
      $endgroup$
      – AdamO
      9 hours ago










    • $begingroup$
      @AdamO: I don't disagree but just say that I don't know another word for the problem that I describe. Pleaae read my comment carefully. And I welcome it if you edit my answer. Although the definition found on wiki "Statistical bias is a feature of a statistical technique or of its results whereby the expected value of the results differs from the true underlying quantitative parameter being estimated." does agree with my way using the word. On wikipedia this effect is called "selection bias".
      $endgroup$
      – stats.and.r
      9 hours ago











    • $begingroup$
      @AdamO: I find your comment quite harsh and want to show you some definition of my "fundamentally incorrect understanding of a term". Maybe you simply never learnt this meaning of that term? See here: en.m.wikipedia.org/wiki/Selection_bias
      $endgroup$
      – stats.and.r
      9 hours ago











    Your Answer








    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "65"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );






    Shivam Tiwari is a new contributor. Be nice, and check out our Code of Conduct.









    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f404439%2f2-sample-t-test-for-sample-sizes-30-000-and-150-000%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    7












    $begingroup$

    Maybe a couple of examples will help to illustrate some of the issues.



    Suppose the two populations are $X sim mathsfNorm(mu = 500, sigma =30)$
    and $Y sim mathsfNorm(mu = 501, sigma = 20.)$



    If both sample sizes are $150,000,$ then there is sufficient power to detect
    the small difference in means.



    set.seed(422)
    x = rnorm(150000, 500, 30)
    y = rnorm(150000, 501, 20)
    t.test(x, y)

    Welch Two Sample t-test

    data: x and y
    t = -10.983, df = 261530, p-value < 2.2e-16
    alternative hypothesis: true difference in means is not equal to 0
    95 percent confidence interval:
    -1.2042715 -0.8395487
    sample estimates:
    mean of x mean of y
    499.9804 501.0023


    If we use only the first 30,000 values in the first sample, results are
    very nearly the same for most practical purposes.



    t.test(x[1:30000], y)

    Welch Two Sample t-test

    data: x[1:30000] and y
    t = -6.3728, df = 35463, p-value = 1.879e-10
    alternative hypothesis: true difference in means is not equal to 0
    95 percent confidence interval:
    -1.5126269 -0.8010336
    sample estimates:
    mean of x mean of y
    499.8455 501.0023


    Here is a boxplot of the data used in the second t test (the wider box indicates a larger sample):



    enter image description here



    Issues of minimal concern:



    • Even though labeled as 'Welch t tests', sample sizes are sufficiently large
      that these are essentially t tests. Unless the data are very far from normal,
      we would still detect the small difference in means.


    • The power of the test is heavily dependent on the smaller sample size. But
      power is not a concern here.


    Issues warranting attention:



    • With such large samples
      in the real world (not the simulation world),
      one is entitled to wonder whether data are truly simple random samples from
      their respective populations. Could smaller, more carefully collected samples provide better information?


    • Although we did not do a formal test to confirm that variances differ, it seems clear from the boxplot that they do. In the Welch test,
      it is OK for variances to differ. But would different variances have important practical implications?


    • Although the null hypothesis that the two population means are equal is soundly rejected with minuscule P-values, it is important to realize that "statistically significant" differences (by whatever definition) are not necessarily differences of practical importance or interest. For what purpose are you
      taking the effort of check whether means are different? And what do the results
      of the t test actually contribute to that purpose?






    share|cite|improve this answer











    $endgroup$








    • 1




      $begingroup$
      the test was to determine which list is better in conversion for emails.1 list was from a prediction model(30,000) and the other, the current list(150,000). We had set up an initial test frame but previous conversion rates(0.05%) and power analysis yielding huge sample sizes for significance, we decided to disregard the framework(our model could not have produced huge sample without lowering the accuracy). Hence, we decided to send the emails to both the lists and compute the results after. We have the conversions now and are trying to establish whether or not the difference is significant
      $endgroup$
      – Shivam Tiwari
      2 days ago















    7












    $begingroup$

    Maybe a couple of examples will help to illustrate some of the issues.



    Suppose the two populations are $X sim mathsfNorm(mu = 500, sigma =30)$
    and $Y sim mathsfNorm(mu = 501, sigma = 20.)$



    If both sample sizes are $150,000,$ then there is sufficient power to detect
    the small difference in means.



    set.seed(422)
    x = rnorm(150000, 500, 30)
    y = rnorm(150000, 501, 20)
    t.test(x, y)

    Welch Two Sample t-test

    data: x and y
    t = -10.983, df = 261530, p-value < 2.2e-16
    alternative hypothesis: true difference in means is not equal to 0
    95 percent confidence interval:
    -1.2042715 -0.8395487
    sample estimates:
    mean of x mean of y
    499.9804 501.0023


    If we use only the first 30,000 values in the first sample, results are
    very nearly the same for most practical purposes.



    t.test(x[1:30000], y)

    Welch Two Sample t-test

    data: x[1:30000] and y
    t = -6.3728, df = 35463, p-value = 1.879e-10
    alternative hypothesis: true difference in means is not equal to 0
    95 percent confidence interval:
    -1.5126269 -0.8010336
    sample estimates:
    mean of x mean of y
    499.8455 501.0023


    Here is a boxplot of the data used in the second t test (the wider box indicates a larger sample):



    enter image description here



    Issues of minimal concern:



    • Even though labeled as 'Welch t tests', sample sizes are sufficiently large
      that these are essentially t tests. Unless the data are very far from normal,
      we would still detect the small difference in means.


    • The power of the test is heavily dependent on the smaller sample size. But
      power is not a concern here.


    Issues warranting attention:



    • With such large samples
      in the real world (not the simulation world),
      one is entitled to wonder whether data are truly simple random samples from
      their respective populations. Could smaller, more carefully collected samples provide better information?


    • Although we did not do a formal test to confirm that variances differ, it seems clear from the boxplot that they do. In the Welch test,
      it is OK for variances to differ. But would different variances have important practical implications?


    • Although the null hypothesis that the two population means are equal is soundly rejected with minuscule P-values, it is important to realize that "statistically significant" differences (by whatever definition) are not necessarily differences of practical importance or interest. For what purpose are you
      taking the effort of check whether means are different? And what do the results
      of the t test actually contribute to that purpose?






    share|cite|improve this answer











    $endgroup$








    • 1




      $begingroup$
      the test was to determine which list is better in conversion for emails.1 list was from a prediction model(30,000) and the other, the current list(150,000). We had set up an initial test frame but previous conversion rates(0.05%) and power analysis yielding huge sample sizes for significance, we decided to disregard the framework(our model could not have produced huge sample without lowering the accuracy). Hence, we decided to send the emails to both the lists and compute the results after. We have the conversions now and are trying to establish whether or not the difference is significant
      $endgroup$
      – Shivam Tiwari
      2 days ago













    7












    7








    7





    $begingroup$

    Maybe a couple of examples will help to illustrate some of the issues.



    Suppose the two populations are $X sim mathsfNorm(mu = 500, sigma =30)$
    and $Y sim mathsfNorm(mu = 501, sigma = 20.)$



    If both sample sizes are $150,000,$ then there is sufficient power to detect
    the small difference in means.



    set.seed(422)
    x = rnorm(150000, 500, 30)
    y = rnorm(150000, 501, 20)
    t.test(x, y)

    Welch Two Sample t-test

    data: x and y
    t = -10.983, df = 261530, p-value < 2.2e-16
    alternative hypothesis: true difference in means is not equal to 0
    95 percent confidence interval:
    -1.2042715 -0.8395487
    sample estimates:
    mean of x mean of y
    499.9804 501.0023


    If we use only the first 30,000 values in the first sample, results are
    very nearly the same for most practical purposes.



    t.test(x[1:30000], y)

    Welch Two Sample t-test

    data: x[1:30000] and y
    t = -6.3728, df = 35463, p-value = 1.879e-10
    alternative hypothesis: true difference in means is not equal to 0
    95 percent confidence interval:
    -1.5126269 -0.8010336
    sample estimates:
    mean of x mean of y
    499.8455 501.0023


    Here is a boxplot of the data used in the second t test (the wider box indicates a larger sample):



    enter image description here



    Issues of minimal concern:



    • Even though labeled as 'Welch t tests', sample sizes are sufficiently large
      that these are essentially t tests. Unless the data are very far from normal,
      we would still detect the small difference in means.


    • The power of the test is heavily dependent on the smaller sample size. But
      power is not a concern here.


    Issues warranting attention:



    • With such large samples
      in the real world (not the simulation world),
      one is entitled to wonder whether data are truly simple random samples from
      their respective populations. Could smaller, more carefully collected samples provide better information?


    • Although we did not do a formal test to confirm that variances differ, it seems clear from the boxplot that they do. In the Welch test,
      it is OK for variances to differ. But would different variances have important practical implications?


    • Although the null hypothesis that the two population means are equal is soundly rejected with minuscule P-values, it is important to realize that "statistically significant" differences (by whatever definition) are not necessarily differences of practical importance or interest. For what purpose are you
      taking the effort of check whether means are different? And what do the results
      of the t test actually contribute to that purpose?






    share|cite|improve this answer











    $endgroup$



    Maybe a couple of examples will help to illustrate some of the issues.



    Suppose the two populations are $X sim mathsfNorm(mu = 500, sigma =30)$
    and $Y sim mathsfNorm(mu = 501, sigma = 20.)$



    If both sample sizes are $150,000,$ then there is sufficient power to detect
    the small difference in means.



    set.seed(422)
    x = rnorm(150000, 500, 30)
    y = rnorm(150000, 501, 20)
    t.test(x, y)

    Welch Two Sample t-test

    data: x and y
    t = -10.983, df = 261530, p-value < 2.2e-16
    alternative hypothesis: true difference in means is not equal to 0
    95 percent confidence interval:
    -1.2042715 -0.8395487
    sample estimates:
    mean of x mean of y
    499.9804 501.0023


    If we use only the first 30,000 values in the first sample, results are
    very nearly the same for most practical purposes.



    t.test(x[1:30000], y)

    Welch Two Sample t-test

    data: x[1:30000] and y
    t = -6.3728, df = 35463, p-value = 1.879e-10
    alternative hypothesis: true difference in means is not equal to 0
    95 percent confidence interval:
    -1.5126269 -0.8010336
    sample estimates:
    mean of x mean of y
    499.8455 501.0023


    Here is a boxplot of the data used in the second t test (the wider box indicates a larger sample):



    enter image description here



    Issues of minimal concern:



    • Even though labeled as 'Welch t tests', sample sizes are sufficiently large
      that these are essentially t tests. Unless the data are very far from normal,
      we would still detect the small difference in means.


    • The power of the test is heavily dependent on the smaller sample size. But
      power is not a concern here.


    Issues warranting attention:



    • With such large samples
      in the real world (not the simulation world),
      one is entitled to wonder whether data are truly simple random samples from
      their respective populations. Could smaller, more carefully collected samples provide better information?


    • Although we did not do a formal test to confirm that variances differ, it seems clear from the boxplot that they do. In the Welch test,
      it is OK for variances to differ. But would different variances have important practical implications?


    • Although the null hypothesis that the two population means are equal is soundly rejected with minuscule P-values, it is important to realize that "statistically significant" differences (by whatever definition) are not necessarily differences of practical importance or interest. For what purpose are you
      taking the effort of check whether means are different? And what do the results
      of the t test actually contribute to that purpose?







    share|cite|improve this answer














    share|cite|improve this answer



    share|cite|improve this answer








    edited Apr 22 at 19:42

























    answered Apr 22 at 19:10









    BruceETBruceET

    7,2511721




    7,2511721







    • 1




      $begingroup$
      the test was to determine which list is better in conversion for emails.1 list was from a prediction model(30,000) and the other, the current list(150,000). We had set up an initial test frame but previous conversion rates(0.05%) and power analysis yielding huge sample sizes for significance, we decided to disregard the framework(our model could not have produced huge sample without lowering the accuracy). Hence, we decided to send the emails to both the lists and compute the results after. We have the conversions now and are trying to establish whether or not the difference is significant
      $endgroup$
      – Shivam Tiwari
      2 days ago












    • 1




      $begingroup$
      the test was to determine which list is better in conversion for emails.1 list was from a prediction model(30,000) and the other, the current list(150,000). We had set up an initial test frame but previous conversion rates(0.05%) and power analysis yielding huge sample sizes for significance, we decided to disregard the framework(our model could not have produced huge sample without lowering the accuracy). Hence, we decided to send the emails to both the lists and compute the results after. We have the conversions now and are trying to establish whether or not the difference is significant
      $endgroup$
      – Shivam Tiwari
      2 days ago







    1




    1




    $begingroup$
    the test was to determine which list is better in conversion for emails.1 list was from a prediction model(30,000) and the other, the current list(150,000). We had set up an initial test frame but previous conversion rates(0.05%) and power analysis yielding huge sample sizes for significance, we decided to disregard the framework(our model could not have produced huge sample without lowering the accuracy). Hence, we decided to send the emails to both the lists and compute the results after. We have the conversions now and are trying to establish whether or not the difference is significant
    $endgroup$
    – Shivam Tiwari
    2 days ago




    $begingroup$
    the test was to determine which list is better in conversion for emails.1 list was from a prediction model(30,000) and the other, the current list(150,000). We had set up an initial test frame but previous conversion rates(0.05%) and power analysis yielding huge sample sizes for significance, we decided to disregard the framework(our model could not have produced huge sample without lowering the accuracy). Hence, we decided to send the emails to both the lists and compute the results after. We have the conversions now and are trying to establish whether or not the difference is significant
    $endgroup$
    – Shivam Tiwari
    2 days ago













    6












    $begingroup$

    I can hardly imagine any worthwhile effect size that requires such a large sample size to be decently powered. There's no "bias" of having unequal sample sizes$^1$. The only disadvantage is that the power of the test tends to be somewhat limited by the smaller group. For even very small effects, 30,000 observations may confer quite a powerful test.



    $^1$ except if you inappropriately use the "equal variance" assumption, in which case the "pooled variance" estimate is more heavily weighted toward the larger group (not toward the smaller as you suggested).






    share|cite|improve this answer











    $endgroup$

















      6












      $begingroup$

      I can hardly imagine any worthwhile effect size that requires such a large sample size to be decently powered. There's no "bias" of having unequal sample sizes$^1$. The only disadvantage is that the power of the test tends to be somewhat limited by the smaller group. For even very small effects, 30,000 observations may confer quite a powerful test.



      $^1$ except if you inappropriately use the "equal variance" assumption, in which case the "pooled variance" estimate is more heavily weighted toward the larger group (not toward the smaller as you suggested).






      share|cite|improve this answer











      $endgroup$















        6












        6








        6





        $begingroup$

        I can hardly imagine any worthwhile effect size that requires such a large sample size to be decently powered. There's no "bias" of having unequal sample sizes$^1$. The only disadvantage is that the power of the test tends to be somewhat limited by the smaller group. For even very small effects, 30,000 observations may confer quite a powerful test.



        $^1$ except if you inappropriately use the "equal variance" assumption, in which case the "pooled variance" estimate is more heavily weighted toward the larger group (not toward the smaller as you suggested).






        share|cite|improve this answer











        $endgroup$



        I can hardly imagine any worthwhile effect size that requires such a large sample size to be decently powered. There's no "bias" of having unequal sample sizes$^1$. The only disadvantage is that the power of the test tends to be somewhat limited by the smaller group. For even very small effects, 30,000 observations may confer quite a powerful test.



        $^1$ except if you inappropriately use the "equal variance" assumption, in which case the "pooled variance" estimate is more heavily weighted toward the larger group (not toward the smaller as you suggested).







        share|cite|improve this answer














        share|cite|improve this answer



        share|cite|improve this answer








        edited Apr 22 at 19:49

























        answered Apr 22 at 17:49









        AdamOAdamO

        35.3k266143




        35.3k266143





















            2












            $begingroup$

            I agree with the most that was said so far but I do not completely agree with the satement from @AdamO that "There's no "bias" of having unequal sample sizes".



            Unfortunately, we don't know what the purpose of your study is. But let's assume you are interested in gender differences in regard to salary. We know that in population there should be about 50% male and 50% female and hence if you had drawn random samples and if there was MAR (missing at random) we would expect both groups having approximately same sample sizes. To put it differently, if the ratio between sample sizes of both groups is very different than their ratio in the population this can indicate that either the samples are not random (what could cause a bias) or that the missings are not random (what could cause a bias, too). Talking about the gender example I would be surprised if someone would report such big differences in samples sizes (for example: Did more women refuse to answer questions about their salary? Are the women who responded to the question representative or did only those with a high salary answer the question? And so on... Non-random missings would obviously cause a bias here and make the results misleading).



            Thus the question that I would ask myself is why the groups have unequal sample sizes. If there is an reasonable answer like "There are more people without heart failure than people with heart failure" the data might be alright. But if you would expect equal sample sizes based on what you know about the groups in the population there might be some bias because the samples/ the missings seem to be not random.






            share|cite|improve this answer










            New contributor




            stats.and.r is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.






            $endgroup$












            • $begingroup$
              You seem to be conflating the idea of "bias" and "inefficient design". One is the property of a statistic (in this case the mean difference), the other is a property of a test. The mean difference is never biased no matter how imbalanced the sample. But the power of the test can suffer.
              $endgroup$
              – AdamO
              9 hours ago











            • $begingroup$
              @AdamO: I think I used the word "bias" differently but since I explained what I mean I guess this should be okay to understand what problem to my understanding can arise if the groups are very unequal. I don't know what the word for the misleading effect I describe is correct. Please edit if you think it is necessary.
              $endgroup$
              – stats.and.r
              9 hours ago











            • $begingroup$
              you can't disagree with me based on a fundamentally incorrect understanding of a term.
              $endgroup$
              – AdamO
              9 hours ago










            • $begingroup$
              @AdamO: I don't disagree but just say that I don't know another word for the problem that I describe. Pleaae read my comment carefully. And I welcome it if you edit my answer. Although the definition found on wiki "Statistical bias is a feature of a statistical technique or of its results whereby the expected value of the results differs from the true underlying quantitative parameter being estimated." does agree with my way using the word. On wikipedia this effect is called "selection bias".
              $endgroup$
              – stats.and.r
              9 hours ago











            • $begingroup$
              @AdamO: I find your comment quite harsh and want to show you some definition of my "fundamentally incorrect understanding of a term". Maybe you simply never learnt this meaning of that term? See here: en.m.wikipedia.org/wiki/Selection_bias
              $endgroup$
              – stats.and.r
              9 hours ago















            2












            $begingroup$

            I agree with the most that was said so far but I do not completely agree with the satement from @AdamO that "There's no "bias" of having unequal sample sizes".



            Unfortunately, we don't know what the purpose of your study is. But let's assume you are interested in gender differences in regard to salary. We know that in population there should be about 50% male and 50% female and hence if you had drawn random samples and if there was MAR (missing at random) we would expect both groups having approximately same sample sizes. To put it differently, if the ratio between sample sizes of both groups is very different than their ratio in the population this can indicate that either the samples are not random (what could cause a bias) or that the missings are not random (what could cause a bias, too). Talking about the gender example I would be surprised if someone would report such big differences in samples sizes (for example: Did more women refuse to answer questions about their salary? Are the women who responded to the question representative or did only those with a high salary answer the question? And so on... Non-random missings would obviously cause a bias here and make the results misleading).



            Thus the question that I would ask myself is why the groups have unequal sample sizes. If there is an reasonable answer like "There are more people without heart failure than people with heart failure" the data might be alright. But if you would expect equal sample sizes based on what you know about the groups in the population there might be some bias because the samples/ the missings seem to be not random.






            share|cite|improve this answer










            New contributor




            stats.and.r is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.






            $endgroup$












            • $begingroup$
              You seem to be conflating the idea of "bias" and "inefficient design". One is the property of a statistic (in this case the mean difference), the other is a property of a test. The mean difference is never biased no matter how imbalanced the sample. But the power of the test can suffer.
              $endgroup$
              – AdamO
              9 hours ago











            • $begingroup$
              @AdamO: I think I used the word "bias" differently but since I explained what I mean I guess this should be okay to understand what problem to my understanding can arise if the groups are very unequal. I don't know what the word for the misleading effect I describe is correct. Please edit if you think it is necessary.
              $endgroup$
              – stats.and.r
              9 hours ago











            • $begingroup$
              you can't disagree with me based on a fundamentally incorrect understanding of a term.
              $endgroup$
              – AdamO
              9 hours ago










            • $begingroup$
              @AdamO: I don't disagree but just say that I don't know another word for the problem that I describe. Pleaae read my comment carefully. And I welcome it if you edit my answer. Although the definition found on wiki "Statistical bias is a feature of a statistical technique or of its results whereby the expected value of the results differs from the true underlying quantitative parameter being estimated." does agree with my way using the word. On wikipedia this effect is called "selection bias".
              $endgroup$
              – stats.and.r
              9 hours ago











            • $begingroup$
              @AdamO: I find your comment quite harsh and want to show you some definition of my "fundamentally incorrect understanding of a term". Maybe you simply never learnt this meaning of that term? See here: en.m.wikipedia.org/wiki/Selection_bias
              $endgroup$
              – stats.and.r
              9 hours ago













            2












            2








            2





            $begingroup$

            I agree with the most that was said so far but I do not completely agree with the satement from @AdamO that "There's no "bias" of having unequal sample sizes".



            Unfortunately, we don't know what the purpose of your study is. But let's assume you are interested in gender differences in regard to salary. We know that in population there should be about 50% male and 50% female and hence if you had drawn random samples and if there was MAR (missing at random) we would expect both groups having approximately same sample sizes. To put it differently, if the ratio between sample sizes of both groups is very different than their ratio in the population this can indicate that either the samples are not random (what could cause a bias) or that the missings are not random (what could cause a bias, too). Talking about the gender example I would be surprised if someone would report such big differences in samples sizes (for example: Did more women refuse to answer questions about their salary? Are the women who responded to the question representative or did only those with a high salary answer the question? And so on... Non-random missings would obviously cause a bias here and make the results misleading).



            Thus the question that I would ask myself is why the groups have unequal sample sizes. If there is an reasonable answer like "There are more people without heart failure than people with heart failure" the data might be alright. But if you would expect equal sample sizes based on what you know about the groups in the population there might be some bias because the samples/ the missings seem to be not random.






            share|cite|improve this answer










            New contributor




            stats.and.r is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.






            $endgroup$



            I agree with the most that was said so far but I do not completely agree with the satement from @AdamO that "There's no "bias" of having unequal sample sizes".



            Unfortunately, we don't know what the purpose of your study is. But let's assume you are interested in gender differences in regard to salary. We know that in population there should be about 50% male and 50% female and hence if you had drawn random samples and if there was MAR (missing at random) we would expect both groups having approximately same sample sizes. To put it differently, if the ratio between sample sizes of both groups is very different than their ratio in the population this can indicate that either the samples are not random (what could cause a bias) or that the missings are not random (what could cause a bias, too). Talking about the gender example I would be surprised if someone would report such big differences in samples sizes (for example: Did more women refuse to answer questions about their salary? Are the women who responded to the question representative or did only those with a high salary answer the question? And so on... Non-random missings would obviously cause a bias here and make the results misleading).



            Thus the question that I would ask myself is why the groups have unequal sample sizes. If there is an reasonable answer like "There are more people without heart failure than people with heart failure" the data might be alright. But if you would expect equal sample sizes based on what you know about the groups in the population there might be some bias because the samples/ the missings seem to be not random.







            share|cite|improve this answer










            New contributor




            stats.and.r is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.









            share|cite|improve this answer



            share|cite|improve this answer








            edited 2 days ago





















            New contributor




            stats.and.r is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.









            answered 2 days ago









            stats.and.rstats.and.r

            4019




            4019




            New contributor




            stats.and.r is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.





            New contributor





            stats.and.r is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.






            stats.and.r is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.











            • $begingroup$
              You seem to be conflating the idea of "bias" and "inefficient design". One is the property of a statistic (in this case the mean difference), the other is a property of a test. The mean difference is never biased no matter how imbalanced the sample. But the power of the test can suffer.
              $endgroup$
              – AdamO
              9 hours ago











            • $begingroup$
              @AdamO: I think I used the word "bias" differently but since I explained what I mean I guess this should be okay to understand what problem to my understanding can arise if the groups are very unequal. I don't know what the word for the misleading effect I describe is correct. Please edit if you think it is necessary.
              $endgroup$
              – stats.and.r
              9 hours ago











            • $begingroup$
              you can't disagree with me based on a fundamentally incorrect understanding of a term.
              $endgroup$
              – AdamO
              9 hours ago










            • $begingroup$
              @AdamO: I don't disagree but just say that I don't know another word for the problem that I describe. Pleaae read my comment carefully. And I welcome it if you edit my answer. Although the definition found on wiki "Statistical bias is a feature of a statistical technique or of its results whereby the expected value of the results differs from the true underlying quantitative parameter being estimated." does agree with my way using the word. On wikipedia this effect is called "selection bias".
              $endgroup$
              – stats.and.r
              9 hours ago











            • $begingroup$
              @AdamO: I find your comment quite harsh and want to show you some definition of my "fundamentally incorrect understanding of a term". Maybe you simply never learnt this meaning of that term? See here: en.m.wikipedia.org/wiki/Selection_bias
              $endgroup$
              – stats.and.r
              9 hours ago
















            • $begingroup$
              You seem to be conflating the idea of "bias" and "inefficient design". One is the property of a statistic (in this case the mean difference), the other is a property of a test. The mean difference is never biased no matter how imbalanced the sample. But the power of the test can suffer.
              $endgroup$
              – AdamO
              9 hours ago











            • $begingroup$
              @AdamO: I think I used the word "bias" differently but since I explained what I mean I guess this should be okay to understand what problem to my understanding can arise if the groups are very unequal. I don't know what the word for the misleading effect I describe is correct. Please edit if you think it is necessary.
              $endgroup$
              – stats.and.r
              9 hours ago











            • $begingroup$
              you can't disagree with me based on a fundamentally incorrect understanding of a term.
              $endgroup$
              – AdamO
              9 hours ago










            • $begingroup$
              @AdamO: I don't disagree but just say that I don't know another word for the problem that I describe. Pleaae read my comment carefully. And I welcome it if you edit my answer. Although the definition found on wiki "Statistical bias is a feature of a statistical technique or of its results whereby the expected value of the results differs from the true underlying quantitative parameter being estimated." does agree with my way using the word. On wikipedia this effect is called "selection bias".
              $endgroup$
              – stats.and.r
              9 hours ago











            • $begingroup$
              @AdamO: I find your comment quite harsh and want to show you some definition of my "fundamentally incorrect understanding of a term". Maybe you simply never learnt this meaning of that term? See here: en.m.wikipedia.org/wiki/Selection_bias
              $endgroup$
              – stats.and.r
              9 hours ago















            $begingroup$
            You seem to be conflating the idea of "bias" and "inefficient design". One is the property of a statistic (in this case the mean difference), the other is a property of a test. The mean difference is never biased no matter how imbalanced the sample. But the power of the test can suffer.
            $endgroup$
            – AdamO
            9 hours ago





            $begingroup$
            You seem to be conflating the idea of "bias" and "inefficient design". One is the property of a statistic (in this case the mean difference), the other is a property of a test. The mean difference is never biased no matter how imbalanced the sample. But the power of the test can suffer.
            $endgroup$
            – AdamO
            9 hours ago













            $begingroup$
            @AdamO: I think I used the word "bias" differently but since I explained what I mean I guess this should be okay to understand what problem to my understanding can arise if the groups are very unequal. I don't know what the word for the misleading effect I describe is correct. Please edit if you think it is necessary.
            $endgroup$
            – stats.and.r
            9 hours ago





            $begingroup$
            @AdamO: I think I used the word "bias" differently but since I explained what I mean I guess this should be okay to understand what problem to my understanding can arise if the groups are very unequal. I don't know what the word for the misleading effect I describe is correct. Please edit if you think it is necessary.
            $endgroup$
            – stats.and.r
            9 hours ago













            $begingroup$
            you can't disagree with me based on a fundamentally incorrect understanding of a term.
            $endgroup$
            – AdamO
            9 hours ago




            $begingroup$
            you can't disagree with me based on a fundamentally incorrect understanding of a term.
            $endgroup$
            – AdamO
            9 hours ago












            $begingroup$
            @AdamO: I don't disagree but just say that I don't know another word for the problem that I describe. Pleaae read my comment carefully. And I welcome it if you edit my answer. Although the definition found on wiki "Statistical bias is a feature of a statistical technique or of its results whereby the expected value of the results differs from the true underlying quantitative parameter being estimated." does agree with my way using the word. On wikipedia this effect is called "selection bias".
            $endgroup$
            – stats.and.r
            9 hours ago





            $begingroup$
            @AdamO: I don't disagree but just say that I don't know another word for the problem that I describe. Pleaae read my comment carefully. And I welcome it if you edit my answer. Although the definition found on wiki "Statistical bias is a feature of a statistical technique or of its results whereby the expected value of the results differs from the true underlying quantitative parameter being estimated." does agree with my way using the word. On wikipedia this effect is called "selection bias".
            $endgroup$
            – stats.and.r
            9 hours ago













            $begingroup$
            @AdamO: I find your comment quite harsh and want to show you some definition of my "fundamentally incorrect understanding of a term". Maybe you simply never learnt this meaning of that term? See here: en.m.wikipedia.org/wiki/Selection_bias
            $endgroup$
            – stats.and.r
            9 hours ago




            $begingroup$
            @AdamO: I find your comment quite harsh and want to show you some definition of my "fundamentally incorrect understanding of a term". Maybe you simply never learnt this meaning of that term? See here: en.m.wikipedia.org/wiki/Selection_bias
            $endgroup$
            – stats.and.r
            9 hours ago










            Shivam Tiwari is a new contributor. Be nice, and check out our Code of Conduct.









            draft saved

            draft discarded


















            Shivam Tiwari is a new contributor. Be nice, and check out our Code of Conduct.












            Shivam Tiwari is a new contributor. Be nice, and check out our Code of Conduct.











            Shivam Tiwari is a new contributor. Be nice, and check out our Code of Conduct.














            Thanks for contributing an answer to Cross Validated!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f404439%2f2-sample-t-test-for-sample-sizes-30-000-and-150-000%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Category:9 (number) SubcategoriesMedia in category "9 (number)"Navigation menuUpload mediaGND ID: 4485639-8Library of Congress authority ID: sh85091979ReasonatorScholiaStatistics

            Circuit construction for execution of conditional statements using least significant bitHow are two different registers being used as “control”?How exactly is the stated composite state of the two registers being produced using the $R_zz$ controlled rotations?Efficiently performing controlled rotations in HHLWould this quantum algorithm implementation work?How to prepare a superposed states of odd integers from $1$ to $sqrtN$?Why is this implementation of the order finding algorithm not working?Circuit construction for Hamiltonian simulationHow can I invert the least significant bit of a certain term of a superposed state?Implementing an oracleImplementing a controlled sum operation

            Magento 2 “No Payment Methods” in Admin New OrderHow to integrate Paypal Express Checkout with the Magento APIMagento 1.5 - Sales > Order > edit order and shipping methods disappearAuto Invoice Check/Money Order Payment methodAdd more simple payment methods?Shipping methods not showingWhat should I do to change payment methods if changing the configuration has no effects?1.9 - No Payment Methods showing upMy Payment Methods not Showing for downloadable/virtual product when checkout?Magento2 API to access internal payment methodHow to call an existing payment methods in the registration form?