Check if data (N datapoints) originate from known distributionComparing two discrete distributions (with small cell counts)Testing randomly generated data against its intended distributionnon-parametric two-sample equivalence tests with unequal sample sizesMaximum Likelihood questionHow to draw these t-statistics in RAren't normality tests backwards?Proper test for comparing two means from different distributions? (with limited data)p-value for complete spatial randomness testsHelp with preference testWhat should my critical region look like in this LR test for shifted exponential distribution with pdf $e^-(x-theta)mathbf1_x>theta$?

How to structure presentation to avoid getting questions that will be answered later in the presentation?

Word for giving preference to the oldest child

Why is Searing Smite not listed in the Roll20 Spell books?

Using Python in a Bash Script

Why do we need a voltage divider when we get the same voltage at the output as the input?

What do the novel titles of The Expanse series refer to?

Move arrows along a contour

Why are prop blades not shaped like household fan blades?

How to get Planck length in meters to 6 decimal places

Create two random teams from a list of players

A conjectural trigonometric identity

If I buy and download a game through second Nintendo account do I own it on my main account too?

Patio gate not at right angle to the house

Password management for kids - what's a good way to start?

GDPR Compliance - notification of data breach

What does 「ちんちんかいかい」 mean?

"Fewer errors means better products" or fewer errors mean better products."

How to litter train a cat if both my husband and I work away from home all day?

Academic progression in Germany, what happens after a postdoc? What is the next step?

How do I respond appropriately to an overseas company that obtained a visa for me without hiring me?

How do discovery writers hibernate?

Can living where Rare Earth magnetic ore is abundant provide any protection?

Planting Trees in Outer Space

Best Ergonomic Design for a handheld ranged weapon



Check if data (N datapoints) originate from known distribution


Comparing two discrete distributions (with small cell counts)Testing randomly generated data against its intended distributionnon-parametric two-sample equivalence tests with unequal sample sizesMaximum Likelihood questionHow to draw these t-statistics in RAren't normality tests backwards?Proper test for comparing two means from different distributions? (with limited data)p-value for complete spatial randomness testsHelp with preference testWhat should my critical region look like in this LR test for shifted exponential distribution with pdf $e^-(x-theta)mathbf1_x>theta$?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








4












$begingroup$


Assuming that we have N data points $x_1, ldots, x_N$ and a known distribution $F_0(x)$ (not a Normal Distribution), i.e. we want to check if the system under study behaves 'typically'.



I would like to quantify the probability that those $N$ datapoints originate from this distribution. For large sample sizes, there are various statistical hypothesis tests that could be applied (e.g. chi-squared). However, if the sample size is small, the chi-square test will fail.



Intuitively, it should be possible to detect if those data points are clearly deviating, even with a small sample size. However, is there a mathematical test designed for these conditions?










share|cite|improve this question











$endgroup$













  • $begingroup$
    "Intuitively, it should be possible to detect if those data points are clearly deviating, even with a small sample size." Identifying the population from which a small sample was taken can be problematic.
    $endgroup$
    – BruceET
    Jul 22 at 17:30

















4












$begingroup$


Assuming that we have N data points $x_1, ldots, x_N$ and a known distribution $F_0(x)$ (not a Normal Distribution), i.e. we want to check if the system under study behaves 'typically'.



I would like to quantify the probability that those $N$ datapoints originate from this distribution. For large sample sizes, there are various statistical hypothesis tests that could be applied (e.g. chi-squared). However, if the sample size is small, the chi-square test will fail.



Intuitively, it should be possible to detect if those data points are clearly deviating, even with a small sample size. However, is there a mathematical test designed for these conditions?










share|cite|improve this question











$endgroup$













  • $begingroup$
    "Intuitively, it should be possible to detect if those data points are clearly deviating, even with a small sample size." Identifying the population from which a small sample was taken can be problematic.
    $endgroup$
    – BruceET
    Jul 22 at 17:30













4












4








4


1



$begingroup$


Assuming that we have N data points $x_1, ldots, x_N$ and a known distribution $F_0(x)$ (not a Normal Distribution), i.e. we want to check if the system under study behaves 'typically'.



I would like to quantify the probability that those $N$ datapoints originate from this distribution. For large sample sizes, there are various statistical hypothesis tests that could be applied (e.g. chi-squared). However, if the sample size is small, the chi-square test will fail.



Intuitively, it should be possible to detect if those data points are clearly deviating, even with a small sample size. However, is there a mathematical test designed for these conditions?










share|cite|improve this question











$endgroup$




Assuming that we have N data points $x_1, ldots, x_N$ and a known distribution $F_0(x)$ (not a Normal Distribution), i.e. we want to check if the system under study behaves 'typically'.



I would like to quantify the probability that those $N$ datapoints originate from this distribution. For large sample sizes, there are various statistical hypothesis tests that could be applied (e.g. chi-squared). However, if the sample size is small, the chi-square test will fail.



Intuitively, it should be possible to detect if those data points are clearly deviating, even with a small sample size. However, is there a mathematical test designed for these conditions?







hypothesis-testing mathematical-statistics






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Jul 22 at 8:14









kjetil b halvorsen

35.9k9 gold badges90 silver badges278 bronze badges




35.9k9 gold badges90 silver badges278 bronze badges










asked Jul 22 at 5:23









SZafonSZafon

211 bronze badge




211 bronze badge














  • $begingroup$
    "Intuitively, it should be possible to detect if those data points are clearly deviating, even with a small sample size." Identifying the population from which a small sample was taken can be problematic.
    $endgroup$
    – BruceET
    Jul 22 at 17:30
















  • $begingroup$
    "Intuitively, it should be possible to detect if those data points are clearly deviating, even with a small sample size." Identifying the population from which a small sample was taken can be problematic.
    $endgroup$
    – BruceET
    Jul 22 at 17:30















$begingroup$
"Intuitively, it should be possible to detect if those data points are clearly deviating, even with a small sample size." Identifying the population from which a small sample was taken can be problematic.
$endgroup$
– BruceET
Jul 22 at 17:30




$begingroup$
"Intuitively, it should be possible to detect if those data points are clearly deviating, even with a small sample size." Identifying the population from which a small sample was taken can be problematic.
$endgroup$
– BruceET
Jul 22 at 17:30










2 Answers
2






active

oldest

votes


















3












$begingroup$

Maybe Kolmogorov-Smirnov test with correction for small samples provided by Jan Vrbik in: Vrbik, Jan (2018). "Small-Sample Corrections to Kolmogorov–Smirnov Test Statistic". Pioneer Journal of Theoretical and Applied Statistics. 15 (1–2): 15–23.



Correction itself is also described on Wikipedia site for Kolmogorov-Smirnov test: replace $D_N$ with



$$ D_N+frac 16sqrt N+frac D_N-14N$$



where $D_N$ is standard Kolmogorov-Smirnov statistic.






share|cite|improve this answer











$endgroup$






















    1












    $begingroup$

    Use R to generate 10 observations from a standard uniform
    distribution:



    set seed(722) # for reproducibility
    x = runif(10)
    summary(x); sd(x)
    Min. 1st Qu. Median Mean 3rd Qu. Max.
    0.1270 0.4940 0.7454 0.6627 0.9070 0.9477
    [1] 0.293335 # SD


    Use the Kolmogorov-Smirnov test to see if the sample is consistent with standard uniform. Appropriately, the answer is Yes because data were sampled from a standard uniform distribution: Large P-value, no rejection.



    ks.test(x, punif)

    One-sample Kolmogorov-Smirnov test

    data: x
    D = 0.31507, p-value = 0.2217
    alternative hypothesis: two-sided


    Is the sample also consistent with $mathsfNorm(.5, sqrt1/12)?$ The mean and variance match, but
    shapes differ. Notice that the parameters mean and standard deviation are specified. Again consistent, but we know the normal distribution is not correct.



     ks.test(x, pnorm, .5, sqrt(1/12))

    One-sample Kolmogorov-Smirnov test

    data: x
    D = 0.36246, p-value = 0.1104
    alternative hypothesis: two-sided


    However, the K-S test easily rejects that this sample is from $mathsfExp(rate=2),$ which has mean $1/2 = 0.5,$ but the wrong SD. This exponential distribution has almost 14% of its probability above $1,$ but our sample has no observation above 0.948.



    ks.test(x, dexp, 2)

    One-sample Kolmogorov-Smirnov test

    data: x
    D = 1.5513, p-value < 2.2e-16
    alternative hypothesis: two-sided


    Notes: (1) See other pages on this site and the Internet, including
    the relevant Wikipedia page, which has a brief explanation of the test and some remarks about cases in which parameters must be estimated from data.



    (2) Several well-known statistical software programs have
    procedures that check a sample against a list of often used
    distributions to estimate parameters and see if any distribution is a fit. Often these are called 'distribution ID' procedures
    and sometimes they are restricted to non-negative data.



    For example, when the distribution ID procedure in Minitab
    is asked to compare the small sample above to normal, lognormal, Weibull, and gamma families, here are the parameter estimates:



    ML Estimates of Distribution Parameters

    Distribution Location Shape Scale
    Normal* 0.66265 0.29334
    Lognormal* -0.55937 0.66158
    Weibull 2.62094 0.74268
    Gamma 3.53947 0.18722

    * Scale: Adjusted ML estimate


    And here are appropriate probability plots with P-values of Anderson-Darling goodness-of-fit tests in
    legends. The data are clearly inconsistent with distributions in the lognormal family.



    enter image description here



    (2) For very large sample sizes, Kolmogorov-Smirnov, Anderson-Darling and other goodness-of-fit tests can reject some
    distributions as not fitting---even when the fit might be good
    enough for some practical applications.






    share|cite|improve this answer











    $endgroup$

















      Your Answer








      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "65"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













      draft saved

      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f418547%2fcheck-if-data-n-datapoints-originate-from-known-distribution%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      3












      $begingroup$

      Maybe Kolmogorov-Smirnov test with correction for small samples provided by Jan Vrbik in: Vrbik, Jan (2018). "Small-Sample Corrections to Kolmogorov–Smirnov Test Statistic". Pioneer Journal of Theoretical and Applied Statistics. 15 (1–2): 15–23.



      Correction itself is also described on Wikipedia site for Kolmogorov-Smirnov test: replace $D_N$ with



      $$ D_N+frac 16sqrt N+frac D_N-14N$$



      where $D_N$ is standard Kolmogorov-Smirnov statistic.






      share|cite|improve this answer











      $endgroup$



















        3












        $begingroup$

        Maybe Kolmogorov-Smirnov test with correction for small samples provided by Jan Vrbik in: Vrbik, Jan (2018). "Small-Sample Corrections to Kolmogorov–Smirnov Test Statistic". Pioneer Journal of Theoretical and Applied Statistics. 15 (1–2): 15–23.



        Correction itself is also described on Wikipedia site for Kolmogorov-Smirnov test: replace $D_N$ with



        $$ D_N+frac 16sqrt N+frac D_N-14N$$



        where $D_N$ is standard Kolmogorov-Smirnov statistic.






        share|cite|improve this answer











        $endgroup$

















          3












          3








          3





          $begingroup$

          Maybe Kolmogorov-Smirnov test with correction for small samples provided by Jan Vrbik in: Vrbik, Jan (2018). "Small-Sample Corrections to Kolmogorov–Smirnov Test Statistic". Pioneer Journal of Theoretical and Applied Statistics. 15 (1–2): 15–23.



          Correction itself is also described on Wikipedia site for Kolmogorov-Smirnov test: replace $D_N$ with



          $$ D_N+frac 16sqrt N+frac D_N-14N$$



          where $D_N$ is standard Kolmogorov-Smirnov statistic.






          share|cite|improve this answer











          $endgroup$



          Maybe Kolmogorov-Smirnov test with correction for small samples provided by Jan Vrbik in: Vrbik, Jan (2018). "Small-Sample Corrections to Kolmogorov–Smirnov Test Statistic". Pioneer Journal of Theoretical and Applied Statistics. 15 (1–2): 15–23.



          Correction itself is also described on Wikipedia site for Kolmogorov-Smirnov test: replace $D_N$ with



          $$ D_N+frac 16sqrt N+frac D_N-14N$$



          where $D_N$ is standard Kolmogorov-Smirnov statistic.







          share|cite|improve this answer














          share|cite|improve this answer



          share|cite|improve this answer








          edited Jul 22 at 8:29

























          answered Jul 22 at 8:05









          Łukasz DeryłoŁukasz Deryło

          2,7591 gold badge6 silver badges17 bronze badges




          2,7591 gold badge6 silver badges17 bronze badges


























              1












              $begingroup$

              Use R to generate 10 observations from a standard uniform
              distribution:



              set seed(722) # for reproducibility
              x = runif(10)
              summary(x); sd(x)
              Min. 1st Qu. Median Mean 3rd Qu. Max.
              0.1270 0.4940 0.7454 0.6627 0.9070 0.9477
              [1] 0.293335 # SD


              Use the Kolmogorov-Smirnov test to see if the sample is consistent with standard uniform. Appropriately, the answer is Yes because data were sampled from a standard uniform distribution: Large P-value, no rejection.



              ks.test(x, punif)

              One-sample Kolmogorov-Smirnov test

              data: x
              D = 0.31507, p-value = 0.2217
              alternative hypothesis: two-sided


              Is the sample also consistent with $mathsfNorm(.5, sqrt1/12)?$ The mean and variance match, but
              shapes differ. Notice that the parameters mean and standard deviation are specified. Again consistent, but we know the normal distribution is not correct.



               ks.test(x, pnorm, .5, sqrt(1/12))

              One-sample Kolmogorov-Smirnov test

              data: x
              D = 0.36246, p-value = 0.1104
              alternative hypothesis: two-sided


              However, the K-S test easily rejects that this sample is from $mathsfExp(rate=2),$ which has mean $1/2 = 0.5,$ but the wrong SD. This exponential distribution has almost 14% of its probability above $1,$ but our sample has no observation above 0.948.



              ks.test(x, dexp, 2)

              One-sample Kolmogorov-Smirnov test

              data: x
              D = 1.5513, p-value < 2.2e-16
              alternative hypothesis: two-sided


              Notes: (1) See other pages on this site and the Internet, including
              the relevant Wikipedia page, which has a brief explanation of the test and some remarks about cases in which parameters must be estimated from data.



              (2) Several well-known statistical software programs have
              procedures that check a sample against a list of often used
              distributions to estimate parameters and see if any distribution is a fit. Often these are called 'distribution ID' procedures
              and sometimes they are restricted to non-negative data.



              For example, when the distribution ID procedure in Minitab
              is asked to compare the small sample above to normal, lognormal, Weibull, and gamma families, here are the parameter estimates:



              ML Estimates of Distribution Parameters

              Distribution Location Shape Scale
              Normal* 0.66265 0.29334
              Lognormal* -0.55937 0.66158
              Weibull 2.62094 0.74268
              Gamma 3.53947 0.18722

              * Scale: Adjusted ML estimate


              And here are appropriate probability plots with P-values of Anderson-Darling goodness-of-fit tests in
              legends. The data are clearly inconsistent with distributions in the lognormal family.



              enter image description here



              (2) For very large sample sizes, Kolmogorov-Smirnov, Anderson-Darling and other goodness-of-fit tests can reject some
              distributions as not fitting---even when the fit might be good
              enough for some practical applications.






              share|cite|improve this answer











              $endgroup$



















                1












                $begingroup$

                Use R to generate 10 observations from a standard uniform
                distribution:



                set seed(722) # for reproducibility
                x = runif(10)
                summary(x); sd(x)
                Min. 1st Qu. Median Mean 3rd Qu. Max.
                0.1270 0.4940 0.7454 0.6627 0.9070 0.9477
                [1] 0.293335 # SD


                Use the Kolmogorov-Smirnov test to see if the sample is consistent with standard uniform. Appropriately, the answer is Yes because data were sampled from a standard uniform distribution: Large P-value, no rejection.



                ks.test(x, punif)

                One-sample Kolmogorov-Smirnov test

                data: x
                D = 0.31507, p-value = 0.2217
                alternative hypothesis: two-sided


                Is the sample also consistent with $mathsfNorm(.5, sqrt1/12)?$ The mean and variance match, but
                shapes differ. Notice that the parameters mean and standard deviation are specified. Again consistent, but we know the normal distribution is not correct.



                 ks.test(x, pnorm, .5, sqrt(1/12))

                One-sample Kolmogorov-Smirnov test

                data: x
                D = 0.36246, p-value = 0.1104
                alternative hypothesis: two-sided


                However, the K-S test easily rejects that this sample is from $mathsfExp(rate=2),$ which has mean $1/2 = 0.5,$ but the wrong SD. This exponential distribution has almost 14% of its probability above $1,$ but our sample has no observation above 0.948.



                ks.test(x, dexp, 2)

                One-sample Kolmogorov-Smirnov test

                data: x
                D = 1.5513, p-value < 2.2e-16
                alternative hypothesis: two-sided


                Notes: (1) See other pages on this site and the Internet, including
                the relevant Wikipedia page, which has a brief explanation of the test and some remarks about cases in which parameters must be estimated from data.



                (2) Several well-known statistical software programs have
                procedures that check a sample against a list of often used
                distributions to estimate parameters and see if any distribution is a fit. Often these are called 'distribution ID' procedures
                and sometimes they are restricted to non-negative data.



                For example, when the distribution ID procedure in Minitab
                is asked to compare the small sample above to normal, lognormal, Weibull, and gamma families, here are the parameter estimates:



                ML Estimates of Distribution Parameters

                Distribution Location Shape Scale
                Normal* 0.66265 0.29334
                Lognormal* -0.55937 0.66158
                Weibull 2.62094 0.74268
                Gamma 3.53947 0.18722

                * Scale: Adjusted ML estimate


                And here are appropriate probability plots with P-values of Anderson-Darling goodness-of-fit tests in
                legends. The data are clearly inconsistent with distributions in the lognormal family.



                enter image description here



                (2) For very large sample sizes, Kolmogorov-Smirnov, Anderson-Darling and other goodness-of-fit tests can reject some
                distributions as not fitting---even when the fit might be good
                enough for some practical applications.






                share|cite|improve this answer











                $endgroup$

















                  1












                  1








                  1





                  $begingroup$

                  Use R to generate 10 observations from a standard uniform
                  distribution:



                  set seed(722) # for reproducibility
                  x = runif(10)
                  summary(x); sd(x)
                  Min. 1st Qu. Median Mean 3rd Qu. Max.
                  0.1270 0.4940 0.7454 0.6627 0.9070 0.9477
                  [1] 0.293335 # SD


                  Use the Kolmogorov-Smirnov test to see if the sample is consistent with standard uniform. Appropriately, the answer is Yes because data were sampled from a standard uniform distribution: Large P-value, no rejection.



                  ks.test(x, punif)

                  One-sample Kolmogorov-Smirnov test

                  data: x
                  D = 0.31507, p-value = 0.2217
                  alternative hypothesis: two-sided


                  Is the sample also consistent with $mathsfNorm(.5, sqrt1/12)?$ The mean and variance match, but
                  shapes differ. Notice that the parameters mean and standard deviation are specified. Again consistent, but we know the normal distribution is not correct.



                   ks.test(x, pnorm, .5, sqrt(1/12))

                  One-sample Kolmogorov-Smirnov test

                  data: x
                  D = 0.36246, p-value = 0.1104
                  alternative hypothesis: two-sided


                  However, the K-S test easily rejects that this sample is from $mathsfExp(rate=2),$ which has mean $1/2 = 0.5,$ but the wrong SD. This exponential distribution has almost 14% of its probability above $1,$ but our sample has no observation above 0.948.



                  ks.test(x, dexp, 2)

                  One-sample Kolmogorov-Smirnov test

                  data: x
                  D = 1.5513, p-value < 2.2e-16
                  alternative hypothesis: two-sided


                  Notes: (1) See other pages on this site and the Internet, including
                  the relevant Wikipedia page, which has a brief explanation of the test and some remarks about cases in which parameters must be estimated from data.



                  (2) Several well-known statistical software programs have
                  procedures that check a sample against a list of often used
                  distributions to estimate parameters and see if any distribution is a fit. Often these are called 'distribution ID' procedures
                  and sometimes they are restricted to non-negative data.



                  For example, when the distribution ID procedure in Minitab
                  is asked to compare the small sample above to normal, lognormal, Weibull, and gamma families, here are the parameter estimates:



                  ML Estimates of Distribution Parameters

                  Distribution Location Shape Scale
                  Normal* 0.66265 0.29334
                  Lognormal* -0.55937 0.66158
                  Weibull 2.62094 0.74268
                  Gamma 3.53947 0.18722

                  * Scale: Adjusted ML estimate


                  And here are appropriate probability plots with P-values of Anderson-Darling goodness-of-fit tests in
                  legends. The data are clearly inconsistent with distributions in the lognormal family.



                  enter image description here



                  (2) For very large sample sizes, Kolmogorov-Smirnov, Anderson-Darling and other goodness-of-fit tests can reject some
                  distributions as not fitting---even when the fit might be good
                  enough for some practical applications.






                  share|cite|improve this answer











                  $endgroup$



                  Use R to generate 10 observations from a standard uniform
                  distribution:



                  set seed(722) # for reproducibility
                  x = runif(10)
                  summary(x); sd(x)
                  Min. 1st Qu. Median Mean 3rd Qu. Max.
                  0.1270 0.4940 0.7454 0.6627 0.9070 0.9477
                  [1] 0.293335 # SD


                  Use the Kolmogorov-Smirnov test to see if the sample is consistent with standard uniform. Appropriately, the answer is Yes because data were sampled from a standard uniform distribution: Large P-value, no rejection.



                  ks.test(x, punif)

                  One-sample Kolmogorov-Smirnov test

                  data: x
                  D = 0.31507, p-value = 0.2217
                  alternative hypothesis: two-sided


                  Is the sample also consistent with $mathsfNorm(.5, sqrt1/12)?$ The mean and variance match, but
                  shapes differ. Notice that the parameters mean and standard deviation are specified. Again consistent, but we know the normal distribution is not correct.



                   ks.test(x, pnorm, .5, sqrt(1/12))

                  One-sample Kolmogorov-Smirnov test

                  data: x
                  D = 0.36246, p-value = 0.1104
                  alternative hypothesis: two-sided


                  However, the K-S test easily rejects that this sample is from $mathsfExp(rate=2),$ which has mean $1/2 = 0.5,$ but the wrong SD. This exponential distribution has almost 14% of its probability above $1,$ but our sample has no observation above 0.948.



                  ks.test(x, dexp, 2)

                  One-sample Kolmogorov-Smirnov test

                  data: x
                  D = 1.5513, p-value < 2.2e-16
                  alternative hypothesis: two-sided


                  Notes: (1) See other pages on this site and the Internet, including
                  the relevant Wikipedia page, which has a brief explanation of the test and some remarks about cases in which parameters must be estimated from data.



                  (2) Several well-known statistical software programs have
                  procedures that check a sample against a list of often used
                  distributions to estimate parameters and see if any distribution is a fit. Often these are called 'distribution ID' procedures
                  and sometimes they are restricted to non-negative data.



                  For example, when the distribution ID procedure in Minitab
                  is asked to compare the small sample above to normal, lognormal, Weibull, and gamma families, here are the parameter estimates:



                  ML Estimates of Distribution Parameters

                  Distribution Location Shape Scale
                  Normal* 0.66265 0.29334
                  Lognormal* -0.55937 0.66158
                  Weibull 2.62094 0.74268
                  Gamma 3.53947 0.18722

                  * Scale: Adjusted ML estimate


                  And here are appropriate probability plots with P-values of Anderson-Darling goodness-of-fit tests in
                  legends. The data are clearly inconsistent with distributions in the lognormal family.



                  enter image description here



                  (2) For very large sample sizes, Kolmogorov-Smirnov, Anderson-Darling and other goodness-of-fit tests can reject some
                  distributions as not fitting---even when the fit might be good
                  enough for some practical applications.







                  share|cite|improve this answer














                  share|cite|improve this answer



                  share|cite|improve this answer








                  edited Jul 22 at 8:08

























                  answered Jul 22 at 7:56









                  BruceETBruceET

                  12k1 gold badge8 silver badges26 bronze badges




                  12k1 gold badge8 silver badges26 bronze badges






























                      draft saved

                      draft discarded
















































                      Thanks for contributing an answer to Cross Validated!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      Use MathJax to format equations. MathJax reference.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f418547%2fcheck-if-data-n-datapoints-originate-from-known-distribution%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Category:9 (number) SubcategoriesMedia in category "9 (number)"Navigation menuUpload mediaGND ID: 4485639-8Library of Congress authority ID: sh85091979ReasonatorScholiaStatistics

                      Circuit construction for execution of conditional statements using least significant bitHow are two different registers being used as “control”?How exactly is the stated composite state of the two registers being produced using the $R_zz$ controlled rotations?Efficiently performing controlled rotations in HHLWould this quantum algorithm implementation work?How to prepare a superposed states of odd integers from $1$ to $sqrtN$?Why is this implementation of the order finding algorithm not working?Circuit construction for Hamiltonian simulationHow can I invert the least significant bit of a certain term of a superposed state?Implementing an oracleImplementing a controlled sum operation

                      Magento 2 “No Payment Methods” in Admin New OrderHow to integrate Paypal Express Checkout with the Magento APIMagento 1.5 - Sales > Order > edit order and shipping methods disappearAuto Invoice Check/Money Order Payment methodAdd more simple payment methods?Shipping methods not showingWhat should I do to change payment methods if changing the configuration has no effects?1.9 - No Payment Methods showing upMy Payment Methods not Showing for downloadable/virtual product when checkout?Magento2 API to access internal payment methodHow to call an existing payment methods in the registration form?