What does Fisher mean by this quote?Where does this quote/poem come from?what does this +/- of “average” mean?What does p-value mean in R?Is the “hybrid” between Fisher and Neyman-Pearson approaches to statistical testing really an “incoherent mishmash”?What does “statistically insignificantly worse” mean?Is p-value essentially useless and dangerous to use?repeated measures design of Analysis of variance (ANOVA) with mean not statistically significant and Standard deviation significantWhat does the Hedges g mean in this meta-analysis?How to rigorously justify chosen false-positive/false-negative error rates and underlying cost ratio?How does Fisher calculate his $p$-value?

Can others monetize my project with GPLv3?

Vacuum collapse -- why do strong metals implode but glass doesn't?

A second course in the representation theory

Have only girls been born for a long time in this village?

What does it mean to have a subnet mask /32?

Why is 日本 read as "nihon" but not "nitsuhon"?

Do I have to learn /o/ or /ɔ/ separately?

How big would a Daddy Longlegs Spider need to be to kill an average Human?

How to persuade recruiters to send me the Job Description?

Why didn’t Doctor Strange stay in the original winning timeline?

Co-author responds to email by mistake cc'ing the EiC

(Why) May a Beit Din refuse to bury a body in order to coerce a man into giving a divorce?

What can I do to keep a threaded bolt from falling out of it’s slot

Is it appropriate for a prospective landlord to ask me for my credit report?

Overwrite file only if data

Potential new partner angry about first collaboration - how to answer email to close up this encounter in a graceful manner

How to setup a teletype to a unix shell

Sleeping solo in a double sleeping bag

What is the evidence on the danger of feeding whole blueberries and grapes to infants and toddlers?

Is "stainless" a bulk or a surface property of stainless steel?

Why don't we use Cavea-B

Why we don't have vaccination against all diseases which are caused by microbes?

!I!n!s!e!r!t! !n!b!e!t!w!e!e!n!

Should my "average" PC be able to discern the potential of encountering a gelatinous cube from subtle clues?

What does Fisher mean by this quote?

Where does this quote/poem come from?what does this +/- of “average” mean?What does p-value mean in R?Is the “hybrid” between Fisher and Neyman-Pearson approaches to statistical testing really an “incoherent mishmash”?What does “statistically insignificantly worse” mean?Is p-value essentially useless and dangerous to use?repeated measures design of Analysis of variance (ANOVA) with mean not statistically significant and Standard deviation significantWhat does the Hedges g mean in this meta-analysis?How to rigorously justify chosen false-positive/false-negative error rates and underlying cost ratio?How does Fisher calculate his $p$-value?

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

I keep seeing this famous quote everywhere, but fail to understand the emphasized part every single time.

A man who ‘rejects’ a hypothesis provisionally, as a matter of
habitual practice, when the significance is at the 1% level or higher,
will certainly be mistaken in not more than 1% of such decisions. For
when the hypothesis is correct he will be mistaken in just 1% of these
cases, and when it is incorrect he will never be mistaken in
rejection. [...] However, the calculation is absurdly academic, for in
fact no scientific worker has a fixed level of significance at which
from year to year, and in all circumstances, he rejects hypotheses; he
rather gives his mind to each particular case in the light of his
evidence and his ideas. It should not be forgotten that the cases
chosen for applying a test are manifestly a highly selected set, and
that the conditions of selection cannot be specified even for a single
worker; nor that in the argument used it would clearly be illegitimate
for one to choose the actual level of significance indicated by a
particular trial as though it were his lifelong habit to use just this
level.

(Statistical Methods and Scientific Inference, 1956, p. 42-45)

More specifically, I don't understand

Why are the cases chosen for applying a test "highly selected"? Say you wonder if the average height of people within an area is less than 165cm, and decide to conduct a test. The standard procedure, as far as I know, is to draw random samples from the area and measure their height. How can this be highly selected?

Suppose the cases are highly selected, but how is this related to the choice of the significance level? Consider again the example above, if your sampling method (what I suppose is what Fisher refers to as conditions of selection) is skewed and somehow favors tall people, then the whole research is ruined, and subjective determination of the significance level cannot save it.

Actually, I don't even know what is "the actual level of significance indicated by a particular trial" referring to. Is it the $p$-value of that experiment, some preset value like the (in)famous 0.05, or something else?

edited Aug 8 at 6:07

asked Aug 8 at 6:01

nalzok

5585 silver badges17 bronze badges

add a comment |

I keep seeing this famous quote everywhere, but fail to understand the emphasized part every single time.

A man who ‘rejects’ a hypothesis provisionally, as a matter of
habitual practice, when the significance is at the 1% level or higher,
will certainly be mistaken in not more than 1% of such decisions. For
when the hypothesis is correct he will be mistaken in just 1% of these
cases, and when it is incorrect he will never be mistaken in
rejection. [...] However, the calculation is absurdly academic, for in
fact no scientific worker has a fixed level of significance at which
from year to year, and in all circumstances, he rejects hypotheses; he
rather gives his mind to each particular case in the light of his
evidence and his ideas. It should not be forgotten that the cases
chosen for applying a test are manifestly a highly selected set, and
that the conditions of selection cannot be specified even for a single
worker; nor that in the argument used it would clearly be illegitimate
for one to choose the actual level of significance indicated by a
particular trial as though it were his lifelong habit to use just this
level.

(Statistical Methods and Scientific Inference, 1956, p. 42-45)

More specifically, I don't understand

Why are the cases chosen for applying a test "highly selected"? Say you wonder if the average height of people within an area is less than 165cm, and decide to conduct a test. The standard procedure, as far as I know, is to draw random samples from the area and measure their height. How can this be highly selected?

Suppose the cases are highly selected, but how is this related to the choice of the significance level? Consider again the example above, if your sampling method (what I suppose is what Fisher refers to as conditions of selection) is skewed and somehow favors tall people, then the whole research is ruined, and subjective determination of the significance level cannot save it.

Actually, I don't even know what is "the actual level of significance indicated by a particular trial" referring to. Is it the $p$-value of that experiment, some preset value like the (in)famous 0.05, or something else?

edited Aug 8 at 6:07

asked Aug 8 at 6:01

nalzok

5585 silver badges17 bronze badges

add a comment |

I keep seeing this famous quote everywhere, but fail to understand the emphasized part every single time.

A man who ‘rejects’ a hypothesis provisionally, as a matter of
habitual practice, when the significance is at the 1% level or higher,
will certainly be mistaken in not more than 1% of such decisions. For
when the hypothesis is correct he will be mistaken in just 1% of these
cases, and when it is incorrect he will never be mistaken in
rejection. [...] However, the calculation is absurdly academic, for in
fact no scientific worker has a fixed level of significance at which
from year to year, and in all circumstances, he rejects hypotheses; he
rather gives his mind to each particular case in the light of his
evidence and his ideas. It should not be forgotten that the cases
chosen for applying a test are manifestly a highly selected set, and
that the conditions of selection cannot be specified even for a single
worker; nor that in the argument used it would clearly be illegitimate
for one to choose the actual level of significance indicated by a
particular trial as though it were his lifelong habit to use just this
level.

(Statistical Methods and Scientific Inference, 1956, p. 42-45)

More specifically, I don't understand

Why are the cases chosen for applying a test "highly selected"? Say you wonder if the average height of people within an area is less than 165cm, and decide to conduct a test. The standard procedure, as far as I know, is to draw random samples from the area and measure their height. How can this be highly selected?

Suppose the cases are highly selected, but how is this related to the choice of the significance level? Consider again the example above, if your sampling method (what I suppose is what Fisher refers to as conditions of selection) is skewed and somehow favors tall people, then the whole research is ruined, and subjective determination of the significance level cannot save it.

Actually, I don't even know what is "the actual level of significance indicated by a particular trial" referring to. Is it the $p$-value of that experiment, some preset value like the (in)famous 0.05, or something else?

edited Aug 8 at 6:07

asked Aug 8 at 6:01

nalzok

5585 silver badges17 bronze badges

I keep seeing this famous quote everywhere, but fail to understand the emphasized part every single time.

A man who ‘rejects’ a hypothesis provisionally, as a matter of
habitual practice, when the significance is at the 1% level or higher,
will certainly be mistaken in not more than 1% of such decisions. For
when the hypothesis is correct he will be mistaken in just 1% of these
cases, and when it is incorrect he will never be mistaken in
rejection. [...] However, the calculation is absurdly academic, for in
fact no scientific worker has a fixed level of significance at which
from year to year, and in all circumstances, he rejects hypotheses; he
rather gives his mind to each particular case in the light of his
evidence and his ideas. It should not be forgotten that the cases
chosen for applying a test are manifestly a highly selected set, and
that the conditions of selection cannot be specified even for a single
worker; nor that in the argument used it would clearly be illegitimate
for one to choose the actual level of significance indicated by a
particular trial as though it were his lifelong habit to use just this
level.

(Statistical Methods and Scientific Inference, 1956, p. 42-45)

More specifically, I don't understand

Why are the cases chosen for applying a test "highly selected"? Say you wonder if the average height of people within an area is less than 165cm, and decide to conduct a test. The standard procedure, as far as I know, is to draw random samples from the area and measure their height. How can this be highly selected?

Suppose the cases are highly selected, but how is this related to the choice of the significance level? Consider again the example above, if your sampling method (what I suppose is what Fisher refers to as conditions of selection) is skewed and somehow favors tall people, then the whole research is ruined, and subjective determination of the significance level cannot save it.

Actually, I don't even know what is "the actual level of significance indicated by a particular trial" referring to. Is it the $p$-value of that experiment, some preset value like the (in)famous 0.05, or something else?

hypothesis-testing statistical-significance references experiment-design philosophical

edited Aug 8 at 6:07

asked Aug 8 at 6:01

nalzok

5585 silver badges17 bronze badges

edited Aug 8 at 6:07

asked Aug 8 at 6:01

nalzok

5585 silver badges17 bronze badges

edited Aug 8 at 6:07

asked Aug 8 at 6:01

nalzok

5585 silver badges17 bronze badges

asked Aug 8 at 6:01

nalzok

5585 silver badges17 bronze badges

asked Aug 8 at 6:01

nalzok

5585 silver badges17 bronze badges

add a comment |

3 Answers
3

active

oldest

votes

Here is my paraphrase of what Fisher says in your bolded quote. It should not be forgotten that quite a lot goes into choosing what hypothesis to test, so much so that even for a single person's decision, you could not specify it all. It also should not be forgotten that, for reasons stated above, you cannot decide on a particular trial's significance level always the same way, as a life long habit.

A scientific hypothesis is selected as worth testing against many other competing hypotheses because of the biases of the researcher and their current state of knowledge. The hypotheses are "highly selected", not the samples; the hypotheses are the cases where we apply tests.

The selection process of the hypotheses affects our significance level. If we are very sure of a hypothesis, that should make the significance level less stringent to satisfy ourselves. If we are unsure there is higher burden of proof. Other factors come into play as well, such as Type I error being worse than Type II in drug trials.

I think when he says "indicated by" he simply means "chosen for". Yes, it is a preset value where we reject the hypothesis if the p-value is more extreme.

answered Aug 8 at 7:44

Drew N

3952 silver badges9 bronze badges

add a comment |

The cases to which Fisher is referring are not observations but tests. That is, we select hypotheses to test. We don't just test random hypotheses - we base them on observation, the literature, scientific theories and so on.

If you did test random hypotheses, then the number of times you are mistaken (in the first sentence of your quote) would be 1% (or whatever value is chosen). E.g. if we tested hypotheses like

The parity of a person's social security number is related to his IQ

Blond haired people throw Frisbees better than dark haired people

The time to getting an answer on Cross Validated is related to the number of syllables in your first name.

And tested a whole bunch of them at 1%, we would reject the null about 1% of the time, and do so incorrectly. (Unless, of course, I am on to something with the above nonsense).

I did once see an article about hair color and Frisbee throwing - and it found a difference! So, I call this sort of thing "Frisbee research".

But the part I like the best from the quote is this:

for in fact no scientific worker has a fixed level of significance at
which from year to year, and in all circumstances, he rejects
hypotheses; he rather gives his mind to each particular case in the
light of his evidence and his ideas.

He must be spinning in his grave.

answered Aug 8 at 13:10

Peter Flom♦

80k13 gold badges116 silver badges225 bronze badges

3

$begingroup$
This is a good answer, but I'm hesitated to view "Frisbee research" as bad things. As long as the methodologies are employed properly (taking into account the effect size, etc), I would consider the result plausible. I mean, it is believed that hair color has nothing to do with Frisbee throwing, but it was accepted that Earth is at the center of the universe until hundreds of years ago! We can criticize people for doing things wrong, but we shouldn't blame anyone for asking questions. That being said, I agree that some hypotheses are less useful than others, but still, they can be correct.
$endgroup$
– nalzok
Aug 9 at 0:09

$begingroup$
And they can also be type I errors.
$endgroup$
– Peter Flom♦
Aug 9 at 10:54

add a comment |

Trying to see the background of the quote I came to a version of the book (I am not sure which is which version) that has a slightly different quote

https://archive.org/details/in.ernet.dli.2015.134555/page/n47

The attempts that have been made to explain the cogency of tests of significance in scientific research, by reference to hypothetical frequencies of possible statements, based on them, being right or wrong, thus seem to miss the essential nature of such tests. A man who "rejects" a hypothesis provisionally, as a matter of habitual practice, when the significance is at the 1% level or higher, will certainly be mistaken in not
more than 1% of such decisions. For when the hypothesis is correct he will be mistaken in just 1% of these cases, and when it is incorrect he will never be mistaken in rejection. This inequality statement can therefore be made. However, the calculation is
absurdly academic, for in fact no scientific worker has a fixed level of significance at which from year to year, and in all circumstances, he rejects hypotheses; he rather gives his mind to each particular case in the light of his evidence and his ideas. Further, the calculation is based solely on a hypothesis, which, in the light of the evidence, is often not believed to be true at all, so that the actual probability of erroneous decision, supposing such a phrase to have any meaning, may be much less than the frequency specifying the level of significance. To a practical man, also, who rejects a hypothesis, it is, of course, a matter of indifference with what probability he might be led to accept the hypothesis falsely, for in his case he is not accepting it.

This seems to me a criticism to use the mathematical expression of rejection possibilities, type I errors, as some rigorous argument. Those expressions are often not a good expression for what is relevant and neither are they rigorous.

Why are the cases chosen for applying a test "highly selected"?

This seems to relate to the sentence

Further, the calculation is based solely on a hypothesis, which, in
the light of the evidence, is often not believed to be true at all

We are not indifferent towards the hypothesis that is being tested, and often a hypothesis that is being tested is not believed to be true.

how is this related to the choice of the significance level?

This relates to

so that the actual probability of erroneous decision, supposing such a phrase to have any meaning, may be much less than the frequency specifying the level of significance

The p-value is just the frequency of making a mistake when the null-hypothesis is true. But the actual frequency of making a mistake will be different (lower).

what is "the actual level of significance indicated by a particular trial" referring to

I believe that this part refers to some sort of p-value hacking. Changing the significance level, alpha, after the observations have occurred in order to match the observed p-value, and pretend that this was the cut-off value all along from the beginning.

edited Aug 8 at 11:18

answered Aug 8 at 11:10

Martijn Weterings

15.5k23 silver badges67 bronze badges

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f421179%2fwhat-does-fisher-mean-by-this-quote%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

A scientific hypothesis is selected as worth testing against many other competing hypotheses because of the biases of the researcher and their current state of knowledge. The hypotheses are "highly selected", not the samples; the hypotheses are the cases where we apply tests.

The selection process of the hypotheses affects our significance level. If we are very sure of a hypothesis, that should make the significance level less stringent to satisfy ourselves. If we are unsure there is higher burden of proof. Other factors come into play as well, such as Type I error being worse than Type II in drug trials.

I think when he says "indicated by" he simply means "chosen for". Yes, it is a preset value where we reject the hypothesis if the p-value is more extreme.

answered Aug 8 at 7:44

Drew N

3952 silver badges9 bronze badges

add a comment |

A scientific hypothesis is selected as worth testing against many other competing hypotheses because of the biases of the researcher and their current state of knowledge. The hypotheses are "highly selected", not the samples; the hypotheses are the cases where we apply tests.

The selection process of the hypotheses affects our significance level. If we are very sure of a hypothesis, that should make the significance level less stringent to satisfy ourselves. If we are unsure there is higher burden of proof. Other factors come into play as well, such as Type I error being worse than Type II in drug trials.

I think when he says "indicated by" he simply means "chosen for". Yes, it is a preset value where we reject the hypothesis if the p-value is more extreme.

answered Aug 8 at 7:44

Drew N

3952 silver badges9 bronze badges

add a comment |

A scientific hypothesis is selected as worth testing against many other competing hypotheses because of the biases of the researcher and their current state of knowledge. The hypotheses are "highly selected", not the samples; the hypotheses are the cases where we apply tests.

The selection process of the hypotheses affects our significance level. If we are very sure of a hypothesis, that should make the significance level less stringent to satisfy ourselves. If we are unsure there is higher burden of proof. Other factors come into play as well, such as Type I error being worse than Type II in drug trials.

I think when he says "indicated by" he simply means "chosen for". Yes, it is a preset value where we reject the hypothesis if the p-value is more extreme.

answered Aug 8 at 7:44

Drew N

3952 silver badges9 bronze badges

A scientific hypothesis is selected as worth testing against many other competing hypotheses because of the biases of the researcher and their current state of knowledge. The hypotheses are "highly selected", not the samples; the hypotheses are the cases where we apply tests.

The selection process of the hypotheses affects our significance level. If we are very sure of a hypothesis, that should make the significance level less stringent to satisfy ourselves. If we are unsure there is higher burden of proof. Other factors come into play as well, such as Type I error being worse than Type II in drug trials.

I think when he says "indicated by" he simply means "chosen for". Yes, it is a preset value where we reject the hypothesis if the p-value is more extreme.

answered Aug 8 at 7:44

Drew N

3952 silver badges9 bronze badges

answered Aug 8 at 7:44

Drew N

3952 silver badges9 bronze badges

answered Aug 8 at 7:44

Drew N

3952 silver badges9 bronze badges

answered Aug 8 at 7:44

Drew N

3952 silver badges9 bronze badges

add a comment |

If you did test random hypotheses, then the number of times you are mistaken (in the first sentence of your quote) would be 1% (or whatever value is chosen). E.g. if we tested hypotheses like

The parity of a person's social security number is related to his IQ

Blond haired people throw Frisbees better than dark haired people

The time to getting an answer on Cross Validated is related to the number of syllables in your first name.

And tested a whole bunch of them at 1%, we would reject the null about 1% of the time, and do so incorrectly. (Unless, of course, I am on to something with the above nonsense).

I did once see an article about hair color and Frisbee throwing - and it found a difference! So, I call this sort of thing "Frisbee research".

But the part I like the best from the quote is this:

for in fact no scientific worker has a fixed level of significance at
which from year to year, and in all circumstances, he rejects
hypotheses; he rather gives his mind to each particular case in the
light of his evidence and his ideas.

He must be spinning in his grave.

answered Aug 8 at 13:10

Peter Flom♦

80k13 gold badges116 silver badges225 bronze badges

3

$begingroup$
This is a good answer, but I'm hesitated to view "Frisbee research" as bad things. As long as the methodologies are employed properly (taking into account the effect size, etc), I would consider the result plausible. I mean, it is believed that hair color has nothing to do with Frisbee throwing, but it was accepted that Earth is at the center of the universe until hundreds of years ago! We can criticize people for doing things wrong, but we shouldn't blame anyone for asking questions. That being said, I agree that some hypotheses are less useful than others, but still, they can be correct.
$endgroup$
– nalzok
Aug 9 at 0:09

$begingroup$
And they can also be type I errors.
$endgroup$
– Peter Flom♦
Aug 9 at 10:54

add a comment |

If you did test random hypotheses, then the number of times you are mistaken (in the first sentence of your quote) would be 1% (or whatever value is chosen). E.g. if we tested hypotheses like

The parity of a person's social security number is related to his IQ

Blond haired people throw Frisbees better than dark haired people

The time to getting an answer on Cross Validated is related to the number of syllables in your first name.

And tested a whole bunch of them at 1%, we would reject the null about 1% of the time, and do so incorrectly. (Unless, of course, I am on to something with the above nonsense).

I did once see an article about hair color and Frisbee throwing - and it found a difference! So, I call this sort of thing "Frisbee research".

But the part I like the best from the quote is this:

for in fact no scientific worker has a fixed level of significance at
which from year to year, and in all circumstances, he rejects
hypotheses; he rather gives his mind to each particular case in the
light of his evidence and his ideas.

He must be spinning in his grave.

answered Aug 8 at 13:10

Peter Flom♦

80k13 gold badges116 silver badges225 bronze badges

3

$begingroup$
This is a good answer, but I'm hesitated to view "Frisbee research" as bad things. As long as the methodologies are employed properly (taking into account the effect size, etc), I would consider the result plausible. I mean, it is believed that hair color has nothing to do with Frisbee throwing, but it was accepted that Earth is at the center of the universe until hundreds of years ago! We can criticize people for doing things wrong, but we shouldn't blame anyone for asking questions. That being said, I agree that some hypotheses are less useful than others, but still, they can be correct.
$endgroup$
– nalzok
Aug 9 at 0:09

$begingroup$
And they can also be type I errors.
$endgroup$
– Peter Flom♦
Aug 9 at 10:54

add a comment |

If you did test random hypotheses, then the number of times you are mistaken (in the first sentence of your quote) would be 1% (or whatever value is chosen). E.g. if we tested hypotheses like

The parity of a person's social security number is related to his IQ

Blond haired people throw Frisbees better than dark haired people

The time to getting an answer on Cross Validated is related to the number of syllables in your first name.

And tested a whole bunch of them at 1%, we would reject the null about 1% of the time, and do so incorrectly. (Unless, of course, I am on to something with the above nonsense).

I did once see an article about hair color and Frisbee throwing - and it found a difference! So, I call this sort of thing "Frisbee research".

But the part I like the best from the quote is this:

for in fact no scientific worker has a fixed level of significance at
which from year to year, and in all circumstances, he rejects
hypotheses; he rather gives his mind to each particular case in the
light of his evidence and his ideas.

He must be spinning in his grave.

answered Aug 8 at 13:10

Peter Flom♦

80k13 gold badges116 silver badges225 bronze badges

If you did test random hypotheses, then the number of times you are mistaken (in the first sentence of your quote) would be 1% (or whatever value is chosen). E.g. if we tested hypotheses like

The parity of a person's social security number is related to his IQ

Blond haired people throw Frisbees better than dark haired people

The time to getting an answer on Cross Validated is related to the number of syllables in your first name.

And tested a whole bunch of them at 1%, we would reject the null about 1% of the time, and do so incorrectly. (Unless, of course, I am on to something with the above nonsense).

I did once see an article about hair color and Frisbee throwing - and it found a difference! So, I call this sort of thing "Frisbee research".

But the part I like the best from the quote is this:

for in fact no scientific worker has a fixed level of significance at
which from year to year, and in all circumstances, he rejects
hypotheses; he rather gives his mind to each particular case in the
light of his evidence and his ideas.

He must be spinning in his grave.

answered Aug 8 at 13:10

Peter Flom♦

80k13 gold badges116 silver badges225 bronze badges

answered Aug 8 at 13:10

Peter Flom♦

80k13 gold badges116 silver badges225 bronze badges

answered Aug 8 at 13:10

Peter Flom♦

80k13 gold badges116 silver badges225 bronze badges

answered Aug 8 at 13:10

Peter Flom♦

80k13 gold badges116 silver badges225 bronze badges

3

$begingroup$
This is a good answer, but I'm hesitated to view "Frisbee research" as bad things. As long as the methodologies are employed properly (taking into account the effect size, etc), I would consider the result plausible. I mean, it is believed that hair color has nothing to do with Frisbee throwing, but it was accepted that Earth is at the center of the universe until hundreds of years ago! We can criticize people for doing things wrong, but we shouldn't blame anyone for asking questions. That being said, I agree that some hypotheses are less useful than others, but still, they can be correct.
$endgroup$
– nalzok
Aug 9 at 0:09

$begingroup$
And they can also be type I errors.
$endgroup$
– Peter Flom♦
Aug 9 at 10:54

add a comment |

3

$begingroup$
This is a good answer, but I'm hesitated to view "Frisbee research" as bad things. As long as the methodologies are employed properly (taking into account the effect size, etc), I would consider the result plausible. I mean, it is believed that hair color has nothing to do with Frisbee throwing, but it was accepted that Earth is at the center of the universe until hundreds of years ago! We can criticize people for doing things wrong, but we shouldn't blame anyone for asking questions. That being said, I agree that some hypotheses are less useful than others, but still, they can be correct.
$endgroup$
– nalzok
Aug 9 at 0:09

$begingroup$
And they can also be type I errors.
$endgroup$
– Peter Flom♦
Aug 9 at 10:54

This is a good answer, but I'm hesitated to view "Frisbee research" as bad things. As long as the methodologies are employed properly (taking into account the effect size, etc), I would consider the result plausible. I mean, it is believed that hair color has nothing to do with Frisbee throwing, but it was accepted that Earth is at the center of the universe until hundreds of years ago! We can criticize people for doing things wrong, but we shouldn't blame anyone for asking questions. That being said, I agree that some hypotheses are less useful than others, but still, they can be correct.

– nalzok
Aug 9 at 0:09

And they can also be type I errors.

– Peter Flom♦
Aug 9 at 10:54

add a comment |

Trying to see the background of the quote I came to a version of the book (I am not sure which is which version) that has a slightly different quote

https://archive.org/details/in.ernet.dli.2015.134555/page/n47

The attempts that have been made to explain the cogency of tests of significance in scientific research, by reference to hypothetical frequencies of possible statements, based on them, being right or wrong, thus seem to miss the essential nature of such tests. A man who "rejects" a hypothesis provisionally, as a matter of habitual practice, when the significance is at the 1% level or higher, will certainly be mistaken in not
more than 1% of such decisions. For when the hypothesis is correct he will be mistaken in just 1% of these cases, and when it is incorrect he will never be mistaken in rejection. This inequality statement can therefore be made. However, the calculation is
absurdly academic, for in fact no scientific worker has a fixed level of significance at which from year to year, and in all circumstances, he rejects hypotheses; he rather gives his mind to each particular case in the light of his evidence and his ideas. Further, the calculation is based solely on a hypothesis, which, in the light of the evidence, is often not believed to be true at all, so that the actual probability of erroneous decision, supposing such a phrase to have any meaning, may be much less than the frequency specifying the level of significance. To a practical man, also, who rejects a hypothesis, it is, of course, a matter of indifference with what probability he might be led to accept the hypothesis falsely, for in his case he is not accepting it.

Why are the cases chosen for applying a test "highly selected"?

This seems to relate to the sentence

Further, the calculation is based solely on a hypothesis, which, in
the light of the evidence, is often not believed to be true at all

We are not indifferent towards the hypothesis that is being tested, and often a hypothesis that is being tested is not believed to be true.

how is this related to the choice of the significance level?

This relates to

so that the actual probability of erroneous decision, supposing such a phrase to have any meaning, may be much less than the frequency specifying the level of significance

The p-value is just the frequency of making a mistake when the null-hypothesis is true. But the actual frequency of making a mistake will be different (lower).

what is "the actual level of significance indicated by a particular trial" referring to

I believe that this part refers to some sort of p-value hacking. Changing the significance level, alpha, after the observations have occurred in order to match the observed p-value, and pretend that this was the cut-off value all along from the beginning.

edited Aug 8 at 11:18

answered Aug 8 at 11:10

Martijn Weterings

15.5k23 silver badges67 bronze badges

add a comment |

Trying to see the background of the quote I came to a version of the book (I am not sure which is which version) that has a slightly different quote

https://archive.org/details/in.ernet.dli.2015.134555/page/n47

The attempts that have been made to explain the cogency of tests of significance in scientific research, by reference to hypothetical frequencies of possible statements, based on them, being right or wrong, thus seem to miss the essential nature of such tests. A man who "rejects" a hypothesis provisionally, as a matter of habitual practice, when the significance is at the 1% level or higher, will certainly be mistaken in not
more than 1% of such decisions. For when the hypothesis is correct he will be mistaken in just 1% of these cases, and when it is incorrect he will never be mistaken in rejection. This inequality statement can therefore be made. However, the calculation is
absurdly academic, for in fact no scientific worker has a fixed level of significance at which from year to year, and in all circumstances, he rejects hypotheses; he rather gives his mind to each particular case in the light of his evidence and his ideas. Further, the calculation is based solely on a hypothesis, which, in the light of the evidence, is often not believed to be true at all, so that the actual probability of erroneous decision, supposing such a phrase to have any meaning, may be much less than the frequency specifying the level of significance. To a practical man, also, who rejects a hypothesis, it is, of course, a matter of indifference with what probability he might be led to accept the hypothesis falsely, for in his case he is not accepting it.

Why are the cases chosen for applying a test "highly selected"?

This seems to relate to the sentence

Further, the calculation is based solely on a hypothesis, which, in
the light of the evidence, is often not believed to be true at all

We are not indifferent towards the hypothesis that is being tested, and often a hypothesis that is being tested is not believed to be true.

how is this related to the choice of the significance level?

This relates to

so that the actual probability of erroneous decision, supposing such a phrase to have any meaning, may be much less than the frequency specifying the level of significance

The p-value is just the frequency of making a mistake when the null-hypothesis is true. But the actual frequency of making a mistake will be different (lower).

what is "the actual level of significance indicated by a particular trial" referring to

I believe that this part refers to some sort of p-value hacking. Changing the significance level, alpha, after the observations have occurred in order to match the observed p-value, and pretend that this was the cut-off value all along from the beginning.

edited Aug 8 at 11:18

answered Aug 8 at 11:10

Martijn Weterings

15.5k23 silver badges67 bronze badges

add a comment |

Trying to see the background of the quote I came to a version of the book (I am not sure which is which version) that has a slightly different quote

https://archive.org/details/in.ernet.dli.2015.134555/page/n47

The attempts that have been made to explain the cogency of tests of significance in scientific research, by reference to hypothetical frequencies of possible statements, based on them, being right or wrong, thus seem to miss the essential nature of such tests. A man who "rejects" a hypothesis provisionally, as a matter of habitual practice, when the significance is at the 1% level or higher, will certainly be mistaken in not
more than 1% of such decisions. For when the hypothesis is correct he will be mistaken in just 1% of these cases, and when it is incorrect he will never be mistaken in rejection. This inequality statement can therefore be made. However, the calculation is
absurdly academic, for in fact no scientific worker has a fixed level of significance at which from year to year, and in all circumstances, he rejects hypotheses; he rather gives his mind to each particular case in the light of his evidence and his ideas. Further, the calculation is based solely on a hypothesis, which, in the light of the evidence, is often not believed to be true at all, so that the actual probability of erroneous decision, supposing such a phrase to have any meaning, may be much less than the frequency specifying the level of significance. To a practical man, also, who rejects a hypothesis, it is, of course, a matter of indifference with what probability he might be led to accept the hypothesis falsely, for in his case he is not accepting it.

Why are the cases chosen for applying a test "highly selected"?

This seems to relate to the sentence

Further, the calculation is based solely on a hypothesis, which, in
the light of the evidence, is often not believed to be true at all

We are not indifferent towards the hypothesis that is being tested, and often a hypothesis that is being tested is not believed to be true.

how is this related to the choice of the significance level?

This relates to

so that the actual probability of erroneous decision, supposing such a phrase to have any meaning, may be much less than the frequency specifying the level of significance

The p-value is just the frequency of making a mistake when the null-hypothesis is true. But the actual frequency of making a mistake will be different (lower).

what is "the actual level of significance indicated by a particular trial" referring to

I believe that this part refers to some sort of p-value hacking. Changing the significance level, alpha, after the observations have occurred in order to match the observed p-value, and pretend that this was the cut-off value all along from the beginning.

edited Aug 8 at 11:18

answered Aug 8 at 11:10

Martijn Weterings

15.5k23 silver badges67 bronze badges

Trying to see the background of the quote I came to a version of the book (I am not sure which is which version) that has a slightly different quote

https://archive.org/details/in.ernet.dli.2015.134555/page/n47

The attempts that have been made to explain the cogency of tests of significance in scientific research, by reference to hypothetical frequencies of possible statements, based on them, being right or wrong, thus seem to miss the essential nature of such tests. A man who "rejects" a hypothesis provisionally, as a matter of habitual practice, when the significance is at the 1% level or higher, will certainly be mistaken in not
more than 1% of such decisions. For when the hypothesis is correct he will be mistaken in just 1% of these cases, and when it is incorrect he will never be mistaken in rejection. This inequality statement can therefore be made. However, the calculation is
absurdly academic, for in fact no scientific worker has a fixed level of significance at which from year to year, and in all circumstances, he rejects hypotheses; he rather gives his mind to each particular case in the light of his evidence and his ideas. Further, the calculation is based solely on a hypothesis, which, in the light of the evidence, is often not believed to be true at all, so that the actual probability of erroneous decision, supposing such a phrase to have any meaning, may be much less than the frequency specifying the level of significance. To a practical man, also, who rejects a hypothesis, it is, of course, a matter of indifference with what probability he might be led to accept the hypothesis falsely, for in his case he is not accepting it.

Why are the cases chosen for applying a test "highly selected"?

This seems to relate to the sentence

Further, the calculation is based solely on a hypothesis, which, in
the light of the evidence, is often not believed to be true at all

We are not indifferent towards the hypothesis that is being tested, and often a hypothesis that is being tested is not believed to be true.

how is this related to the choice of the significance level?

This relates to

so that the actual probability of erroneous decision, supposing such a phrase to have any meaning, may be much less than the frequency specifying the level of significance

The p-value is just the frequency of making a mistake when the null-hypothesis is true. But the actual frequency of making a mistake will be different (lower).

what is "the actual level of significance indicated by a particular trial" referring to

I believe that this part refers to some sort of p-value hacking. Changing the significance level, alpha, after the observations have occurred in order to match the observed p-value, and pretend that this was the cut-off value all along from the beginning.

edited Aug 8 at 11:18

answered Aug 8 at 11:10

Martijn Weterings

15.5k23 silver badges67 bronze badges

edited Aug 8 at 11:18

answered Aug 8 at 11:10

Martijn Weterings

15.5k23 silver badges67 bronze badges

answered Aug 8 at 11:10

Martijn Weterings

15.5k23 silver badges67 bronze badges

answered Aug 8 at 11:10

Martijn Weterings

15.5k23 silver badges67 bronze badges

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Cross Validated!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ttdfjt

3 Answers
3

Your Answer

Post as a guest

3 Answers
3

3 Answers
3

Post as a guest

Popular posts from this blog

Grendel Contents Story Scholarship Depictions Notes References Navigation menu10.1093/notesj/gjn112Berserkeree

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

3 Answers 3

3 Answers 3

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Grendel Contents Story Scholarship Depictions Notes References Navigation menu10.1093/notesj/gjn112Berserkeree

3 Answers
3

3 Answers
3

3 Answers
3