What to bootstrap for hypothesis testingExplaining to laypeople why bootstrapping works Bootstrap vs. permutation hypothesis testingHypothesis testing: small timeseries changesWhy the data should be resampled under null hypothesis in bootstrap hypothesis testing?lmer() parametric bootstrap testing for fixed effectsBootstrap hypothesis testing with small sample sizesIs this popular approach to Bootstrap hypothesis testing correct?Hypothesis testing using the non-parametric bootstrapBootstrap hypothesis testing of equality of distributionsWhat is the difference between bootstrap hypothesis testing/permutation test and traditional hypothesis testing?Hypothesis testing for percentage
reverse a call to mmap()
In a list with unique pairs A, B, how can I sort them so that the last B is the first A in the next pair?
Why is Havana covered in 5-digit numbers in Our Man in Havana?
Is there a polite way to ask about one's ethnicity?
How would one carboxylate CBG into its acid form, CBGA?
How can I ping multiple IP addresses at the same time?
Why is it easier to balance a non-moving bike standing up than sitting down?
Would a 7805 5 V regulator drain a 9 V battery?
Why there is a red color in right side?
Is there any way to revive my Sim?
How "fast" do astronomical events occur?
The Amazing Sliding Crossword
Why are there no file insertion syscalls
Explicit song lyrics checker
What is the most suitable position for a bishop here?
「捨ててしまう」why is there two て’s used here?
How can a warlock learn from a spellbook?
How to compute the inverse of an operation in Q#?
How is linear momentum conserved in circular motion?
How to write a nice frame challenge?
How do I find which software is doing an SSH connection?
How is the idea of "girlfriend material" naturally expressed in Russian?
What is the maximum that Player 1 can win?
How can I prevent a user from copying files on another hard drive?
What to bootstrap for hypothesis testing
Explaining to laypeople why bootstrapping works Bootstrap vs. permutation hypothesis testingHypothesis testing: small timeseries changesWhy the data should be resampled under null hypothesis in bootstrap hypothesis testing?lmer() parametric bootstrap testing for fixed effectsBootstrap hypothesis testing with small sample sizesIs this popular approach to Bootstrap hypothesis testing correct?Hypothesis testing using the non-parametric bootstrapBootstrap hypothesis testing of equality of distributionsWhat is the difference between bootstrap hypothesis testing/permutation test and traditional hypothesis testing?Hypothesis testing for percentage
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
$begingroup$
I have a small question about the concept behind hypothesis testing using bootstrap. Assume that I need to evaluate two independent population mean differences: population a and population b. My doubt is the following:
Should I apply bootstrap on a single population, and check the difference of the mean after that?
Mean[BOOT(a)-BOOT(b)]
Alternatively, should I compute di difference:
Mean(a)-Mean(b)
and then apply bootstrap?BOOT[Mean(a)-Mean(b)]
I used this code by using the second approach:
set.seed(123)
a <- rnorm(100)
b <- rnorm(100)
hist(a)
hist(b)
c = a-b
hist(c)
boot_1 = function(R,dati_oss)
n = length(dati_oss)
media_boot = vector("numeric",R)
for(i in 1:R)
ind = sample(1:n,replace=T)
media_boot[i] = mean(dati_oss[ind])
return(media_boot)
res=boot_1(500000,c)
hist(res)
stat = matrix(c(mean(c), mean(res), mean(res)-mean(c), sqrt(var(res)),
as.vector(quantile(res, c(0.025,0.975)))), 1, 6)
colnames(stat) = c("Observed", "Mean-boot", "Bias", "SE", "0.95LCI", "0.95UCI")
row.names(stat) = c("Mean")
stat
r hypothesis-testing bootstrap
$endgroup$
add a comment |
$begingroup$
I have a small question about the concept behind hypothesis testing using bootstrap. Assume that I need to evaluate two independent population mean differences: population a and population b. My doubt is the following:
Should I apply bootstrap on a single population, and check the difference of the mean after that?
Mean[BOOT(a)-BOOT(b)]
Alternatively, should I compute di difference:
Mean(a)-Mean(b)
and then apply bootstrap?BOOT[Mean(a)-Mean(b)]
I used this code by using the second approach:
set.seed(123)
a <- rnorm(100)
b <- rnorm(100)
hist(a)
hist(b)
c = a-b
hist(c)
boot_1 = function(R,dati_oss)
n = length(dati_oss)
media_boot = vector("numeric",R)
for(i in 1:R)
ind = sample(1:n,replace=T)
media_boot[i] = mean(dati_oss[ind])
return(media_boot)
res=boot_1(500000,c)
hist(res)
stat = matrix(c(mean(c), mean(res), mean(res)-mean(c), sqrt(var(res)),
as.vector(quantile(res, c(0.025,0.975)))), 1, 6)
colnames(stat) = c("Observed", "Mean-boot", "Bias", "SE", "0.95LCI", "0.95UCI")
row.names(stat) = c("Mean")
stat
r hypothesis-testing bootstrap
$endgroup$
$begingroup$
Did you try running your code? I get an error. Also, your code doesn't actually match your method 2.
$endgroup$
– gung♦
Jun 10 at 15:00
$begingroup$
@gung where did you get the error?
$endgroup$
– an.dr.ea
Jun 10 at 15:02
$begingroup$
After runningres=boot_1(500000,c)
.
$endgroup$
– gung♦
Jun 10 at 15:04
$begingroup$
@MichaelM Thanks, I just wanted to know if approach 1 or 2 is correct.
$endgroup$
– an.dr.ea
Jun 10 at 15:05
$begingroup$
@gung I don't have this problem, try with fewer reps res = boot_1 (5000, c)
$endgroup$
– an.dr.ea
Jun 10 at 15:07
add a comment |
$begingroup$
I have a small question about the concept behind hypothesis testing using bootstrap. Assume that I need to evaluate two independent population mean differences: population a and population b. My doubt is the following:
Should I apply bootstrap on a single population, and check the difference of the mean after that?
Mean[BOOT(a)-BOOT(b)]
Alternatively, should I compute di difference:
Mean(a)-Mean(b)
and then apply bootstrap?BOOT[Mean(a)-Mean(b)]
I used this code by using the second approach:
set.seed(123)
a <- rnorm(100)
b <- rnorm(100)
hist(a)
hist(b)
c = a-b
hist(c)
boot_1 = function(R,dati_oss)
n = length(dati_oss)
media_boot = vector("numeric",R)
for(i in 1:R)
ind = sample(1:n,replace=T)
media_boot[i] = mean(dati_oss[ind])
return(media_boot)
res=boot_1(500000,c)
hist(res)
stat = matrix(c(mean(c), mean(res), mean(res)-mean(c), sqrt(var(res)),
as.vector(quantile(res, c(0.025,0.975)))), 1, 6)
colnames(stat) = c("Observed", "Mean-boot", "Bias", "SE", "0.95LCI", "0.95UCI")
row.names(stat) = c("Mean")
stat
r hypothesis-testing bootstrap
$endgroup$
I have a small question about the concept behind hypothesis testing using bootstrap. Assume that I need to evaluate two independent population mean differences: population a and population b. My doubt is the following:
Should I apply bootstrap on a single population, and check the difference of the mean after that?
Mean[BOOT(a)-BOOT(b)]
Alternatively, should I compute di difference:
Mean(a)-Mean(b)
and then apply bootstrap?BOOT[Mean(a)-Mean(b)]
I used this code by using the second approach:
set.seed(123)
a <- rnorm(100)
b <- rnorm(100)
hist(a)
hist(b)
c = a-b
hist(c)
boot_1 = function(R,dati_oss)
n = length(dati_oss)
media_boot = vector("numeric",R)
for(i in 1:R)
ind = sample(1:n,replace=T)
media_boot[i] = mean(dati_oss[ind])
return(media_boot)
res=boot_1(500000,c)
hist(res)
stat = matrix(c(mean(c), mean(res), mean(res)-mean(c), sqrt(var(res)),
as.vector(quantile(res, c(0.025,0.975)))), 1, 6)
colnames(stat) = c("Observed", "Mean-boot", "Bias", "SE", "0.95LCI", "0.95UCI")
row.names(stat) = c("Mean")
stat
r hypothesis-testing bootstrap
r hypothesis-testing bootstrap
edited Jun 10 at 14:58
gung♦
110k34271543
110k34271543
asked Jun 10 at 14:14
an.dr.eaan.dr.ea
214
214
$begingroup$
Did you try running your code? I get an error. Also, your code doesn't actually match your method 2.
$endgroup$
– gung♦
Jun 10 at 15:00
$begingroup$
@gung where did you get the error?
$endgroup$
– an.dr.ea
Jun 10 at 15:02
$begingroup$
After runningres=boot_1(500000,c)
.
$endgroup$
– gung♦
Jun 10 at 15:04
$begingroup$
@MichaelM Thanks, I just wanted to know if approach 1 or 2 is correct.
$endgroup$
– an.dr.ea
Jun 10 at 15:05
$begingroup$
@gung I don't have this problem, try with fewer reps res = boot_1 (5000, c)
$endgroup$
– an.dr.ea
Jun 10 at 15:07
add a comment |
$begingroup$
Did you try running your code? I get an error. Also, your code doesn't actually match your method 2.
$endgroup$
– gung♦
Jun 10 at 15:00
$begingroup$
@gung where did you get the error?
$endgroup$
– an.dr.ea
Jun 10 at 15:02
$begingroup$
After runningres=boot_1(500000,c)
.
$endgroup$
– gung♦
Jun 10 at 15:04
$begingroup$
@MichaelM Thanks, I just wanted to know if approach 1 or 2 is correct.
$endgroup$
– an.dr.ea
Jun 10 at 15:05
$begingroup$
@gung I don't have this problem, try with fewer reps res = boot_1 (5000, c)
$endgroup$
– an.dr.ea
Jun 10 at 15:07
$begingroup$
Did you try running your code? I get an error. Also, your code doesn't actually match your method 2.
$endgroup$
– gung♦
Jun 10 at 15:00
$begingroup$
Did you try running your code? I get an error. Also, your code doesn't actually match your method 2.
$endgroup$
– gung♦
Jun 10 at 15:00
$begingroup$
@gung where did you get the error?
$endgroup$
– an.dr.ea
Jun 10 at 15:02
$begingroup$
@gung where did you get the error?
$endgroup$
– an.dr.ea
Jun 10 at 15:02
$begingroup$
After running
res=boot_1(500000,c)
.$endgroup$
– gung♦
Jun 10 at 15:04
$begingroup$
After running
res=boot_1(500000,c)
.$endgroup$
– gung♦
Jun 10 at 15:04
$begingroup$
@MichaelM Thanks, I just wanted to know if approach 1 or 2 is correct.
$endgroup$
– an.dr.ea
Jun 10 at 15:05
$begingroup$
@MichaelM Thanks, I just wanted to know if approach 1 or 2 is correct.
$endgroup$
– an.dr.ea
Jun 10 at 15:05
$begingroup$
@gung I don't have this problem, try with fewer reps res = boot_1 (5000, c)
$endgroup$
– an.dr.ea
Jun 10 at 15:07
$begingroup$
@gung I don't have this problem, try with fewer reps res = boot_1 (5000, c)
$endgroup$
– an.dr.ea
Jun 10 at 15:07
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
The basic principle to apply, quoting @MichaelChernick, is: "Sampling with replacement behaves on the original sample the way the original sample behaves on a population."
Think about how you analyzed the original sample. You took the sample, calculated the mean of each of the 2 groups, and determined the difference between their mean values to get an estimate of the a-b
difference.
So you proceed similarly with each bootstrapped resample: resample from the original sample, calculate the mean of each group, and determine the difference between the means of the two groups as represented in the resample. Do this a large number of times to estimate the distribution of a-b
differences. Compare the mean of the bootstrapped a-b
differences against the a-b
difference found in the original sample to estimate the bias in the original a-b
difference.
Note that the way you design the resampling might depend on the original study design. If you had two independent populations from which you took samples, then the resampling should proceed comparably, within each of the populations. If you sampled from a mixed population in which individual cases were labeled as belonging to population a
versus b
, then you should resample from a pool of all the cases in the original sample.
$endgroup$
add a comment |
$begingroup$
Without going into the coding of it, consider what happens if you calculate Mean(a)-Mean(b) first (proposal 2). You calculate Mean(a) and get a single number, then you calculate Mean(b) and get a single number. Then you take your two numbers, and calculate Mean(a)-Mean(b) to get a single value for your difference. No matter how many times you sample/bootstrap this single value, you will get the same number. Try it out!
Whereas if you take multiple samples of your population A and population B, then calculate the difference of means of your samples (proposal 1) you will get slightly different combinations nearly every time you sample (assuming you don't re-set the random seed at the wrong point in your code), so you will get a range of values for the difference.
I would consider proposal 1 to be a form of bootstrapping, but I wouldn't say the same about proposal 2!
$endgroup$
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f412311%2fwhat-to-bootstrap-for-hypothesis-testing%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
The basic principle to apply, quoting @MichaelChernick, is: "Sampling with replacement behaves on the original sample the way the original sample behaves on a population."
Think about how you analyzed the original sample. You took the sample, calculated the mean of each of the 2 groups, and determined the difference between their mean values to get an estimate of the a-b
difference.
So you proceed similarly with each bootstrapped resample: resample from the original sample, calculate the mean of each group, and determine the difference between the means of the two groups as represented in the resample. Do this a large number of times to estimate the distribution of a-b
differences. Compare the mean of the bootstrapped a-b
differences against the a-b
difference found in the original sample to estimate the bias in the original a-b
difference.
Note that the way you design the resampling might depend on the original study design. If you had two independent populations from which you took samples, then the resampling should proceed comparably, within each of the populations. If you sampled from a mixed population in which individual cases were labeled as belonging to population a
versus b
, then you should resample from a pool of all the cases in the original sample.
$endgroup$
add a comment |
$begingroup$
The basic principle to apply, quoting @MichaelChernick, is: "Sampling with replacement behaves on the original sample the way the original sample behaves on a population."
Think about how you analyzed the original sample. You took the sample, calculated the mean of each of the 2 groups, and determined the difference between their mean values to get an estimate of the a-b
difference.
So you proceed similarly with each bootstrapped resample: resample from the original sample, calculate the mean of each group, and determine the difference between the means of the two groups as represented in the resample. Do this a large number of times to estimate the distribution of a-b
differences. Compare the mean of the bootstrapped a-b
differences against the a-b
difference found in the original sample to estimate the bias in the original a-b
difference.
Note that the way you design the resampling might depend on the original study design. If you had two independent populations from which you took samples, then the resampling should proceed comparably, within each of the populations. If you sampled from a mixed population in which individual cases were labeled as belonging to population a
versus b
, then you should resample from a pool of all the cases in the original sample.
$endgroup$
add a comment |
$begingroup$
The basic principle to apply, quoting @MichaelChernick, is: "Sampling with replacement behaves on the original sample the way the original sample behaves on a population."
Think about how you analyzed the original sample. You took the sample, calculated the mean of each of the 2 groups, and determined the difference between their mean values to get an estimate of the a-b
difference.
So you proceed similarly with each bootstrapped resample: resample from the original sample, calculate the mean of each group, and determine the difference between the means of the two groups as represented in the resample. Do this a large number of times to estimate the distribution of a-b
differences. Compare the mean of the bootstrapped a-b
differences against the a-b
difference found in the original sample to estimate the bias in the original a-b
difference.
Note that the way you design the resampling might depend on the original study design. If you had two independent populations from which you took samples, then the resampling should proceed comparably, within each of the populations. If you sampled from a mixed population in which individual cases were labeled as belonging to population a
versus b
, then you should resample from a pool of all the cases in the original sample.
$endgroup$
The basic principle to apply, quoting @MichaelChernick, is: "Sampling with replacement behaves on the original sample the way the original sample behaves on a population."
Think about how you analyzed the original sample. You took the sample, calculated the mean of each of the 2 groups, and determined the difference between their mean values to get an estimate of the a-b
difference.
So you proceed similarly with each bootstrapped resample: resample from the original sample, calculate the mean of each group, and determine the difference between the means of the two groups as represented in the resample. Do this a large number of times to estimate the distribution of a-b
differences. Compare the mean of the bootstrapped a-b
differences against the a-b
difference found in the original sample to estimate the bias in the original a-b
difference.
Note that the way you design the resampling might depend on the original study design. If you had two independent populations from which you took samples, then the resampling should proceed comparably, within each of the populations. If you sampled from a mixed population in which individual cases were labeled as belonging to population a
versus b
, then you should resample from a pool of all the cases in the original sample.
answered Jun 10 at 16:53
EdMEdM
23.9k234103
23.9k234103
add a comment |
add a comment |
$begingroup$
Without going into the coding of it, consider what happens if you calculate Mean(a)-Mean(b) first (proposal 2). You calculate Mean(a) and get a single number, then you calculate Mean(b) and get a single number. Then you take your two numbers, and calculate Mean(a)-Mean(b) to get a single value for your difference. No matter how many times you sample/bootstrap this single value, you will get the same number. Try it out!
Whereas if you take multiple samples of your population A and population B, then calculate the difference of means of your samples (proposal 1) you will get slightly different combinations nearly every time you sample (assuming you don't re-set the random seed at the wrong point in your code), so you will get a range of values for the difference.
I would consider proposal 1 to be a form of bootstrapping, but I wouldn't say the same about proposal 2!
$endgroup$
add a comment |
$begingroup$
Without going into the coding of it, consider what happens if you calculate Mean(a)-Mean(b) first (proposal 2). You calculate Mean(a) and get a single number, then you calculate Mean(b) and get a single number. Then you take your two numbers, and calculate Mean(a)-Mean(b) to get a single value for your difference. No matter how many times you sample/bootstrap this single value, you will get the same number. Try it out!
Whereas if you take multiple samples of your population A and population B, then calculate the difference of means of your samples (proposal 1) you will get slightly different combinations nearly every time you sample (assuming you don't re-set the random seed at the wrong point in your code), so you will get a range of values for the difference.
I would consider proposal 1 to be a form of bootstrapping, but I wouldn't say the same about proposal 2!
$endgroup$
add a comment |
$begingroup$
Without going into the coding of it, consider what happens if you calculate Mean(a)-Mean(b) first (proposal 2). You calculate Mean(a) and get a single number, then you calculate Mean(b) and get a single number. Then you take your two numbers, and calculate Mean(a)-Mean(b) to get a single value for your difference. No matter how many times you sample/bootstrap this single value, you will get the same number. Try it out!
Whereas if you take multiple samples of your population A and population B, then calculate the difference of means of your samples (proposal 1) you will get slightly different combinations nearly every time you sample (assuming you don't re-set the random seed at the wrong point in your code), so you will get a range of values for the difference.
I would consider proposal 1 to be a form of bootstrapping, but I wouldn't say the same about proposal 2!
$endgroup$
Without going into the coding of it, consider what happens if you calculate Mean(a)-Mean(b) first (proposal 2). You calculate Mean(a) and get a single number, then you calculate Mean(b) and get a single number. Then you take your two numbers, and calculate Mean(a)-Mean(b) to get a single value for your difference. No matter how many times you sample/bootstrap this single value, you will get the same number. Try it out!
Whereas if you take multiple samples of your population A and population B, then calculate the difference of means of your samples (proposal 1) you will get slightly different combinations nearly every time you sample (assuming you don't re-set the random seed at the wrong point in your code), so you will get a range of values for the difference.
I would consider proposal 1 to be a form of bootstrapping, but I wouldn't say the same about proposal 2!
answered Jun 10 at 16:59
IzyIzy
355213
355213
add a comment |
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f412311%2fwhat-to-bootstrap-for-hypothesis-testing%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Did you try running your code? I get an error. Also, your code doesn't actually match your method 2.
$endgroup$
– gung♦
Jun 10 at 15:00
$begingroup$
@gung where did you get the error?
$endgroup$
– an.dr.ea
Jun 10 at 15:02
$begingroup$
After running
res=boot_1(500000,c)
.$endgroup$
– gung♦
Jun 10 at 15:04
$begingroup$
@MichaelM Thanks, I just wanted to know if approach 1 or 2 is correct.
$endgroup$
– an.dr.ea
Jun 10 at 15:05
$begingroup$
@gung I don't have this problem, try with fewer reps res = boot_1 (5000, c)
$endgroup$
– an.dr.ea
Jun 10 at 15:07