Randomly generate a sorted set with uniform distributionGenerate a Monte Carlo sample from a PDF defined by a Fourier SeriesSample $x$ from $g(x)$Problems sampling from a $pdf$ over $SOleft(3right)$Uniformly distributed points over the surface of the standard simplexHow to generate random numbers with Laplace distribution using uniform distributionConditional Sample from Gaussian CopulaProbability of an unordered sample under weighted sampling without replacementSampling from a Mixture of DistributionsGenerating samples of a Gaussian distribution with independent components above a hypersurfaceHow to sample from a multivariate normal distribution in hyperbolic space?
Does Lawful Interception of 4G / the proposed 5G provide a back door for hackers as well?
LWC1513: @salesforce/resourceUrl modules only support default imports
51% attack - apparently very easy? refering to CZ's "rollback btc chain" - How to make sure such corruptible scenario can never happen so easily?
How can a layman easily get the consensus view of what academia *thinks* about a subject?
What is the limit on how high you can fly up?
What is the best way for a skeleton to impersonate human without using magic?
Quote from Leibniz
correct spelling of "carruffel" (fuzz, hustle, all that jazz)
What are the holes in files created with fallocate?
Is 12 minutes connection in Bristol Temple Meads long enough?
Can't find the release for this wiring harness connector
How do I tell my supervisor that he is choosing poor replacements for me while I am on maternity leave?
On studying Computer Science vs. Software Engineering to become a proficient coder
Can someone explain homicide-related death rates?
return tuple of uncopyable objects
Why does the headset man not get on the tractor?
What are the implications of the new alleged key recovery attack preprint on SIMON?
Is there any good reason to write "it is easy to see"?
Why is a set not a partition of itself?
Do I need to say 'o`clock'?
Is there ever any indication in the MCU as to how Spider-Man got his powers?
Was this character’s old age look CGI or make-up?
Why did the metro bus stop at each railway crossing, despite no warning indicating a train was coming?
What is the largest number of identical satellites launched together?
Randomly generate a sorted set with uniform distribution
Generate a Monte Carlo sample from a PDF defined by a Fourier SeriesSample $x$ from $g(x)$Problems sampling from a $pdf$ over $SOleft(3right)$Uniformly distributed points over the surface of the standard simplexHow to generate random numbers with Laplace distribution using uniform distributionConditional Sample from Gaussian CopulaProbability of an unordered sample under weighted sampling without replacementSampling from a Mixture of DistributionsGenerating samples of a Gaussian distribution with independent components above a hypersurfaceHow to sample from a multivariate normal distribution in hyperbolic space?
$begingroup$
I have an ordered set $S = langle S_1, S_2, .., S_M rangle$ from which I want to draw a sample of $N$ elements in such a way that the sample is non-strictly totally ordered (as with $leq$ and the integers), and all the possible occur with equal probability. The sample must be taken with repetition.
For example, let's say $S = langle 1, 2, 3, 4 rangle$ and $N=3$, the samples: $[1, 1, 1]$, $[1,2,3]$, $[2,3,3]$ would be valid, but $[3,2,1]$ or $[2,1,1]$ would be invalid.
A simple way to generate this set would be to just randomly sample from $S$, and then sort the resulting sequence. However, please note that the following approach is biased ($[1,1,1]$ is less likely to occur than $[1,2,3]$, for example).
This question is related to one of the answers given in this StackOverflow question:https://stackoverflow.com/questions/26467434/generating-random-number-in-sorted-order. Note that the algorithm proposed there is to generate such a sample without repetition, whereas I want my sample to be generated with repetition.
random sampling
$endgroup$
|
show 6 more comments
$begingroup$
I have an ordered set $S = langle S_1, S_2, .., S_M rangle$ from which I want to draw a sample of $N$ elements in such a way that the sample is non-strictly totally ordered (as with $leq$ and the integers), and all the possible occur with equal probability. The sample must be taken with repetition.
For example, let's say $S = langle 1, 2, 3, 4 rangle$ and $N=3$, the samples: $[1, 1, 1]$, $[1,2,3]$, $[2,3,3]$ would be valid, but $[3,2,1]$ or $[2,1,1]$ would be invalid.
A simple way to generate this set would be to just randomly sample from $S$, and then sort the resulting sequence. However, please note that the following approach is biased ($[1,1,1]$ is less likely to occur than $[1,2,3]$, for example).
This question is related to one of the answers given in this StackOverflow question:https://stackoverflow.com/questions/26467434/generating-random-number-in-sorted-order. Note that the algorithm proposed there is to generate such a sample without repetition, whereas I want my sample to be generated with repetition.
random sampling
$endgroup$
$begingroup$
Is it easy to compute precisely how biased this "simple" method is? (As an example, $[1,2,3]$ is six times more likely than it should be in your case.) If so then you can achieve an unbiased method by randomly rejecting some of the outputs of this "simple" method. This would probably be rather slow, but if you don't really need particularly high speed then that's not a problem. If you need faster than you'll need to be a bit more clever.
$endgroup$
– Ian
May 8 at 19:00
$begingroup$
In any case, does the simple algorithm "randomly pick $s_1$ from $S_1,dots,S_M$, randomly pick $s_2$ from $s_2,dots,S_M$, ..." give a biased result as well? (By the way, are you sure it makes sense to do this problem in the setting of just a partial order? For a partial order not all samples can be sorted...)
$endgroup$
– Ian
May 8 at 19:10
$begingroup$
@Ian, this "correction" method is something I've been thinking of. However, I couldn't figure out how to compute this bias in the general $N$, $M$ case. Any help with that would be much appreciated! As for your second comment, I do not understand the sample techniques you're describing. It appears to me as if you are just picking the elements in order, but that wouldn't be a random sample. Finally, by partial order, I meant something like <= for integers (which is actually the real problem I'm trying to solve). Please let me know if my naming is incorrect so I can adapt the question.
$endgroup$
– Setzer22
May 8 at 19:22
$begingroup$
If you're talking about a totally ordered set then the bias is straightforward to compute: a particular sample will have its probability multiplied by the number of permutations of that sample that exist, which is easily obtained by using factorials. As for my second idea, the point is to randomly select elements of your sample in order from among those that you're still allowed to select based on the sorting requirement. Again this is reliant on the order being total.
$endgroup$
– Ian
May 8 at 19:24
$begingroup$
@MarcusRitt Note that the interval is always closed, so $langle 1,2 rangle$ can indeed give $[1,1]$ in that approach.
$endgroup$
– Ian
May 8 at 19:26
|
show 6 more comments
$begingroup$
I have an ordered set $S = langle S_1, S_2, .., S_M rangle$ from which I want to draw a sample of $N$ elements in such a way that the sample is non-strictly totally ordered (as with $leq$ and the integers), and all the possible occur with equal probability. The sample must be taken with repetition.
For example, let's say $S = langle 1, 2, 3, 4 rangle$ and $N=3$, the samples: $[1, 1, 1]$, $[1,2,3]$, $[2,3,3]$ would be valid, but $[3,2,1]$ or $[2,1,1]$ would be invalid.
A simple way to generate this set would be to just randomly sample from $S$, and then sort the resulting sequence. However, please note that the following approach is biased ($[1,1,1]$ is less likely to occur than $[1,2,3]$, for example).
This question is related to one of the answers given in this StackOverflow question:https://stackoverflow.com/questions/26467434/generating-random-number-in-sorted-order. Note that the algorithm proposed there is to generate such a sample without repetition, whereas I want my sample to be generated with repetition.
random sampling
$endgroup$
I have an ordered set $S = langle S_1, S_2, .., S_M rangle$ from which I want to draw a sample of $N$ elements in such a way that the sample is non-strictly totally ordered (as with $leq$ and the integers), and all the possible occur with equal probability. The sample must be taken with repetition.
For example, let's say $S = langle 1, 2, 3, 4 rangle$ and $N=3$, the samples: $[1, 1, 1]$, $[1,2,3]$, $[2,3,3]$ would be valid, but $[3,2,1]$ or $[2,1,1]$ would be invalid.
A simple way to generate this set would be to just randomly sample from $S$, and then sort the resulting sequence. However, please note that the following approach is biased ($[1,1,1]$ is less likely to occur than $[1,2,3]$, for example).
This question is related to one of the answers given in this StackOverflow question:https://stackoverflow.com/questions/26467434/generating-random-number-in-sorted-order. Note that the algorithm proposed there is to generate such a sample without repetition, whereas I want my sample to be generated with repetition.
random sampling
random sampling
edited May 8 at 20:40
Jean Marie
32.7k42357
32.7k42357
asked May 8 at 18:57
Setzer22Setzer22
1625
1625
$begingroup$
Is it easy to compute precisely how biased this "simple" method is? (As an example, $[1,2,3]$ is six times more likely than it should be in your case.) If so then you can achieve an unbiased method by randomly rejecting some of the outputs of this "simple" method. This would probably be rather slow, but if you don't really need particularly high speed then that's not a problem. If you need faster than you'll need to be a bit more clever.
$endgroup$
– Ian
May 8 at 19:00
$begingroup$
In any case, does the simple algorithm "randomly pick $s_1$ from $S_1,dots,S_M$, randomly pick $s_2$ from $s_2,dots,S_M$, ..." give a biased result as well? (By the way, are you sure it makes sense to do this problem in the setting of just a partial order? For a partial order not all samples can be sorted...)
$endgroup$
– Ian
May 8 at 19:10
$begingroup$
@Ian, this "correction" method is something I've been thinking of. However, I couldn't figure out how to compute this bias in the general $N$, $M$ case. Any help with that would be much appreciated! As for your second comment, I do not understand the sample techniques you're describing. It appears to me as if you are just picking the elements in order, but that wouldn't be a random sample. Finally, by partial order, I meant something like <= for integers (which is actually the real problem I'm trying to solve). Please let me know if my naming is incorrect so I can adapt the question.
$endgroup$
– Setzer22
May 8 at 19:22
$begingroup$
If you're talking about a totally ordered set then the bias is straightforward to compute: a particular sample will have its probability multiplied by the number of permutations of that sample that exist, which is easily obtained by using factorials. As for my second idea, the point is to randomly select elements of your sample in order from among those that you're still allowed to select based on the sorting requirement. Again this is reliant on the order being total.
$endgroup$
– Ian
May 8 at 19:24
$begingroup$
@MarcusRitt Note that the interval is always closed, so $langle 1,2 rangle$ can indeed give $[1,1]$ in that approach.
$endgroup$
– Ian
May 8 at 19:26
|
show 6 more comments
$begingroup$
Is it easy to compute precisely how biased this "simple" method is? (As an example, $[1,2,3]$ is six times more likely than it should be in your case.) If so then you can achieve an unbiased method by randomly rejecting some of the outputs of this "simple" method. This would probably be rather slow, but if you don't really need particularly high speed then that's not a problem. If you need faster than you'll need to be a bit more clever.
$endgroup$
– Ian
May 8 at 19:00
$begingroup$
In any case, does the simple algorithm "randomly pick $s_1$ from $S_1,dots,S_M$, randomly pick $s_2$ from $s_2,dots,S_M$, ..." give a biased result as well? (By the way, are you sure it makes sense to do this problem in the setting of just a partial order? For a partial order not all samples can be sorted...)
$endgroup$
– Ian
May 8 at 19:10
$begingroup$
@Ian, this "correction" method is something I've been thinking of. However, I couldn't figure out how to compute this bias in the general $N$, $M$ case. Any help with that would be much appreciated! As for your second comment, I do not understand the sample techniques you're describing. It appears to me as if you are just picking the elements in order, but that wouldn't be a random sample. Finally, by partial order, I meant something like <= for integers (which is actually the real problem I'm trying to solve). Please let me know if my naming is incorrect so I can adapt the question.
$endgroup$
– Setzer22
May 8 at 19:22
$begingroup$
If you're talking about a totally ordered set then the bias is straightforward to compute: a particular sample will have its probability multiplied by the number of permutations of that sample that exist, which is easily obtained by using factorials. As for my second idea, the point is to randomly select elements of your sample in order from among those that you're still allowed to select based on the sorting requirement. Again this is reliant on the order being total.
$endgroup$
– Ian
May 8 at 19:24
$begingroup$
@MarcusRitt Note that the interval is always closed, so $langle 1,2 rangle$ can indeed give $[1,1]$ in that approach.
$endgroup$
– Ian
May 8 at 19:26
$begingroup$
Is it easy to compute precisely how biased this "simple" method is? (As an example, $[1,2,3]$ is six times more likely than it should be in your case.) If so then you can achieve an unbiased method by randomly rejecting some of the outputs of this "simple" method. This would probably be rather slow, but if you don't really need particularly high speed then that's not a problem. If you need faster than you'll need to be a bit more clever.
$endgroup$
– Ian
May 8 at 19:00
$begingroup$
Is it easy to compute precisely how biased this "simple" method is? (As an example, $[1,2,3]$ is six times more likely than it should be in your case.) If so then you can achieve an unbiased method by randomly rejecting some of the outputs of this "simple" method. This would probably be rather slow, but if you don't really need particularly high speed then that's not a problem. If you need faster than you'll need to be a bit more clever.
$endgroup$
– Ian
May 8 at 19:00
$begingroup$
In any case, does the simple algorithm "randomly pick $s_1$ from $S_1,dots,S_M$, randomly pick $s_2$ from $s_2,dots,S_M$, ..." give a biased result as well? (By the way, are you sure it makes sense to do this problem in the setting of just a partial order? For a partial order not all samples can be sorted...)
$endgroup$
– Ian
May 8 at 19:10
$begingroup$
In any case, does the simple algorithm "randomly pick $s_1$ from $S_1,dots,S_M$, randomly pick $s_2$ from $s_2,dots,S_M$, ..." give a biased result as well? (By the way, are you sure it makes sense to do this problem in the setting of just a partial order? For a partial order not all samples can be sorted...)
$endgroup$
– Ian
May 8 at 19:10
$begingroup$
@Ian, this "correction" method is something I've been thinking of. However, I couldn't figure out how to compute this bias in the general $N$, $M$ case. Any help with that would be much appreciated! As for your second comment, I do not understand the sample techniques you're describing. It appears to me as if you are just picking the elements in order, but that wouldn't be a random sample. Finally, by partial order, I meant something like <= for integers (which is actually the real problem I'm trying to solve). Please let me know if my naming is incorrect so I can adapt the question.
$endgroup$
– Setzer22
May 8 at 19:22
$begingroup$
@Ian, this "correction" method is something I've been thinking of. However, I couldn't figure out how to compute this bias in the general $N$, $M$ case. Any help with that would be much appreciated! As for your second comment, I do not understand the sample techniques you're describing. It appears to me as if you are just picking the elements in order, but that wouldn't be a random sample. Finally, by partial order, I meant something like <= for integers (which is actually the real problem I'm trying to solve). Please let me know if my naming is incorrect so I can adapt the question.
$endgroup$
– Setzer22
May 8 at 19:22
$begingroup$
If you're talking about a totally ordered set then the bias is straightforward to compute: a particular sample will have its probability multiplied by the number of permutations of that sample that exist, which is easily obtained by using factorials. As for my second idea, the point is to randomly select elements of your sample in order from among those that you're still allowed to select based on the sorting requirement. Again this is reliant on the order being total.
$endgroup$
– Ian
May 8 at 19:24
$begingroup$
If you're talking about a totally ordered set then the bias is straightforward to compute: a particular sample will have its probability multiplied by the number of permutations of that sample that exist, which is easily obtained by using factorials. As for my second idea, the point is to randomly select elements of your sample in order from among those that you're still allowed to select based on the sorting requirement. Again this is reliant on the order being total.
$endgroup$
– Ian
May 8 at 19:24
$begingroup$
@MarcusRitt Note that the interval is always closed, so $langle 1,2 rangle$ can indeed give $[1,1]$ in that approach.
$endgroup$
– Ian
May 8 at 19:26
$begingroup$
@MarcusRitt Note that the interval is always closed, so $langle 1,2 rangle$ can indeed give $[1,1]$ in that approach.
$endgroup$
– Ian
May 8 at 19:26
|
show 6 more comments
2 Answers
2
active
oldest
votes
$begingroup$
It's enough to pick $N$ random elements from $ 1, 2, ldots, M+N-1 $ without replacement and then do a postprocessing step. Say you pick $T_1 < T_2 < ldots < T_N$; then let $S_K = T_K - K + 1$. For example with $M = 4, N = 3$, this is like picking $3$ random elements from $1 ,2 , ldots, 6$. So for example you might pick $T = langle 1, 4, 5 rangle$ and then $S = langle 1 - 1 + 1, 4 - 2 + 1, 5 - 3 + 1 rangle = langle 1, 3, 3 rangle$.
So you need an algorithm for picking random subsets of a given size.
To pick a random subset of size $k$ of the set $1, 2, ldots, n$, there's a nice recursive algorithm. Such a set includes $n$ with probability $k/n$. If it includes $n$, then take the set to be a subset of $1, 2, ldots, n-1$ of size $k-1$, with $n$ adjoined; if it does not include $n$, then the remainder is a subset of $1, 2, ldots, n-1 $ of size $k$. (I learned this algorithm from the late Herb Wilf's notes "East Side, West Side", available online at https://www.math.upenn.edu/~wilf/eastwest.pdf - see page 16 for code. It's in Maple, but should be reasonably understandable.)
$endgroup$
$begingroup$
Must the $T_K$ be sorted? If not then $T_2$ could be $1$ and then you have a problem.
$endgroup$
– Ian
May 8 at 20:25
1
$begingroup$
Also you should obviously have $T_N$ not $T_M$.
$endgroup$
– Ian
May 8 at 20:27
$begingroup$
This is a very elegant solution! As @Ian mentioned, the first random subset must be sorted for this to work, but other than that it worked fine and was very easy to implement in code, thanks!
$endgroup$
– Setzer22
May 8 at 20:57
$begingroup$
I intended for the $T_K$ to be sorted, but it's not obvious from the notation. I will edit accordingly.
$endgroup$
– Michael Lugo
May 9 at 1:59
add a comment |
$begingroup$
I would just draw $3$ random numbers and only accept ones in ascending order, that way you have equal probabilities (i.e. rejection method as discussed)
Otherwise,
Sample the triples by using monte carlo. I.e. count how many different combinations are permissible, and accept each with equal probability according to different values of a $U[0,1]$, i.e. give $111$ the range $[0,1/n]$, $211$ is $[1/n, 2/n]$. As long as you have some sensible ordering on the set of possible sequences you are in business. You can use for loops to achieve this.
$endgroup$
1
$begingroup$
For moderately large $N,M$, the first approach is very slow due to acceptance being rare, while the second approach is slow to get started because there will be many such sorted sequences to handle.
$endgroup$
– Ian
May 8 at 21:56
$begingroup$
Hmm yes agree first approach is kind of standard rejection sampling with low acceptance. For second one you could probably just compute $nU$ with $U$ the standard uniform and then take integer part. Then you can just pass this to an array containing all of the sorted sequences in order.
$endgroup$
– George Dewhirst
May 8 at 22:25
$begingroup$
Can make function from naturals $leq n$ to sequences by doing a recursive function
$endgroup$
– George Dewhirst
May 8 at 23:10
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3218854%2frandomly-generate-a-sorted-set-with-uniform-distribution%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
It's enough to pick $N$ random elements from $ 1, 2, ldots, M+N-1 $ without replacement and then do a postprocessing step. Say you pick $T_1 < T_2 < ldots < T_N$; then let $S_K = T_K - K + 1$. For example with $M = 4, N = 3$, this is like picking $3$ random elements from $1 ,2 , ldots, 6$. So for example you might pick $T = langle 1, 4, 5 rangle$ and then $S = langle 1 - 1 + 1, 4 - 2 + 1, 5 - 3 + 1 rangle = langle 1, 3, 3 rangle$.
So you need an algorithm for picking random subsets of a given size.
To pick a random subset of size $k$ of the set $1, 2, ldots, n$, there's a nice recursive algorithm. Such a set includes $n$ with probability $k/n$. If it includes $n$, then take the set to be a subset of $1, 2, ldots, n-1$ of size $k-1$, with $n$ adjoined; if it does not include $n$, then the remainder is a subset of $1, 2, ldots, n-1 $ of size $k$. (I learned this algorithm from the late Herb Wilf's notes "East Side, West Side", available online at https://www.math.upenn.edu/~wilf/eastwest.pdf - see page 16 for code. It's in Maple, but should be reasonably understandable.)
$endgroup$
$begingroup$
Must the $T_K$ be sorted? If not then $T_2$ could be $1$ and then you have a problem.
$endgroup$
– Ian
May 8 at 20:25
1
$begingroup$
Also you should obviously have $T_N$ not $T_M$.
$endgroup$
– Ian
May 8 at 20:27
$begingroup$
This is a very elegant solution! As @Ian mentioned, the first random subset must be sorted for this to work, but other than that it worked fine and was very easy to implement in code, thanks!
$endgroup$
– Setzer22
May 8 at 20:57
$begingroup$
I intended for the $T_K$ to be sorted, but it's not obvious from the notation. I will edit accordingly.
$endgroup$
– Michael Lugo
May 9 at 1:59
add a comment |
$begingroup$
It's enough to pick $N$ random elements from $ 1, 2, ldots, M+N-1 $ without replacement and then do a postprocessing step. Say you pick $T_1 < T_2 < ldots < T_N$; then let $S_K = T_K - K + 1$. For example with $M = 4, N = 3$, this is like picking $3$ random elements from $1 ,2 , ldots, 6$. So for example you might pick $T = langle 1, 4, 5 rangle$ and then $S = langle 1 - 1 + 1, 4 - 2 + 1, 5 - 3 + 1 rangle = langle 1, 3, 3 rangle$.
So you need an algorithm for picking random subsets of a given size.
To pick a random subset of size $k$ of the set $1, 2, ldots, n$, there's a nice recursive algorithm. Such a set includes $n$ with probability $k/n$. If it includes $n$, then take the set to be a subset of $1, 2, ldots, n-1$ of size $k-1$, with $n$ adjoined; if it does not include $n$, then the remainder is a subset of $1, 2, ldots, n-1 $ of size $k$. (I learned this algorithm from the late Herb Wilf's notes "East Side, West Side", available online at https://www.math.upenn.edu/~wilf/eastwest.pdf - see page 16 for code. It's in Maple, but should be reasonably understandable.)
$endgroup$
$begingroup$
Must the $T_K$ be sorted? If not then $T_2$ could be $1$ and then you have a problem.
$endgroup$
– Ian
May 8 at 20:25
1
$begingroup$
Also you should obviously have $T_N$ not $T_M$.
$endgroup$
– Ian
May 8 at 20:27
$begingroup$
This is a very elegant solution! As @Ian mentioned, the first random subset must be sorted for this to work, but other than that it worked fine and was very easy to implement in code, thanks!
$endgroup$
– Setzer22
May 8 at 20:57
$begingroup$
I intended for the $T_K$ to be sorted, but it's not obvious from the notation. I will edit accordingly.
$endgroup$
– Michael Lugo
May 9 at 1:59
add a comment |
$begingroup$
It's enough to pick $N$ random elements from $ 1, 2, ldots, M+N-1 $ without replacement and then do a postprocessing step. Say you pick $T_1 < T_2 < ldots < T_N$; then let $S_K = T_K - K + 1$. For example with $M = 4, N = 3$, this is like picking $3$ random elements from $1 ,2 , ldots, 6$. So for example you might pick $T = langle 1, 4, 5 rangle$ and then $S = langle 1 - 1 + 1, 4 - 2 + 1, 5 - 3 + 1 rangle = langle 1, 3, 3 rangle$.
So you need an algorithm for picking random subsets of a given size.
To pick a random subset of size $k$ of the set $1, 2, ldots, n$, there's a nice recursive algorithm. Such a set includes $n$ with probability $k/n$. If it includes $n$, then take the set to be a subset of $1, 2, ldots, n-1$ of size $k-1$, with $n$ adjoined; if it does not include $n$, then the remainder is a subset of $1, 2, ldots, n-1 $ of size $k$. (I learned this algorithm from the late Herb Wilf's notes "East Side, West Side", available online at https://www.math.upenn.edu/~wilf/eastwest.pdf - see page 16 for code. It's in Maple, but should be reasonably understandable.)
$endgroup$
It's enough to pick $N$ random elements from $ 1, 2, ldots, M+N-1 $ without replacement and then do a postprocessing step. Say you pick $T_1 < T_2 < ldots < T_N$; then let $S_K = T_K - K + 1$. For example with $M = 4, N = 3$, this is like picking $3$ random elements from $1 ,2 , ldots, 6$. So for example you might pick $T = langle 1, 4, 5 rangle$ and then $S = langle 1 - 1 + 1, 4 - 2 + 1, 5 - 3 + 1 rangle = langle 1, 3, 3 rangle$.
So you need an algorithm for picking random subsets of a given size.
To pick a random subset of size $k$ of the set $1, 2, ldots, n$, there's a nice recursive algorithm. Such a set includes $n$ with probability $k/n$. If it includes $n$, then take the set to be a subset of $1, 2, ldots, n-1$ of size $k-1$, with $n$ adjoined; if it does not include $n$, then the remainder is a subset of $1, 2, ldots, n-1 $ of size $k$. (I learned this algorithm from the late Herb Wilf's notes "East Side, West Side", available online at https://www.math.upenn.edu/~wilf/eastwest.pdf - see page 16 for code. It's in Maple, but should be reasonably understandable.)
edited May 9 at 1:59
answered May 8 at 20:14
Michael LugoMichael Lugo
18.5k33577
18.5k33577
$begingroup$
Must the $T_K$ be sorted? If not then $T_2$ could be $1$ and then you have a problem.
$endgroup$
– Ian
May 8 at 20:25
1
$begingroup$
Also you should obviously have $T_N$ not $T_M$.
$endgroup$
– Ian
May 8 at 20:27
$begingroup$
This is a very elegant solution! As @Ian mentioned, the first random subset must be sorted for this to work, but other than that it worked fine and was very easy to implement in code, thanks!
$endgroup$
– Setzer22
May 8 at 20:57
$begingroup$
I intended for the $T_K$ to be sorted, but it's not obvious from the notation. I will edit accordingly.
$endgroup$
– Michael Lugo
May 9 at 1:59
add a comment |
$begingroup$
Must the $T_K$ be sorted? If not then $T_2$ could be $1$ and then you have a problem.
$endgroup$
– Ian
May 8 at 20:25
1
$begingroup$
Also you should obviously have $T_N$ not $T_M$.
$endgroup$
– Ian
May 8 at 20:27
$begingroup$
This is a very elegant solution! As @Ian mentioned, the first random subset must be sorted for this to work, but other than that it worked fine and was very easy to implement in code, thanks!
$endgroup$
– Setzer22
May 8 at 20:57
$begingroup$
I intended for the $T_K$ to be sorted, but it's not obvious from the notation. I will edit accordingly.
$endgroup$
– Michael Lugo
May 9 at 1:59
$begingroup$
Must the $T_K$ be sorted? If not then $T_2$ could be $1$ and then you have a problem.
$endgroup$
– Ian
May 8 at 20:25
$begingroup$
Must the $T_K$ be sorted? If not then $T_2$ could be $1$ and then you have a problem.
$endgroup$
– Ian
May 8 at 20:25
1
1
$begingroup$
Also you should obviously have $T_N$ not $T_M$.
$endgroup$
– Ian
May 8 at 20:27
$begingroup$
Also you should obviously have $T_N$ not $T_M$.
$endgroup$
– Ian
May 8 at 20:27
$begingroup$
This is a very elegant solution! As @Ian mentioned, the first random subset must be sorted for this to work, but other than that it worked fine and was very easy to implement in code, thanks!
$endgroup$
– Setzer22
May 8 at 20:57
$begingroup$
This is a very elegant solution! As @Ian mentioned, the first random subset must be sorted for this to work, but other than that it worked fine and was very easy to implement in code, thanks!
$endgroup$
– Setzer22
May 8 at 20:57
$begingroup$
I intended for the $T_K$ to be sorted, but it's not obvious from the notation. I will edit accordingly.
$endgroup$
– Michael Lugo
May 9 at 1:59
$begingroup$
I intended for the $T_K$ to be sorted, but it's not obvious from the notation. I will edit accordingly.
$endgroup$
– Michael Lugo
May 9 at 1:59
add a comment |
$begingroup$
I would just draw $3$ random numbers and only accept ones in ascending order, that way you have equal probabilities (i.e. rejection method as discussed)
Otherwise,
Sample the triples by using monte carlo. I.e. count how many different combinations are permissible, and accept each with equal probability according to different values of a $U[0,1]$, i.e. give $111$ the range $[0,1/n]$, $211$ is $[1/n, 2/n]$. As long as you have some sensible ordering on the set of possible sequences you are in business. You can use for loops to achieve this.
$endgroup$
1
$begingroup$
For moderately large $N,M$, the first approach is very slow due to acceptance being rare, while the second approach is slow to get started because there will be many such sorted sequences to handle.
$endgroup$
– Ian
May 8 at 21:56
$begingroup$
Hmm yes agree first approach is kind of standard rejection sampling with low acceptance. For second one you could probably just compute $nU$ with $U$ the standard uniform and then take integer part. Then you can just pass this to an array containing all of the sorted sequences in order.
$endgroup$
– George Dewhirst
May 8 at 22:25
$begingroup$
Can make function from naturals $leq n$ to sequences by doing a recursive function
$endgroup$
– George Dewhirst
May 8 at 23:10
add a comment |
$begingroup$
I would just draw $3$ random numbers and only accept ones in ascending order, that way you have equal probabilities (i.e. rejection method as discussed)
Otherwise,
Sample the triples by using monte carlo. I.e. count how many different combinations are permissible, and accept each with equal probability according to different values of a $U[0,1]$, i.e. give $111$ the range $[0,1/n]$, $211$ is $[1/n, 2/n]$. As long as you have some sensible ordering on the set of possible sequences you are in business. You can use for loops to achieve this.
$endgroup$
1
$begingroup$
For moderately large $N,M$, the first approach is very slow due to acceptance being rare, while the second approach is slow to get started because there will be many such sorted sequences to handle.
$endgroup$
– Ian
May 8 at 21:56
$begingroup$
Hmm yes agree first approach is kind of standard rejection sampling with low acceptance. For second one you could probably just compute $nU$ with $U$ the standard uniform and then take integer part. Then you can just pass this to an array containing all of the sorted sequences in order.
$endgroup$
– George Dewhirst
May 8 at 22:25
$begingroup$
Can make function from naturals $leq n$ to sequences by doing a recursive function
$endgroup$
– George Dewhirst
May 8 at 23:10
add a comment |
$begingroup$
I would just draw $3$ random numbers and only accept ones in ascending order, that way you have equal probabilities (i.e. rejection method as discussed)
Otherwise,
Sample the triples by using monte carlo. I.e. count how many different combinations are permissible, and accept each with equal probability according to different values of a $U[0,1]$, i.e. give $111$ the range $[0,1/n]$, $211$ is $[1/n, 2/n]$. As long as you have some sensible ordering on the set of possible sequences you are in business. You can use for loops to achieve this.
$endgroup$
I would just draw $3$ random numbers and only accept ones in ascending order, that way you have equal probabilities (i.e. rejection method as discussed)
Otherwise,
Sample the triples by using monte carlo. I.e. count how many different combinations are permissible, and accept each with equal probability according to different values of a $U[0,1]$, i.e. give $111$ the range $[0,1/n]$, $211$ is $[1/n, 2/n]$. As long as you have some sensible ordering on the set of possible sequences you are in business. You can use for loops to achieve this.
edited May 8 at 20:00
answered May 8 at 19:48
George DewhirstGeorge Dewhirst
2,471125
2,471125
1
$begingroup$
For moderately large $N,M$, the first approach is very slow due to acceptance being rare, while the second approach is slow to get started because there will be many such sorted sequences to handle.
$endgroup$
– Ian
May 8 at 21:56
$begingroup$
Hmm yes agree first approach is kind of standard rejection sampling with low acceptance. For second one you could probably just compute $nU$ with $U$ the standard uniform and then take integer part. Then you can just pass this to an array containing all of the sorted sequences in order.
$endgroup$
– George Dewhirst
May 8 at 22:25
$begingroup$
Can make function from naturals $leq n$ to sequences by doing a recursive function
$endgroup$
– George Dewhirst
May 8 at 23:10
add a comment |
1
$begingroup$
For moderately large $N,M$, the first approach is very slow due to acceptance being rare, while the second approach is slow to get started because there will be many such sorted sequences to handle.
$endgroup$
– Ian
May 8 at 21:56
$begingroup$
Hmm yes agree first approach is kind of standard rejection sampling with low acceptance. For second one you could probably just compute $nU$ with $U$ the standard uniform and then take integer part. Then you can just pass this to an array containing all of the sorted sequences in order.
$endgroup$
– George Dewhirst
May 8 at 22:25
$begingroup$
Can make function from naturals $leq n$ to sequences by doing a recursive function
$endgroup$
– George Dewhirst
May 8 at 23:10
1
1
$begingroup$
For moderately large $N,M$, the first approach is very slow due to acceptance being rare, while the second approach is slow to get started because there will be many such sorted sequences to handle.
$endgroup$
– Ian
May 8 at 21:56
$begingroup$
For moderately large $N,M$, the first approach is very slow due to acceptance being rare, while the second approach is slow to get started because there will be many such sorted sequences to handle.
$endgroup$
– Ian
May 8 at 21:56
$begingroup$
Hmm yes agree first approach is kind of standard rejection sampling with low acceptance. For second one you could probably just compute $nU$ with $U$ the standard uniform and then take integer part. Then you can just pass this to an array containing all of the sorted sequences in order.
$endgroup$
– George Dewhirst
May 8 at 22:25
$begingroup$
Hmm yes agree first approach is kind of standard rejection sampling with low acceptance. For second one you could probably just compute $nU$ with $U$ the standard uniform and then take integer part. Then you can just pass this to an array containing all of the sorted sequences in order.
$endgroup$
– George Dewhirst
May 8 at 22:25
$begingroup$
Can make function from naturals $leq n$ to sequences by doing a recursive function
$endgroup$
– George Dewhirst
May 8 at 23:10
$begingroup$
Can make function from naturals $leq n$ to sequences by doing a recursive function
$endgroup$
– George Dewhirst
May 8 at 23:10
add a comment |
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3218854%2frandomly-generate-a-sorted-set-with-uniform-distribution%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Is it easy to compute precisely how biased this "simple" method is? (As an example, $[1,2,3]$ is six times more likely than it should be in your case.) If so then you can achieve an unbiased method by randomly rejecting some of the outputs of this "simple" method. This would probably be rather slow, but if you don't really need particularly high speed then that's not a problem. If you need faster than you'll need to be a bit more clever.
$endgroup$
– Ian
May 8 at 19:00
$begingroup$
In any case, does the simple algorithm "randomly pick $s_1$ from $S_1,dots,S_M$, randomly pick $s_2$ from $s_2,dots,S_M$, ..." give a biased result as well? (By the way, are you sure it makes sense to do this problem in the setting of just a partial order? For a partial order not all samples can be sorted...)
$endgroup$
– Ian
May 8 at 19:10
$begingroup$
@Ian, this "correction" method is something I've been thinking of. However, I couldn't figure out how to compute this bias in the general $N$, $M$ case. Any help with that would be much appreciated! As for your second comment, I do not understand the sample techniques you're describing. It appears to me as if you are just picking the elements in order, but that wouldn't be a random sample. Finally, by partial order, I meant something like <= for integers (which is actually the real problem I'm trying to solve). Please let me know if my naming is incorrect so I can adapt the question.
$endgroup$
– Setzer22
May 8 at 19:22
$begingroup$
If you're talking about a totally ordered set then the bias is straightforward to compute: a particular sample will have its probability multiplied by the number of permutations of that sample that exist, which is easily obtained by using factorials. As for my second idea, the point is to randomly select elements of your sample in order from among those that you're still allowed to select based on the sorting requirement. Again this is reliant on the order being total.
$endgroup$
– Ian
May 8 at 19:24
$begingroup$
@MarcusRitt Note that the interval is always closed, so $langle 1,2 rangle$ can indeed give $[1,1]$ in that approach.
$endgroup$
– Ian
May 8 at 19:26