How to find the three closest (nearest) values within a vector?Finding the two closest numbers in a list using sortingHow to find two closest (nearest) values within a vector in MATLAB?How to access the last value in a vector?Counting the number of elements with the values of x in a vectorMinimum distance between elements in two logical vectorsFind column value of second, third (etc) closest value in multiple other columnsThe index of second, third,.. min with apply functionFind nearest data from shapefileNearest neighbour vector matching without replacementIn R, sample from a neighborhood according to scoresr - Finding closest coordinates between two large data setsFinding the nearest neighbor in “i & j” coordinates in R based on lat/lon outputs
Infeasibility in mathematical optimization models
Can a fight scene, component-wise, be too complex and complicated?
What should I call bands of armed men in Medieval Times?
Withdrew when Jimmy met up with Heath
Are differences between uniformly distributed numbers uniformly distributed?
How to take the beginning and end parts of a list with simpler syntax?
Acceptable to cut steak before searing?
What does this double-treble double-bass staff mean?
On the Rømer experiments and the speed if light
Who are these characters/superheroes in the posters from Chris's room in Family Guy?
constant evaluation when using differential equations.
Plausibility of Ice Eaters in the Arctic
Multirow in tabularx?
In SQL Server, why does backward scan of clustered index cannot use parallelism?
How to avoid the "need" to learn more before conducting research?
How to mark beverage cans in a cooler for a blind person?
Wherein the Shatapatha Brahmana it was mentioned about 8.64 lakh alphabets in Vedas?
If "more guns less crime", how do gun advocates explain that the EU has less crime than the US?
A simple stop watch which I want to extend
Why are Gatwick's runways too close together?
I accidentally overwrote a Linux binary file
What are the conventions for transcribing Semitic languages into Greek?
How are you supposed to know the strumming pattern for a song from the "chord sheet music"?
What does Apple mean by "This may decrease battery life"?
How to find the three closest (nearest) values within a vector?
Finding the two closest numbers in a list using sortingHow to find two closest (nearest) values within a vector in MATLAB?How to access the last value in a vector?Counting the number of elements with the values of x in a vectorMinimum distance between elements in two logical vectorsFind column value of second, third (etc) closest value in multiple other columnsThe index of second, third,.. min with apply functionFind nearest data from shapefileNearest neighbour vector matching without replacementIn R, sample from a neighborhood according to scoresr - Finding closest coordinates between two large data setsFinding the nearest neighbor in “i & j” coordinates in R based on lat/lon outputs
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I would like to find out the three closest numbers in a vector.
Something like
v = c(10,23,25,26,38,50)
c = findClosest(v,3)
c
23 25 26
I tried with sort(colSums(as.matrix(dist(x))))[1:3], and kind of works, but it selects the three numbers with minimum overall distance not the three closest numbers.
There is already an answer for matlab, but I do not know how to translate it to R:
%finds the index with the minimal difference in A
minDiffInd = find(abs(diff(A))==min(abs(diff(A))));
%extract this index, and it's neighbor index from A
val1 = A(minDiffInd);
val2 = A(minDiffInd+1);
How to find two closest (nearest) values within a vector in MATLAB?
r
add a comment |
I would like to find out the three closest numbers in a vector.
Something like
v = c(10,23,25,26,38,50)
c = findClosest(v,3)
c
23 25 26
I tried with sort(colSums(as.matrix(dist(x))))[1:3], and kind of works, but it selects the three numbers with minimum overall distance not the three closest numbers.
There is already an answer for matlab, but I do not know how to translate it to R:
%finds the index with the minimal difference in A
minDiffInd = find(abs(diff(A))==min(abs(diff(A))));
%extract this index, and it's neighbor index from A
val1 = A(minDiffInd);
val2 = A(minDiffInd+1);
How to find two closest (nearest) values within a vector in MATLAB?
r
2
If you replacefind
withwhich
(and use[
for array/matrix indexing), the Matlab answer will work in R, but obviously only works to find the closest 2. Can you clarify what you mean exactly by "finding the closest values in a vector"? The matlab answer only works if the vector is sorted, is that a fair assumption? Your title says "two" but your example uses "3", which is it? A solution working for arbitraryn
is much harder that one that only works for 2. The matlab answer does not extend to >2 numbers, is that why you're asking?
– antoine-sac
Jul 31 at 8:34
Hi, yes I fixed the title. In my case I need three. Following your suggestion I have adapted the code from MATLAB and it works, but it only finds the two closest numbers. How should I adapt it to find also the third? The vector can be sorted, it is just a group of numbers and I have to pick the three closer replicas.
– Terry
Jul 31 at 8:47
add a comment |
I would like to find out the three closest numbers in a vector.
Something like
v = c(10,23,25,26,38,50)
c = findClosest(v,3)
c
23 25 26
I tried with sort(colSums(as.matrix(dist(x))))[1:3], and kind of works, but it selects the three numbers with minimum overall distance not the three closest numbers.
There is already an answer for matlab, but I do not know how to translate it to R:
%finds the index with the minimal difference in A
minDiffInd = find(abs(diff(A))==min(abs(diff(A))));
%extract this index, and it's neighbor index from A
val1 = A(minDiffInd);
val2 = A(minDiffInd+1);
How to find two closest (nearest) values within a vector in MATLAB?
r
I would like to find out the three closest numbers in a vector.
Something like
v = c(10,23,25,26,38,50)
c = findClosest(v,3)
c
23 25 26
I tried with sort(colSums(as.matrix(dist(x))))[1:3], and kind of works, but it selects the three numbers with minimum overall distance not the three closest numbers.
There is already an answer for matlab, but I do not know how to translate it to R:
%finds the index with the minimal difference in A
minDiffInd = find(abs(diff(A))==min(abs(diff(A))));
%extract this index, and it's neighbor index from A
val1 = A(minDiffInd);
val2 = A(minDiffInd+1);
How to find two closest (nearest) values within a vector in MATLAB?
r
r
edited Jul 31 at 8:46
Sotos
34.7k5 gold badges19 silver badges45 bronze badges
34.7k5 gold badges19 silver badges45 bronze badges
asked Jul 31 at 8:24
TerryTerry
564 bronze badges
564 bronze badges
2
If you replacefind
withwhich
(and use[
for array/matrix indexing), the Matlab answer will work in R, but obviously only works to find the closest 2. Can you clarify what you mean exactly by "finding the closest values in a vector"? The matlab answer only works if the vector is sorted, is that a fair assumption? Your title says "two" but your example uses "3", which is it? A solution working for arbitraryn
is much harder that one that only works for 2. The matlab answer does not extend to >2 numbers, is that why you're asking?
– antoine-sac
Jul 31 at 8:34
Hi, yes I fixed the title. In my case I need three. Following your suggestion I have adapted the code from MATLAB and it works, but it only finds the two closest numbers. How should I adapt it to find also the third? The vector can be sorted, it is just a group of numbers and I have to pick the three closer replicas.
– Terry
Jul 31 at 8:47
add a comment |
2
If you replacefind
withwhich
(and use[
for array/matrix indexing), the Matlab answer will work in R, but obviously only works to find the closest 2. Can you clarify what you mean exactly by "finding the closest values in a vector"? The matlab answer only works if the vector is sorted, is that a fair assumption? Your title says "two" but your example uses "3", which is it? A solution working for arbitraryn
is much harder that one that only works for 2. The matlab answer does not extend to >2 numbers, is that why you're asking?
– antoine-sac
Jul 31 at 8:34
Hi, yes I fixed the title. In my case I need three. Following your suggestion I have adapted the code from MATLAB and it works, but it only finds the two closest numbers. How should I adapt it to find also the third? The vector can be sorted, it is just a group of numbers and I have to pick the three closer replicas.
– Terry
Jul 31 at 8:47
2
2
If you replace
find
with which
(and use [
for array/matrix indexing), the Matlab answer will work in R, but obviously only works to find the closest 2. Can you clarify what you mean exactly by "finding the closest values in a vector"? The matlab answer only works if the vector is sorted, is that a fair assumption? Your title says "two" but your example uses "3", which is it? A solution working for arbitrary n
is much harder that one that only works for 2. The matlab answer does not extend to >2 numbers, is that why you're asking?– antoine-sac
Jul 31 at 8:34
If you replace
find
with which
(and use [
for array/matrix indexing), the Matlab answer will work in R, but obviously only works to find the closest 2. Can you clarify what you mean exactly by "finding the closest values in a vector"? The matlab answer only works if the vector is sorted, is that a fair assumption? Your title says "two" but your example uses "3", which is it? A solution working for arbitrary n
is much harder that one that only works for 2. The matlab answer does not extend to >2 numbers, is that why you're asking?– antoine-sac
Jul 31 at 8:34
Hi, yes I fixed the title. In my case I need three. Following your suggestion I have adapted the code from MATLAB and it works, but it only finds the two closest numbers. How should I adapt it to find also the third? The vector can be sorted, it is just a group of numbers and I have to pick the three closer replicas.
– Terry
Jul 31 at 8:47
Hi, yes I fixed the title. In my case I need three. Following your suggestion I have adapted the code from MATLAB and it works, but it only finds the two closest numbers. How should I adapt it to find also the third? The vector can be sorted, it is just a group of numbers and I have to pick the three closer replicas.
– Terry
Jul 31 at 8:47
add a comment |
4 Answers
4
active
oldest
votes
My assumption is that the for the n
nearest values, the only thing that matters is the difference between the v[i] - v[i - (n-1)]
. That is, finding the minimum of diff(x, lag = n - 1L)
.
findClosest <- function(x, n)
x <- sort(x)
x[seq.int(which.min(diff(x, lag = n - 1L)), length.out = n)]
findClosest(v, 3L)
[1] 23 25 26
add a comment |
Let's define "nearest numbers" by "numbers with minimal sum of L1 distances". You can achieve what you want by a combination of diff
and windowed sum.
You could write a much shorter function but I wrote it step by step to make it easier to follow.
v <- c(10,23,25,26,38,50)
#' Find the n nearest numbers in a vector
#'
#' @param v Numeric vector
#' @param n Number of nearest numbers to extract
#'
#' @details "Nearest numbers" defined as the numbers which minimise the
#' within-group sum of L1 distances.
#'
findClosest <- function(v, n)
# Sort and remove NA
v <- sort(v, na.last = NA)
# Compute L1 distances between closest points. We know each point is next to
# its closest neighbour since we sorted.
delta <- diff(v)
# Compute sum of L1 distances on a rolling window with n - 1 elements
# Why n-1 ? Because we are looking at deltas and 2 deltas ~ 3 elements.
withingroup_distances <- zoo::rollsum(delta, k = n - 1)
# Now it's simply finding the group with minimum within-group sum
# And working out the elements
group_index <- which.min(withingroup_distances)
element_indices <- group_index + 0:(n-1)
v[element_indices]
findClosest(v, 2)
# 25 26
findClosest(v, 3)
# 23 25 26
Thanks, I have implemented it and it works great! Thanks also for the explanation, I understood the logic behind it.
– Terry
Jul 31 at 10:50
1
Interestingly, this solution can very easily be extended to use another norm such as L2 instead of L1, if you want to penalise larger gaps more. For example, (10,20,30) and (50,55,70) are equally near according to L1 (10+10=5+15) but the first group is better according to L2 (10^2+10^2 < 5^2+15^2).
– antoine-sac
Aug 1 at 7:04
Very interesting. Actually, I think I am gonna give it a try since I would like to find the three numbers with minimum variance between them. The L1 does not allow them, L2 instead would allow me to select the group with minimum variance. Thanks very much!
– Terry
Aug 1 at 16:24
1
You're welcome, you just have to usedelta^2
in therollsum
– antoine-sac
Aug 1 at 16:29
add a comment |
An idea is to use zoo
library to do a rolling operation, i.e.
library(zoo)
m1 <- rollapply(v, 3, by = 1, function(i)c(sum(diff(i)), c(i)))
m1[which.min(m1[, 1]),][-1]
#[1] 23 25 26
Or make it into a function,
findClosest <- function(vec, n)
require(zoo)
vec1 <- sort(vec)
m1 <- rollapply(vec1, n, by = 1, function(i) c(sum(diff(i)), c(i)))
return(m1[which.min(m1[, 1]),][-1])
findClosest(v, 3)
#[1] 23 25 26
add a comment |
A base R option, idea being we first sort
the vector and subtract every i
th element with i + n - 1
element in the sorted vector and select the group which has minimum difference.
closest_n_vectors <- function(v, n)
v1 <- sort(v)
inds <- which.min(sapply(head(seq_along(v1), -(n - 1)), function(x)
v1[x + n -1] - v1[x]))
v1[inds: (inds + n - 1)]
closest_n_vectors(v, 3)
#[1] 23 25 26
closest_n_vectors(c(2, 10, 1, 20, 4, 5, 23), 2)
#[1] 1 2
closest_n_vectors(c(19, 23, 45, 67, 89, 65, 1), 2)
#[1] 65 67
closest_n_vectors(c(19, 23, 45, 67, 89, 65, 1), 3)
#[1] 1 19 23
In case of tie this will return the numbers with smallest value since we are using which.min
.
BENCHMARKS
Since we have got quite a few answers, it is worth doing a benchmark of all the solutions till now
set.seed(1234)
x <- sample(100000000, 100000)
identical(findClosest_antoine(x, 3), findClosest_Sotos(x, 3),
closest_n_vectors_Ronak(x, 3), findClosest_Cole(x, 3))
#[1] TRUE
microbenchmark::microbenchmark(
antoine = findClosest_antoine(x, 3),
Sotos = findClosest_Sotos(x, 3),
Ronak = closest_n_vectors_Ronak(x, 3),
Cole = findClosest_Cole(x, 3),
times = 10
)
#Unit: milliseconds
# expr min lq mean median uq max neval cld
#antoine 148.751 159.071 163.298 162.581 167.365 181.314 10 b
# Sotos 1086.098 1349.762 1372.232 1398.211 1453.217 1553.945 10 c
# Ronak 54.248 56.870 78.886 83.129 94.748 100.299 10 a
# Cole 4.958 5.042 6.202 6.047 7.386 7.915 10 a
1
@Cole I am not sure aboutcld
either but I have it in my output. Yes, @Rui's solution was notidentical
. I didn't check that earlier.
– Ronak Shah
Jul 31 at 11:18
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f57286328%2fhow-to-find-the-three-closest-nearest-values-within-a-vector%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
My assumption is that the for the n
nearest values, the only thing that matters is the difference between the v[i] - v[i - (n-1)]
. That is, finding the minimum of diff(x, lag = n - 1L)
.
findClosest <- function(x, n)
x <- sort(x)
x[seq.int(which.min(diff(x, lag = n - 1L)), length.out = n)]
findClosest(v, 3L)
[1] 23 25 26
add a comment |
My assumption is that the for the n
nearest values, the only thing that matters is the difference between the v[i] - v[i - (n-1)]
. That is, finding the minimum of diff(x, lag = n - 1L)
.
findClosest <- function(x, n)
x <- sort(x)
x[seq.int(which.min(diff(x, lag = n - 1L)), length.out = n)]
findClosest(v, 3L)
[1] 23 25 26
add a comment |
My assumption is that the for the n
nearest values, the only thing that matters is the difference between the v[i] - v[i - (n-1)]
. That is, finding the minimum of diff(x, lag = n - 1L)
.
findClosest <- function(x, n)
x <- sort(x)
x[seq.int(which.min(diff(x, lag = n - 1L)), length.out = n)]
findClosest(v, 3L)
[1] 23 25 26
My assumption is that the for the n
nearest values, the only thing that matters is the difference between the v[i] - v[i - (n-1)]
. That is, finding the minimum of diff(x, lag = n - 1L)
.
findClosest <- function(x, n)
x <- sort(x)
x[seq.int(which.min(diff(x, lag = n - 1L)), length.out = n)]
findClosest(v, 3L)
[1] 23 25 26
edited Aug 1 at 3:00
answered Jul 31 at 10:14
ColeCole
2,3451 gold badge1 silver badge9 bronze badges
2,3451 gold badge1 silver badge9 bronze badges
add a comment |
add a comment |
Let's define "nearest numbers" by "numbers with minimal sum of L1 distances". You can achieve what you want by a combination of diff
and windowed sum.
You could write a much shorter function but I wrote it step by step to make it easier to follow.
v <- c(10,23,25,26,38,50)
#' Find the n nearest numbers in a vector
#'
#' @param v Numeric vector
#' @param n Number of nearest numbers to extract
#'
#' @details "Nearest numbers" defined as the numbers which minimise the
#' within-group sum of L1 distances.
#'
findClosest <- function(v, n)
# Sort and remove NA
v <- sort(v, na.last = NA)
# Compute L1 distances between closest points. We know each point is next to
# its closest neighbour since we sorted.
delta <- diff(v)
# Compute sum of L1 distances on a rolling window with n - 1 elements
# Why n-1 ? Because we are looking at deltas and 2 deltas ~ 3 elements.
withingroup_distances <- zoo::rollsum(delta, k = n - 1)
# Now it's simply finding the group with minimum within-group sum
# And working out the elements
group_index <- which.min(withingroup_distances)
element_indices <- group_index + 0:(n-1)
v[element_indices]
findClosest(v, 2)
# 25 26
findClosest(v, 3)
# 23 25 26
Thanks, I have implemented it and it works great! Thanks also for the explanation, I understood the logic behind it.
– Terry
Jul 31 at 10:50
1
Interestingly, this solution can very easily be extended to use another norm such as L2 instead of L1, if you want to penalise larger gaps more. For example, (10,20,30) and (50,55,70) are equally near according to L1 (10+10=5+15) but the first group is better according to L2 (10^2+10^2 < 5^2+15^2).
– antoine-sac
Aug 1 at 7:04
Very interesting. Actually, I think I am gonna give it a try since I would like to find the three numbers with minimum variance between them. The L1 does not allow them, L2 instead would allow me to select the group with minimum variance. Thanks very much!
– Terry
Aug 1 at 16:24
1
You're welcome, you just have to usedelta^2
in therollsum
– antoine-sac
Aug 1 at 16:29
add a comment |
Let's define "nearest numbers" by "numbers with minimal sum of L1 distances". You can achieve what you want by a combination of diff
and windowed sum.
You could write a much shorter function but I wrote it step by step to make it easier to follow.
v <- c(10,23,25,26,38,50)
#' Find the n nearest numbers in a vector
#'
#' @param v Numeric vector
#' @param n Number of nearest numbers to extract
#'
#' @details "Nearest numbers" defined as the numbers which minimise the
#' within-group sum of L1 distances.
#'
findClosest <- function(v, n)
# Sort and remove NA
v <- sort(v, na.last = NA)
# Compute L1 distances between closest points. We know each point is next to
# its closest neighbour since we sorted.
delta <- diff(v)
# Compute sum of L1 distances on a rolling window with n - 1 elements
# Why n-1 ? Because we are looking at deltas and 2 deltas ~ 3 elements.
withingroup_distances <- zoo::rollsum(delta, k = n - 1)
# Now it's simply finding the group with minimum within-group sum
# And working out the elements
group_index <- which.min(withingroup_distances)
element_indices <- group_index + 0:(n-1)
v[element_indices]
findClosest(v, 2)
# 25 26
findClosest(v, 3)
# 23 25 26
Thanks, I have implemented it and it works great! Thanks also for the explanation, I understood the logic behind it.
– Terry
Jul 31 at 10:50
1
Interestingly, this solution can very easily be extended to use another norm such as L2 instead of L1, if you want to penalise larger gaps more. For example, (10,20,30) and (50,55,70) are equally near according to L1 (10+10=5+15) but the first group is better according to L2 (10^2+10^2 < 5^2+15^2).
– antoine-sac
Aug 1 at 7:04
Very interesting. Actually, I think I am gonna give it a try since I would like to find the three numbers with minimum variance between them. The L1 does not allow them, L2 instead would allow me to select the group with minimum variance. Thanks very much!
– Terry
Aug 1 at 16:24
1
You're welcome, you just have to usedelta^2
in therollsum
– antoine-sac
Aug 1 at 16:29
add a comment |
Let's define "nearest numbers" by "numbers with minimal sum of L1 distances". You can achieve what you want by a combination of diff
and windowed sum.
You could write a much shorter function but I wrote it step by step to make it easier to follow.
v <- c(10,23,25,26,38,50)
#' Find the n nearest numbers in a vector
#'
#' @param v Numeric vector
#' @param n Number of nearest numbers to extract
#'
#' @details "Nearest numbers" defined as the numbers which minimise the
#' within-group sum of L1 distances.
#'
findClosest <- function(v, n)
# Sort and remove NA
v <- sort(v, na.last = NA)
# Compute L1 distances between closest points. We know each point is next to
# its closest neighbour since we sorted.
delta <- diff(v)
# Compute sum of L1 distances on a rolling window with n - 1 elements
# Why n-1 ? Because we are looking at deltas and 2 deltas ~ 3 elements.
withingroup_distances <- zoo::rollsum(delta, k = n - 1)
# Now it's simply finding the group with minimum within-group sum
# And working out the elements
group_index <- which.min(withingroup_distances)
element_indices <- group_index + 0:(n-1)
v[element_indices]
findClosest(v, 2)
# 25 26
findClosest(v, 3)
# 23 25 26
Let's define "nearest numbers" by "numbers with minimal sum of L1 distances". You can achieve what you want by a combination of diff
and windowed sum.
You could write a much shorter function but I wrote it step by step to make it easier to follow.
v <- c(10,23,25,26,38,50)
#' Find the n nearest numbers in a vector
#'
#' @param v Numeric vector
#' @param n Number of nearest numbers to extract
#'
#' @details "Nearest numbers" defined as the numbers which minimise the
#' within-group sum of L1 distances.
#'
findClosest <- function(v, n)
# Sort and remove NA
v <- sort(v, na.last = NA)
# Compute L1 distances between closest points. We know each point is next to
# its closest neighbour since we sorted.
delta <- diff(v)
# Compute sum of L1 distances on a rolling window with n - 1 elements
# Why n-1 ? Because we are looking at deltas and 2 deltas ~ 3 elements.
withingroup_distances <- zoo::rollsum(delta, k = n - 1)
# Now it's simply finding the group with minimum within-group sum
# And working out the elements
group_index <- which.min(withingroup_distances)
element_indices <- group_index + 0:(n-1)
v[element_indices]
findClosest(v, 2)
# 25 26
findClosest(v, 3)
# 23 25 26
answered Jul 31 at 8:47
antoine-sacantoine-sac
3,6792 gold badges15 silver badges45 bronze badges
3,6792 gold badges15 silver badges45 bronze badges
Thanks, I have implemented it and it works great! Thanks also for the explanation, I understood the logic behind it.
– Terry
Jul 31 at 10:50
1
Interestingly, this solution can very easily be extended to use another norm such as L2 instead of L1, if you want to penalise larger gaps more. For example, (10,20,30) and (50,55,70) are equally near according to L1 (10+10=5+15) but the first group is better according to L2 (10^2+10^2 < 5^2+15^2).
– antoine-sac
Aug 1 at 7:04
Very interesting. Actually, I think I am gonna give it a try since I would like to find the three numbers with minimum variance between them. The L1 does not allow them, L2 instead would allow me to select the group with minimum variance. Thanks very much!
– Terry
Aug 1 at 16:24
1
You're welcome, you just have to usedelta^2
in therollsum
– antoine-sac
Aug 1 at 16:29
add a comment |
Thanks, I have implemented it and it works great! Thanks also for the explanation, I understood the logic behind it.
– Terry
Jul 31 at 10:50
1
Interestingly, this solution can very easily be extended to use another norm such as L2 instead of L1, if you want to penalise larger gaps more. For example, (10,20,30) and (50,55,70) are equally near according to L1 (10+10=5+15) but the first group is better according to L2 (10^2+10^2 < 5^2+15^2).
– antoine-sac
Aug 1 at 7:04
Very interesting. Actually, I think I am gonna give it a try since I would like to find the three numbers with minimum variance between them. The L1 does not allow them, L2 instead would allow me to select the group with minimum variance. Thanks very much!
– Terry
Aug 1 at 16:24
1
You're welcome, you just have to usedelta^2
in therollsum
– antoine-sac
Aug 1 at 16:29
Thanks, I have implemented it and it works great! Thanks also for the explanation, I understood the logic behind it.
– Terry
Jul 31 at 10:50
Thanks, I have implemented it and it works great! Thanks also for the explanation, I understood the logic behind it.
– Terry
Jul 31 at 10:50
1
1
Interestingly, this solution can very easily be extended to use another norm such as L2 instead of L1, if you want to penalise larger gaps more. For example, (10,20,30) and (50,55,70) are equally near according to L1 (10+10=5+15) but the first group is better according to L2 (10^2+10^2 < 5^2+15^2).
– antoine-sac
Aug 1 at 7:04
Interestingly, this solution can very easily be extended to use another norm such as L2 instead of L1, if you want to penalise larger gaps more. For example, (10,20,30) and (50,55,70) are equally near according to L1 (10+10=5+15) but the first group is better according to L2 (10^2+10^2 < 5^2+15^2).
– antoine-sac
Aug 1 at 7:04
Very interesting. Actually, I think I am gonna give it a try since I would like to find the three numbers with minimum variance between them. The L1 does not allow them, L2 instead would allow me to select the group with minimum variance. Thanks very much!
– Terry
Aug 1 at 16:24
Very interesting. Actually, I think I am gonna give it a try since I would like to find the three numbers with minimum variance between them. The L1 does not allow them, L2 instead would allow me to select the group with minimum variance. Thanks very much!
– Terry
Aug 1 at 16:24
1
1
You're welcome, you just have to use
delta^2
in the rollsum
– antoine-sac
Aug 1 at 16:29
You're welcome, you just have to use
delta^2
in the rollsum
– antoine-sac
Aug 1 at 16:29
add a comment |
An idea is to use zoo
library to do a rolling operation, i.e.
library(zoo)
m1 <- rollapply(v, 3, by = 1, function(i)c(sum(diff(i)), c(i)))
m1[which.min(m1[, 1]),][-1]
#[1] 23 25 26
Or make it into a function,
findClosest <- function(vec, n)
require(zoo)
vec1 <- sort(vec)
m1 <- rollapply(vec1, n, by = 1, function(i) c(sum(diff(i)), c(i)))
return(m1[which.min(m1[, 1]),][-1])
findClosest(v, 3)
#[1] 23 25 26
add a comment |
An idea is to use zoo
library to do a rolling operation, i.e.
library(zoo)
m1 <- rollapply(v, 3, by = 1, function(i)c(sum(diff(i)), c(i)))
m1[which.min(m1[, 1]),][-1]
#[1] 23 25 26
Or make it into a function,
findClosest <- function(vec, n)
require(zoo)
vec1 <- sort(vec)
m1 <- rollapply(vec1, n, by = 1, function(i) c(sum(diff(i)), c(i)))
return(m1[which.min(m1[, 1]),][-1])
findClosest(v, 3)
#[1] 23 25 26
add a comment |
An idea is to use zoo
library to do a rolling operation, i.e.
library(zoo)
m1 <- rollapply(v, 3, by = 1, function(i)c(sum(diff(i)), c(i)))
m1[which.min(m1[, 1]),][-1]
#[1] 23 25 26
Or make it into a function,
findClosest <- function(vec, n)
require(zoo)
vec1 <- sort(vec)
m1 <- rollapply(vec1, n, by = 1, function(i) c(sum(diff(i)), c(i)))
return(m1[which.min(m1[, 1]),][-1])
findClosest(v, 3)
#[1] 23 25 26
An idea is to use zoo
library to do a rolling operation, i.e.
library(zoo)
m1 <- rollapply(v, 3, by = 1, function(i)c(sum(diff(i)), c(i)))
m1[which.min(m1[, 1]),][-1]
#[1] 23 25 26
Or make it into a function,
findClosest <- function(vec, n)
require(zoo)
vec1 <- sort(vec)
m1 <- rollapply(vec1, n, by = 1, function(i) c(sum(diff(i)), c(i)))
return(m1[which.min(m1[, 1]),][-1])
findClosest(v, 3)
#[1] 23 25 26
edited Jul 31 at 8:51
answered Jul 31 at 8:46
SotosSotos
34.7k5 gold badges19 silver badges45 bronze badges
34.7k5 gold badges19 silver badges45 bronze badges
add a comment |
add a comment |
A base R option, idea being we first sort
the vector and subtract every i
th element with i + n - 1
element in the sorted vector and select the group which has minimum difference.
closest_n_vectors <- function(v, n)
v1 <- sort(v)
inds <- which.min(sapply(head(seq_along(v1), -(n - 1)), function(x)
v1[x + n -1] - v1[x]))
v1[inds: (inds + n - 1)]
closest_n_vectors(v, 3)
#[1] 23 25 26
closest_n_vectors(c(2, 10, 1, 20, 4, 5, 23), 2)
#[1] 1 2
closest_n_vectors(c(19, 23, 45, 67, 89, 65, 1), 2)
#[1] 65 67
closest_n_vectors(c(19, 23, 45, 67, 89, 65, 1), 3)
#[1] 1 19 23
In case of tie this will return the numbers with smallest value since we are using which.min
.
BENCHMARKS
Since we have got quite a few answers, it is worth doing a benchmark of all the solutions till now
set.seed(1234)
x <- sample(100000000, 100000)
identical(findClosest_antoine(x, 3), findClosest_Sotos(x, 3),
closest_n_vectors_Ronak(x, 3), findClosest_Cole(x, 3))
#[1] TRUE
microbenchmark::microbenchmark(
antoine = findClosest_antoine(x, 3),
Sotos = findClosest_Sotos(x, 3),
Ronak = closest_n_vectors_Ronak(x, 3),
Cole = findClosest_Cole(x, 3),
times = 10
)
#Unit: milliseconds
# expr min lq mean median uq max neval cld
#antoine 148.751 159.071 163.298 162.581 167.365 181.314 10 b
# Sotos 1086.098 1349.762 1372.232 1398.211 1453.217 1553.945 10 c
# Ronak 54.248 56.870 78.886 83.129 94.748 100.299 10 a
# Cole 4.958 5.042 6.202 6.047 7.386 7.915 10 a
1
@Cole I am not sure aboutcld
either but I have it in my output. Yes, @Rui's solution was notidentical
. I didn't check that earlier.
– Ronak Shah
Jul 31 at 11:18
add a comment |
A base R option, idea being we first sort
the vector and subtract every i
th element with i + n - 1
element in the sorted vector and select the group which has minimum difference.
closest_n_vectors <- function(v, n)
v1 <- sort(v)
inds <- which.min(sapply(head(seq_along(v1), -(n - 1)), function(x)
v1[x + n -1] - v1[x]))
v1[inds: (inds + n - 1)]
closest_n_vectors(v, 3)
#[1] 23 25 26
closest_n_vectors(c(2, 10, 1, 20, 4, 5, 23), 2)
#[1] 1 2
closest_n_vectors(c(19, 23, 45, 67, 89, 65, 1), 2)
#[1] 65 67
closest_n_vectors(c(19, 23, 45, 67, 89, 65, 1), 3)
#[1] 1 19 23
In case of tie this will return the numbers with smallest value since we are using which.min
.
BENCHMARKS
Since we have got quite a few answers, it is worth doing a benchmark of all the solutions till now
set.seed(1234)
x <- sample(100000000, 100000)
identical(findClosest_antoine(x, 3), findClosest_Sotos(x, 3),
closest_n_vectors_Ronak(x, 3), findClosest_Cole(x, 3))
#[1] TRUE
microbenchmark::microbenchmark(
antoine = findClosest_antoine(x, 3),
Sotos = findClosest_Sotos(x, 3),
Ronak = closest_n_vectors_Ronak(x, 3),
Cole = findClosest_Cole(x, 3),
times = 10
)
#Unit: milliseconds
# expr min lq mean median uq max neval cld
#antoine 148.751 159.071 163.298 162.581 167.365 181.314 10 b
# Sotos 1086.098 1349.762 1372.232 1398.211 1453.217 1553.945 10 c
# Ronak 54.248 56.870 78.886 83.129 94.748 100.299 10 a
# Cole 4.958 5.042 6.202 6.047 7.386 7.915 10 a
1
@Cole I am not sure aboutcld
either but I have it in my output. Yes, @Rui's solution was notidentical
. I didn't check that earlier.
– Ronak Shah
Jul 31 at 11:18
add a comment |
A base R option, idea being we first sort
the vector and subtract every i
th element with i + n - 1
element in the sorted vector and select the group which has minimum difference.
closest_n_vectors <- function(v, n)
v1 <- sort(v)
inds <- which.min(sapply(head(seq_along(v1), -(n - 1)), function(x)
v1[x + n -1] - v1[x]))
v1[inds: (inds + n - 1)]
closest_n_vectors(v, 3)
#[1] 23 25 26
closest_n_vectors(c(2, 10, 1, 20, 4, 5, 23), 2)
#[1] 1 2
closest_n_vectors(c(19, 23, 45, 67, 89, 65, 1), 2)
#[1] 65 67
closest_n_vectors(c(19, 23, 45, 67, 89, 65, 1), 3)
#[1] 1 19 23
In case of tie this will return the numbers with smallest value since we are using which.min
.
BENCHMARKS
Since we have got quite a few answers, it is worth doing a benchmark of all the solutions till now
set.seed(1234)
x <- sample(100000000, 100000)
identical(findClosest_antoine(x, 3), findClosest_Sotos(x, 3),
closest_n_vectors_Ronak(x, 3), findClosest_Cole(x, 3))
#[1] TRUE
microbenchmark::microbenchmark(
antoine = findClosest_antoine(x, 3),
Sotos = findClosest_Sotos(x, 3),
Ronak = closest_n_vectors_Ronak(x, 3),
Cole = findClosest_Cole(x, 3),
times = 10
)
#Unit: milliseconds
# expr min lq mean median uq max neval cld
#antoine 148.751 159.071 163.298 162.581 167.365 181.314 10 b
# Sotos 1086.098 1349.762 1372.232 1398.211 1453.217 1553.945 10 c
# Ronak 54.248 56.870 78.886 83.129 94.748 100.299 10 a
# Cole 4.958 5.042 6.202 6.047 7.386 7.915 10 a
A base R option, idea being we first sort
the vector and subtract every i
th element with i + n - 1
element in the sorted vector and select the group which has minimum difference.
closest_n_vectors <- function(v, n)
v1 <- sort(v)
inds <- which.min(sapply(head(seq_along(v1), -(n - 1)), function(x)
v1[x + n -1] - v1[x]))
v1[inds: (inds + n - 1)]
closest_n_vectors(v, 3)
#[1] 23 25 26
closest_n_vectors(c(2, 10, 1, 20, 4, 5, 23), 2)
#[1] 1 2
closest_n_vectors(c(19, 23, 45, 67, 89, 65, 1), 2)
#[1] 65 67
closest_n_vectors(c(19, 23, 45, 67, 89, 65, 1), 3)
#[1] 1 19 23
In case of tie this will return the numbers with smallest value since we are using which.min
.
BENCHMARKS
Since we have got quite a few answers, it is worth doing a benchmark of all the solutions till now
set.seed(1234)
x <- sample(100000000, 100000)
identical(findClosest_antoine(x, 3), findClosest_Sotos(x, 3),
closest_n_vectors_Ronak(x, 3), findClosest_Cole(x, 3))
#[1] TRUE
microbenchmark::microbenchmark(
antoine = findClosest_antoine(x, 3),
Sotos = findClosest_Sotos(x, 3),
Ronak = closest_n_vectors_Ronak(x, 3),
Cole = findClosest_Cole(x, 3),
times = 10
)
#Unit: milliseconds
# expr min lq mean median uq max neval cld
#antoine 148.751 159.071 163.298 162.581 167.365 181.314 10 b
# Sotos 1086.098 1349.762 1372.232 1398.211 1453.217 1553.945 10 c
# Ronak 54.248 56.870 78.886 83.129 94.748 100.299 10 a
# Cole 4.958 5.042 6.202 6.047 7.386 7.915 10 a
edited Jul 31 at 11:27
answered Jul 31 at 8:54
Ronak ShahRonak Shah
74k11 gold badges48 silver badges83 bronze badges
74k11 gold badges48 silver badges83 bronze badges
1
@Cole I am not sure aboutcld
either but I have it in my output. Yes, @Rui's solution was notidentical
. I didn't check that earlier.
– Ronak Shah
Jul 31 at 11:18
add a comment |
1
@Cole I am not sure aboutcld
either but I have it in my output. Yes, @Rui's solution was notidentical
. I didn't check that earlier.
– Ronak Shah
Jul 31 at 11:18
1
1
@Cole I am not sure about
cld
either but I have it in my output. Yes, @Rui's solution was not identical
. I didn't check that earlier.– Ronak Shah
Jul 31 at 11:18
@Cole I am not sure about
cld
either but I have it in my output. Yes, @Rui's solution was not identical
. I didn't check that earlier.– Ronak Shah
Jul 31 at 11:18
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f57286328%2fhow-to-find-the-three-closest-nearest-values-within-a-vector%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
If you replace
find
withwhich
(and use[
for array/matrix indexing), the Matlab answer will work in R, but obviously only works to find the closest 2. Can you clarify what you mean exactly by "finding the closest values in a vector"? The matlab answer only works if the vector is sorted, is that a fair assumption? Your title says "two" but your example uses "3", which is it? A solution working for arbitraryn
is much harder that one that only works for 2. The matlab answer does not extend to >2 numbers, is that why you're asking?– antoine-sac
Jul 31 at 8:34
Hi, yes I fixed the title. In my case I need three. Following your suggestion I have adapted the code from MATLAB and it works, but it only finds the two closest numbers. How should I adapt it to find also the third? The vector can be sorted, it is just a group of numbers and I have to pick the three closer replicas.
– Terry
Jul 31 at 8:47