Rank groups within a grouped sequence of TRUE/FALSE and NA The 2019 Stack Overflow Developer Survey Results Are InGrouping functions (tapply, by, aggregate) and the *apply familyCharacters counting and subletting specific patternsWhat is the purpose of setting a key in data.table?data.table vs dplyr: can one do something well the other can't or does poorly?how to make a bar plot for a list of dataframes?How to group by unique values in a list in RPandas - Alternative to rank() function that gives unique ordinal ranks for a columnRank within group in for loop in RData transformation: from dyadic to observational data in RGetting map from purrr to work with paste0

Did Scotland spend $250,000 for the slogan "Welcome to Scotland"?

How to translate "being like"?

Is it safe to harvest rainwater that fell on solar panels?

How to notate time signature switching consistently every measure

Accepted by European university, rejected by all American ones I applied to? Possible reasons?

Why does the nucleus not repel itself?

Correct punctuation for showing a character's confusion

The difference between dialogue marks

Star Trek - X-shaped Item on Regula/Orbital Office Starbases

How can I add encounters in the Lost Mine of Phandelver campaign without giving PCs too much XP?

What do hard-Brexiteers want with respect to the Irish border?

Can we generate random numbers using irrational numbers like π and e?

What does Linus Torvalds mean when he says that Git "never ever" tracks a file?

Deal with toxic manager when you can't quit

How to support a colleague who finds meetings extremely tiring?

Ubuntu Server install with full GUI

Why doesn't UInt have a toDouble()?

Getting crown tickets for Statue of Liberty

Is it a good practice to use a static variable in a Test Class and use that in the actual class instead of Test.isRunningTest()?

What to do when moving next to a bird sanctuary with a loosely-domesticated cat?

Is it ethical to upload a automatically generated paper to a non peer-reviewed site as part of a larger research?

Can I have a signal generator on while it's not connected?

Does HR tell a hiring manager about salary negotiations?

Does adding complexity mean a more secure cipher?

Rank groups within a grouped sequence of TRUE/FALSE and NA

The 2019 Stack Overflow Developer Survey Results Are InGrouping functions (tapply, by, aggregate) and the *apply familyCharacters counting and subletting specific patternsWhat is the purpose of setting a key in data.table?data.table vs dplyr: can one do something well the other can't or does poorly?how to make a bar plot for a list of dataframes?How to group by unique values in a list in RPandas - Alternative to rank() function that gives unique ordinal ranks for a columnRank within group in for loop in RData transformation: from dyadic to observational data in RGetting map from purrr to work with paste0

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;

I have a little nut to crack.

I have a data.frame like this:

 group criterium
1 A NA
2 A TRUE
3 A TRUE
4 A TRUE
5 A FALSE
6 A FALSE
7 A TRUE
8 A TRUE
9 A FALSE
10 A TRUE
11 A TRUE
12 A TRUE
13 B NA
14 B FALSE
15 B TRUE
16 B TRUE
17 B TRUE
18 B FALSE

structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", 
"B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE, 
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE, 
TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA, 
-18L))

And I want to rank the groups of TRUE in column criterium in ascending order while disregarding the FALSEand NA. The goal is to have a unique group identifier inside each group of group.

So the result should look like:

 group criterium goal
1 A NA NA
2 A TRUE 1
3 A TRUE 1
4 A TRUE 1
5 A FALSE NA
6 A FALSE NA
7 A TRUE 2
8 A TRUE 2
9 A FALSE NA
10 A TRUE 3
11 A TRUE 3
12 A TRUE 3
13 B NA NA
14 B FALSE NA
15 B TRUE 1
16 B TRUE 1
17 B TRUE 1
18 B FALSE NA

I'm sure there is a relatively easy way to do this, I just can't think of one. I experimented with dense_rank() and other window functions of dplyr, but to no avail.

Thanks for the help!

edited yesterday

asked yesterday

Humpelstielzchen

1,3801318

1

you can just about grab what you need with this work of beauty; as.numeric(as.factor(cumsum(is.na(d$criterium^NA)) + d$criterium^NA)) -- just needs to be applied by group

– user20650
yesterday

that is a really funny solution. Very good job!

– Humpelstielzchen
yesterday

In your example all of group A comes first, then group B. We don't need to handle cases with group=A, criterium=TRUE interspersed with group=B, criterium=TRUE?

– smci
yesterday

No, when group A stops so stops the sequence for group A.

– Humpelstielzchen
yesterday

But I'm suggesting if you construct an example with group=A, criterium=TRUE followed by group=B, criterium=TRUE (with no FALSE's in-between), would that get a new 'goal' number or not? Some of the answers here will fail because they don't group-by group or consider the discontinuity in group.

– smci
yesterday

|
show 1 more comment

I have a little nut to crack.

I have a data.frame like this:

 group criterium
1 A NA
2 A TRUE
3 A TRUE
4 A TRUE
5 A FALSE
6 A FALSE
7 A TRUE
8 A TRUE
9 A FALSE
10 A TRUE
11 A TRUE
12 A TRUE
13 B NA
14 B FALSE
15 B TRUE
16 B TRUE
17 B TRUE
18 B FALSE

structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", 
"B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE, 
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE, 
TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA, 
-18L))

And I want to rank the groups of TRUE in column criterium in ascending order while disregarding the FALSEand NA. The goal is to have a unique group identifier inside each group of group.

So the result should look like:

 group criterium goal
1 A NA NA
2 A TRUE 1
3 A TRUE 1
4 A TRUE 1
5 A FALSE NA
6 A FALSE NA
7 A TRUE 2
8 A TRUE 2
9 A FALSE NA
10 A TRUE 3
11 A TRUE 3
12 A TRUE 3
13 B NA NA
14 B FALSE NA
15 B TRUE 1
16 B TRUE 1
17 B TRUE 1
18 B FALSE NA

I'm sure there is a relatively easy way to do this, I just can't think of one. I experimented with dense_rank() and other window functions of dplyr, but to no avail.

Thanks for the help!

edited yesterday

asked yesterday

Humpelstielzchen

1,3801318

1

you can just about grab what you need with this work of beauty; as.numeric(as.factor(cumsum(is.na(d$criterium^NA)) + d$criterium^NA)) -- just needs to be applied by group

– user20650
yesterday

that is a really funny solution. Very good job!

– Humpelstielzchen
yesterday

In your example all of group A comes first, then group B. We don't need to handle cases with group=A, criterium=TRUE interspersed with group=B, criterium=TRUE?

– smci
yesterday

No, when group A stops so stops the sequence for group A.

– Humpelstielzchen
yesterday

But I'm suggesting if you construct an example with group=A, criterium=TRUE followed by group=B, criterium=TRUE (with no FALSE's in-between), would that get a new 'goal' number or not? Some of the answers here will fail because they don't group-by group or consider the discontinuity in group.

– smci
yesterday

|
show 1 more comment

I have a little nut to crack.

I have a data.frame like this:

 group criterium
1 A NA
2 A TRUE
3 A TRUE
4 A TRUE
5 A FALSE
6 A FALSE
7 A TRUE
8 A TRUE
9 A FALSE
10 A TRUE
11 A TRUE
12 A TRUE
13 B NA
14 B FALSE
15 B TRUE
16 B TRUE
17 B TRUE
18 B FALSE

structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", 
"B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE, 
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE, 
TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA, 
-18L))

And I want to rank the groups of TRUE in column criterium in ascending order while disregarding the FALSEand NA. The goal is to have a unique group identifier inside each group of group.

So the result should look like:

 group criterium goal
1 A NA NA
2 A TRUE 1
3 A TRUE 1
4 A TRUE 1
5 A FALSE NA
6 A FALSE NA
7 A TRUE 2
8 A TRUE 2
9 A FALSE NA
10 A TRUE 3
11 A TRUE 3
12 A TRUE 3
13 B NA NA
14 B FALSE NA
15 B TRUE 1
16 B TRUE 1
17 B TRUE 1
18 B FALSE NA

I'm sure there is a relatively easy way to do this, I just can't think of one. I experimented with dense_rank() and other window functions of dplyr, but to no avail.

Thanks for the help!

edited yesterday

asked yesterday

Humpelstielzchen

1,3801318

I have a little nut to crack.

I have a data.frame like this:

 group criterium
1 A NA
2 A TRUE
3 A TRUE
4 A TRUE
5 A FALSE
6 A FALSE
7 A TRUE
8 A TRUE
9 A FALSE
10 A TRUE
11 A TRUE
12 A TRUE
13 B NA
14 B FALSE
15 B TRUE
16 B TRUE
17 B TRUE
18 B FALSE

structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", 
"B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE, 
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE, 
TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA, 
-18L))

And I want to rank the groups of TRUE in column criterium in ascending order while disregarding the FALSEand NA. The goal is to have a unique group identifier inside each group of group.

So the result should look like:

 group criterium goal
1 A NA NA
2 A TRUE 1
3 A TRUE 1
4 A TRUE 1
5 A FALSE NA
6 A FALSE NA
7 A TRUE 2
8 A TRUE 2
9 A FALSE NA
10 A TRUE 3
11 A TRUE 3
12 A TRUE 3
13 B NA NA
14 B FALSE NA
15 B TRUE 1
16 B TRUE 1
17 B TRUE 1
18 B FALSE NA

I'm sure there is a relatively easy way to do this, I just can't think of one. I experimented with dense_rank() and other window functions of dplyr, but to no avail.

Thanks for the help!

r dplyr data.table rank

edited yesterday

asked yesterday

Humpelstielzchen

1,3801318

edited yesterday

asked yesterday

Humpelstielzchen

1,3801318

edited yesterday

asked yesterday

Humpelstielzchen

1,3801318

asked yesterday

Humpelstielzchen

1,3801318

asked yesterday

Humpelstielzchen

1,3801318

1

you can just about grab what you need with this work of beauty; as.numeric(as.factor(cumsum(is.na(d$criterium^NA)) + d$criterium^NA)) -- just needs to be applied by group

– user20650
yesterday

that is a really funny solution. Very good job!

– Humpelstielzchen
yesterday

In your example all of group A comes first, then group B. We don't need to handle cases with group=A, criterium=TRUE interspersed with group=B, criterium=TRUE?

– smci
yesterday

No, when group A stops so stops the sequence for group A.

– Humpelstielzchen
yesterday

But I'm suggesting if you construct an example with group=A, criterium=TRUE followed by group=B, criterium=TRUE (with no FALSE's in-between), would that get a new 'goal' number or not? Some of the answers here will fail because they don't group-by group or consider the discontinuity in group.

– smci
yesterday

|
show 1 more comment

1

you can just about grab what you need with this work of beauty; as.numeric(as.factor(cumsum(is.na(d$criterium^NA)) + d$criterium^NA)) -- just needs to be applied by group

– user20650
yesterday

that is a really funny solution. Very good job!

– Humpelstielzchen
yesterday

In your example all of group A comes first, then group B. We don't need to handle cases with group=A, criterium=TRUE interspersed with group=B, criterium=TRUE?

– smci
yesterday

No, when group A stops so stops the sequence for group A.

– Humpelstielzchen
yesterday

But I'm suggesting if you construct an example with group=A, criterium=TRUE followed by group=B, criterium=TRUE (with no FALSE's in-between), would that get a new 'goal' number or not? Some of the answers here will fail because they don't group-by group or consider the discontinuity in group.

– smci
yesterday

you can just about grab what you need with this work of beauty; as.numeric(as.factor(cumsum(is.na(d$criterium^NA)) + d$criterium^NA)) -- just needs to be applied by group

– user20650
yesterday

that is a really funny solution. Very good job!

– Humpelstielzchen
yesterday

In your example all of group A comes first, then group B. We don't need to handle cases with group=A, criterium=TRUE interspersed with group=B, criterium=TRUE?

– smci
yesterday

No, when group A stops so stops the sequence for group A.

– Humpelstielzchen
yesterday

But I'm suggesting if you construct an example with group=A, criterium=TRUE followed by group=B, criterium=TRUE (with no FALSE's in-between), would that get a new 'goal' number or not? Some of the answers here will fail because they don't group-by group or consider the discontinuity in group.

– smci
yesterday

|
show 1 more comment

4 Answers
4

active

oldest

votes

Another data.table approach:

library(data.table)
setDT(dt)
dt[, cr := rleid(criterium)][
 (criterium), goal := rleid(cr), by=.(group)]

answered yesterday

chinsoon12

9,93611420

1

Tried with rleid but didn't get it to work. (+1)

– markus
yesterday

works for me. And seems to be the most elegant answer.

– Humpelstielzchen
yesterday

add a comment |

Maybe I have over-complicated this but one way with dplyr is

library(dplyr)

df %>%
 mutate(temp = replace(criterium, is.na(criterium), FALSE), 
 temp1 = cumsum(!temp)) %>%
 group_by(temp1) %>%
 mutate(goal = +(row_number() == which.max(temp) & any(temp))) %>%
 group_by(group) %>%
 mutate(goal = ifelse(temp, cumsum(goal), NA)) %>%
 select(-temp, -temp1)

# group criterium goal
# <fct> <lgl> <int>
# 1 A NA NA
# 2 A TRUE 1
# 3 A TRUE 1
# 4 A TRUE 1
# 5 A FALSE NA
# 6 A FALSE NA
# 7 A TRUE 2
# 8 A TRUE 2
# 9 A FALSE NA
#10 A TRUE 3
#11 A TRUE 3
#12 A TRUE 3
#13 B NA NA
#14 B FALSE NA
#15 B TRUE 1
#16 B TRUE 1
#17 B TRUE 1
#18 B FALSE NA

We first replace NAs in criterium column to FALSE and take cumulative sum over the negation of it (temp1). We group_by temp1 and assign 1 to every first TRUE value in the group. Finally grouping by group we take a cumulative sum for TRUE values or return NA for FALSE and NA values.

answered yesterday

Ronak Shah

46.3k104268

add a comment |

A pure Base R solution, we can create a custom function via rle, and use it per group, i.e.

f1 <- function(x) 
 x[is.na(x)] <- FALSE
 rle1 <- rle(x)
 y <- rle1$values
 rle1$values[!y] <- 0
 rle1$values[y] <- cumsum(rle1$values[y])
 return(inverse.rle(rle1))



do.call(rbind, 
 lapply(split(df, df$group), function(i)i$goal <- f1(i$criterium); 
 i$goal <- replace(i$goal, is.na(i$criterium)))

Of course, If you want you can apply it via dplyr, i.e.

library(dplyr)

df %>% 
 group_by(group) %>% 
 mutate(goal = f1(criterium), 
 goal = replace(goal, is.na(criterium)|!criterium, NA))

which gives,

# A tibble: 18 x 3
# Groups: group [2]
 group criterium goal
 <fct> <lgl> <dbl>
 1 A NA NA
 2 A TRUE 1
 3 A TRUE 1
 4 A TRUE 1
 5 A FALSE NA
 6 A FALSE NA
 7 A TRUE 2
 8 A TRUE 2
 9 A FALSE NA
10 A TRUE 3
11 A TRUE 3
12 A TRUE 3
13 B NA NA
14 B FALSE NA
15 B TRUE 1
16 B TRUE 1
17 B TRUE 1
18 B FALSE NA

edited yesterday

answered yesterday

Sotos

31.4k51741

add a comment |

A data.table option using rle

library(data.table)
DT <- as.data.table(dat)
DT[, goal := 
 r <- rle(replace(criterium, is.na(criterium), FALSE))
 r$values <- with(r, cumsum(values) * values) 
 out <- inverse.rle(r) 
 replace(out, out == 0, NA)
, by = group]
DT
# group criterium goal
# 1: A NA NA
# 2: A TRUE 1
# 3: A TRUE 1
# 4: A TRUE 1
# 5: A FALSE NA
# 6: A FALSE NA
# 7: A TRUE 2
# 8: A TRUE 2
# 9: A FALSE NA
#10: A TRUE 3
#11: A TRUE 3
#12: A TRUE 3
#13: B NA NA
#14: B FALSE NA
#15: B TRUE 1
#16: B TRUE 1
#17: B TRUE 1
#18: B FALSE NA

step by step

When we call r <- rle(replace(criterium, is.na(criterium), FALSE)) we get an object of class rle

r
#Run Length Encoding
# lengths: int [1:9] 1 3 2 2 1 3 2 3 1
# values : logi [1:9] FALSE TRUE FALSE TRUE FALSE TRUE ...

We manipulate the values compenent in the following way

r$values <- with(r, cumsum(values) * values)
r
#Run Length Encoding
# lengths: int [1:9] 1 3 2 2 1 3 2 3 1
# values : int [1:9] 0 1 0 2 0 3 0 4 0

That is, we replaced TRUEs with the cumulative sum of values and set the FALSEs to 0. Now inverse.rle returns a vector in which values will repeated lenghts times

out <- inverse.rle(r)
out
# [1] 0 1 1 1 0 0 2 2 0 3 3 3 0 0 4 4 4 0

This is almost what OP wants but we need to replace the 0s with NA

replace(out, out == 0, NA)

This is done for each group.

data

dat <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", 
"B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE, 
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE, 
TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA, 
-18L))

edited yesterday

answered yesterday

markus

15.5k11336

Wow, impressive. Thanks for introducing me to rleand inverse.rle. Gruß nach Leipzig.

– Humpelstielzchen
yesterday

1

@Humpelstielzchen Gern geschehen. Will try to simplify and explain the logic a bit.

– markus
yesterday

Thanks! I was dissecting your answer just like that. Your answer taught me the most. But chinsoon12 is just a Teufelskerl. ^^

– Humpelstielzchen
yesterday

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55606323%2frank-groups-within-a-grouped-sequence-of-true-false-and-na%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

Another data.table approach:

library(data.table)
setDT(dt)
dt[, cr := rleid(criterium)][
 (criterium), goal := rleid(cr), by=.(group)]

answered yesterday

chinsoon12

9,93611420

1

Tried with rleid but didn't get it to work. (+1)

– markus
yesterday

works for me. And seems to be the most elegant answer.

– Humpelstielzchen
yesterday

add a comment |

Another data.table approach:

library(data.table)
setDT(dt)
dt[, cr := rleid(criterium)][
 (criterium), goal := rleid(cr), by=.(group)]

answered yesterday

chinsoon12

9,93611420

1

Tried with rleid but didn't get it to work. (+1)

– markus
yesterday

works for me. And seems to be the most elegant answer.

– Humpelstielzchen
yesterday

add a comment |

Another data.table approach:

library(data.table)
setDT(dt)
dt[, cr := rleid(criterium)][
 (criterium), goal := rleid(cr), by=.(group)]

answered yesterday

chinsoon12

9,93611420

Another data.table approach:

library(data.table)
setDT(dt)
dt[, cr := rleid(criterium)][
 (criterium), goal := rleid(cr), by=.(group)]

answered yesterday

chinsoon12

9,93611420

answered yesterday

chinsoon12

9,93611420

answered yesterday

chinsoon12

9,93611420

answered yesterday

chinsoon12

9,93611420

1

Tried with rleid but didn't get it to work. (+1)

– markus
yesterday

works for me. And seems to be the most elegant answer.

– Humpelstielzchen
yesterday

add a comment |

1

Tried with rleid but didn't get it to work. (+1)

– markus
yesterday

works for me. And seems to be the most elegant answer.

– Humpelstielzchen
yesterday

Tried with rleid but didn't get it to work. (+1)

– markus
yesterday

works for me. And seems to be the most elegant answer.

– Humpelstielzchen
yesterday

add a comment |

Maybe I have over-complicated this but one way with dplyr is

library(dplyr)

df %>%
 mutate(temp = replace(criterium, is.na(criterium), FALSE), 
 temp1 = cumsum(!temp)) %>%
 group_by(temp1) %>%
 mutate(goal = +(row_number() == which.max(temp) & any(temp))) %>%
 group_by(group) %>%
 mutate(goal = ifelse(temp, cumsum(goal), NA)) %>%
 select(-temp, -temp1)

# group criterium goal
# <fct> <lgl> <int>
# 1 A NA NA
# 2 A TRUE 1
# 3 A TRUE 1
# 4 A TRUE 1
# 5 A FALSE NA
# 6 A FALSE NA
# 7 A TRUE 2
# 8 A TRUE 2
# 9 A FALSE NA
#10 A TRUE 3
#11 A TRUE 3
#12 A TRUE 3
#13 B NA NA
#14 B FALSE NA
#15 B TRUE 1
#16 B TRUE 1
#17 B TRUE 1
#18 B FALSE NA

answered yesterday

Ronak Shah

46.3k104268

add a comment |

Maybe I have over-complicated this but one way with dplyr is

library(dplyr)

df %>%
 mutate(temp = replace(criterium, is.na(criterium), FALSE), 
 temp1 = cumsum(!temp)) %>%
 group_by(temp1) %>%
 mutate(goal = +(row_number() == which.max(temp) & any(temp))) %>%
 group_by(group) %>%
 mutate(goal = ifelse(temp, cumsum(goal), NA)) %>%
 select(-temp, -temp1)

# group criterium goal
# <fct> <lgl> <int>
# 1 A NA NA
# 2 A TRUE 1
# 3 A TRUE 1
# 4 A TRUE 1
# 5 A FALSE NA
# 6 A FALSE NA
# 7 A TRUE 2
# 8 A TRUE 2
# 9 A FALSE NA
#10 A TRUE 3
#11 A TRUE 3
#12 A TRUE 3
#13 B NA NA
#14 B FALSE NA
#15 B TRUE 1
#16 B TRUE 1
#17 B TRUE 1
#18 B FALSE NA

answered yesterday

Ronak Shah

46.3k104268

add a comment |

Maybe I have over-complicated this but one way with dplyr is

library(dplyr)

df %>%
 mutate(temp = replace(criterium, is.na(criterium), FALSE), 
 temp1 = cumsum(!temp)) %>%
 group_by(temp1) %>%
 mutate(goal = +(row_number() == which.max(temp) & any(temp))) %>%
 group_by(group) %>%
 mutate(goal = ifelse(temp, cumsum(goal), NA)) %>%
 select(-temp, -temp1)

# group criterium goal
# <fct> <lgl> <int>
# 1 A NA NA
# 2 A TRUE 1
# 3 A TRUE 1
# 4 A TRUE 1
# 5 A FALSE NA
# 6 A FALSE NA
# 7 A TRUE 2
# 8 A TRUE 2
# 9 A FALSE NA
#10 A TRUE 3
#11 A TRUE 3
#12 A TRUE 3
#13 B NA NA
#14 B FALSE NA
#15 B TRUE 1
#16 B TRUE 1
#17 B TRUE 1
#18 B FALSE NA

answered yesterday

Ronak Shah

46.3k104268

Maybe I have over-complicated this but one way with dplyr is

library(dplyr)

df %>%
 mutate(temp = replace(criterium, is.na(criterium), FALSE), 
 temp1 = cumsum(!temp)) %>%
 group_by(temp1) %>%
 mutate(goal = +(row_number() == which.max(temp) & any(temp))) %>%
 group_by(group) %>%
 mutate(goal = ifelse(temp, cumsum(goal), NA)) %>%
 select(-temp, -temp1)

# group criterium goal
# <fct> <lgl> <int>
# 1 A NA NA
# 2 A TRUE 1
# 3 A TRUE 1
# 4 A TRUE 1
# 5 A FALSE NA
# 6 A FALSE NA
# 7 A TRUE 2
# 8 A TRUE 2
# 9 A FALSE NA
#10 A TRUE 3
#11 A TRUE 3
#12 A TRUE 3
#13 B NA NA
#14 B FALSE NA
#15 B TRUE 1
#16 B TRUE 1
#17 B TRUE 1
#18 B FALSE NA

answered yesterday

Ronak Shah

46.3k104268

answered yesterday

Ronak Shah

46.3k104268

answered yesterday

Ronak Shah

46.3k104268

answered yesterday

Ronak Shah

46.3k104268

add a comment |

A pure Base R solution, we can create a custom function via rle, and use it per group, i.e.

f1 <- function(x) 
 x[is.na(x)] <- FALSE
 rle1 <- rle(x)
 y <- rle1$values
 rle1$values[!y] <- 0
 rle1$values[y] <- cumsum(rle1$values[y])
 return(inverse.rle(rle1))



do.call(rbind, 
 lapply(split(df, df$group), function(i)i$goal <- f1(i$criterium); 
 i$goal <- replace(i$goal, is.na(i$criterium)))

Of course, If you want you can apply it via dplyr, i.e.

library(dplyr)

df %>% 
 group_by(group) %>% 
 mutate(goal = f1(criterium), 
 goal = replace(goal, is.na(criterium)|!criterium, NA))

which gives,

# A tibble: 18 x 3
# Groups: group [2]
 group criterium goal
 <fct> <lgl> <dbl>
 1 A NA NA
 2 A TRUE 1
 3 A TRUE 1
 4 A TRUE 1
 5 A FALSE NA
 6 A FALSE NA
 7 A TRUE 2
 8 A TRUE 2
 9 A FALSE NA
10 A TRUE 3
11 A TRUE 3
12 A TRUE 3
13 B NA NA
14 B FALSE NA
15 B TRUE 1
16 B TRUE 1
17 B TRUE 1
18 B FALSE NA

edited yesterday

answered yesterday

Sotos

31.4k51741

add a comment |

A pure Base R solution, we can create a custom function via rle, and use it per group, i.e.

f1 <- function(x) 
 x[is.na(x)] <- FALSE
 rle1 <- rle(x)
 y <- rle1$values
 rle1$values[!y] <- 0
 rle1$values[y] <- cumsum(rle1$values[y])
 return(inverse.rle(rle1))



do.call(rbind, 
 lapply(split(df, df$group), function(i)i$goal <- f1(i$criterium); 
 i$goal <- replace(i$goal, is.na(i$criterium)))

Of course, If you want you can apply it via dplyr, i.e.

library(dplyr)

df %>% 
 group_by(group) %>% 
 mutate(goal = f1(criterium), 
 goal = replace(goal, is.na(criterium)|!criterium, NA))

which gives,

# A tibble: 18 x 3
# Groups: group [2]
 group criterium goal
 <fct> <lgl> <dbl>
 1 A NA NA
 2 A TRUE 1
 3 A TRUE 1
 4 A TRUE 1
 5 A FALSE NA
 6 A FALSE NA
 7 A TRUE 2
 8 A TRUE 2
 9 A FALSE NA
10 A TRUE 3
11 A TRUE 3
12 A TRUE 3
13 B NA NA
14 B FALSE NA
15 B TRUE 1
16 B TRUE 1
17 B TRUE 1
18 B FALSE NA

edited yesterday

answered yesterday

Sotos

31.4k51741

add a comment |

A pure Base R solution, we can create a custom function via rle, and use it per group, i.e.

f1 <- function(x) 
 x[is.na(x)] <- FALSE
 rle1 <- rle(x)
 y <- rle1$values
 rle1$values[!y] <- 0
 rle1$values[y] <- cumsum(rle1$values[y])
 return(inverse.rle(rle1))



do.call(rbind, 
 lapply(split(df, df$group), function(i)i$goal <- f1(i$criterium); 
 i$goal <- replace(i$goal, is.na(i$criterium)))

Of course, If you want you can apply it via dplyr, i.e.

library(dplyr)

df %>% 
 group_by(group) %>% 
 mutate(goal = f1(criterium), 
 goal = replace(goal, is.na(criterium)|!criterium, NA))

which gives,

# A tibble: 18 x 3
# Groups: group [2]
 group criterium goal
 <fct> <lgl> <dbl>
 1 A NA NA
 2 A TRUE 1
 3 A TRUE 1
 4 A TRUE 1
 5 A FALSE NA
 6 A FALSE NA
 7 A TRUE 2
 8 A TRUE 2
 9 A FALSE NA
10 A TRUE 3
11 A TRUE 3
12 A TRUE 3
13 B NA NA
14 B FALSE NA
15 B TRUE 1
16 B TRUE 1
17 B TRUE 1
18 B FALSE NA

edited yesterday

answered yesterday

Sotos

31.4k51741

A pure Base R solution, we can create a custom function via rle, and use it per group, i.e.

f1 <- function(x) 
 x[is.na(x)] <- FALSE
 rle1 <- rle(x)
 y <- rle1$values
 rle1$values[!y] <- 0
 rle1$values[y] <- cumsum(rle1$values[y])
 return(inverse.rle(rle1))



do.call(rbind, 
 lapply(split(df, df$group), function(i)i$goal <- f1(i$criterium); 
 i$goal <- replace(i$goal, is.na(i$criterium)))

Of course, If you want you can apply it via dplyr, i.e.

library(dplyr)

df %>% 
 group_by(group) %>% 
 mutate(goal = f1(criterium), 
 goal = replace(goal, is.na(criterium)|!criterium, NA))

which gives,

# A tibble: 18 x 3
# Groups: group [2]
 group criterium goal
 <fct> <lgl> <dbl>
 1 A NA NA
 2 A TRUE 1
 3 A TRUE 1
 4 A TRUE 1
 5 A FALSE NA
 6 A FALSE NA
 7 A TRUE 2
 8 A TRUE 2
 9 A FALSE NA
10 A TRUE 3
11 A TRUE 3
12 A TRUE 3
13 B NA NA
14 B FALSE NA
15 B TRUE 1
16 B TRUE 1
17 B TRUE 1
18 B FALSE NA

edited yesterday

answered yesterday

Sotos

31.4k51741

edited yesterday

answered yesterday

Sotos

31.4k51741

answered yesterday

Sotos

31.4k51741

answered yesterday

Sotos

31.4k51741

add a comment |

A data.table option using rle

library(data.table)
DT <- as.data.table(dat)
DT[, goal := 
 r <- rle(replace(criterium, is.na(criterium), FALSE))
 r$values <- with(r, cumsum(values) * values) 
 out <- inverse.rle(r) 
 replace(out, out == 0, NA)
, by = group]
DT
# group criterium goal
# 1: A NA NA
# 2: A TRUE 1
# 3: A TRUE 1
# 4: A TRUE 1
# 5: A FALSE NA
# 6: A FALSE NA
# 7: A TRUE 2
# 8: A TRUE 2
# 9: A FALSE NA
#10: A TRUE 3
#11: A TRUE 3
#12: A TRUE 3
#13: B NA NA
#14: B FALSE NA
#15: B TRUE 1
#16: B TRUE 1
#17: B TRUE 1
#18: B FALSE NA

step by step

When we call r <- rle(replace(criterium, is.na(criterium), FALSE)) we get an object of class rle

r
#Run Length Encoding
# lengths: int [1:9] 1 3 2 2 1 3 2 3 1
# values : logi [1:9] FALSE TRUE FALSE TRUE FALSE TRUE ...

We manipulate the values compenent in the following way

r$values <- with(r, cumsum(values) * values)
r
#Run Length Encoding
# lengths: int [1:9] 1 3 2 2 1 3 2 3 1
# values : int [1:9] 0 1 0 2 0 3 0 4 0

That is, we replaced TRUEs with the cumulative sum of values and set the FALSEs to 0. Now inverse.rle returns a vector in which values will repeated lenghts times

out <- inverse.rle(r)
out
# [1] 0 1 1 1 0 0 2 2 0 3 3 3 0 0 4 4 4 0

This is almost what OP wants but we need to replace the 0s with NA

replace(out, out == 0, NA)

This is done for each group.

data

dat <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", 
"B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE, 
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE, 
TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA, 
-18L))

edited yesterday

answered yesterday

markus

15.5k11336

Wow, impressive. Thanks for introducing me to rleand inverse.rle. Gruß nach Leipzig.

– Humpelstielzchen
yesterday

1

@Humpelstielzchen Gern geschehen. Will try to simplify and explain the logic a bit.

– markus
yesterday

Thanks! I was dissecting your answer just like that. Your answer taught me the most. But chinsoon12 is just a Teufelskerl. ^^

– Humpelstielzchen
yesterday

add a comment |

A data.table option using rle

library(data.table)
DT <- as.data.table(dat)
DT[, goal := 
 r <- rle(replace(criterium, is.na(criterium), FALSE))
 r$values <- with(r, cumsum(values) * values) 
 out <- inverse.rle(r) 
 replace(out, out == 0, NA)
, by = group]
DT
# group criterium goal
# 1: A NA NA
# 2: A TRUE 1
# 3: A TRUE 1
# 4: A TRUE 1
# 5: A FALSE NA
# 6: A FALSE NA
# 7: A TRUE 2
# 8: A TRUE 2
# 9: A FALSE NA
#10: A TRUE 3
#11: A TRUE 3
#12: A TRUE 3
#13: B NA NA
#14: B FALSE NA
#15: B TRUE 1
#16: B TRUE 1
#17: B TRUE 1
#18: B FALSE NA

step by step

When we call r <- rle(replace(criterium, is.na(criterium), FALSE)) we get an object of class rle

r
#Run Length Encoding
# lengths: int [1:9] 1 3 2 2 1 3 2 3 1
# values : logi [1:9] FALSE TRUE FALSE TRUE FALSE TRUE ...

We manipulate the values compenent in the following way

r$values <- with(r, cumsum(values) * values)
r
#Run Length Encoding
# lengths: int [1:9] 1 3 2 2 1 3 2 3 1
# values : int [1:9] 0 1 0 2 0 3 0 4 0

That is, we replaced TRUEs with the cumulative sum of values and set the FALSEs to 0. Now inverse.rle returns a vector in which values will repeated lenghts times

out <- inverse.rle(r)
out
# [1] 0 1 1 1 0 0 2 2 0 3 3 3 0 0 4 4 4 0

This is almost what OP wants but we need to replace the 0s with NA

replace(out, out == 0, NA)

This is done for each group.

data

dat <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", 
"B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE, 
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE, 
TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA, 
-18L))

edited yesterday

answered yesterday

markus

15.5k11336

Wow, impressive. Thanks for introducing me to rleand inverse.rle. Gruß nach Leipzig.

– Humpelstielzchen
yesterday

1

@Humpelstielzchen Gern geschehen. Will try to simplify and explain the logic a bit.

– markus
yesterday

Thanks! I was dissecting your answer just like that. Your answer taught me the most. But chinsoon12 is just a Teufelskerl. ^^

– Humpelstielzchen
yesterday

add a comment |

A data.table option using rle

library(data.table)
DT <- as.data.table(dat)
DT[, goal := 
 r <- rle(replace(criterium, is.na(criterium), FALSE))
 r$values <- with(r, cumsum(values) * values) 
 out <- inverse.rle(r) 
 replace(out, out == 0, NA)
, by = group]
DT
# group criterium goal
# 1: A NA NA
# 2: A TRUE 1
# 3: A TRUE 1
# 4: A TRUE 1
# 5: A FALSE NA
# 6: A FALSE NA
# 7: A TRUE 2
# 8: A TRUE 2
# 9: A FALSE NA
#10: A TRUE 3
#11: A TRUE 3
#12: A TRUE 3
#13: B NA NA
#14: B FALSE NA
#15: B TRUE 1
#16: B TRUE 1
#17: B TRUE 1
#18: B FALSE NA

step by step

When we call r <- rle(replace(criterium, is.na(criterium), FALSE)) we get an object of class rle

r
#Run Length Encoding
# lengths: int [1:9] 1 3 2 2 1 3 2 3 1
# values : logi [1:9] FALSE TRUE FALSE TRUE FALSE TRUE ...

We manipulate the values compenent in the following way

r$values <- with(r, cumsum(values) * values)
r
#Run Length Encoding
# lengths: int [1:9] 1 3 2 2 1 3 2 3 1
# values : int [1:9] 0 1 0 2 0 3 0 4 0

That is, we replaced TRUEs with the cumulative sum of values and set the FALSEs to 0. Now inverse.rle returns a vector in which values will repeated lenghts times

out <- inverse.rle(r)
out
# [1] 0 1 1 1 0 0 2 2 0 3 3 3 0 0 4 4 4 0

This is almost what OP wants but we need to replace the 0s with NA

replace(out, out == 0, NA)

This is done for each group.

data

dat <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", 
"B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE, 
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE, 
TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA, 
-18L))

edited yesterday

answered yesterday

markus

15.5k11336

A data.table option using rle

library(data.table)
DT <- as.data.table(dat)
DT[, goal := 
 r <- rle(replace(criterium, is.na(criterium), FALSE))
 r$values <- with(r, cumsum(values) * values) 
 out <- inverse.rle(r) 
 replace(out, out == 0, NA)
, by = group]
DT
# group criterium goal
# 1: A NA NA
# 2: A TRUE 1
# 3: A TRUE 1
# 4: A TRUE 1
# 5: A FALSE NA
# 6: A FALSE NA
# 7: A TRUE 2
# 8: A TRUE 2
# 9: A FALSE NA
#10: A TRUE 3
#11: A TRUE 3
#12: A TRUE 3
#13: B NA NA
#14: B FALSE NA
#15: B TRUE 1
#16: B TRUE 1
#17: B TRUE 1
#18: B FALSE NA

step by step

When we call r <- rle(replace(criterium, is.na(criterium), FALSE)) we get an object of class rle

r
#Run Length Encoding
# lengths: int [1:9] 1 3 2 2 1 3 2 3 1
# values : logi [1:9] FALSE TRUE FALSE TRUE FALSE TRUE ...

We manipulate the values compenent in the following way

r$values <- with(r, cumsum(values) * values)
r
#Run Length Encoding
# lengths: int [1:9] 1 3 2 2 1 3 2 3 1
# values : int [1:9] 0 1 0 2 0 3 0 4 0

That is, we replaced TRUEs with the cumulative sum of values and set the FALSEs to 0. Now inverse.rle returns a vector in which values will repeated lenghts times

out <- inverse.rle(r)
out
# [1] 0 1 1 1 0 0 2 2 0 3 3 3 0 0 4 4 4 0

This is almost what OP wants but we need to replace the 0s with NA

replace(out, out == 0, NA)

This is done for each group.

data

dat <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", 
"B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE, 
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE, 
TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA, 
-18L))

edited yesterday

answered yesterday

markus

15.5k11336

edited yesterday

answered yesterday

markus

15.5k11336

answered yesterday

markus

15.5k11336

answered yesterday

markus

15.5k11336

Wow, impressive. Thanks for introducing me to rleand inverse.rle. Gruß nach Leipzig.

– Humpelstielzchen
yesterday

1

@Humpelstielzchen Gern geschehen. Will try to simplify and explain the logic a bit.

– markus
yesterday

Thanks! I was dissecting your answer just like that. Your answer taught me the most. But chinsoon12 is just a Teufelskerl. ^^

– Humpelstielzchen
yesterday

add a comment |

Wow, impressive. Thanks for introducing me to rleand inverse.rle. Gruß nach Leipzig.

– Humpelstielzchen
yesterday

1

@Humpelstielzchen Gern geschehen. Will try to simplify and explain the logic a bit.

– markus
yesterday

Thanks! I was dissecting your answer just like that. Your answer taught me the most. But chinsoon12 is just a Teufelskerl. ^^

– Humpelstielzchen
yesterday

Wow, impressive. Thanks for introducing me to rleand inverse.rle. Gruß nach Leipzig.

– Humpelstielzchen
yesterday

@Humpelstielzchen Gern geschehen. Will try to simplify and explain the logic a bit.

– markus
yesterday

Thanks! I was dissecting your answer just like that. Your answer taught me the most. But chinsoon12 is just a Teufelskerl. ^^

– Humpelstielzchen
yesterday

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ttdfjt

4 Answers
4

Your Answer

Post as a guest

4 Answers
4

4 Answers
4

Post as a guest

Popular posts from this blog

Grendel Contents Story Scholarship Depictions Notes References Navigation menu10.1093/notesj/gjn112Berserkeree

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

4 Answers 4

4 Answers 4

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Grendel Contents Story Scholarship Depictions Notes References Navigation menu10.1093/notesj/gjn112Berserkeree

4 Answers
4

4 Answers
4

4 Answers
4