Rank groups within a grouped sequence of TRUE/FALSE and NAGrouping functions (tapply, by, aggregate) and the *apply familyCharacters counting and subletting specific patternsWhat is the purpose of setting a key in data.table?data.table vs dplyr: can one do something well the other can't or does poorly?how to make a bar plot for a list of dataframes?How to group by unique values in a list in RPandas - Alternative to rank() function that gives unique ordinal ranks for a columnRank within group in for loop in RData transformation: from dyadic to observational data in RGetting map from purrr to work with paste0

Understanding trademark infringements in a world where many dictionary words are trademarks?

How to use dependency injection and avoid temporal coupling?

How to write a 12-bar blues melody

Can you Ready a Bard spell to release it after using Battle Magic?

Why did the Apollo 13 crew extend the LM landing gear?

SafeCracker #3 - We've Been Blocked

As a Bard multi-classing into Warlock, what spells do I get?

Multiple SQL versions with Docker

Didn't attend field-specific conferences during my PhD; how much of a disadvantage is it?

Copy previous line to current line from text file

Would glacier 'trees' be plausible?

As matter approaches a black hole, does it speed up?

Find the cheapest shipping option based on item weight

Python - What if the end-user didn't have the required library?

Why does this derived table improve performance?

Appropriate certificate to ask for a fibre installation (ANSI/TIA-568.3-D?)

How can internet speed be 10 times slower without a router than when using a router?

What to use instead of cling film to wrap pastry

What is the most remote airport from the center of the city it supposedly serves?

Validation rule Scheduled Apex

Shutter speed -vs- effective image stabilisation

Why do people keep telling me that I am a bad photographer?

Controlled Hadamard gate in ZX-calculus

A factorization game

Rank groups within a grouped sequence of TRUE/FALSE and NA

Grouping functions (tapply, by, aggregate) and the *apply familyCharacters counting and subletting specific patternsWhat is the purpose of setting a key in data.table?data.table vs dplyr: can one do something well the other can't or does poorly?how to make a bar plot for a list of dataframes?How to group by unique values in a list in RPandas - Alternative to rank() function that gives unique ordinal ranks for a columnRank within group in for loop in RData transformation: from dyadic to observational data in RGetting map from purrr to work with paste0

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;

I have a little nut to crack.

I have a data.frame like this:

 group criterium
1 A NA
2 A TRUE
3 A TRUE
4 A TRUE
5 A FALSE
6 A FALSE
7 A TRUE
8 A TRUE
9 A FALSE
10 A TRUE
11 A TRUE
12 A TRUE
13 B NA
14 B FALSE
15 B TRUE
16 B TRUE
17 B TRUE
18 B FALSE

structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", 
"B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE, 
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE, 
TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA, 
-18L))

And I want to rank the groups of TRUE in column criterium in ascending order while disregarding the FALSEand NA. The goal is to have a unique group identifier inside each group of group.

So the result should look like:

 group criterium goal
1 A NA NA
2 A TRUE 1
3 A TRUE 1
4 A TRUE 1
5 A FALSE NA
6 A FALSE NA
7 A TRUE 2
8 A TRUE 2
9 A FALSE NA
10 A TRUE 3
11 A TRUE 3
12 A TRUE 3
13 B NA NA
14 B FALSE NA
15 B TRUE 1
16 B TRUE 1
17 B TRUE 1
18 B FALSE NA

I'm sure there is a relatively easy way to do this, I just can't think of one. I experimented with dense_rank() and other window functions of dplyr, but to no avail.

edited Apr 17 at 22:04

TylerH

16.2k105569

asked Apr 10 at 6:47

Humpelstielzchen

1,7991319

1

you can just about grab what you need with this work of beauty; as.numeric(as.factor(cumsum(is.na(d$criterium^NA)) + d$criterium^NA)) -- just needs to be applied by group

– user20650
Apr 10 at 8:45

that is a really funny solution. Very good job!

– Humpelstielzchen
Apr 10 at 8:49

In your example all of group A comes first, then group B. We don't need to handle cases with group=A, criterium=TRUE interspersed with group=B, criterium=TRUE?

– smci
Apr 10 at 8:50

No, when group A stops so stops the sequence for group A.

– Humpelstielzchen
Apr 10 at 8:51

But I'm suggesting if you construct an example with group=A, criterium=TRUE followed by group=B, criterium=TRUE (with no FALSE's in-between), would that get a new 'goal' number or not? Some of the answers here will fail because they don't group-by group or consider the discontinuity in group.

– smci
Apr 10 at 8:53

|
show 1 more comment

I have a little nut to crack.

I have a data.frame like this:

 group criterium
1 A NA
2 A TRUE
3 A TRUE
4 A TRUE
5 A FALSE
6 A FALSE
7 A TRUE
8 A TRUE
9 A FALSE
10 A TRUE
11 A TRUE
12 A TRUE
13 B NA
14 B FALSE
15 B TRUE
16 B TRUE
17 B TRUE
18 B FALSE

structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", 
"B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE, 
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE, 
TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA, 
-18L))

And I want to rank the groups of TRUE in column criterium in ascending order while disregarding the FALSEand NA. The goal is to have a unique group identifier inside each group of group.

So the result should look like:

 group criterium goal
1 A NA NA
2 A TRUE 1
3 A TRUE 1
4 A TRUE 1
5 A FALSE NA
6 A FALSE NA
7 A TRUE 2
8 A TRUE 2
9 A FALSE NA
10 A TRUE 3
11 A TRUE 3
12 A TRUE 3
13 B NA NA
14 B FALSE NA
15 B TRUE 1
16 B TRUE 1
17 B TRUE 1
18 B FALSE NA

I'm sure there is a relatively easy way to do this, I just can't think of one. I experimented with dense_rank() and other window functions of dplyr, but to no avail.

edited Apr 17 at 22:04

TylerH

16.2k105569

asked Apr 10 at 6:47

Humpelstielzchen

1,7991319

1

you can just about grab what you need with this work of beauty; as.numeric(as.factor(cumsum(is.na(d$criterium^NA)) + d$criterium^NA)) -- just needs to be applied by group

– user20650
Apr 10 at 8:45

that is a really funny solution. Very good job!

– Humpelstielzchen
Apr 10 at 8:49

In your example all of group A comes first, then group B. We don't need to handle cases with group=A, criterium=TRUE interspersed with group=B, criterium=TRUE?

– smci
Apr 10 at 8:50

No, when group A stops so stops the sequence for group A.

– Humpelstielzchen
Apr 10 at 8:51

But I'm suggesting if you construct an example with group=A, criterium=TRUE followed by group=B, criterium=TRUE (with no FALSE's in-between), would that get a new 'goal' number or not? Some of the answers here will fail because they don't group-by group or consider the discontinuity in group.

– smci
Apr 10 at 8:53

|
show 1 more comment

I have a little nut to crack.

I have a data.frame like this:

 group criterium
1 A NA
2 A TRUE
3 A TRUE
4 A TRUE
5 A FALSE
6 A FALSE
7 A TRUE
8 A TRUE
9 A FALSE
10 A TRUE
11 A TRUE
12 A TRUE
13 B NA
14 B FALSE
15 B TRUE
16 B TRUE
17 B TRUE
18 B FALSE

structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", 
"B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE, 
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE, 
TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA, 
-18L))

And I want to rank the groups of TRUE in column criterium in ascending order while disregarding the FALSEand NA. The goal is to have a unique group identifier inside each group of group.

So the result should look like:

 group criterium goal
1 A NA NA
2 A TRUE 1
3 A TRUE 1
4 A TRUE 1
5 A FALSE NA
6 A FALSE NA
7 A TRUE 2
8 A TRUE 2
9 A FALSE NA
10 A TRUE 3
11 A TRUE 3
12 A TRUE 3
13 B NA NA
14 B FALSE NA
15 B TRUE 1
16 B TRUE 1
17 B TRUE 1
18 B FALSE NA

I'm sure there is a relatively easy way to do this, I just can't think of one. I experimented with dense_rank() and other window functions of dplyr, but to no avail.

edited Apr 17 at 22:04

TylerH

16.2k105569

asked Apr 10 at 6:47

Humpelstielzchen

1,7991319

I have a little nut to crack.

I have a data.frame like this:

 group criterium
1 A NA
2 A TRUE
3 A TRUE
4 A TRUE
5 A FALSE
6 A FALSE
7 A TRUE
8 A TRUE
9 A FALSE
10 A TRUE
11 A TRUE
12 A TRUE
13 B NA
14 B FALSE
15 B TRUE
16 B TRUE
17 B TRUE
18 B FALSE

structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", 
"B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE, 
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE, 
TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA, 
-18L))

And I want to rank the groups of TRUE in column criterium in ascending order while disregarding the FALSEand NA. The goal is to have a unique group identifier inside each group of group.

So the result should look like:

 group criterium goal
1 A NA NA
2 A TRUE 1
3 A TRUE 1
4 A TRUE 1
5 A FALSE NA
6 A FALSE NA
7 A TRUE 2
8 A TRUE 2
9 A FALSE NA
10 A TRUE 3
11 A TRUE 3
12 A TRUE 3
13 B NA NA
14 B FALSE NA
15 B TRUE 1
16 B TRUE 1
17 B TRUE 1
18 B FALSE NA

I'm sure there is a relatively easy way to do this, I just can't think of one. I experimented with dense_rank() and other window functions of dplyr, but to no avail.

r dplyr data.table rank

edited Apr 17 at 22:04

TylerH

16.2k105569

asked Apr 10 at 6:47

Humpelstielzchen

1,7991319

edited Apr 17 at 22:04

TylerH

16.2k105569

asked Apr 10 at 6:47

Humpelstielzchen

1,7991319

edited Apr 17 at 22:04

TylerH

16.2k105569

edited Apr 17 at 22:04

TylerH

16.2k105569

edited Apr 17 at 22:04

TylerH

16.2k105569

asked Apr 10 at 6:47

Humpelstielzchen

1,7991319

asked Apr 10 at 6:47

Humpelstielzchen

1,7991319

asked Apr 10 at 6:47

Humpelstielzchen

1,7991319

1

you can just about grab what you need with this work of beauty; as.numeric(as.factor(cumsum(is.na(d$criterium^NA)) + d$criterium^NA)) -- just needs to be applied by group

– user20650
Apr 10 at 8:45

that is a really funny solution. Very good job!

– Humpelstielzchen
Apr 10 at 8:49

In your example all of group A comes first, then group B. We don't need to handle cases with group=A, criterium=TRUE interspersed with group=B, criterium=TRUE?

– smci
Apr 10 at 8:50

No, when group A stops so stops the sequence for group A.

– Humpelstielzchen
Apr 10 at 8:51

But I'm suggesting if you construct an example with group=A, criterium=TRUE followed by group=B, criterium=TRUE (with no FALSE's in-between), would that get a new 'goal' number or not? Some of the answers here will fail because they don't group-by group or consider the discontinuity in group.

– smci
Apr 10 at 8:53

|
show 1 more comment

1

you can just about grab what you need with this work of beauty; as.numeric(as.factor(cumsum(is.na(d$criterium^NA)) + d$criterium^NA)) -- just needs to be applied by group

– user20650
Apr 10 at 8:45

that is a really funny solution. Very good job!

– Humpelstielzchen
Apr 10 at 8:49

In your example all of group A comes first, then group B. We don't need to handle cases with group=A, criterium=TRUE interspersed with group=B, criterium=TRUE?

– smci
Apr 10 at 8:50

No, when group A stops so stops the sequence for group A.

– Humpelstielzchen
Apr 10 at 8:51

But I'm suggesting if you construct an example with group=A, criterium=TRUE followed by group=B, criterium=TRUE (with no FALSE's in-between), would that get a new 'goal' number or not? Some of the answers here will fail because they don't group-by group or consider the discontinuity in group.

– smci
Apr 10 at 8:53

you can just about grab what you need with this work of beauty; as.numeric(as.factor(cumsum(is.na(d$criterium^NA)) + d$criterium^NA)) -- just needs to be applied by group

– user20650
Apr 10 at 8:45

that is a really funny solution. Very good job!

– Humpelstielzchen
Apr 10 at 8:49

In your example all of group A comes first, then group B. We don't need to handle cases with group=A, criterium=TRUE interspersed with group=B, criterium=TRUE?

– smci
Apr 10 at 8:50

No, when group A stops so stops the sequence for group A.

– Humpelstielzchen
Apr 10 at 8:51

But I'm suggesting if you construct an example with group=A, criterium=TRUE followed by group=B, criterium=TRUE (with no FALSE's in-between), would that get a new 'goal' number or not? Some of the answers here will fail because they don't group-by group or consider the discontinuity in group.

– smci
Apr 10 at 8:53

|
show 1 more comment

4 Answers
4

active

oldest

votes

Another data.table approach:

library(data.table)
setDT(dt)
dt[, cr := rleid(criterium)][
 (criterium), goal := rleid(cr), by=.(group)]

answered Apr 10 at 8:20

chinsoon12

10.1k11420

add a comment |

Maybe I have over-complicated this but one way with dplyr is

library(dplyr)

df %>%
 mutate(temp = replace(criterium, is.na(criterium), FALSE), 
 temp1 = cumsum(!temp)) %>%
 group_by(temp1) %>%
 mutate(goal = +(row_number() == which.max(temp) & any(temp))) %>%
 group_by(group) %>%
 mutate(goal = ifelse(temp, cumsum(goal), NA)) %>%
 select(-temp, -temp1)

# group criterium goal
# <fct> <lgl> <int>
# 1 A NA NA
# 2 A TRUE 1
# 3 A TRUE 1
# 4 A TRUE 1
# 5 A FALSE NA
# 6 A FALSE NA
# 7 A TRUE 2
# 8 A TRUE 2
# 9 A FALSE NA
#10 A TRUE 3
#11 A TRUE 3
#12 A TRUE 3
#13 B NA NA
#14 B FALSE NA
#15 B TRUE 1
#16 B TRUE 1
#17 B TRUE 1
#18 B FALSE NA

We first replace NAs in criterium column to FALSE and take cumulative sum over the negation of it (temp1). We group_by temp1 and assign 1 to every first TRUE value in the group. Finally grouping by group we take a cumulative sum for TRUE values or return NA for FALSE and NA values.

answered Apr 10 at 7:24

Ronak Shah

51.4k104370

add a comment |

A pure Base R solution, we can create a custom function via rle, and use it per group, i.e.

f1 <- function(x) 
 x[is.na(x)] <- FALSE
 rle1 <- rle(x)
 y <- rle1$values
 rle1$values[!y] <- 0
 rle1$values[y] <- cumsum(rle1$values[y])
 return(inverse.rle(rle1))



do.call(rbind, 
 lapply(split(df, df$group), function(i)i$goal <- f1(i$criterium); 
 i$goal <- replace(i$goal, is.na(i$criterium)))

Of course, If you want you can apply it via dplyr, i.e.

library(dplyr)

df %>% 
 group_by(group) %>% 
 mutate(goal = f1(criterium), 
 goal = replace(goal, is.na(criterium)|!criterium, NA))

which gives,

# A tibble: 18 x 3
# Groups: group [2]
 group criterium goal
 <fct> <lgl> <dbl>
 1 A NA NA
 2 A TRUE 1
 3 A TRUE 1
 4 A TRUE 1
 5 A FALSE NA
 6 A FALSE NA
 7 A TRUE 2
 8 A TRUE 2
 9 A FALSE NA
10 A TRUE 3
11 A TRUE 3
12 A TRUE 3
13 B NA NA
14 B FALSE NA
15 B TRUE 1
16 B TRUE 1
17 B TRUE 1
18 B FALSE NA

edited Apr 10 at 9:59

answered Apr 10 at 7:29

Sotos

32.2k51843

add a comment |

A data.table option using rle

library(data.table)
DT <- as.data.table(dat)
DT[, goal := 
 r <- rle(replace(criterium, is.na(criterium), FALSE))
 r$values <- with(r, cumsum(values) * values) 
 out <- inverse.rle(r) 
 replace(out, out == 0, NA)
, by = group]
DT
# group criterium goal
# 1: A NA NA
# 2: A TRUE 1
# 3: A TRUE 1
# 4: A TRUE 1
# 5: A FALSE NA
# 6: A FALSE NA
# 7: A TRUE 2
# 8: A TRUE 2
# 9: A FALSE NA
#10: A TRUE 3
#11: A TRUE 3
#12: A TRUE 3
#13: B NA NA
#14: B FALSE NA
#15: B TRUE 1
#16: B TRUE 1
#17: B TRUE 1
#18: B FALSE NA

step by step

When we call r <- rle(replace(criterium, is.na(criterium), FALSE)) we get an object of class rle

r
#Run Length Encoding
# lengths: int [1:9] 1 3 2 2 1 3 2 3 1
# values : logi [1:9] FALSE TRUE FALSE TRUE FALSE TRUE ...

We manipulate the values compenent in the following way

r$values <- with(r, cumsum(values) * values)
r
#Run Length Encoding
# lengths: int [1:9] 1 3 2 2 1 3 2 3 1
# values : int [1:9] 0 1 0 2 0 3 0 4 0

That is, we replaced TRUEs with the cumulative sum of values and set the FALSEs to 0. Now inverse.rle returns a vector in which values will repeated lenghts times

out <- inverse.rle(r)
out
# [1] 0 1 1 1 0 0 2 2 0 3 3 3 0 0 4 4 4 0

This is almost what OP wants but we need to replace the 0s with NA

replace(out, out == 0, NA)

This is done for each group.

data

dat <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", 
"B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE, 
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE, 
TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA, 
-18L))

edited Apr 10 at 11:34

answered Apr 10 at 7:26

markus

16.4k11336

Wow, impressive. Thanks for introducing me to rleand inverse.rle. Gruß nach Leipzig.

– Humpelstielzchen
Apr 10 at 7:59

1

@Humpelstielzchen Gern geschehen. Will try to simplify and explain the logic a bit.

– markus
Apr 10 at 8:02

Thanks! I was dissecting your answer just like that. Your answer taught me the most. But chinsoon12 is just a Teufelskerl. ^^

– Humpelstielzchen
Apr 10 at 8:36

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55606323%2frank-groups-within-a-grouped-sequence-of-true-false-and-na%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

Another data.table approach:

library(data.table)
setDT(dt)
dt[, cr := rleid(criterium)][
 (criterium), goal := rleid(cr), by=.(group)]

answered Apr 10 at 8:20

chinsoon12

10.1k11420

add a comment |

Another data.table approach:

library(data.table)
setDT(dt)
dt[, cr := rleid(criterium)][
 (criterium), goal := rleid(cr), by=.(group)]

answered Apr 10 at 8:20

chinsoon12

10.1k11420

add a comment |

Another data.table approach:

library(data.table)
setDT(dt)
dt[, cr := rleid(criterium)][
 (criterium), goal := rleid(cr), by=.(group)]

answered Apr 10 at 8:20

chinsoon12

10.1k11420

Another data.table approach:

library(data.table)
setDT(dt)
dt[, cr := rleid(criterium)][
 (criterium), goal := rleid(cr), by=.(group)]

answered Apr 10 at 8:20

chinsoon12

10.1k11420

answered Apr 10 at 8:20

chinsoon12

10.1k11420

answered Apr 10 at 8:20

chinsoon12

10.1k11420

answered Apr 10 at 8:20

chinsoon12

10.1k11420

add a comment |

Maybe I have over-complicated this but one way with dplyr is

library(dplyr)

df %>%
 mutate(temp = replace(criterium, is.na(criterium), FALSE), 
 temp1 = cumsum(!temp)) %>%
 group_by(temp1) %>%
 mutate(goal = +(row_number() == which.max(temp) & any(temp))) %>%
 group_by(group) %>%
 mutate(goal = ifelse(temp, cumsum(goal), NA)) %>%
 select(-temp, -temp1)

# group criterium goal
# <fct> <lgl> <int>
# 1 A NA NA
# 2 A TRUE 1
# 3 A TRUE 1
# 4 A TRUE 1
# 5 A FALSE NA
# 6 A FALSE NA
# 7 A TRUE 2
# 8 A TRUE 2
# 9 A FALSE NA
#10 A TRUE 3
#11 A TRUE 3
#12 A TRUE 3
#13 B NA NA
#14 B FALSE NA
#15 B TRUE 1
#16 B TRUE 1
#17 B TRUE 1
#18 B FALSE NA

answered Apr 10 at 7:24

Ronak Shah

51.4k104370

add a comment |

Maybe I have over-complicated this but one way with dplyr is

library(dplyr)

df %>%
 mutate(temp = replace(criterium, is.na(criterium), FALSE), 
 temp1 = cumsum(!temp)) %>%
 group_by(temp1) %>%
 mutate(goal = +(row_number() == which.max(temp) & any(temp))) %>%
 group_by(group) %>%
 mutate(goal = ifelse(temp, cumsum(goal), NA)) %>%
 select(-temp, -temp1)

# group criterium goal
# <fct> <lgl> <int>
# 1 A NA NA
# 2 A TRUE 1
# 3 A TRUE 1
# 4 A TRUE 1
# 5 A FALSE NA
# 6 A FALSE NA
# 7 A TRUE 2
# 8 A TRUE 2
# 9 A FALSE NA
#10 A TRUE 3
#11 A TRUE 3
#12 A TRUE 3
#13 B NA NA
#14 B FALSE NA
#15 B TRUE 1
#16 B TRUE 1
#17 B TRUE 1
#18 B FALSE NA

answered Apr 10 at 7:24

Ronak Shah

51.4k104370

add a comment |

Maybe I have over-complicated this but one way with dplyr is

library(dplyr)

df %>%
 mutate(temp = replace(criterium, is.na(criterium), FALSE), 
 temp1 = cumsum(!temp)) %>%
 group_by(temp1) %>%
 mutate(goal = +(row_number() == which.max(temp) & any(temp))) %>%
 group_by(group) %>%
 mutate(goal = ifelse(temp, cumsum(goal), NA)) %>%
 select(-temp, -temp1)

# group criterium goal
# <fct> <lgl> <int>
# 1 A NA NA
# 2 A TRUE 1
# 3 A TRUE 1
# 4 A TRUE 1
# 5 A FALSE NA
# 6 A FALSE NA
# 7 A TRUE 2
# 8 A TRUE 2
# 9 A FALSE NA
#10 A TRUE 3
#11 A TRUE 3
#12 A TRUE 3
#13 B NA NA
#14 B FALSE NA
#15 B TRUE 1
#16 B TRUE 1
#17 B TRUE 1
#18 B FALSE NA

answered Apr 10 at 7:24

Ronak Shah

51.4k104370

Maybe I have over-complicated this but one way with dplyr is

library(dplyr)

df %>%
 mutate(temp = replace(criterium, is.na(criterium), FALSE), 
 temp1 = cumsum(!temp)) %>%
 group_by(temp1) %>%
 mutate(goal = +(row_number() == which.max(temp) & any(temp))) %>%
 group_by(group) %>%
 mutate(goal = ifelse(temp, cumsum(goal), NA)) %>%
 select(-temp, -temp1)

# group criterium goal
# <fct> <lgl> <int>
# 1 A NA NA
# 2 A TRUE 1
# 3 A TRUE 1
# 4 A TRUE 1
# 5 A FALSE NA
# 6 A FALSE NA
# 7 A TRUE 2
# 8 A TRUE 2
# 9 A FALSE NA
#10 A TRUE 3
#11 A TRUE 3
#12 A TRUE 3
#13 B NA NA
#14 B FALSE NA
#15 B TRUE 1
#16 B TRUE 1
#17 B TRUE 1
#18 B FALSE NA

answered Apr 10 at 7:24

Ronak Shah

51.4k104370

answered Apr 10 at 7:24

Ronak Shah

51.4k104370

answered Apr 10 at 7:24

Ronak Shah

51.4k104370

answered Apr 10 at 7:24

Ronak Shah

51.4k104370

add a comment |

A pure Base R solution, we can create a custom function via rle, and use it per group, i.e.

f1 <- function(x) 
 x[is.na(x)] <- FALSE
 rle1 <- rle(x)
 y <- rle1$values
 rle1$values[!y] <- 0
 rle1$values[y] <- cumsum(rle1$values[y])
 return(inverse.rle(rle1))



do.call(rbind, 
 lapply(split(df, df$group), function(i)i$goal <- f1(i$criterium); 
 i$goal <- replace(i$goal, is.na(i$criterium)))

Of course, If you want you can apply it via dplyr, i.e.

library(dplyr)

df %>% 
 group_by(group) %>% 
 mutate(goal = f1(criterium), 
 goal = replace(goal, is.na(criterium)|!criterium, NA))

which gives,

# A tibble: 18 x 3
# Groups: group [2]
 group criterium goal
 <fct> <lgl> <dbl>
 1 A NA NA
 2 A TRUE 1
 3 A TRUE 1
 4 A TRUE 1
 5 A FALSE NA
 6 A FALSE NA
 7 A TRUE 2
 8 A TRUE 2
 9 A FALSE NA
10 A TRUE 3
11 A TRUE 3
12 A TRUE 3
13 B NA NA
14 B FALSE NA
15 B TRUE 1
16 B TRUE 1
17 B TRUE 1
18 B FALSE NA

edited Apr 10 at 9:59

answered Apr 10 at 7:29

Sotos

32.2k51843

add a comment |

A pure Base R solution, we can create a custom function via rle, and use it per group, i.e.

f1 <- function(x) 
 x[is.na(x)] <- FALSE
 rle1 <- rle(x)
 y <- rle1$values
 rle1$values[!y] <- 0
 rle1$values[y] <- cumsum(rle1$values[y])
 return(inverse.rle(rle1))



do.call(rbind, 
 lapply(split(df, df$group), function(i)i$goal <- f1(i$criterium); 
 i$goal <- replace(i$goal, is.na(i$criterium)))

Of course, If you want you can apply it via dplyr, i.e.

library(dplyr)

df %>% 
 group_by(group) %>% 
 mutate(goal = f1(criterium), 
 goal = replace(goal, is.na(criterium)|!criterium, NA))

which gives,

# A tibble: 18 x 3
# Groups: group [2]
 group criterium goal
 <fct> <lgl> <dbl>
 1 A NA NA
 2 A TRUE 1
 3 A TRUE 1
 4 A TRUE 1
 5 A FALSE NA
 6 A FALSE NA
 7 A TRUE 2
 8 A TRUE 2
 9 A FALSE NA
10 A TRUE 3
11 A TRUE 3
12 A TRUE 3
13 B NA NA
14 B FALSE NA
15 B TRUE 1
16 B TRUE 1
17 B TRUE 1
18 B FALSE NA

edited Apr 10 at 9:59

answered Apr 10 at 7:29

Sotos

32.2k51843

add a comment |

A pure Base R solution, we can create a custom function via rle, and use it per group, i.e.

f1 <- function(x) 
 x[is.na(x)] <- FALSE
 rle1 <- rle(x)
 y <- rle1$values
 rle1$values[!y] <- 0
 rle1$values[y] <- cumsum(rle1$values[y])
 return(inverse.rle(rle1))



do.call(rbind, 
 lapply(split(df, df$group), function(i)i$goal <- f1(i$criterium); 
 i$goal <- replace(i$goal, is.na(i$criterium)))

Of course, If you want you can apply it via dplyr, i.e.

library(dplyr)

df %>% 
 group_by(group) %>% 
 mutate(goal = f1(criterium), 
 goal = replace(goal, is.na(criterium)|!criterium, NA))

which gives,

# A tibble: 18 x 3
# Groups: group [2]
 group criterium goal
 <fct> <lgl> <dbl>
 1 A NA NA
 2 A TRUE 1
 3 A TRUE 1
 4 A TRUE 1
 5 A FALSE NA
 6 A FALSE NA
 7 A TRUE 2
 8 A TRUE 2
 9 A FALSE NA
10 A TRUE 3
11 A TRUE 3
12 A TRUE 3
13 B NA NA
14 B FALSE NA
15 B TRUE 1
16 B TRUE 1
17 B TRUE 1
18 B FALSE NA

edited Apr 10 at 9:59

answered Apr 10 at 7:29

Sotos

32.2k51843

A pure Base R solution, we can create a custom function via rle, and use it per group, i.e.

f1 <- function(x) 
 x[is.na(x)] <- FALSE
 rle1 <- rle(x)
 y <- rle1$values
 rle1$values[!y] <- 0
 rle1$values[y] <- cumsum(rle1$values[y])
 return(inverse.rle(rle1))



do.call(rbind, 
 lapply(split(df, df$group), function(i)i$goal <- f1(i$criterium); 
 i$goal <- replace(i$goal, is.na(i$criterium)))

Of course, If you want you can apply it via dplyr, i.e.

library(dplyr)

df %>% 
 group_by(group) %>% 
 mutate(goal = f1(criterium), 
 goal = replace(goal, is.na(criterium)|!criterium, NA))

which gives,

# A tibble: 18 x 3
# Groups: group [2]
 group criterium goal
 <fct> <lgl> <dbl>
 1 A NA NA
 2 A TRUE 1
 3 A TRUE 1
 4 A TRUE 1
 5 A FALSE NA
 6 A FALSE NA
 7 A TRUE 2
 8 A TRUE 2
 9 A FALSE NA
10 A TRUE 3
11 A TRUE 3
12 A TRUE 3
13 B NA NA
14 B FALSE NA
15 B TRUE 1
16 B TRUE 1
17 B TRUE 1
18 B FALSE NA

edited Apr 10 at 9:59

answered Apr 10 at 7:29

Sotos

32.2k51843

edited Apr 10 at 9:59

answered Apr 10 at 7:29

Sotos

32.2k51843

answered Apr 10 at 7:29

Sotos

32.2k51843

answered Apr 10 at 7:29

Sotos

32.2k51843

add a comment |

A data.table option using rle

library(data.table)
DT <- as.data.table(dat)
DT[, goal := 
 r <- rle(replace(criterium, is.na(criterium), FALSE))
 r$values <- with(r, cumsum(values) * values) 
 out <- inverse.rle(r) 
 replace(out, out == 0, NA)
, by = group]
DT
# group criterium goal
# 1: A NA NA
# 2: A TRUE 1
# 3: A TRUE 1
# 4: A TRUE 1
# 5: A FALSE NA
# 6: A FALSE NA
# 7: A TRUE 2
# 8: A TRUE 2
# 9: A FALSE NA
#10: A TRUE 3
#11: A TRUE 3
#12: A TRUE 3
#13: B NA NA
#14: B FALSE NA
#15: B TRUE 1
#16: B TRUE 1
#17: B TRUE 1
#18: B FALSE NA

step by step

When we call r <- rle(replace(criterium, is.na(criterium), FALSE)) we get an object of class rle

r
#Run Length Encoding
# lengths: int [1:9] 1 3 2 2 1 3 2 3 1
# values : logi [1:9] FALSE TRUE FALSE TRUE FALSE TRUE ...

We manipulate the values compenent in the following way

r$values <- with(r, cumsum(values) * values)
r
#Run Length Encoding
# lengths: int [1:9] 1 3 2 2 1 3 2 3 1
# values : int [1:9] 0 1 0 2 0 3 0 4 0

That is, we replaced TRUEs with the cumulative sum of values and set the FALSEs to 0. Now inverse.rle returns a vector in which values will repeated lenghts times

out <- inverse.rle(r)
out
# [1] 0 1 1 1 0 0 2 2 0 3 3 3 0 0 4 4 4 0

This is almost what OP wants but we need to replace the 0s with NA

replace(out, out == 0, NA)

This is done for each group.

data

dat <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", 
"B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE, 
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE, 
TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA, 
-18L))

edited Apr 10 at 11:34

answered Apr 10 at 7:26

markus

16.4k11336

Wow, impressive. Thanks for introducing me to rleand inverse.rle. Gruß nach Leipzig.

– Humpelstielzchen
Apr 10 at 7:59

1

@Humpelstielzchen Gern geschehen. Will try to simplify and explain the logic a bit.

– markus
Apr 10 at 8:02

Thanks! I was dissecting your answer just like that. Your answer taught me the most. But chinsoon12 is just a Teufelskerl. ^^

– Humpelstielzchen
Apr 10 at 8:36

add a comment |

A data.table option using rle

library(data.table)
DT <- as.data.table(dat)
DT[, goal := 
 r <- rle(replace(criterium, is.na(criterium), FALSE))
 r$values <- with(r, cumsum(values) * values) 
 out <- inverse.rle(r) 
 replace(out, out == 0, NA)
, by = group]
DT
# group criterium goal
# 1: A NA NA
# 2: A TRUE 1
# 3: A TRUE 1
# 4: A TRUE 1
# 5: A FALSE NA
# 6: A FALSE NA
# 7: A TRUE 2
# 8: A TRUE 2
# 9: A FALSE NA
#10: A TRUE 3
#11: A TRUE 3
#12: A TRUE 3
#13: B NA NA
#14: B FALSE NA
#15: B TRUE 1
#16: B TRUE 1
#17: B TRUE 1
#18: B FALSE NA

step by step

When we call r <- rle(replace(criterium, is.na(criterium), FALSE)) we get an object of class rle

r
#Run Length Encoding
# lengths: int [1:9] 1 3 2 2 1 3 2 3 1
# values : logi [1:9] FALSE TRUE FALSE TRUE FALSE TRUE ...

We manipulate the values compenent in the following way

r$values <- with(r, cumsum(values) * values)
r
#Run Length Encoding
# lengths: int [1:9] 1 3 2 2 1 3 2 3 1
# values : int [1:9] 0 1 0 2 0 3 0 4 0

That is, we replaced TRUEs with the cumulative sum of values and set the FALSEs to 0. Now inverse.rle returns a vector in which values will repeated lenghts times

out <- inverse.rle(r)
out
# [1] 0 1 1 1 0 0 2 2 0 3 3 3 0 0 4 4 4 0

This is almost what OP wants but we need to replace the 0s with NA

replace(out, out == 0, NA)

This is done for each group.

data

dat <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", 
"B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE, 
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE, 
TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA, 
-18L))

edited Apr 10 at 11:34

answered Apr 10 at 7:26

markus

16.4k11336

Wow, impressive. Thanks for introducing me to rleand inverse.rle. Gruß nach Leipzig.

– Humpelstielzchen
Apr 10 at 7:59

1

@Humpelstielzchen Gern geschehen. Will try to simplify and explain the logic a bit.

– markus
Apr 10 at 8:02

Thanks! I was dissecting your answer just like that. Your answer taught me the most. But chinsoon12 is just a Teufelskerl. ^^

– Humpelstielzchen
Apr 10 at 8:36

add a comment |

A data.table option using rle

library(data.table)
DT <- as.data.table(dat)
DT[, goal := 
 r <- rle(replace(criterium, is.na(criterium), FALSE))
 r$values <- with(r, cumsum(values) * values) 
 out <- inverse.rle(r) 
 replace(out, out == 0, NA)
, by = group]
DT
# group criterium goal
# 1: A NA NA
# 2: A TRUE 1
# 3: A TRUE 1
# 4: A TRUE 1
# 5: A FALSE NA
# 6: A FALSE NA
# 7: A TRUE 2
# 8: A TRUE 2
# 9: A FALSE NA
#10: A TRUE 3
#11: A TRUE 3
#12: A TRUE 3
#13: B NA NA
#14: B FALSE NA
#15: B TRUE 1
#16: B TRUE 1
#17: B TRUE 1
#18: B FALSE NA

step by step

When we call r <- rle(replace(criterium, is.na(criterium), FALSE)) we get an object of class rle

r
#Run Length Encoding
# lengths: int [1:9] 1 3 2 2 1 3 2 3 1
# values : logi [1:9] FALSE TRUE FALSE TRUE FALSE TRUE ...

We manipulate the values compenent in the following way

r$values <- with(r, cumsum(values) * values)
r
#Run Length Encoding
# lengths: int [1:9] 1 3 2 2 1 3 2 3 1
# values : int [1:9] 0 1 0 2 0 3 0 4 0

That is, we replaced TRUEs with the cumulative sum of values and set the FALSEs to 0. Now inverse.rle returns a vector in which values will repeated lenghts times

out <- inverse.rle(r)
out
# [1] 0 1 1 1 0 0 2 2 0 3 3 3 0 0 4 4 4 0

This is almost what OP wants but we need to replace the 0s with NA

replace(out, out == 0, NA)

This is done for each group.

data

dat <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", 
"B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE, 
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE, 
TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA, 
-18L))

edited Apr 10 at 11:34

answered Apr 10 at 7:26

markus

16.4k11336

A data.table option using rle

library(data.table)
DT <- as.data.table(dat)
DT[, goal := 
 r <- rle(replace(criterium, is.na(criterium), FALSE))
 r$values <- with(r, cumsum(values) * values) 
 out <- inverse.rle(r) 
 replace(out, out == 0, NA)
, by = group]
DT
# group criterium goal
# 1: A NA NA
# 2: A TRUE 1
# 3: A TRUE 1
# 4: A TRUE 1
# 5: A FALSE NA
# 6: A FALSE NA
# 7: A TRUE 2
# 8: A TRUE 2
# 9: A FALSE NA
#10: A TRUE 3
#11: A TRUE 3
#12: A TRUE 3
#13: B NA NA
#14: B FALSE NA
#15: B TRUE 1
#16: B TRUE 1
#17: B TRUE 1
#18: B FALSE NA

step by step

When we call r <- rle(replace(criterium, is.na(criterium), FALSE)) we get an object of class rle

r
#Run Length Encoding
# lengths: int [1:9] 1 3 2 2 1 3 2 3 1
# values : logi [1:9] FALSE TRUE FALSE TRUE FALSE TRUE ...

We manipulate the values compenent in the following way

r$values <- with(r, cumsum(values) * values)
r
#Run Length Encoding
# lengths: int [1:9] 1 3 2 2 1 3 2 3 1
# values : int [1:9] 0 1 0 2 0 3 0 4 0

That is, we replaced TRUEs with the cumulative sum of values and set the FALSEs to 0. Now inverse.rle returns a vector in which values will repeated lenghts times

out <- inverse.rle(r)
out
# [1] 0 1 1 1 0 0 2 2 0 3 3 3 0 0 4 4 4 0

This is almost what OP wants but we need to replace the 0s with NA

replace(out, out == 0, NA)

This is done for each group.

data

dat <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", 
"B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE, 
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE, 
TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA, 
-18L))

edited Apr 10 at 11:34

answered Apr 10 at 7:26

markus

16.4k11336

edited Apr 10 at 11:34

answered Apr 10 at 7:26

markus

16.4k11336

answered Apr 10 at 7:26

markus

16.4k11336

answered Apr 10 at 7:26

markus

16.4k11336

Wow, impressive. Thanks for introducing me to rleand inverse.rle. Gruß nach Leipzig.

– Humpelstielzchen
Apr 10 at 7:59

1

@Humpelstielzchen Gern geschehen. Will try to simplify and explain the logic a bit.

– markus
Apr 10 at 8:02

Thanks! I was dissecting your answer just like that. Your answer taught me the most. But chinsoon12 is just a Teufelskerl. ^^

– Humpelstielzchen
Apr 10 at 8:36

add a comment |

Wow, impressive. Thanks for introducing me to rleand inverse.rle. Gruß nach Leipzig.

– Humpelstielzchen
Apr 10 at 7:59

1

@Humpelstielzchen Gern geschehen. Will try to simplify and explain the logic a bit.

– markus
Apr 10 at 8:02

Thanks! I was dissecting your answer just like that. Your answer taught me the most. But chinsoon12 is just a Teufelskerl. ^^

– Humpelstielzchen
Apr 10 at 8:36

Wow, impressive. Thanks for introducing me to rleand inverse.rle. Gruß nach Leipzig.

– Humpelstielzchen
Apr 10 at 7:59

@Humpelstielzchen Gern geschehen. Will try to simplify and explain the logic a bit.

– markus
Apr 10 at 8:02

Thanks! I was dissecting your answer just like that. Your answer taught me the most. But chinsoon12 is just a Teufelskerl. ^^

– Humpelstielzchen
Apr 10 at 8:36

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

uqZmdBCXSassH NXLFsqoa,HOJ,9,y48PdwKSdR0tfqn0Ya4wQ5qWmCa

搜尋此網誌

Trjtdtk

4 Answers
4

Your Answer

Post as a guest

4 Answers
4

4 Answers
4

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

4 Answers 4

4 Answers 4

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

4 Answers
4

4 Answers
4

4 Answers
4