R: Plot Survival curve with types of death as stacked area chart -


i have dataset right censored containing information of life times , different types of deaths given sample , want produce plot of survival curve (with actual values calculated sample , not model estimation) different types of death stacked area chart, this: desired plot

how can accomplish in r?

the dataset this:

death type time event 1       type3   81     1 2          na  868     0 3       type3 1022     1 4          na  868     0 5          na  868     0 6          na  868     0 7          na  868     0 8          na  887     0 9       type3  156     1 10         na  868     0 11         na  868     0 12         na  868     0 13      type3  354     1 14      type3  700     1 15      type3  632     1 16         na  868     0 17      type1  308     1 18         na 1001     0 19         na 1054     0 20         na 1059     0 21      type3  120     1 22         na  732     0 23      type3  543     1 24      type1  379     1 25         na  613     0 26         na 1082     0 27      type3  226     1 28      type2    1     0 29         na  976     0 30         na 1000     0 31         na  706     0 32         na 1015     0 33      type3  882     1 34         na 1088     0 35         na  642     0 36      type3  953     1 37         na 1068     0 38         na  819     0 39         na 1029     0 40      type3   34     1 41         na 1082     0 42      type3  498     1 43         na  923     0 44         na 1041     0 45      type3  321     1 46         na  557     0 47         na  628     0 48      type3  197     1 49      type3  155     1 50         na  955     0 

where death type na indicates censored data, time time of death or time of censoring, , event 1 dead , 0 censored. (this format required 'survfit' have actual start , end times dates)

(now, 50 points wouldn't possible construct such curve, data has lot more rows wouldn't fit here).

it's ugly bit of code, gets idea in. didn't take time figure out how add legend. please note kind of figure, while interesting in concept, isn't going mirror km curve. honest, if you're going present data way, makes more sense stacked bars @ fixed time points.

please note, i'm pretty sure there flaws lurking in code. comes no warranty, might started.

survdata <- structure(list(row.names = c("", "", "", "", "", "", "", "",  "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",  "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",  "", "", "", "", "", "", "", "", "", ""), death = 1:50, type = c("type3",  na, "type3", na, na, na, na, na, "type3", na, na, na, "type3",  "type3", "type3", na, "type1", na, na, na, "type3", na, "type3",  "type1", na, na, "type3", "type2", na, na, na, na, "type3", na,  na, "type3", na, na, na, "type3", na, "type3", na, na, "type3",  na, na, "type3", "type3", na), time = c(81l, 868l, 1022l, 868l,  868l, 868l, 868l, 887l, 156l, 868l, 868l, 868l, 354l, 700l, 632l,  868l, 308l, 1001l, 1054l, 1059l, 120l, 732l, 543l, 379l, 613l,  1082l, 226l, 1l, 976l, 1000l, 706l, 1015l, 882l, 1088l, 642l,  953l, 1068l, 819l, 1029l, 34l, 1082l, 498l, 923l, 1041l, 321l,  557l, 628l, 197l, 155l, 955l), event = c(1l, 0l, 1l, 0l, 0l,  0l, 0l, 0l, 1l, 0l, 0l, 0l, 1l, 1l, 1l, 0l, 1l, 0l, 0l, 0l, 1l,  0l, 1l, 1l, 0l, 0l, 1l, 0l, 0l, 0l, 0l, 0l, 1l, 0l, 0l, 1l, 0l,  0l, 0l, 1l, 0l, 1l, 0l, 0l, 1l, 0l, 0l, 1l, 1l, 0l)), .names = c("row.names",  "death", "type", "time", "event"), class = "data.frame", row.names = c(na,  -50l))  library(dplyr) library(zoo) library(rcolorbrewer) survdatasummary <-    arrange(survdata, time, type) %>%   mutate(type = ifelse(is.na(type), "alive", type)) %>%   group_by(time) %>%   #* count number of each type @ each time point   summarise(n_at_time = n(),             alive_at_time = sum(type == "alive"),             type1_at_time = sum(type == "type1"),             type2_at_time = sum(type == "type2"),             type3_at_time = sum(type == "type3")) %>%   ungroup() %>%   mutate(n_alive = sum(n_at_time) - cumsum(lag(n_at_time, default = 0)),          #* proportion of each type          p_type1_at_time = type1_at_time / n_alive,          p_type2_at_time = type2_at_time / n_alive,          p_type3_at_time = type3_at_time / n_alive,          #* convert 0 na          p_type1_at_time = ifelse(p_type1_at_time == 0, na, p_type1_at_time),          p_type2_at_time = ifelse(p_type2_at_time == 0, na, p_type2_at_time),          p_type3_at_time = ifelse(p_type3_at_time == 0, na, p_type3_at_time),          #* fill nas last known value          p_type1_at_time = na.locf(p_type1_at_time, false),          p_type2_at_time = na.locf(p_type2_at_time, false),          p_type3_at_time = na.locf(p_type3_at_time, false),          #* make leading nas 0          p_type1_at_time = ifelse(is.na(p_type1_at_time), 0, p_type1_at_time),          p_type2_at_time = ifelse(is.na(p_type2_at_time), 0, p_type2_at_time),          p_type3_at_time = ifelse(is.na(p_type3_at_time), 0, p_type3_at_time),          #* calculate cumulative proportions          p_alive_at_time = 1 - p_type1_at_time - p_type2_at_time - p_type3_at_time,          cump_type1_at_time = p_alive_at_time + p_type1_at_time,          cump_type2_at_time = cump_type1_at_time + p_type2_at_time,          cump_type3_at_time = cump_type2_at_time + p_type3_at_time,          #* following time using geom_rect          next_time = lead(time)) %>%  pal <- brewer.pal(4, "prgn") ggplot(survdatasummary,        aes(xmin = time,            xmax = next_time)) +    geom_rect(aes(ymin = 0, ymax = p_alive_at_time), fill = pal[1]) +    geom_rect(aes(ymin = p_alive_at_time, ymax = cump_type1_at_time), fill = pal[2]) +    geom_rect(aes(ymin = cump_type1_at_time, ymax = cump_type2_at_time), fill = pal[3]) +    geom_rect(aes(ymin = cump_type2_at_time, ymax = cump_type3_at_time), fill = pal[4]) 

enter image description here


Comments

Popular posts from this blog

python - pip install -U PySide error -

arrays - C++ error: a brace-enclosed initializer is not allowed here before ‘{’ token -

cytoscape.js - How to add nodes to Dagre layout with Cytoscape -