transposing data and sequence mining most common patterns in rows

Question

transposing data and sequence mining most common patterns in rows

31 views Asked by ATF At 15 August 2018 at 19:33

I have a data frame that looks like this:

              SFOpID Number MAGroupID
1 0032A00002cgs3XQAQ      1        99
2 0032A00002cgs3XQAQ      1        79
3 003F000001vyUGKIA2      2         8
4 0032A00002btWE6QAM      3        97
5 0032A00002btWE6QAM      3        86
6 0032A00002btWE6QAM      3        35

I need to transpose it so that it looks like this:

              SFOpID Number MAGroupID
1 0032A00002cgs3XQAQ      1        99  79
3 003F000001vyUGKIA2      2         8

Then generate counts for the five most common sequences for example: 12 people (SFOpID) have the 97 86 35 sequence, but only 4 people have the 99 79 sequence. I think this may be possible with the arules package doing something like the following:

x <- read_baskets(con  = system.file("misc", "zaki.txt", package = 
                                 "arulesSequences"),
      info = c("sequenceID","eventID","SIZE"))
      as(x, "data.frame")

The goal is to have output that looks like this:

       items sequenceID eventID SIZE
 1      {C,D}          1      10    2
 2    {A,B,C}          1      15    3
 3    {A,B,F}          1      20    3
 4  {A,C,D,F}          1      25    4
 5    {A,B,F}          2      15    3

Just, for items, it would be a sequence like {99, 79} or {97, 86, 35}

Original Q&A

There are 1 answers

**Nar** · Answer 1 · 2018-08-15T22:49:03+00:00

You can use group_by and next to collect values into one list. The list could be converted to text. Here is an example:

 code <- read.csv("code.csv", stringsAsFactors = F)
  library(dplyr)
  output <- code[, 2:4]%>%
    group_by(Number, MAGroupID) %>%
    nest()
  output$data <- as.character(output$data )

TechQA.

transposing data and sequence mining most common patterns in rows

There are 1 answers

Related Questions in R

Related Questions in SEQUENCES

Popular Questions

Trending Questions