How to apply specific function to range of columns(but applying it to every column alone) in R?

Question

How to apply specific function to range of columns(but applying it to every column alone) in R?

559 views Asked by A. Stefanov At 10 January 2017 at 11:14

how the data I work with looks(it is a SNP data):

AA CC CA GG  
GA CA CC GG  
GG CCCC CAA GG  
CA GG CC GC

How I want it to become after case 2(row 3 is removed due to multiple characters column 2 and all columns are split into 2)

A A C C C A G G  
G A C A C C G G  
C A G G C C G C

case 1
what I use in the moment

mydata <- mydata[which(!nchar(as.character(mydata[,5]))>2),]
mydata <- mydata[which(!nchar(as.character(mydata[,6]))>2),]
mydata <- mydata[which(!nchar(as.character(mydata[,7]))>2),]

i want it to be

mydata <- mydata[which(!nchar(as.character(mydata[,5:7]))>2),]

the problem is that the function is counting all columns 5:7 and deleting every row. I want the same, but with doing it for each column, not for them together.
case 2 my code this uses libraries

library(dplyr)
library(splitstackshape)

run for each column splits the cells this is for column 6

data2$V6 = as.character(data2$V6)
data2 <- cSplit(data.frame(data2 %>% rowwise() %>%
mutate(V6 = V6, V6n = paste(unlist(strsplit(V6, "")),
collapse = ','))), "V6n", ",")
data2$V5 <- NULL

I do the same for all columns problem i want to do it for all columns potential solution: different types of loops, but I can't make it work. Any help will be appreciated

Original Q&A

There are 1 answers

**David Arenburg** · Accepted Answer · 2017-01-10T12:22:58+00:00

Here's a fully vectorized solution in order to reach your desired ouput

## Convert all the rows into a single vectors
tmp <- do.call(paste0, mydata)

## Remove too long rows, split and rbind
do.call(rbind, strsplit(tmp[nchar(tmp) == 2 * ncol(mydata)], "", fixed = TRUE))
#     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
# [1,] "A"  "A"  "C"  "C"  "C"  "A"  "G"  "G" 
# [2,] "G"  "A"  "C"  "A"  "C"  "C"  "G"  "G" 
# [3,] "C"  "A"  "G"  "G"  "C"  "C"  "G"  "C"

This will result in a matrix but could be easily converted to a data.frame if needed

TechQA.

How to apply specific function to range of columns(but applying it to every column alone) in R?

There are 1 answers

Related Questions in R

Related Questions in DPLYR

Related Questions in SPLITSTACKSHAPE

Popular Questions

Popular Tags

Trending Questions