data format and uploading issue

68 views Asked by At

Here is a snippet of data I have to analyze:

2014 log[1]: id="1" sys="Sec" sub="fill" severity="inf" dort="30" 2014 log[2]: id="2" sys="Sec" sub="fill" severity="inf" name="dropped"

Excuse me if this is an obvious question, but what kind of data is this, and how can I get it into a csv format to analyze? So far I have tried using R by specifying whitespace as the delimiter, but that didn't get me what I need:

table1<-read.table('data.txt', sep="", fill=TRUE, header = FALSE)

Ideally, "id", "sys", "sub", "severity", "dort", and "name"would be the column headers. Each row contains about the same number of variables, but some rows do not contain variables that other rows do. For those instances, I would like there to be "NULL". Ex: In row two in the data above, under the dort column I would like it to say "NULL"

1

There are 1 answers

3
Prem On BEST ANSWER

Hope this helps!

id="1" sys="Sec" sub="fill" severity="inf" dort="30" 
id="2" sys="Sec" sub="fill" severity="inf" name="dropped"

Let's consider that test.txt has above mentioned data then

library(rjson)
library(plyr)

#prepare json data
txt_file <- readLines("test.txt")
json_file <- lapply(txt_file, function(x) paste0('{"', gsub(',', ',"',gsub("=",'":',gsub('"\\s+', '",', gsub("^\\s+|\\s+$","",x)))), '}'))
json_file <- paste0("[", paste(json_file, collapse=","), "]")

#read json data
json_data <- fromJSON(json_file)

#convert it to dataframe
df <- rbind.fill(lapply(json_data,function(y){as.data.frame(t(y),stringsAsFactors=FALSE)}))
df[df=='NULL'] <- NA
df

Output is:

  id sys  sub severity dort    name
1  1 Sec fill      inf   30      NA
2  2 Sec fill      inf   NA dropped