How to save hdf5 as a txt or csv in R?

2.2k views Asked by At

Using the info here I looked at the structure of hdf5 file:

source("http://bioconductor.org/biocLite.R")
biocLite("rhdf5")

library(rhdf5)

> str(h5ls("C:/Users/durraniu/hd5_file"))
'data.frame':   400 obs. of  5 variables:
 $ group : chr  "/" "/data" "/data" "/data" ...
 $ name  : chr  "data" "ACC_State" "ACC_State_Frames" "ACC_Voltage" ...
 $ otype : Factor w/ 15 levels "H5I_FILE","H5I_GROUP",..: 2 5 5 5 5 5 5 5 5 5 ...
 $ dclass: chr  "" "INTEGER" "INTEGER" "FLOAT" ...
 $ dim   : chr  "" "1 x 1" "1" "15869 x 1" ...

Some details:

> head(h5ls("C:/Users/durraniu/hd5_file"))
  group                           name       otype  dclass       dim
0     /                           data   H5I_GROUP                  
1 /data                      ACC_State H5I_DATASET INTEGER     1 x 1
2 /data               ACC_State_Frames H5I_DATASET INTEGER         1
3 /data                    ACC_Voltage H5I_DATASET   FLOAT 15869 x 1
4 /data CFS_Accelerator_Pedal_Position H5I_DATASET   FLOAT 15869 x 1
5 /data     CFS_Auto_Transmission_Mode H5I_DATASET INTEGER    28 x 1
> tail(h5ls("C:/Users/durraniu/hd5_file"))
      group        name       otype  dclass dim
394 /header   numvalues H5I_DATASET INTEGER 246
395 /header        rate H5I_DATASET INTEGER 246
396 /header        type H5I_DATASET  STRING 246
397 /header       units H5I_DATASET  STRING 246
398 /header varrateflag H5I_DATASET INTEGER 246
399       /        info   H5I_GROUP            

I want to explore and analyze the data but don't want to use hdf5 format. Can I convert it to a data frame or a set of different data frames? Can I save these data as txt or csv file(s)? I am comfortable in working with data frames in R.

1

There are 1 answers

0
LyzandeR On BEST ANSWER

See an example below:

I took this from the documentation of ?h5write:

First of all I make a sample .h5 to demonstrate how to read it as data.frame:

library(rhdf5)
h5createFile("ex_ls_dump.h5")
# create groups
h5createGroup("ex_ls_dump.h5","foo")
h5createGroup("ex_ls_dump.h5","foo/foobaa")
B = array(seq(0.1,2.0,by=0.1),dim=c(5,2,2))
attr(B, "scale") <- "liter"
h5write(B, "ex_ls_dump.h5","foo/B")

This writes an array on my disk in .h5 format.

If I do:

> str(h5ls("ex_ls_dump.h5"))
'data.frame':   3 obs. of  5 variables:
 $ group : chr  "/" "/foo" "/foo"
 $ name  : chr  "foo" "B" "foobaa"
 $ otype : Factor w/ 15 levels "H5I_FILE","H5I_GROUP",..: 2 5 2
 $ dclass: chr  "" "FLOAT" ""
 $ dim   : chr  "" "5 x 2 x 2" ""

what I get is the content of the file and not the data itself. It just contains info about the file. From documentation:

Lists the content of an HDF5 file.

Now, if you want to read this file normally i.e. since my .h5 file is an array I want to read it as such you use h5read:

E = h5read("ex_ls_dump.h5","foo/B")

> E
, , 1

     [,1] [,2]
[1,]  0.1  0.6
[2,]  0.2  0.7
[3,]  0.3  0.8
[4,]  0.4  0.9
[5,]  0.5  1.0

, , 2

     [,1] [,2]
[1,]  1.1  1.6
[2,]  1.2  1.7
[3,]  1.3  1.8
[4,]  1.4  1.9
[5,]  1.5  2.0

> is.array(E)
[1] TRUE

Therefore, just read your file in R using h5read and it will be read in the same format as it was written (probably a data.frame in your case).