R: melting a data.frame to use in ggplot2 for multiple y-value plot

1.4k views Asked by At

I would like to plot multiple Y-values against a time-variable. I don't have a complete data frame, as I didn't collect all the Y-values to start with. I believe I need to melt() the data frame before passing it to ggplot2, but that seems to fail because the "id" column isn't always filled in. Here's what I've tried, with the badly-broken ggplot at the end:

    library("ggplot2")
    library("reshape2")

    #Example data
    Date=c("2015-06-09", "2015-06-09", "2015-06-09")
    Time=c("07:00:01",   "08:00:01",   "09:00:01")
    src=c(47420413232,   47519749372,  47571637056)
    dest=c(NA,           NA,           49738231808)
    df = data.frame(Date, Time, src, dest)

    # process into a real "time" type, throw away the previous types
    df$time =  as.POSIXlt(paste(df$Date, df$Time))
    df$Date = NULL
    df$Time = NULL

    # review df to check it's sane
    df

    # melt the data to transform it for plotting
    molten_space = melt(df, id.vars="time")

    # now look at molten_space for what I think goes wrong
    molten_space


    # doesn't work at all, but basically I want to plot both src(y-value) and dest(y-value) against time(x-value)

    ggplot(data=molten_space, aes(x=time, y=value, group="time")) + geom_line() + scale_y_continuous("Space Used")
    #ggplot(data=molten_space, aes(x=time, y=Bytes, group="time")) + geom_line() + scale_y_continuous("Space Used", labels=f2si)
1

There are 1 answers

1
Backlin On BEST ANSWER

There are two problems here.

The first problem is that you use as.POSIXlt that produces lists instead of as.POSIXct which produces vectors. Therefore melt cannot do its job properly. Try this instead:

df$time <- as.POSIXct(paste(df$Date, df$Time))

The second problem is that you group on the same variable that is used for the x-axis, which doesn't make much sense to me. Try this instead:

ggplot(data=molten_space, aes(x=time, y=value, color=variable)) + 
    geom_point() + geom_line() + scale_y_continuous("Space Used")

enter image description here

Side note

The usage of as.POSIXlt is usually to extract different components from the data.

   R> unlist(as.POSIXlt(df$time[1]))
       sec    min   hour   mday    mon   year   wday   yday  isdst   zone gmtoff 
       "1"    "0"    "7"    "9"    "5"  "115"    "2"  "159"    "1" "CEST" "7200"