How can I plot subsets of temporal data?

4.1k views Asked by At

I have input data, and subset it in order to look at only rows with an entry of 4 or 5 in a column called CODE. Next, I subsetted this data in order to be able to look at a particular species. Then, I made sure entries in the DATE column were read as a date, instead of a factor (which was the default). Then, I plot two of the columns against each other:

ph<-read.csv(url("http://luq.lternet.edu/data/lterdb88/data/Lfdp1-ElVerdePhenology.txt"))
ftsd<-subset(ph, ph$CODE %in% c("4","5"))
DACEXC<-subset(ftsd, ftsd$SPECIES %in% "DACEXC")
DACEXC$DATE<-as.Date(DACEXC$DATE, format="%m/%d/%y")
plot(DACEXC$DATE,DACEXC$NUMBER)

The data goes from 1992 until 2007, and I'd like to plot one year at a time. I'll be doing this for a whole lot of species, but I can't figure out how to do it. I've tried a whole slew of things, including limiting the x-axis or trying to make a subset of only one year, but haven't figured it out. I've tried some of the following ideas:

plot(DACEXC$DATE,DACEXC$NUMBER, xlim=c(1992,1993))
plot(DACEXC$DATE,DACEXC$NUMBER, xlim=c(01/01/1992,12/31/1992))
plot(DACEXC$DATE,DACEXC$NUMBER, xlim=c(1992:1993))

DACEXC92<-subset(DACEXC92, DATE==1992)
DACEXC92
[1] DATE    BASKET  SPECIES CODE    NUMBER 
<0 rows> (or 0-length row.names)

The above yields an empty data frame as does the below, and none of my attempts at making conditional arguments have been successful.

DACEXC92<-subset(DACEXC92, DATE==04/01/92)
DACEXC92
[1] DATE    BASKET  SPECIES CODE    NUMBER 
<0 rows> (or 0-length row.names)

Any ideas for how to plot only one year at a time, or for how to make a subset of each year?

2

There are 2 answers

0
mdsumner On BEST ANSWER

Convert the date to a proper DateTimeClass (POSIXct or Date) and then use tools available to that.

 DACEXC$DATE <- as.POSIXct(strptime(DACEXC$DATE, "%Y-%m-%d"))

(as.Date(DACEXC$DATE) or as.POSIXct(DACEXC$DATE) could probably be used but I like to do it explicitly since it's easier to understand what is wrong when a different format is used).

Extract the year component from the POSIXlt representation, and equate to a specific year:

 with(DACEXC[as.POSIXlt(DACEXC$DATE)$year + 1900 == 1993, ], plot(DATE, NUMBER))

Or within a range of years:

with(DACEXC[as.POSIXlt(DACEXC$DATE)$year + 1900 %in% 1993:1995, ], 
     plot(DATE, NUMBER))

There are lots of options, once the data are in a good DateTime format, including subsetting with character representations like format(DACEXC$DATE, "%Y") == "1993".

See ?strptime for the format details, and ?DateTimeClasses for the big picture.

0
Joris Meys On

Make sure your xlim values are dates :

with(DACEXC,
  plot(DATE,NUMBER, 
     xlim=as.Date(c("1992-01-01","1992-12-31"))
  )
)

which gives:

enter image description here

Notice that this only changes the xlim, so data for the next year is still visible. If you want to work with the years, then you can also use the package chron :

library(chron)
DACEXC92 <- DACEXC[years(DACEXC$DATE)==1992,]
with(DACEXC92,plot(DATE,NUMBER))

which gives you the desired dataframe and :

enter image description here