I am plotting two line plots from two files. File 1 is for the raw data, while File 2 is for a smoothed data using fourier transform.

File 1:

> dput(dat)
structure(list(Average = c(0.290625, 1.803125, 0.05625, 0, 0.290625, 
0.434375, 0.878125, 0.028125, 0.075, 0, 0.396875, 0.38125, 0.5, 
0.3, 0.4, 0.38125, 0.1, 0.35, 0.46875, 1.01875, 0.2375, 0.94375, 
1.5125, 1.36875, 3.378125, 0.20625, 0.109375, 0.590625, 0.371875, 
0.76875, 0.171875, 0.373341836734375, 0.20625, 0.43125, 0.40625, 
0.39375, 1.121875, 0.3875, 2.16875, 0.3, 0.3625, 1.23125, 0.44375, 
0.25625, 0.896875, 1.253125, 1.13125, 0.81875, 0.85, 0.228125, 
0.8, 0.88125, 0.4625, 1.646875, 0.53125, 1.53125, 2.34375, 0.20625, 
2.99375, 0.8125, 0.5875, 0.26875, 1.859375, 0.053125, 0.95, 0.9625, 
1.134375, 2.41875, 0.2125, 0.565625, 0.675, 0.440625, 2.571875, 
1.55, 1.55625, 0.709375, 0.390625, 1.025, 0.715625, 2.4125, 1.234375, 
0.303125, 0.815625, 1.665625, 4.3875, 4.0375, 5.0875, 1.265625, 
5.128125, 4.39375, 1.475, 1.3375, 2.534375, 1.3125, 6.309375, 
2.675, 1.809375, 1.75625, 2.075, 1.171875, 1.496875, 5.946875, 
2.50625, 2.65, 1.7, 2.6, 2.05, 3.659375, 5.203125, 0.71875, 1.5125, 
6.2, 3.559375, 2.1375, 5.2875, 9.375, 4.83125, 5.378125, 3.915625, 
3.63125, 4.221875, 5.09375, 7.209375, 6.525, 7.296875, 4.440625, 
5.559375, 6.440625, 6.921875, 12.1125, 10.196875, 7.940625, 5.128125, 
6.678125, 9.728125, 5.70625, 6.709375, 25.5625, 15.3, 6.490625, 
10.321875, 14.203125, 11.459375, 13.553125, 15.903125, 14.603125, 
19.315625, 25.240625, 19.546875, 10.896875, 10.4875, 14.0625, 
17.33125, 21.6125, 17.13125, 18.175, 16.846875, 8.484375, 10.809375, 
6.246875, 11.421875, 7.265625, 9.209375, 14.175, 8.66875, 8.5875, 
8.7375, 18.325, 15.88125, 10.159375, 11.9875, 12.83125, 21.175, 
31.146875, 18.740625, 14.359375, 20.2375, 26.640625, 16.35, 25.496875, 
32.378125, 22.7625, 10.55625, 15.4875, 17.484375, 52.04375, 22.840625, 
24.1375, 21.875, 29.95, 43.153125, 25.75625, 26.15625, 24.128125, 
14.1875, 13.590625, 25.08125, 20.31875, 21.628125, 17.834375, 
13.375, 15.734375, 10.86875, 29.55625, 21.753125, 41.03125, 33.934375, 
25.40625, 30.71875, 31.3375, 31.059375, 36.25, 31.10625, 38.875, 
30.96875, 32.196875, 23.49375, 34.7125, 50.0625, 30.9625, 23.6625, 
20.71875, 13, 24.125, 23.621875, 19.671875, 27.978125, 20.965625, 
17.590625, 29.703125, 39.609375, 40.053125, 44.803125, 15.425, 
32.30625, 26.3875, 23.309375, 28.7625, 43.9, 31.021875, 54.571875, 
54.403125, 31.728125, 22.859375, 28.65625, 21.49375, 26.95625, 
24.2375, 15.040625, 15.5625, 17.11875, 19.565625, 25.928125, 
18.7125, 17.675, 17.103125, 14.43125, 15.659375, 17.51875, 10.378125, 
21.98125, 24.46875, 22.603125, 33.06875, 32.328125, 21.6125, 
12.184375, 8.484375, 21.428125, 16.5875, 14.753125, 7.6375, 7.98125, 
12.43125, 13.2125, 5.715625, 23.38125, 12.203125, 28.646875, 
23.490625, 14.590625, 30.828125, 11.640625, 8.775, 5.3125, 4.228125, 
5.175, 36.371875, 11.765625, 3.7125, 4.521875, 18.20625, 21, 
9.28125, 3.54375, 5.021875, 19.31875, 8.91875, 9.078125, 8.025, 
25.971875, 27.328125, 18.74375, 13.415625, 7.1625, 5.290625, 
4.434375, 2.44375, 6.596875, 2.278125, 2.828125, 1.1875, 3.825, 
5.01875, 8.05, 4.35, 3.478125, 2.7375, 1.00625, 3.98125, 3.475, 
1.84375, 0.45, 1.08125, 1.490625, 0.628125, 5.15625, 0.896875, 
5.8125, 7.79375, 3.909375, 1.40625, 0.225, 2.7125, 1.278125, 
1.29354838709677, 4.21290322580645, 1.68387096774194, 
1.19032258064516, 
0.567741935483871, 0.951612903225806, 0.948387096774193, 
0.493548387096774, 
1.00322580645161, 0.709677419354839, 1.19354838709677, 
0.135483870967742, 
0.412903225806452, 0.896774193548387, 0.032258064516129, 
0.329032258064516, 
0.441935483870968, 1.53870967741936, 0.935483870967742, 
0.009677419354839, 
1.36774193548387, 0.509677419354839, 1.08387096774194, 
0.280645161290323, 
0.606451612903226, 1.05161290322581, 1.88387096774194, 
0.206451612903226, 
1.00645161290323, 0.309677419354839, 0.225806451612903)), class = 
"data.frame", row.names = c(NA, 
-366L))

File 2:File 2 is a bit long so I uploaded it in the following link:

Link to data

CODE

library(lubridate)
library(ggplot2)
library(extrafont)
loadfonts()
inDir <- "."
imgDir <- "."
    dat<-read.csv("test.csv",header=TRUE,sep=",")
    dat2 <-read.csv("fourier.csv",header=TRUE,sep=",")
    dat<-data.frame(dat)
    dat$date<-seq(as.Date("2000-01-01"),as.Date("2000-12-31"),"day")
    p <- ggplot(dat, aes(date, Average))
    p <- p + 
 geom_bar(colour="black",size=0.15,width=1,stat="identity")
    p <- p + theme(panel.background=element_rect(fill="white"),
         panel.border=element_rect(colour="black",fill=NA,size=1),
         axis.line.x=element_line(colour="black"),
         axis.line.y=element_line(colour="black"),
         axis.text=element_text(size=15,colour="black",family="serif"),
         axis.title=element_text(size=15,colour="black",family="serif"),
         legend.position = "top", legend.key = element_rect(fill = 'white'),
         plot.margin = unit(c(0.5,0.5,0.5,0.5),"cm"))
    p <- p + scale_y_continuous(breaks=seq(0,70,by=10),limits = c(0,70), expand=c(0,0))

    p <- p + geom_line(data=dat2,aes(time,y),color='red',size=1)

    p <- p + scale_x_date(date_breaks="1 month",date_labels="%b",expand=c(0,0))
    p <- p + labs(x = "Month", y = "Average Daily Rainfall(mm/day)")

outImg <- paste0(imgDir,"/","fourier.png")
ggsave(outImg,p,width=6,height=5)

PROBLEM

This line causes the error:

p <- p + geom_line(data=dat2,aes(time,y),color='red',size=1)

Error: Invalid input: date_trans works with objects of class Date only Execution halted

I cannot overlay the second file because it doesnt follow the date axis. It has a different time-step after the smoothing.

EXPECTED OUTPUT

The ouput should be a barplot (gray color) with a line of the smoothed time series (red color). The x-axis are the months (date).

I'll appreciate any suggestions on how to do this correctly in R. Many thanks!

1 Answers

2
Peter Smittenaar On Best Solutions

This is what your data looks like before plotting:

> str(dat)
'data.frame':   366 obs. of  2 variables:
 $ Average: num  0.2906 1.8031 0.0563 0 0.2906 ...
 $ date   : Date, format: "2000-01-01" "2000-01-02" "2000-01-03" "2000-01-04" ...
> str(dat2)
'data.frame':   36600 obs. of  3 variables:
 $ amp : int  1 2 3 4 5 6 7 8 9 10 ...
 $ time: num  1 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 ...
 $ y   : num  0.898 0.897 0.896 0.894 0.893 ...

Your smoothed curve is upsampled 100-fold from your original data. When you call

p <- p + geom_line(data=dat2,aes(time,y),color='red',size=1)

The time variable is just 1, 1.01, 1.02, etc. which doesn't fit on the x-axis which is of Date format. One thing you can do is downsample the fourier dataframe:

dat2.downsampled = dat2[seq(1, nrow(dat2), 100), ]
nrow(dat2.downsampled)
[1] 366

then plot

p <- p + geom_line(data=dat2.downsampled,aes(dat$date, y),color='red',size=1)

which gives you enter image description here

You could also add a Date variable to dat2 that takes 36600 steps to get through your 1-year range, but that seems to me more hassle.

Hope that helps.