I have a data frame of sensor data
I have a data frame as follows:
pressure datetime
4.848374 2016-04-12 10:04:00
4.683901 2016-04-12 10:04:32
5.237860 2016-04-12 10:13:20
Now, I would like to apply ARIMA
to make predictive analytics.
Since the data is not sampled uniformly, I have aggregated it on Hourly basis which looks as follows:
datetime pressure
"2016-04-19 00:00:00 BST" 5.581806
"2016-04-19 01:00:00 BST" 4.769832
"2016-04-19 02:00:00 BST" 4.769832
"2016-04-19 03:00:00 BST" 4.553711
"2016-04-19 04:00:00 BST" 6.285599
"2016-04-19 05:00:00 BST" 5.873414
The pressure for every hour looks like below:
But I can't create ts
object as I am not sure what the frequency should be for Hourly data.
Your question has already been answered in the comment section, but just to reiterate, you should set the frequency to 24, as you want to forecast the hourly data:
To your next point with regards to fixing the dates in your plot, lets start with some example data:
Now we can set the hourlyPressure data to be a ts() object (let's ignore the dates for a minute)
Now fit your arima model, in this example I will use the auto.arima function from the forecast package as finding the best arima model is not the focus of attention here (although using auto.arima() is a pretty robust way of finding the best arima model to fit your data):
You can then plot this data with the appropriate dates by just specifying the x value in the plot() function
A little more difficult is when we forecast our data and want to plot the original data, with the forecasts and have the dates correct.
Now to plot the original data, forecasts with the correct dates, we can do the following:
This plots the original data with the forecasts and the dates are correct. However, just like the forecast package provides, perhaps we want the forecasts to be in blue.