I want to find a way to build a custom pandas.tseries.offsets
class at 1 second frequency for trading hours. The main requirement here is that the time offset object would be smart enough to know the next second of '2015-06-18 16:00:00' would be '2015-06-19 09:30:00 or 09:30:01', and the time delta computed from these two time stamps would be exactly 1s (custom offset 1s, similar to BDay(1)
for business day frequency) instead of the duration of closing hours.
The reason is that when plotting the pd.Series for intraday data across a few trading days, see the simulated example below, there are lots of 'step line' (linear interpolation) between close and next day open prices to represent the time duration of closing hours. Is there a way to get rid of this? I've look at the source codes of pandas.tseries.offsets
and find pd.tseries.offsets.BusinessHour
and pd.tseries.offsets.BusinessMixin
may help but I don't know how to use those.
import pandas as pd
import numpy as np
from pandas.tseries.holiday import USFederalHolidayCalendar
from pandas.tseries.offsets import CustomBusinessDay
# set as 'constant' object shared by all codes in this script
BDAY_US = CustomBusinessDay(calender=USFederalHolidayCalendar())
sample_freq = '5min'
dates = pd.date_range(start='2015-01-01', end='2015-01-31', freq=BDAY_US).date
# exculde the 09:30:00 as it is included in the first time bucket
times = pd.date_range(start='09:30:00', end='16:00:00', freq=sample_freq).time[1:]
time_stamps = [dt.datetime.combine(date, time) for date in dates for time in times]
s = pd.Series(np.random.randn(len(time_stamps)).cumsum() + 100, index=time_stamps)
s.plot()
Another way I can think of to partially fix this problem is to first reset_index()
to get the default consecutive integer index for each row, and then calculating the difference between consecutive integer index as time(in seconds) elapsed. Plotting integer index as x-axis and then relabel them to appropriate time labels. Could someone show me how to do it with matplotlib
as well?
Thanks for Jeff's comments. I just check the on-line docs for BusinessHour()
and find it may useful in my case. Another follow-up question(s): the BusinessHour
is in hour frequency, is there a way to make it at 1s frequency? also, how to combine it with CustomBusinessDay
object?
To use BusinessHour()
from pandas.tseries.offsets import *
bhour = BusinessHour(start='09:30', end='16:00')
time = pd.Timestamp('2015-06-18 15:00:00')
print(time)
2015-06-18 15:00:00
# hourly increment works nicely
print(time + bhour * 1)
2015-06-19 09:30:00
# but not at minute or second frequency
print(time + Minute(61))
2015-06-18 16:01:00
print(time + Second(60*60 + 1))
2015-06-18 16:00:01
Many many thanks, and any help would be highly appreciated.
As I mention in the comment, you potentially have two different problems
I've given a solution that will account for 1 as this seems to be your immediate issue. If you need 2, or both - let us know in comments:
1. Plot the points with business days adjacent
Most graphs in
matplotlib
can have index formatters applied to the axes via theticker
API. I'll adapt this example to suit your caseThis gives an output as follows, first at zoomed out natural scale, second zoomed in so you can see the transition between Friday 16:00 and Monday 09:00