Get last date in each month of a time series pandas

Question

Get last date in each month of a time series pandas

12k views Asked by ikemblem At 09 June 2015 at 22:21

Currently I'm generating a DateTimeIndex using a certain function, zipline.utils.tradingcalendar.get_trading_days. The time series is roughly daily but with some gaps.

My goal is to get the last date in the DateTimeIndex for each month.

.to_period('M') & .to_timestamp('M') don't work since they give the last day of the month rather than the last value of the variable in each month.

As an example, if this is my time series I would want to select '2015-05-29' while the last day of the month is '2015-05-31'.

['2015-05-18', '2015-05-19', '2015-05-20', '2015-05-21', '2015-05-22', '2015-05-26', '2015-05-27', '2015-05-28', '2015-05-29', '2015-06-01']

Original Q&A

There are 6 answers

Condla On 09 June 2015 at 23:05

My strategy would be to group by month and then select the "maximum" of each group:

If "dt" is your DatetimeIndex object:

last_dates_of_the_month = []
dt_month_group_dict = dt.groupby(dt.month)
for month in dt_month_group_dict:
    last_date = max(dt_month_group_dict[month])
    last_dates_of_the_month.append(last_date)

The list "last_date_of_the_month" contains all occuring last dates of each month in your dataset. You can use this list to create a DatetimeIndex in pandas again (or whatever you want to do with it).

MMCM_ On 21 August 2015 at 08:04

Maybe the answer is not needed anymore, but while searching for an answer to the same question I found maybe a simpler solution:

import pandas as pd 

sample_dates = pd.date_range(start='2010-01-01', periods=100, freq='B')
month_end_dates = sample_dates[sample_dates.is_month_end]

Maxim On 21 February 2018 at 18:17

This is an old question, but all existing answers here aren't perfect. This is the solution I came up with (assuming that date is a sorted index), which can be even written in one line, but I split it for readability:

month1 = pd.Series(apple.index.month)
month2 = pd.Series(apple.index.month).shift(-1)
mask = (month1 != month2)
apple[mask.values].head(10)

Few notes here:

Shifting a datetime series requires another pd.Series instance (see here)
Boolean mask indexing requires .values (see here)

By the way, when the dates are the business days, it'd be easier to use resampling: apple.resample('BM')

Guido Guerra On 06 August 2020 at 15:21

Try this, to create a new diff column where the value 1 points to the change from one month to the next.

     df['diff'] = np.where(df['Date'].dt.month.diff() != 0,1,0)

user3570984 On 24 May 2019 at 20:56

Suppose your data frame looks like this

original dataframe

Then the following Code will give you the last day of each month.

df_monthly = df.reset_index().groupby([df.index.year,df.index.month],as_index=False).last().set_index('index')

transformed_dataframe

This one line code does its job :)

**ikemblem** · Accepted Answer · 2015-06-10T12:15:02+00:00

Condla's answer came closest to what I needed except that since my time index stretched for more than a year I needed to groupby by both month and year and then select the maximum date. Below is the code I ended up with.

# tempTradeDays is the initial DatetimeIndex
dateRange = []  
tempYear = None  
dictYears = tempTradeDays.groupby(tempTradeDays.year)
for yr in dictYears.keys():
    tempYear = pd.DatetimeIndex(dictYears[yr]).groupby(pd.DatetimeIndex(dictYears[yr]).month)
    for m in tempYear.keys():
        dateRange.append(max(tempYear[m]))
dateRange = pd.DatetimeIndex(dateRange).order()

TechQA.

Get last date in each month of a time series pandas

There are 6 answers

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in ZIPLINE

Popular Questions

Popular Tags

Trending Questions