Algorithm used by java.time.Period.between() when dealing with different-length months?

675 views Asked by At

When using java.time.Period.between() across months of varying lengths, why does the code below report different results depending on the direction of the operation?

import java.time.LocalDate;
import java.time.Period;
class Main {
  public static void main(String[] args) {
    LocalDate d1 = LocalDate.of(2019, 1, 30);
    LocalDate d2 = LocalDate.of(2019, 3, 29); 
    Period period = Period.between(d1, d2);
    System.out.println("diff: " + period.toString());
    // => P1M29D
    Period period2 = Period.between(d2, d1);
    System.out.println("diff: " + period2.toString());
    // => P-1M-30D
  }
}

Live repl: https://repl.it/@JustinGrant/BigScornfulQueryplan#Main.java

Here's how I'd expect it to work:

2019-01-30 => 2019-03-29

  1. Add one month to 2019-01-30 => 2019-02-30, which is constrained to 2019-02-28
  2. Add 29 days to get to 2019-03-29

This matches Java's result: P1M29D

(reversed) 2019-03-29 => 2019-01-30

  1. Subtract one month from 2019-03-29 => 2019-02-29, which is constrained to 2019-02-28
  2. Subtract 29 days to get to 2019-01-30

But Java returns P-1M-30D here. I expected P-1M-29D.

The reference docs say:

The period is calculated by removing complete months, then calculating the remaining number of days, adjusting to ensure that both have the same sign. The number of months is then split into years and months based on a 12 month year. A month is considered if the end day-of-month is greater than or equal to the start day-of-month. For example, from 2010-01-15 to 2011-03-18 is one year, two months and three days.

Maybe I'm not reading this carefully enough, but I don't think this text fully explains the divergent behavior that I'm seeing.

What am I misunderstanding about how java.time.Period.between is supposed to work? Specifically, what is expected to happen when the intermediate result of "removing complete months" is an invalid date?

Is the algorithm documented in more detail elsewhere?

1

There are 1 answers

6
ernest_k On

TL;DR

The algorithm I see in the source (copied below) does not seem to assume that a Period between two dates is expected to have the accuracy that the number of days between the same two dates would (I even suspect Period is not meant to be used in calculations on continuous time variables).

It computes the difference in months and days, then adjusts to make sure both have the same sign. The resulting period is built on the grounds of these two values.

The main challenge is that adding two months to LocalDate.of(2019, 1, 28) is not the same thing as adding (31 + 28) days or (28 + 31) days to that date. It's simply adding 2 months to LocalDate.of(2019, 1, 28), which gives LocalDate.of(2019, 3, 28).

In other words, in the context of LocalDate, Periods represent an accurate number of months (and derived years), but days are sensitive to the lengths of months they're computed into.


This is the source I'm seeing (java.time.LocalDate.until(ChronoLocalDate) is ultimately doing the job):

public Period until(ChronoLocalDate endDateExclusive) {
    LocalDate end = LocalDate.from(endDateExclusive);
    long totalMonths = end.getProlepticMonth() - this.getProlepticMonth();  // safe
    int days = end.day - this.day;
    if (totalMonths > 0 && days < 0) {
        totalMonths--;
        LocalDate calcDate = this.plusMonths(totalMonths);
        days = (int) (end.toEpochDay() - calcDate.toEpochDay());  // safe
    } else if (totalMonths < 0 && days > 0) {
        totalMonths++;
        days -= end.lengthOfMonth();
    }
    long years = totalMonths / 12;  // safe
    int months = (int) (totalMonths % 12);  // safe
    return Period.of(Math.toIntExact(years), months, days);
}

As can be seen, the sign adjustment is made when the month difference has a different sign from the day difference (and yes, they're computed separately). Both totalMonths > 0 && days < 0 and totalMonths < 0 && days > 0 are applicable in your examples (one to each calculation).

It just happens that when the period difference in months is positive, the period's day is computed using epoch days, thus producing an accurate result. It would still be potentially affected when there's necessity to clip the new end date to fit into the month length - such as in:

jshell> LocalDate.of(2019, 1, 31).plusMonths(1)
$42 ==> 2019-02-28

But this can't happen in your example because you simply can't supply an invalid end date to the method, as in

// Period.between(LocalDate.of(2019, 1, 31), LocalDate.of(2019, 2, 30)

for the resulting number of days in the resulting period to be clipped.

When the time difference in months is negative, however, it happens:

//task: account for the 1-day difference
jshell> LocalDate.of(2019, 5, 30).plusMonths(-1)
$50 ==> 2019-04-30

jshell> LocalDate.of(2019, 5, 31).plusMonths(-1)
$51 ==> 2019-04-30

And, using periods and local dates:

jshell> Period.between(LocalDate.of(2019, 3, 31), LocalDate.of(2019, 2, 28))
$39 ==> P-1M-3D //3 days? It didn't look at February's length (2<3 && 28<31)

jshell> Period.between(LocalDate.of(2019, 3, 31), LocalDate.of(2019, 1, 31))
$40 ==> P-2M

In your case (second call), -30 is the result of (30 - 29) - 31, where 31 is the number of days in January.

I think the short story here is not to use Period for time value calculations. In the context of time, I suppose month is a notional unit. Periods will work well when a month is defined as an abstract period (such as in calculations of monthly rent payments), but they'll usually fail when it comes to continuous time.