Understanding large difference in cache hits

32 views Asked by At

I'm working on a Leetcode hard-tagged dynamic programming problem. My solution is 16x slower than a solution I found in the discussion section, and I'd like to better understand why.

Historically the recurrence relation I've used in similar problems is to advance the item index by 1 and either keep or skip the item. That's what the fast solution does. I thought that looping through items would be similar (I'm pretty sure I've seen that approach used with success elsewhere), and since I have less practice with that technique, I thought I would apply it to this problem. That's what I did in my much slower solution, and I think that's the source of the slowness. Are these two approaches not practically equivalent?

Here is the fast code that I didn't write. It takes about half a second.

from typing import List
from functools import cache

class Solution:
    def profitableSchemes(self, n: int, minProfit: int, group: List[int], profit: List[int]) -> int:

        @cache
        def dfs(i, members, cur_profit):

            if i >= len(profit):
                if cur_profit >= minProfit and members <= n:
                    return 1
                else:
                    return 0

            ans = 0
            ans += dfs(i + 1, members, cur_profit)

            if members + group[i] <= n:
                ans += dfs(i + 1, members + group[i], min(cur_profit + profit[i], minProfit))

            return ans

        answer = dfs(0, 0, 0) % (10 ** 9 + 7)
        print(dfs.cacehe_info())

        return answer


solution = Solution()
answer = solution.profitableSchemes(
    100, 100,
    [2,5,36,2,5,5,14,1,12,1,14,15,1,1,27,13,6,59,6,1,7,1,2,7,6,1,6,1,3,1,2,11,3,39,21,20,1,27,26,22,11,17,3,2,4,5,6,18,4,14,1,1,1,3,12,9,7,3,16,5,1,19,4,8,6,3,2,7,3,5,12,6,15,2,11,12,12,21,5,1,13,2,29,38,10,17,1,14,1,62,7,1,14,6,4,16,6,4,32,48],
    [21,4,9,12,5,8,8,5,14,18,43,24,3,0,20,9,0,24,4,0,0,7,3,13,6,5,19,6,3,14,9,5,5,6,4,7,20,2,13,0,1,19,4,0,11,9,6,15,15,7,1,25,17,4,4,3,43,46,82,15,12,4,1,8,24,3,15,3,6,3,0,8,10,8,10,1,21,13,10,28,11,27,17,1,13,10,11,4,36,26,4,2,2,2,10,0,11,5,22,6]
)

Here is my slow code. It takes about 8.5 seconds.

from typing import List
from functools import cache


class Solution:
    def profitableSchemes(self, n: int, minProfit: int, group: List[int], profit: List[int]) -> int:
        self.modulus = 10 ** 9 + 7
        self.groups = group
        self.profit = profit
        self.min_profit = minProfit
        self.n = n
        answer = self.solve(-1, n, 0) % self.modulus
        if minProfit == 0:
            answer += 1
        return answer

    @cache
    def solve(self, previous_crime_index, remaining_members, accumulated_profit):
        if remaining_members <= 0 or previous_crime_index == self.n - 1:
            return 0
        answer = 0
        for crime_index in range(previous_crime_index+1, len(self.groups)):
            capped_profit = min(accumulated_profit + self.profit[crime_index], self.min_profit)
            answer += self.solve(
                crime_index, remaining_members - self.groups[crime_index],
                capped_profit
            )
            answer = answer % self.modulus
            if self.profit[crime_index] + accumulated_profit >= self.min_profit and remaining_members - self.groups[crime_index] >= 0:
                answer += 1
        return answer % self.modulus


solution = Solution()
answer = solution.profitableSchemes(
    100, 100,
    [2,5,36,2,5,5,14,1,12,1,14,15,1,1,27,13,6,59,6,1,7,1,2,7,6,1,6,1,3,1,2,11,3,39,21,20,1,27,26,22,11,17,3,2,4,5,6,18,4,14,1,1,1,3,12,9,7,3,16,5,1,19,4,8,6,3,2,7,3,5,12,6,15,2,11,12,12,21,5,1,13,2,29,38,10,17,1,14,1,62,7,1,14,6,4,16,6,4,32,48],
    [21,4,9,12,5,8,8,5,14,18,43,24,3,0,20,9,0,24,4,0,0,7,3,13,6,5,19,6,3,14,9,5,5,6,4,7,20,2,13,0,1,19,4,0,11,9,6,15,15,7,1,25,17,4,4,3,43,46,82,15,12,4,1,8,24,3,15,3,6,3,0,8,10,8,10,1,21,13,10,28,11,27,17,1,13,10,11,4,36,26,4,2,2,2,10,0,11,5,22,6]
)

print(solution.solve.cache_info())

The fast method has 692860 cache hits, while mine has 24868771. That's a 35x difference! I would have expected similar cache hits. I ran python -m cProfile on both functions, and got similar stats. The only reason for the difference I can think of is that the loop approach and that the +1 item index approach are not equally efficient.

0

There are 0 answers