Dynamic programming: how to design algorithm for when there are two factors to consider?

406 views Asked by At

I have the following problem and I only have a slight idea about it:

Consider a tape storage problem. Given n files of length l1,...,ln and frequencies with which they are accessed f1,...,fn, where sum of all frequencies is 1 and 0<fi<1. "Optimal" means to minimize the average retrieval time of these files. For example, if you have two files, each of length 10, 4 and frequency 0.8, 0.2, if store file 1 first, the average retrieval time is 10 x 0.8 + 14 x 0.2 = 10.8.

Design an algorithm to find the optimal order and prove that it works.

My thoughts: Larger frequencies & larger length in front of the order, but which factor should be given a higher priority?

3

There are 3 answers

0
Jay Bosamiya On BEST ANSWER

There is no need for Dynamic Programming for this problem. It is a simple sorting problem, with a slight twist.

Find frequency / length for each file. This gets you, how frequent is each unit length access for the file. Sort these in descending order since each file is "penalized" by the previous file's length.

Proof:

Assume that the order is fixed somehow. Then, swapping any two adjacent files will not affect any of the other li * fi other than the location of the swap. Call the left one in the swap x and other one y == x+1. Then, the change in average retrieval time is ly*fx - lx*fy. If fx/lx < fy/ly, then ly*fx < lx*fy which means ly*fx - lx*fy < 0, which means that the average retrieval time decreased. This means that we are at an advantage to do this swap. We keep doing such swaps whenever fx/lx < fy/ly and we will get to a point when this is no longer possible. This must be the optimal answer.

Swapping adjacent things whenever left < right, is similar to bubble sorting in descending order, so we can just generally use any sort instead.

1
Younghyo Kim On

just let files of length as ascending(l1<l2<l3<l4<...<ln) and frequencies as descending orders. because

average retrieval time =
ART1 = 
(l1)*f1 +
(l1+l2)*f2 +
(l1+l2+l3)*f3 +
...

let assume we swap l1 and l2. so we broke the ascending order of length.

ART2 =
(l2)*f1 + 
(l2+l1)*f2 + 
(l1+l2+l3)*f3 + 
...

if it makes (ART1 < ART2), it made the situation worse. look at this. we should check ART1-ART2 < 0?

ART1-ART2 = 
{ (l1)*f1 + (l1+l2)*f2 }
- 
{ (l2)*f1 + (l2+l1)*f2 }

= l1*f1-l2*f1
= (l1-l2)*f1.

from assumption l1<l2, we conclude ART1-ART2<0.

so lengths should be ascending order.

similarly you can probe that frequencies should be descending orders if you swap f1 and f2.

0
div On

DP will work but is not a good solution of this problem because while a brute-force method checking all options take O(n!) time, a DP method takes time proportional to the number of different subproblems. Say, firstly we make all possible n choices, then we would be left with n subproblems of size (n-1) all different. For DP to work, optimal solution to main problem should result from optimal solution to subproblems. Now, for size (n-1) problems, lets say we know the solutions, so

total average retreival time = (length of first chosen file * 1) + (retrieval time for (n-1) size problem)

Because of above equation, overlapping subproblems will emerge, but total number of distinct subproblems would be

nC1 + nC2 + ... nCn = 2^n

still exponential.

So, DP will reduce the complexity, but nature of problem would still be exponential.