The following problem occurred to me while thinking about how to implement a linker for 80286 programs:
Given an integer n and a set of sections where each section has a size no larger than n, find a partitioning of S with the least number of partitions such that in each partition the sum of the section-sizes does not exceed n.
This problem appears when trying to link object files into a program for the 80286 architecture in 16 bit protected mode. This architecture has segments of up to 65536 bytes each and complex programs need to be split into multiple segments. The linker tries to arrange the program into the least possible number of segments since each segment corresponds to one entry in the operating system's segment descriptor table. Please consider this problem without considering the fact that the 16 bit protected mode is obsolete.
As a complication of this problem, one could consider the call-graph of the program to link. Considering the case that each change of the code segment has a little penalty (the CPU has to look up the new segment in the segment descriptor table), the linker could try to group functions into the same section that call one-another. When the call graph is weighted by the expected number of calls, the linker could try to find a partitioning of the program into segments so that the edges of the call graph that cross segment boundaries have the lowest sum / sum of squares.
Questions
- I have the hunch that the original problem might easily be NP-hard, but I'm not sure how to show that. Is it NP-hard?
- Is there a fast algorithm to approximate a good solution?
- Is there an algorithm to solve the second problem or to approximate a good solution?
- I appreciate pointers to further material on the topic of distributing code into segments.