To begin with, it should be noted that "as close as possible" is possible to define in several different ways, so let's assume we're talking about a sum-of-square-errors metric for the sake of simplicity.
Depending on your number of values and subsets, the brute-force solution may not actually be out of reach. If
n is the number of values, and
k is the number of subsets of
m values each (such that
m *
k =
n), the total number of ways to partition the values into subsets is
eq1.png
Which simplifies into
eq2.png
For either relatively small numbers of
n, or small numbers of
k, this might come out to reasonable values for exhaustive search.
Though for many values, of course, it doesn't. It's still possible to do heuristically better than Potjeh's suggestion, however. I tried that with 500 random numbers between 100 and 200, divided into 20 subsets, and the initial sum-of-square-errors was ~36.1. After applying a simple and fast adjustment algorithm, however, I decreased it to ~0.08. The algorithm in question simply being as such:
- For each distinct pair (a, b) of subsets of values:
- For each pair of items (i, j) in (a, b):
- If swapping i and j between a and b decreases the global error, then do so.
As is clear from the above description, this algorithm scales simply with O(
n²), which should be not only realistic but even fast for any realistic values of
n. It isn't strictly optimal, though, because it's often possible to optimize the error more by running the adjustment again, but it stabilizes very quickly (within two runs for every set I've tried it on thus far). I'm not sure if I can formally prove that the solution is optimal once it has stabilized, however. Though I'd think it is.
You do not have the required permissions to view the files attached to this post.