Merge strategy



This isn't Java-specific, but I this would be a good group to ask...

I have directory containing many files of varying sizes. An external process adds small files periodically. Small files need to merged into large files so as to reduce the total number of files in the directory. We want to reduce the total number of files as quickly as possible.

What's the best merge strategy, assuming that the time require to merge two or more files is roughly proportional to their size?

Clearly it makes sense to merge small files together first before going after large files. One strategy would be to just merge the smallest X files on each pass. This is not optimal, though, if the smallest X files contains one or more very large files.

Assume that 1) there is a maximum number of files we want to involve in any given merge operation, and 2) there is a maximum file size.

Thoughts?
.