This is OpenCM libdiff. It is a partial replacement for the diff library used by CVS, which is encumbered by GPL. The problem with this is that GPL and the OpenSSL license are incompatible. We need OpenSSL, and we don't want to be subject to GPL anyway, so replacing libdiff is definitely the best solution.
This library is NOT a full replacement for gnu diff or for the CVS libdiff. It implements only those parts of the diff logic that are needed by OpenCM, and it does NOT implement many of the options of the conventional diff program. The diff printer only knows how to print unified diffs, for example, and the implementation does not currently support many of the whitespace control options of the original. Some of these would be easy additions, some harder.
The biggest impediments to using this as a general purpose diff library are:
- It currently relies on garbage collection to free its temporary storage, and
- It relies on the OpenCM buffer object implementation.
Both of these should be straightforward to change, but doing so hasn't been an objective for OpenCM. Perhaps it should be, but in the final analysis C isn't the right programming language to do this kind of thing in to begin with.
The sources of information for this implementation are the AT&T UNIX 32v diff implementation, which was released under the BSD license by SCO, and a paper by Eugene Myers:
Eugene W. Myers: An O(ND) Difference Algorithm and its
Variations
The Myers algorithm is also the one used by gdiff, and is considerably faster than the O(N^2) algorithm used by the original AT&T diff implementation.
