Re: how to compare two version strings in java



On Mon, 28 Jul 2008, Eric Sosman wrote:

Stefan Ram wrote:
anywherenotes@xxxxxxxxx writes:
Is there a good way of doing it?

When the version numbers are limited to 4 positions,
with values ranging from 0 to 99 for each:

There's no universally-agreed syntax for version strings. I'm composing this message on Thunderbird "2.0.0.16", my Java version is "1.6.0_07" (also known as "1.6.0_07-b06"), I use an O/S whose version is "5.10" and whose kernel patch level is "Generic_127127-11".

Okay, rambling follows.

The Mac, in classial period at least, had a well-defined standard for version numbering, used for system components and applications, both Apple and third-party, based on the NumVersion struct in MacTypes.h:

struct NumVersion {
UInt8 majorRev; /*1st part of version number in BCD*/
UInt8 minorAndBugRev; /*2nd & 3rd part of version number share a byte*/
UInt8 stage; /*stage code: dev, alpha, beta, final*/
UInt8 nonRelRev; /*revision level of non-released version*/
};

This basically gives you a three-part version number, major.minor.bug, where the major version is two digits, and the minor and bug are one digit (like, say 10.5.4). Apple had conventions about when to increment each number, but i don't think it followed them very rigorously.

As well as that, you get a flag which indicates if this is a proper release, or a development, alpha, beta, or release candidate version, and a revision number within that scope, which ranges from 0 to 255, although by convention, there aren't revisions within a release, only at earlier stages.

The format for writing this out was to have the dotted major.minor.bug part, then a single letter for stage, and a number for the revision level; So, you might have the following sequence of versions:

10.5.5d1
10.5.5d2
10.5.5d3
10.5.5a1
10.5.5a2
10.5.5b1
10.5.5b2
10.5.5rc1
10.5.5

With the last one being the release stage.

A nifty feature of this structure was that it fitted in 32 bits, and is laid out most-to-least significant (on a big-endian machine, anyway), which means that if you reinterpret it as a 32-bit unsigned integer, the ordering relation is preserved. That makes comparison as easy as a cast - no parsing or elementwise comparison needed!

There was a hiccup, though, in that there weren't actually separate stage values for release candidate and release - a genuine release was marked by having a revision number of zero. That means that a naive comparison will make release versions look older than all the release candidates. Fairly simple to work around, but annoying. It would have been really simple to define a new stage constant for release candidates, in between the beta and release values. For some reason, this was never done.

Still, it was kind of nifty.

I see no solution for the O.P. other than to know things about the syntax of the version strings he's interested in, to parse them into their constituent pieces, and to compare piece by piece. The java.util.regex package may be helpful for the parsing, but I know of no way to write a "universal" regex for all styles of versioning.

I got the impression that the OP *did* know the specific format; it was a string of decimal numbers, separated by full stops. But you're right, even if he does know that, there's nothing for it but to parse and compare.

In python, FWIW:

def compareVersions(aStr, bStr):
a, b = map(lambda s: tuple(map(int, s.split("."))), (aStr, bStr))
return cmp(a, b)

It may take a few more lines of code in java.

tom

--
The glass is twice as big as it needs to be.
.



Relevant Pages

  • Re: find certain strings in java files not inside comments
    ... > I want to determine if there any of the following strings in a java ... > with a list of java files starting at $ARGV. ... > /* blah blah blah ... > foreach(@ARGV) ...
    (comp.lang.perl.misc)
  • Re: Java compatibility issues (WAS: MF having issues?)
    ... I believe, that the JLS (Java Language Specification) REQUIRES that this optimization be done. ... package testPackage; ... true true true true false true ... * Literal strings within different classes in the same package represent references to the same String object. ...
    (comp.lang.cobol)
  • Re: Performance of hash_set vs. Java
    ... The main benefit Java has in hashing is that Strings cache their ... >that my hash function may be slow from the call to c_str. ... The above code may be where your main bottleneck is. ...
    (comp.lang.cpp)
  • Re: Ressourcensparend programmieren
    ... Mein Java Client arbeitet mit einem C++ Server zusammen und es kommen ... nur 1 byte Character vor (UTF-8). ... Seite als Strings behandelt. ... Tabelle mit engen Zellen kam ich nicht über 10 000 getValueAbfragen ...
    (de.comp.lang.java)
  • Re: Function pointers (Callback functions) in Java ?
    ... > a struct, into a heavy, clumsy object. ... > around in memory is not a pleasing prospect. ... Lets compare a C struct to a Java object. ... The Java object typically has one additional field, a pointer ...
    (comp.lang.java.programmer)