Re: Maximum String size in Java?
From: Randy Howard (randyhoward_at_FOOverizonBAR.net)
Date: 03/04/05
- Next message: Rob Thorpe: "Re: Variation of bit-counting"
- Previous message: Ney André de Mello Zunino: "Architecture-switch to Web Services; newbie needs advice"
- In reply to: CBFalconer: "Re: Maximum String size in Java?"
- Next in thread: CBFalconer: "Re: Maximum String size in Java?"
- Reply: CBFalconer: "Re: Maximum String size in Java?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Fri, 04 Mar 2005 16:12:39 GMT
In article <422876A1.66D5BA5@yahoo.com>, cbfalconer@yahoo.com says...
> Randy Howard wrote:
> > And I found that it was slower at 12. That's pretty contrived, for
> > most applications anyway. Neither is slow. In fact, 31/33 is
> > *slightly* faster for very short strings, and considerably slower
> > for long strings, so on average, SFH bakes it in the performance
> > category.
>
> Will it if the alignment problem is handled? I don't know.
So comment out the shortcut and find out.
> When you are pushing performance, the only thing that counts is the
> statistics on the data you are actually using. I said this several
> times in several forms. My opinion is that the use of long strings
> as keys is relatively rare.
My opinion is that agreeing that they're never long than 5 characters
is pretty rare.
> > Well, when you have collisions in the hash value itself, as seen with
> > 31/33, that's not going to change when you % that into a table (or
> > whatever method you use). I forget the exact detail, but I think it
> > was something really ugly, like "Ab" and "aB" hashing to the same
> > value. Also, I noticed some long sentence length strings that had
> > only one or two characters changed hashed to very nearly the same
> > hash value, which wasn't expected.
>
> Hashing to the 'nearly same' value is no problem for hash table
> use.
I think you ignored Paul's claim that something like 3000 collisions,
not near collisions, occur with trivially short strings using 31/33.
That's in the same league with hash = 1.
> > Apart from the major weakness of being easy to find collisions
> > using it perhaps.
>
> I don't concede that fact. That awaits tests on real data.
I note that you didn't choose to pay attention to Paul's algebraic
argument about this, I wonder why.
> > Not being harmed is not the same as "can't be improved upon". It
> > may turn out that in real world testing, there is no difference
> > between them, or one of them outright slays the other.
>
> Where did I ever make such a claim? I will not concede that an
> improvement that is totally masked by i/o is worthwhile.
Who said it was masked by I/O? Furthermore, if the improvement
is on the collision side, with performance coming along in the
bargain, that's farcical at best.
-- Randy Howard (2reply remove FOOBAR) "Making it hard to do stupid things often makes it hard to do smart ones too." -- Andrew Koenig
- Next message: Rob Thorpe: "Re: Variation of bit-counting"
- Previous message: Ney André de Mello Zunino: "Architecture-switch to Web Services; newbie needs advice"
- In reply to: CBFalconer: "Re: Maximum String size in Java?"
- Next in thread: CBFalconer: "Re: Maximum String size in Java?"
- Reply: CBFalconer: "Re: Maximum String size in Java?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|