Re: Find/Replace in TStringList loses lots of data
- From: "alanglloyd@xxxxxxx" <alanglloyd@xxxxxxx>
- Date: 9 Aug 2005 23:05:26 -0700
As you have got a really large number of files which are themselves
quite large, I would seriously consider using a stream (as Bruce
suggests) and a Boyer-Moore-Horspool algorithm for searching.
This would be approx 9 times faster for the string you quote. It is
faster than a single character search because it effectively checks the
Nth file character (where N is the search string length) against all
the characters in the search string. If it is not in the search string
then it jumps on N characters. If it is in the search string it jumps
on appropriately and searches for a last character match (for example
if it found an "X" it would jump on 4 characters). If it finds the last
search string character then it searches backwards checking for each
individual character in the search string. If it has found all the
search string characters then it has found the word.
Basically one sets up a 256 character array of byte values, each
element of that array is a jump value - mainly the search string
length, but appropriate values for the search string characters. Then
its a matter of entering the array with the Nth character and jumping
the corresponding element value in the array.
This sounds like a lot of coding, but the speed increase is surprising.
I have code if you're interested.
Alan Lloyd
.
- References:
- Find/Replace in TStringList loses lots of data
- From: Ryan
- Re: Find/Replace in TStringList loses lots of data
- From: Bruce Roberts
- Find/Replace in TStringList loses lots of data
- Prev by Date: Re: How can I send different packets (eg. records) over a TCustomSocket?
- Next by Date: Re: Find/Replace in TStringList loses lots of data
- Previous by thread: Re: Find/Replace in TStringList loses lots of data
- Next by thread: Re: Find/Replace in TStringList loses lots of data
- Index(es):
Relevant Pages
|