Re: Finding substring in character array



On Thu, 18 Sep 2008, Roedy Green wrote:

On Thu, 18 Sep 2008 15:21:28 +0200, Hakan <H.L@xxxxxxxxxxxx> wrote,
quoted or indirectly quoted someone who said :

The problem is that some of the original files are really large. The runtime system reaches a state where it's dry of heap space when allocating a String for that kind of text. It works with a character array instead.

Why would a char[] take up significantly less space? A string is just a char[] with a few extra fields.

I'm guessing his problem is in allocating it - the trouble with a string is that there's no way to make one without already having all the characters in an array (or other container) which then gets copied into the string. If you have 1 MB of RAM, the biggest string you can make is 512 kB, whereas you could fill all 1 MB with a char[]. Roughly speaking, of course.

I think the real solution to the OP's problem is not to read the whole file into a char[]. Either use a streaming parser, or memory-map the file.

If he insists on searching a char[], i would suggest he reads up on the Boyer-Moore string searching algorithm. It's not that complicated, and is pretty much the fastest possible way to search his array. Faster than String.indexOf, which i believe uses a naive search.

tom

--
got EXPERTISE in BADASS BRAIN FREEZE
.



Relevant Pages

  • Re: is there an alternative to strstr
    ... >> To exploit this fact, you need a different data format, a plain string is ... > Ok I have put the email ids in a sorted array. ... However you can use an array of char pointers and use ... int cmp(const void *v1, const void *v2); ...
    (comp.lang.c)
  • Re: Pointers on string members of structure
    ... because this just points memstr to a fixed string and it is undefined to ... char memstrA; ... string array directly like this and how? ... struct or if the struct member points to the array. ...
    (microsoft.public.vc.language)
  • Re: copy a string into a 2d array of chars
    ... This split function should allocate a 2D array of chars ... >focus the program the string is not actually split. ... later) is an array of char containing the original contents of the ... The i-th pointer will contain the starting address of the ...
    (comp.lang.c)
  • Re: K&R2, exercise 5.4
    ... int strend(char*, char*); ... Now ps points to the first character of a string which is one character ... * don't want to FAKE the array call ... outer for loop checks for each element of 1st array. ...
    (comp.lang.c)
  • Re: Pointers on string members of structure
    ... because this just points memstr to a fixed string and it is undefined to try ... char memstrA; ... string array directly like this and how? ... struct or if the struct member points to the array. ...
    (microsoft.public.vc.language)