Re: aligning text with space-normalized text



Steven Bethard wrote:
John Machin wrote:

If "work" is meant to detect *all* possibilities of 'chunks' not having been derived from 'text' in the described manner, then it doesn't work -- all information about the positions of the whitespace is thrown away by your code.

For example, text = 'foo bar', chunks = ['foobar']


This doesn't match the (admittedly vague) spec

That is *exactly* my point -- it is not valid input, and you are not reporting all cases of invalid input; you have an exception where the non-spaces are impossible, but no exception where whitespaces are impossible.



which said that chunks
are created "as if by ' '.join(chunk.split())". For the text:
'foo bar'
the possible chunk lists should be something like:
['foo bar']
['foo', 'bar']
If it helps, you can think of chunks as lists of words, where the words have been ' '.join()ed.

If it helps, you can re-read my message.


STeVe
.



Relevant Pages

  • Re: aligning text with space-normalized text
    ... been derived from 'text' in the described manner, then it doesn't work -- all information about the positions of the whitespace is thrown away by your code. ... the possible chunk lists should be something like: ...
    (comp.lang.python)
  • Re: Getting rid of punctuation in chunked strings
    ... > I am reading in a string, splitting it into chunks on whitespace and ... > placing the values in an array which I then process further to match. ... Mr. Steeve to please read the posting guidelines, ...
    (comp.lang.perl.misc)
  • aligning text with space-normalized text
    ... I have a string with a bunch of whitespace in it, and a series of chunks of that string whose indices I need to find. ... been converted to single spaces as if by ' '.join). ... Note that the original "text" has a variety of whitespace between words, but the corresponding "chunks" have only single space characters between "words". ...
    (comp.lang.python)
  • Re: Getting rid of punctuation in chunked strings
    ... >I am reading in a string, splitting it into chunks on whitespace and ... >placing the values in an array which I then process further to match. ... Nobody knows what punctuation is. ...
    (comp.lang.perl.misc)
  • Re: aligning text with space-normalized text
    ... > I have a string with a bunch of whitespace in it, and a series of chunks ... so that multiple spaces and newlines have ... def indices: ...
    (comp.lang.python)