Re: 80x86 tasm search for string in txt file
- From: "Rod Pemberton" <do_not_have@xxxxxxxxxxxxx>
- Date: Thu, 8 Nov 2007 23:29:42 -0500
"Šarūnas Kazlauskas" <referas@xxxxxxxxx> wrote in message
news:1194526026.446721.61440@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
hya,
i have a little problem. the program must be able to find a string,
given via param., in a txt file and if found -> print the WHOLE line
and its number if that string(max 10chars) was found in that line.
problems begin when string begins at the end of the buffer and ends at
the beggining of the next buffer.
read buffer is 20
write buffer is 100
what can you suggest?
It does sound like homework... This is really more of an implementation
problem than an assembly issue, isn't it?
Actually, a slight variation on this was a computer club programming problem
when I was in HS many years ago. This was in BASIC on an Apple II with a 5
1/4 disk. The routine was similar to the fourth description below. It was
quite fast all things considered. Of course, I was the only one I knew
using structured programming in BASIC at the time...
I'm surprised that even simple text parsing would be part of an assembly
class considering all the other problems of learning assembly. This is more
suited to a HLL like Pascal or C than assembly.
Anyway, since it's an implementation issue, I'll describe solving just your
boundary problem without using any x86 assembly...
The first suggestion is to save the last 10 chars of the data from the prior
read. I.e., use two buffers 10 and 20 in length creating a buffer of 30.
This will allow you to match 10 which is split across two buffers. Some
logic is necessary to switch to the other buffer when comparing and you have
to copy the last ten on each read. I.e., match by one character at a time
or compute offsets for two strings to match.
The second, if you know what you are doing, you could read 20 into position
10 of a 30 byte buffer. Or, you could read 20 into position 80 of a 100
byte buffer, i.e, your write buffer. This would have fewer calculations.
Once you've found the last character of param, you only need to subtract the
length of param to find the start location to match the entire string.
Since the length of param is always less than 30 or 100, specifically 10 or
less, you won't underflow. You have to save the read data by shifting it 20
between reads so the next read doesn't overwrite. You may still have the
problem of matching param as a substring of larger text. I.e., you'll match
'cat' in 'location'.
Third, if you can't use more buffers, then you'll have to use 1) flags to
indicate that you had a partial match in each read and 2) length of match in
each read to ensure that you didn't skip matching any characters. I.e.,
simplified:
if (((match_partial_1=true) and (match_partial_2=true)) and
(length_match_1+length_match_2=length_param)) then (match_is_found)
You'll need to have a matching method which can be split, i.e., match by one
character at time. And, you still have the substring issue which can be
solved by more flags. I.e.,
if ((char_before_first_char_of_param_not_text_char=true) and
(char_after_last_char_of_param_not_text=true)) then (code_above)
Fourth, is that there is a simpler flags method, if you compare on a char by
char basis. Get rid of the buffer, and then put it back later. I'd
recommend coding to solve the issue using a read of one(1) char and compare
of one char setting flags as you go. If the char doesn't match the first
char of param, you compare the next char against the first of param. If the
char does match the first char of param, you compare the next char against
the second char of param. If the next char doesn't match the second char of
param, you compare the next char against the first of param,i.e., reset
index into param to the first char. If the next char does match the second
char of param, you compare the next char against the third char of param.
Repeating until all chars match. On the first match, you'll also check that
the prior char is non-text and set a flag. This requires saving the last
char for comparison. When the index into param equals the length of param,
you've matched all the characters in param to the text stream. You may want
to read one past to prevent substring matching, i.e., post-match non-text
flag. And, you'll need to check the pre-match non-text flag too. Once that
works, install another procedure to get one char from a buffer of 20
(instead of reading one char) and refill the buffer with a read of 20 when
it's at the end of buffer. From the example below, you'll see my
description above has a slight error(s), missing step, stuff I didn't
notice,
etc. I didn't feel like correcting the paragraph since it captures the
basic idea.
I.e., if param is 'cat' and text is 'zcacats'
z compare c - no match, index=0
c compare c - match, index=0 set to 1, prior char is text
a compare a - match, index=1 set to 2
c compare t - no match, index=2 reset to 0
must do compare again for first char: (missing step)
c compare c - match, index=0 set 1
a compare a - match, index=1 set to 2
t compare t - match, index=2 set to 3, index equals length of param
match:
s read - post char is text
result: substring match, post and prior are text - so not an exact match
I.e., if param is 'cat' and text is '+cat!'
...
result: exact match, post and prior are non-text
IIRC, this can be implemented a loop and an if, maybe two of one or both...,
and some var's... No, I didn't work it out completely today. Anyway, it
can be done simply, but I think it might take more work in assembly than a
HLL.
Rod Pemberton
.
- Follow-Ups:
- Re: 80x86 tasm search for string in txt file
- From: ?ar nas Kazlauskas
- Re: 80x86 tasm search for string in txt file
- References:
- 80x86 tasm search for string in txt file
- From: Šarūnas Kazlauskas
- 80x86 tasm search for string in txt file
- Prev by Date: Re: x86 Instruction Reference - review appreciated
- Next by Date: Re: 80x86 tasm search for string in txt file
- Previous by thread: Re: 80x86 tasm search for string in txt file
- Next by thread: Re: 80x86 tasm search for string in txt file
- Index(es):
Relevant Pages
|