Re: divide & conquer algorithm to implement string matching
- From: Pascal Bourguignon <pjb@xxxxxxxxxxxxxxxxx>
- Date: Tue, 29 May 2007 17:21:15 +0200
misterio <giagimari@xxxxxxxxx> writes:
On 29 Mag, 14:58, Richard Heathfield <r...@xxxxxxxxxxxxxxx> wrote:
Maybe I'm missing something, but I really don't see how d&c works forI'm not trying to use divide & conquer to find any real benefit from
you in this situation.
it in this case,I'm just trying to understand how to properly use it
for this problem,because I need to understand this to prepare an
exam.Usually I can use divide & conquer to solve problems,but this
particular one is giving me some headaches.If I would have wanted to
use an effective algorithm to implement string matching,I would have
used the KMP algorithm.
Thank you for any kind of advice that you can give me.
Of course, to apply any rule of thumb, you have to adapt it to the
current circumstances.
Instead of "dividing" by cutting the search string exactly in two, you
can cut it so as to avoid missing any occurences.
pattern: 1011
string: 10100101110001010110
cut into:
string 1: 10100101110
string 2: 110001010110
Actually, you don't really need to cut the string itself, only the
range you search.
string 1: 10100101110
^ ^
| |
start1 end1
string 2: 110001010110
^ ^
| |
start2 end2
(defun d&c-search (pattern string start end)
(if (= start end)
(and (string= pattern string :start2 start :end2 (+ start (length pattern)))
start)
(or (d&c-search pattern string start (truncate (+ start end) 2))
(d&c-search pattern string (1+ (truncate (+ start end) 2)) end))))
(defun my-search (pattern string)
(d&c-search pattern string 0 (- (length string) (length pattern) -1)))
C/USER[15]> (trace d&c-search)
;; Tracing function D&C-SEARCH.
(D&C-SEARCH)
C/USER[16]> (my-search "abc" "xyzxyzabcxyyzya")
1. Trace: (D&C-SEARCH '"abc" '"xyzxyzabcxyyzya" '0 '13)
2. Trace: (D&C-SEARCH '"abc" '"xyzxyzabcxyyzya" '0 '6)
3. Trace: (D&C-SEARCH '"abc" '"xyzxyzabcxyyzya" '0 '3)
4. Trace: (D&C-SEARCH '"abc" '"xyzxyzabcxyyzya" '0 '1)
5. Trace: (D&C-SEARCH '"abc" '"xyzxyzabcxyyzya" '0 '0)
5. Trace: D&C-SEARCH ==> NIL
5. Trace: (D&C-SEARCH '"abc" '"xyzxyzabcxyyzya" '1 '1)
5. Trace: D&C-SEARCH ==> NIL
4. Trace: D&C-SEARCH ==> NIL
4. Trace: (D&C-SEARCH '"abc" '"xyzxyzabcxyyzya" '2 '3)
5. Trace: (D&C-SEARCH '"abc" '"xyzxyzabcxyyzya" '2 '2)
5. Trace: D&C-SEARCH ==> NIL
5. Trace: (D&C-SEARCH '"abc" '"xyzxyzabcxyyzya" '3 '3)
5. Trace: D&C-SEARCH ==> NIL
4. Trace: D&C-SEARCH ==> NIL
3. Trace: D&C-SEARCH ==> NIL
3. Trace: (D&C-SEARCH '"abc" '"xyzxyzabcxyyzya" '4 '6)
4. Trace: (D&C-SEARCH '"abc" '"xyzxyzabcxyyzya" '4 '5)
5. Trace: (D&C-SEARCH '"abc" '"xyzxyzabcxyyzya" '4 '4)
5. Trace: D&C-SEARCH ==> NIL
5. Trace: (D&C-SEARCH '"abc" '"xyzxyzabcxyyzya" '5 '5)
5. Trace: D&C-SEARCH ==> NIL
4. Trace: D&C-SEARCH ==> NIL
4. Trace: (D&C-SEARCH '"abc" '"xyzxyzabcxyyzya" '6 '6)
4. Trace: D&C-SEARCH ==> 6
3. Trace: D&C-SEARCH ==> 6
2. Trace: D&C-SEARCH ==> 6
1. Trace: D&C-SEARCH ==> 6
6
C/USER[17]>
But of course, this is taking the "divide" part much too literally! :-)
You don't need to cut the problem in two equal parts. You can just
divide the problem in two smaller problems, and one may be exactly the
same as the original problem but for some monotonous progression.
So if the problem is to search a substring in a string, you can divide
it in these two simplier problems:
- test if a string is a prefix of another string,
- search a substring in a _smaller_ string.
(defun my-search (pattern string)
(cond
((< (length string) (length pattern)) ; part of testing pattern as a prefix.
nil) ; If string is shorter, no.
((string= pattern string :end2 (length pattern)) ; prefix? position = 0
0)
(t ; now, search the pattern in a smaller string, starting from 1:
(let ((pos (my-search pattern (subseq string 1))))
(when pos (1+ pos))))))
Of course, here we copy a lot of tail substrings, so we can apply the
same trick as above, and only keep an index to the start of the
"smaller" string:
(defun d&c-search (pattern string start)
(cond
((< (- (length string) start) (length pattern))
nil)
((string= pattern string :start2 start :end2 (+ start (length pattern)))
start)
(t
(d&c-search pattern string (1+ start)))))
(defun my-search (pattern string)
(d&c-search pattern string 0))
--
__Pascal Bourguignon__ http://www.informatimago.com/
NOTE: The most fundamental particles in this product are held
together by a "gluing" force about which little is currently known
and whose adhesive power can therefore not be permanently
guaranteed.
.
- Follow-Ups:
- Re: divide & conquer algorithm to implement string matching
- From: misterio
- Re: divide & conquer algorithm to implement string matching
- References:
- divide & conquer algorithm to implement string matching
- From: misterio
- Re: divide & conquer algorithm to implement string matching
- From: Richard Heathfield
- Re: divide & conquer algorithm to implement string matching
- From: misterio
- divide & conquer algorithm to implement string matching
- Prev by Date: Re: divide & conquer algorithm to implement string matching
- Next by Date: Re: pop up... ;(
- Previous by thread: Re: divide & conquer algorithm to implement string matching
- Next by thread: Re: divide & conquer algorithm to implement string matching
- Index(es):
Relevant Pages
|