Re: divide & conquer algorithm to implement string matching



The real problem to correctly implement this algorithm is to find a
way to calculate the exact indices when applying the divide & conquer
approach,and tat is why in the previous implementation some of these
indices were missing.With the following code I solved this problem by
calculating a fake array dimension augmented by 2 units,and that gives
me the total coverage of the indices.Now the problem is that I need to
verify is this approach is too particular and in case how to extend it
to any lenght of the base string and the pattern to search.Now I'll
start to search a solution to this,but I think it should be possible
to find such a math formula if I maintain the divide sequence costant
(ie.dividing always by 2),or maybe it could be even possible that this
approach can work flowessly for any kind of lenghts,I still don't
know,I need to try.Anyway,in the meantime,this is the code :
(P. S. I simplified the printf code to print just the 1st index of
every sequence,so that you can see if there's one missing)

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int count_sequence(char arr[], int arrlen);

int main(int argc,char **argv)
{
char arr[] = "01101011000110101100";
printf("number of sequences found = %d\n",
count_sequence(arr, strlen(arr)));
return 0;
}

#define SEQUENCE "0110"

void static cnt_seq_rec(char arr[], int start, int end,
int arrlen, int *cnt, int *visited)
{
int slice = end - start;
if (slice > 4) {
int mid = (start + end) / 2;
cnt_seq_rec(arr, start, mid, arrlen, cnt, visited);
cnt_seq_rec(arr, mid + 1, end, arrlen, cnt, visited);
} else {
if (((start + 3) >= arrlen) || visited[start])
return;
printf("sequence[%d] = %c%c%c%c\n", start, arr[start],
arr[start + 1], arr[start + 2], arr[start + 3]);
if (!strncmp(SEQUENCE, &arr[start], 4))
(*cnt)++;
visited[start] = 1;
}
}

int count_sequence(char arr[], int arrlen)
{
int cnt = 0;
int fake_arrlen = arrlen + 2;
int *visited = (int *)malloc(sizeof(int) * arrlen);
for (int i = 0; i < arrlen; ++i)
visited[i] = 0;
cnt_seq_rec(arr, 0, fake_arrlen, arrlen, &cnt, visited);
cnt_seq_rec(arr, 1, fake_arrlen + 1, arrlen, &cnt, visited);
cnt_seq_rec(arr, 2, fake_arrlen + 2, arrlen, &cnt, visited);
cnt_seq_rec(arr, 3, fake_arrlen + 3, arrlen, &cnt, visited);
free(visited);
return cnt;
}




.



Relevant Pages