Re: [PHP] case and accent - insensitive regular expression?



On Sat, Jul 12, 2008 at 10:29 AM, tedd <tedd.sperling@xxxxxxxxx> wrote:
At 9:36 AM +0200 7/12/08, Giulio Mastrosanti wrote:

Hi,
I have a php page that asks user for a key ( or a list of keys ) and then
shows a list of items matching the query.

every item in the list shows its data, and the list of keys it has ( a
list of comma-separated words )

I would like to higlight, in the list of keys shown for every item, the
words matching the query,

this can be easily achieved with a search and replace, for every search
word, i search it in the key list and replace it adding a style tag to
higlight it such as for example to have it in red color:

if ( @stripos($keylist,$keysearch!== false ) {
$keylist = str_ireplace($keysearch,'<span style="color:
#FF0000">'.$keysearch.'</span>',$keylist);
}

but i have some problem with accented characters:

i have mysql with character encoding utf8, and all the php pages are
declared as utf8

mysql in configured to perform queries in a case and accent insensitive
way.
this mean that if you search for the word 'cafe', you have returned rows
that contains in the keyword list 'cafe', but also 'café' with the accent. (
I think it has to do with 'collation' settings, but I'm not investigating at
the moment because it is OK for me the way it works ).

now my problem is to find a way ( I imagine with some kind of regular
expression ) to achieve in php a search and replace accent-insensitive, so
that i can find the word 'cafe' in a string also if it is 'café', or 'CAFÉ',
or 'CAFE', and vice-versa.

hope the problem is clear and well-explained in english,

thank you for any tip,

Giulio

Giulio:

Three things:

1. Your English is fine.

2. Try using mb_ereg_replace()

http://www.php.net/mb_ereg_replace

Place the accents you want to change in that and change them to whatever you
want.

3. Change:

<span style="color: #FF0000">'.$keysearch.'</span>'

to

<span class="keysearch">'.$keysearch.'</span>'

and add

.keysearch
{
color: #FF0000;
}

to your css.

Cheers,

tedd

I may be mistaken (and if I am, then just ignore this as ignorant
rambling), but I don't think he's wanting to replace the accented
characters in the original string. I think he's just wanting the
pattern to find all variations of the same string and highlight them
without changing them. For example, his last paragraph would look like
this:

[quote]
now my problem is to find a way ( I imagine with some kind of regular
expression ) to achieve in php a search and replace
accent-insensitive, so that i can find the word '<span
class="keysearch">cafe</span>' in a string also if it is '<span
class="keysearch">café</span>', or '<span
class="keysearch">CAFÉ</span>', or '<span
class="keysearch">CAFE</span>', and vice-versa.
[/quote]

The best I can think of right now is something like this:

<?php

function highlight_search_terms($word, $string) {
$search = preg_quote($word);

$search = str_replace('a', '[aàáâãäå]', $search);
$search = str_replace('e', '[eèéêë]', $search);
/* repeat for each possible accented character */

return preg_replace('/\b' . $search . '\b/i', '<span
class="keysearch">$0</span>', $string);

}

$string = "now my problem is to find a way ( I imagine with some kind
of regular expression ) to achieve in php a search and replace
accent-insensitive, so that i can find the word 'cafe' in a string
also if it is 'café', or 'CAFÉ', or 'CAFE', and vice-versa.";


echo highlight_search_terms('cafe', $string);

?>


Andrew


Relevant Pages

  • Re: [PHP] case and accent - insensitive regular expression?
    ... I have a php page that asks user for a key (or a list of keys) and then ... that contains in the keyword list 'cafe', but also 'café' with the accent. ... characters in the original string. ...
    (php.general)
  • Re: [PHP] case and accent - insensitive regular expression?
    ... $string = strtr; ... so you replace every occurence of every accent variation with all the ... It matches any character based on the latin 'a' that is not ...
    (php.general)
  • Re: [PHP] case and accent - insensitive regular expression?
    ... I have a php page that asks user for a key (or a list of keys) and then shows a list of items matching the query. ... mysql in configured to perform queries in a case and accent insensitive way. ...
    (php.general)
  • PHP Shell
    ... I have created some source and some developing ideas for packages to go ... hackables to make ever simpler to hack online with php. ... and create a buffer for the string, and by creating an action wether it ... some papers on the keys, but you can hit backspace, removing the keys, ...
    (alt.php)
  • Re: loose thinking
    ... php> var_dump; ... There is no difference between an empty string and null in array indexing. ... As about the array key definition, a key is an identifier so I think it's reasonable to make it a string. ... Having arrays where keys can be any arbitrary data type looks like an interesting idea but you'll have to implement it yourself. ...
    (comp.lang.php)