Re: What exactly is this simple regex doing?

From: Angie Ahl (angie_at_fivegeeks.com)
Date: 04/17/04


Date: Sat, 17 Apr 2004 13:16:36 +0100
To: "B. Fongo" <perl@fongo.de>

I see people have explained the regex itself but not how it's doing
what you want.

It's actually removing everything up until a dot is found

s/// is used to find and replace so this little regex is searching your
string eg http:://www.domain4you.com

finding everything up until a dot is found in this case the
"http:://www" part and replacing it with nothing.

Now I have a little regex question:

why was ^[^\.]+ suggested rather than ^.*?\. as a pattern.

I'll explain the difference to assist:

^[^\.]+
^ means start at the beginning of the string
[^\.] match any character but a .
+ do the match multiple times... as many as possible in fact.

^.*?\.
^ means start at the beginning of the string
.*? means match any character as many times as possible until the next
character is found
\. means a literal . (the next character)

So that pattern I just gave means start at beginning, match any
characters until you reach the first dot.

I would have thought the later pattern would be more efficient, but I
don't know the mind of regex so couldn't say for sure.

HTH

Angie

On 16 Apr 2004, at 16:35, B. Fongo wrote:

>
> This regex by Rob is working alright, but can't follow exactly how it
> truncates an absolute url from first character to the one before the
> dot.
>
> It returns (.domain4you.com from http:://www. domain4you.com.) exactly
> what is expected, but I can't easily understand it.
>
> Please I'm not pulling anyone's leg. So just explain it if you can.
>
> (.domain4you.com from http:://www. domain4you.com.)
>
>
> foreach (@domains) {
> my $name = $_;
> $name =~ s/^[^\.]+//;
> print $name;
> }
>
>
> -----Original Message-----
> From: B. Fongo [mailto:perl@fongo.de]
> Sent: Thursday, April 15, 2004 7:29 PM
> To: beginners@perl.org
> Subject: Regex to match domain for cookie
>
>
> How do I match a domain name starting from the dot?
>
> # Match something like these
> ".domain4you.co.uk"
> ".domain-house.de"
>
>
> This is what I have:
>
>
> @domains = ("http://www.domain.com ", "http://www.domain4you.co.uk
> "http://www.domain-house.de" "https//rrp.cash-day.com"
> );
>
>
> foreach (@domains){
>
> $_ =~ /^\D ([\.A-Za-z0-9]+[\.\D])$/; # What is wrong here?
> # Need ".domain.com", but I get "ww.domain.com"
> $x = $1;
> print "$x";
>
>
> }
>
>
>
> Babs
>
>
>
>
>
>
> --
> To unsubscribe, e-mail: beginners-unsubscribe@perl.org
> For additional commands, e-mail: beginners-help@perl.org
> <http://learn.perl.org/> <http://learn.perl.org/first-response>
>
>



Relevant Pages

  • Re: RegEx issues
    ... The problem is it appears that python is escaping the \ in the regex ... character within a string. ... This flag allows you to write regular expressions that look nicer. ...
    (comp.lang.python)
  • Re: Matching parentheses with Regular Expressions
    ... you probably want this regex: ... You also might get rid of some of those backslashes by substituting another character, then using replaceon the string before compiling it. ... This just allows Sun to make new keywords or operators, with out breaking any existing code. ...
    (comp.lang.java.programmer)
  • Re: preg_split problem
    ... To do that i need to split them whenever a dot occurs, and the join the first two array occurences in a new string but I have a problem beacuse the dot in the Croatian languages is not always used a sentence delimiter, but is often used in conjuction with numbers and acronyms. ... Formulating the exact requirements before writing the regex is more then half the work. ...
    (comp.lang.php)
  • Re: preg_split problem
    ... To do that i need to split them whenever a dot occurs, and the join the first two array occurences in a new string but I have a problem beacuse the dot in the Croatian languages is not always used a sentence delimiter, but is often used in conjuction with numbers and acronyms. ... Formulating the exact requirements before writing the regex is more then half the work. ...
    (comp.lang.php)
  • Re: Regex Matches
    ... Well, yes, but I think that what the OP wanted to know is why Regex ... That is, in the string GGATGGATG, the ... part of the string after the first matched character, ... Match sequence = r.Match(dna, matchIndex); ...
    (microsoft.public.dotnet.languages.csharp)