Re: C# programmer wants to learn assembly?? plz help



¬a\/b <al@xxx> wrote...
On Fri, 29 Dec 2006 20:56:29 -0600, David Jones wrote:

Here's a counter-example: let's say you need to verify that an email
address in a string has the format something@xxxxxxxxxxx? (string +
'@' + string + '.' + string, with only one @ in the string) See how
long it takes you to write it in assembly.


[snip long code]

This was exactly my point. There's *no* way you wrote that faster than
the regular expression I wrote.


Then compare to ECMAScript: /^[^\s\@]+\@[^\s\@]+\.[^\s\@]+$/.test(str)

does "ecmascript" see the not allowed chars in string like "é" ?

Why is this not allowed? AFAIK, this is a valid character in an email
address, and besides, that wasn't part of the input validation
specification I listed anyway.

does "ecmascript" see the not allowed chars in string like ".." or
"@@" ?

It does not catch ".." since that was not part of the specification. It
*will* catch "@@" since it *IS* part of the specification -- it's in the
[^\s\@] blocks.

If you wanted to catch "..", all you'd have to do is add another test:

!/\.\./.test(...)

.... and combine the results with the other expression. Wow, that took
me so long to add. HLLs are *SO* unproductive! :)

Moreover, say you wanted to restrict the trailing domain part (after the
final '.') to be 2-4 characters only (to prevent the user entering
something@xxxxxxxxxxxxxxx) -- all you'd have to do is change the final +
to {2,4} and you're done.


does "ecmascript" see "1@c." or "1@.qw" or "@a.b" like wrongs?

Yes, all of these will fail the test (meaning they won't be accepted),
which you can easily verify by writing some quick JavaScript:

(1) "1@c." will fail due to the trailing [^\s\@]+$ part of the
expression.
(2) "1@.qw" will fail due to the middle [^\s\@]+
(3) "@a.b" will fail due to the leading ^[^\s\@]+

The + modifier requires at least one character before matching.


Guess which one would take longer for me to write.... (hint: if I wrote

you have to see the time for the bug-free version

This code works exactly to the specification I gave, which is exactly
the kind of specification you would get from a client. My point wasn't
that regular expressions was a proper way to perform actual RFC-
compliant email validation -- it isn't. My point was that a language
with regular expressions is much better at pattern matching than raw
assembly, which demonstrates that raw assembly is not always the best
tool for the job.

As another example, check phone number fields:

/^((\(\d{3}\))\s?|\d{3}\/\s?|\d{3}\.?)?\d{3}[\-\.]?\d{4}$/.test(str)

This accepts phone numbers in any of the following formats:

(202) 555-1212
(202)555-1212
202/555-1212
202.555.1212
2025551212

Doing this completely in a single expression is hard (the one above
isn't complete), so you'd want to do this in separate steps and OR them
together:

// require the - or . separator with area code in ()
/^(\(\d{3}\)\s?)?\d{3}[\-\.]\d{4}$/.test(...)

// if using - or . as area code separator, require matching for local
/^\d{3}([\-\.])\d{3}\1\d{4}$/.test(...)

// if using / as area code separator, don't require exchange separator
/^\d{3}\/\d{3}[\-\.]?\d{4}$/.test(...)

// allow omitting the separators entirely
/^\d{3}?\d{7}$/

This took hardly any time to write, yet I think an appropriate equally-
featured assembly version would take longer (and be even harder to
follow when you have to come back and modify it months/years later).

I hope you and annabee see my point: sometimes, to be more productive,
you need to use a tool designed for the task. Assembly can be a swiss
army knife, but if I'm opening a can, I'd rather use my electric can
opener rather than doing it manually with the tool on my swiss army
knife. :)

David
.



Relevant Pages

  • Re: Remove characters from string
    ... and your link took me to the templates page at microsoft office. ... there expaining regular expressions unless you meant I should search for it. ... | them to the same format for ease of processing. ... | the string I remove extraneous characters. ...
    (microsoft.public.excel.programming)
  • RE: format phone number
    ... As for how to format a string value, generally, we can use the ... we can also use Regular Expressions to do string search or ...
    (microsoft.public.dotnet.framework.aspnet)
  • Re: Serious Perl Regular Expression deficiency?
    ... I started doing Perl 2 years ago and have ... > conclusion that regular expressions have a serious ... This is serious because the not string ... If you want to pull out the contents of XML comments you could do this. ...
    (comp.lang.perl.misc)
  • Re: dividing an replacing spaces in string
    ... I knew regular expressions would help in this. ... This newly delimited string will dump into separate rows like this ... Dim colMatches As Object ... Set objRe = CreateObject ...
    (microsoft.public.excel.programming)
  • Re: combining millions of different regular expressions
    ... match a given string with all of them some how. ... merged state machine will have an optimal structure to improve the ... First, be careful with what you mean by matching regular expressions, ... (One of those cases where theory and practice mis-align.) ...
    (comp.theory)