Re: C# programmer wants to learn assembly?? plz help



On Sat, 30 Dec 2006 04:58:54 -0600, David Jones wrote:
¬a\/b <al@xxx> wrote...
On Fri, 29 Dec 2006 20:56:29 -0600, David Jones wrote:
Here's a counter-example: let's say you need to verify that an email
address in a string has the format something@xxxxxxxxxxx? (string +
'@' + string + '.' + string, with only one @ in the string) See how
long it takes you to write it in assembly.


[snip long code]

This was exactly my point. There's *no* way you wrote that faster than
the regular expression I wrote.


Then compare to ECMAScript: /^[^\s\@]+\@[^\s\@]+\.[^\s\@]+$/.test(str)

does "ecmascript" see the not allowed chars in string like "é" ?

Why is this not allowed? AFAIK, this is a valid character in an email
address, and besides, that wasn't part of the input validation
specification I listed anyway.

in my system seems "é"==232 so it is > 127 and it should be not an
allowed char.
it seems that for read a valid email we have to write a compiler

(from rcf 821 on email) i indent where i found email definition

<reverse-path> ::= <path>
<forward-path> ::= <path>
<path> ::= "<" [ <a-d-l> ":" ] <mailbox> ">"
<a-d-l> ::= <at-domain> | <at-domain> "," <a-d-l>
<at-domain> ::= "@" <domain>
<domain> ::= <element> | <element> "." <domain>
<element> ::= <name> | "#" <number> | "[" <dotnum> "]"

<mailbox> ::= <local-part> "@" <domain>
<local-part> ::= <dot-string> | <quoted-string>
<name> ::= <a> <ldh-str> <let-dig>
<ldh-str> ::= <let-dig-hyp> | <let-dig-hyp> <ldh-str>
<let-dig> ::= <a> | <d>
<let-dig-hyp> ::= <a> | <d> | "-"
<dot-string> ::= <string> | <string> "." <dot-string>
<string> ::= <char> | <char> <string>
<quoted-string> ::= """ <qtext> """
<qtext> ::= "\" <x> | "\" <x> <qtext> | <q> | <q> <qtext>
<char> ::= <c> | "\" <x>
<dotnum> ::= <snum> "." <snum> "." <snum> "." <snum>
<number> ::= <d> | <d> <number>
<CRLF> ::= <CR> <LF>
RFC 821 August
1982
Simple Mail Transfer
Protocol
<CR> ::= the carriage return character (ASCII code 13)
<LF> ::= the line feed character (ASCII code 10)
<SP> ::= the space character (ASCII code 32)
<snum> ::= one, two, or three digits representing a decimal
integer value in the range 0 through 255
<a> ::= any one of the 52 alphabetic characters A through Z
in upper case and a through z in lower case
<c> ::= any one of the 128 ASCII characters, but not any
<special> or <SP>
<d> ::= any one of the ten digits 0 through 9
<q> ::= any one of the 128 ASCII characters except <CR>,
<LF>, quote ("), or backslash (\)
<x> ::= any one of the 128 ASCII characters (no exceptions)
<special> ::= "<" | ">" | "(" | ")" | "[" | "]" | "\" | "."
| "," | ";" | ":" | "@" """ | the control
characters (ASCII codes 0 through 31 inclusive and 127)

if i read well above

a@xxxxxxxxxxxxx

should be a valid email string. But your code seems to me not find it
(like my)

does "ecmascript" see the not allowed chars in string like ".." or
"@@" ?

It does not catch ".." since that was not part of the specification. It
*will* catch "@@" since it *IS* part of the specification -- it's in the
[^\s\@] blocks.

"a@xxxx" should be not a valid address too
if you want ".." it should be "a@b.\.c"

If you wanted to catch "..", all you'd have to do is add another test:

!/\.\./.test(...)

... and combine the results with the other expression. Wow, that took
me so long to add. HLLs are *SO* unproductive! :)

Moreover, say you wanted to restrict the trailing domain part (after the
final '.') to be 2-4 characters only (to prevent the user entering
something@xxxxxxxxxxxxxxx) -- all you'd have to do is change the final +
to {2,4} and you're done.


does "ecmascript" see "1@c." or "1@.qw" or "@a.b" like wrongs?

Yes, all of these will fail the test (meaning they won't be accepted),
which you can easily verify by writing some quick JavaScript:

(1) "1@c." will fail due to the trailing [^\s\@]+$ part of the
expression.
(2) "1@.qw" will fail due to the middle [^\s\@]+
(3) "@a.b" will fail due to the leading ^[^\s\@]+

The + modifier requires at least one character before matching.


Guess which one would take longer for me to write.... (hint: if I wrote

you have to see the time for the bug-free version

This code works exactly to the specification I gave, which is exactly
the kind of specification you would get from a client. My point wasn't
that regular expressions was a proper way to perform actual RFC-
compliant email validation -- it isn't. My point was that a language
with regular expressions is much better at pattern matching than raw
assembly, which demonstrates that raw assembly is not always the best
tool for the job.

you call only functions
if an assembler have the same functions, it could do the same but
writing more text in the programme

As another example, check phone number fields:

/^((\(\d{3}\))\s?|\d{3}\/\s?|\d{3}\.?)?\d{3}[\-\.]?\d{4}$/.test(str)

This accepts phone numbers in any of the following formats:

(202) 555-1212
(202)555-1212
202/555-1212
202.555.1212
2025551212
.



Relevant Pages

  • Re: improve strlen
    ... The biggest optimization is that the code is bigger. ... each test string is tested ... The object on right side is type tree, ... The way I see it is that C is portable assembler, ...
    (comp.lang.asm.x86)
  • Re: Macro code
    ... > I could replace any module in Berkeley Unix 4.4 whether in assembler or C." ... > create an HLL systems language for that architecture, ... was on static labeling of I/O buffers, and movement of data into static ... I/O buffer lables were string arrays. ...
    (comp.os.vms)
  • Re: GCC question
    ... you explicitly have to specify the byte order and the string is much ... padding byte here to make sure it's possible to have an array ... That's something you don't have to care about in assembler ... If you want support for that kind of fine control you have to ...
    (comp.lang.c)
  • Re: Basic questions about C
    ... is C closer to assembly language than most others? ... Wasn't it the ANSI standard that did this? ... but not to say that C is portable assembler or whatever. ... Except when calculating the string length. ...
    (comp.lang.c)
  • Re: The Case Against RosAsm (#2)
    ... Or would you say that MASM is not an assembler because it provides ... and up till now I only added minor stuff to get into the business of pICE. ... Yes, I know, though I haven't known that HLA is available for Linux as well. ... >string, or a number. ...
    (alt.lang.asm)