Re: mail address validation



[I'm unsure whether the quoting levels are correct. When I read ~greg's
posting I assumed that he was the OP, clarifying his problem. But
reading again, it looks like Petr is the OP? So I may be responding to
the wrong person]

On 2009-01-04 23:14, ~greg <g_m@xxxxxxxxxxxxxxxxxx> wrote:
"Petr Vileta "fidokomik"" <stoupa@xxxxxxxxxxxxx> wrote in message news:ggu7di$1qdp$1@xxxxxxxxxxxxxxxxxx
smallpond wrote:
On Nov 29, 11:38 am, "Petr Vileta \"fidokomik\""
My questions are:

1) what characters are forbidden in "user" part
2) what characters are forbidden in "domain" part
3) are allowed single character top level domains, say "example.o" ?

Many thanks for any explanation.


You want to reinvent the wheel and not just use the CPAN module? ok

http://ex-parrot.com/~pdw/Mail-RFC822-Address.html

I have read this but regexp on this page return true for user@domain By my opinion this is invalid mail address because TLD
part missing.

~~~~~~~~~~~~~~~~
If "TLD" meand "top level domain", then I'm guessing that you really are
just talking about syntax (as in whether user@domain is invalid because
it's not more like user@xxxxxxxxxx) and not sematics (as in:
"Please note that there is no way to determine whether an address
is deliverable without attempting delivery" - Email::Valid doc )

~~
I maybe can say a few things about this.

First of all, RFC2822 "obsoletes 822".

RFC 5322 obsoletes RFC 2822 and allows email addresses of the form
<localpart@tld>. So without "attempting delivery" (i.e. at least doing a
DNS lookup on "domain" and determining that there is no A or MX records
for "domain.") you cannot know that <user@domain> is invalid.


Second, I don't know what's what with these things,
but it seemed to me that RFC1036
(here: http://www.cs.tut.fi/~jkorpela/rfc/1036.html )
and "son of RFC1036"
(here: http://www.chemie.fu-berlin.de/outerspace/netnews/son-of-1036.html )
are clearer and maybe more authoritative than some others.

RFC 1036 is more strict than RFC 822 (and its successors). I.e., some
forms which would be legal in Email messages are not allowed in usenet
articles.

In particular, for what I needed, this
http://www.cs.tut.fi/~jkorpela/rfc/1036.html#2.1.1
quote:
Thus, the three permissible forms are:
From: mark@xxxxxxxxxxxxxx
From: mark@xxxxxxxxxxxxxx (Mark Horton)
From: Mark Horton <mark@xxxxxxxxxxxxxx>

together with the note that the characters < > @ ( )
aren't syntactic if they're escaped (with \ ) or occur
in quotes ("..."), was good enought for what I needed.
Or at least I think it is.

~~~
So here's what I needed ....

I'm re-doing a newsgroup archive.
And I needed to parse the From header lines in order
1) to create a "safe" From line, with the domain part of the addresses
replaced by '...', in order to protect the authors from bot harvesting, and
2) to extract the friendly name, such as is normally displayed in e-mail readers.
(using the 'local' from local@domain as the friendly name if there isn't a proper one.)

I would suggest using Mail::Address for this task. You don't have to
"validate" email addresses for this (and you probably shouldn't - some
posters use invalid addresses on purpose, and an address which was valid
a few years ago might be invalid now), and Mail::Address should handle
the parsing, and return the "friendly name", local part and domain name.

hp
.



Relevant Pages

  • RE: Post Code UK Structure Verification
    ... Dim invalid As Boolean ... ' Validate inner code ..... ... Notepad is used because copying from ... Sort to bring to the top records with too many characters ...
    (microsoft.public.excel.programming)
  • Re: [PHP] One last try at this!
    ... Jim Lucas wrote: ... "Invalid Characters" So $result now contains the value 'Invalid Characters'. ... At one point he mentioned changing ValidateString() to return false on an error. ...
    (php.general)
  • Re: trouble joining exchange 5.5 domain
    ... There might be some old data in AD that references the invalid domain name. ... A SKCC might be required. ... > message when trying to add an exchange 2000 server to the ... > greater than 64 characters or contains at least one of the ...
    (microsoft.public.exchange.misc)
  • Re: [PHP] One last try at this!
    ... "Invalid Characters" So $result now contains the value 'Invalid Characters'. ... At one point he mentioned changing ValidateString() to return false on an error. ... PHP General Mailing List To unsubscribe, ...
    (php.general)
  • Re: Invalid character in XML
    ... Assuming that the characters are invalid not because of the wrong encoding ... There may be some non-standard option on the XML ... While I'm reading in the data I get an error. ...
    (microsoft.public.sqlserver.xml)