Re: mail address validation
- From: "Peter J. Holzer" <hjp-usenet2@xxxxxx>
- Date: Mon, 5 Jan 2009 01:03:29 +0100
[I'm unsure whether the quoting levels are correct. When I read ~greg's
posting I assumed that he was the OP, clarifying his problem. But
reading again, it looks like Petr is the OP? So I may be responding to
the wrong person]
On 2009-01-04 23:14, ~greg <g_m@xxxxxxxxxxxxxxxxxx> wrote:
"Petr Vileta "fidokomik"" <stoupa@xxxxxxxxxxxxx> wrote in message news:ggu7di$1qdp$1@xxxxxxxxxxxxxxxxxx
smallpond wrote:
On Nov 29, 11:38 am, "Petr Vileta \"fidokomik\""
My questions are:
1) what characters are forbidden in "user" part
2) what characters are forbidden in "domain" part
3) are allowed single character top level domains, say "example.o" ?
Many thanks for any explanation.
You want to reinvent the wheel and not just use the CPAN module? ok
http://ex-parrot.com/~pdw/Mail-RFC822-Address.html
I have read this but regexp on this page return true for user@domain By my opinion this is invalid mail address because TLD
part missing.
~~~~~~~~~~~~~~~~
If "TLD" meand "top level domain", then I'm guessing that you really are
just talking about syntax (as in whether user@domain is invalid because
it's not more like user@xxxxxxxxxx) and not sematics (as in:
"Please note that there is no way to determine whether an address
is deliverable without attempting delivery" - Email::Valid doc )
~~
I maybe can say a few things about this.
First of all, RFC2822 "obsoletes 822".
RFC 5322 obsoletes RFC 2822 and allows email addresses of the form
<localpart@tld>. So without "attempting delivery" (i.e. at least doing a
DNS lookup on "domain" and determining that there is no A or MX records
for "domain.") you cannot know that <user@domain> is invalid.
Second, I don't know what's what with these things,
but it seemed to me that RFC1036
(here: http://www.cs.tut.fi/~jkorpela/rfc/1036.html )
and "son of RFC1036"
(here: http://www.chemie.fu-berlin.de/outerspace/netnews/son-of-1036.html )
are clearer and maybe more authoritative than some others.
RFC 1036 is more strict than RFC 822 (and its successors). I.e., some
forms which would be legal in Email messages are not allowed in usenet
articles.
In particular, for what I needed, this
http://www.cs.tut.fi/~jkorpela/rfc/1036.html#2.1.1
quote:
Thus, the three permissible forms are:
From: mark@xxxxxxxxxxxxxx
From: mark@xxxxxxxxxxxxxx (Mark Horton)
From: Mark Horton <mark@xxxxxxxxxxxxxx>
together with the note that the characters < > @ ( )
aren't syntactic if they're escaped (with \ ) or occur
in quotes ("..."), was good enought for what I needed.
Or at least I think it is.
~~~
So here's what I needed ....
I'm re-doing a newsgroup archive.
And I needed to parse the From header lines in order
1) to create a "safe" From line, with the domain part of the addresses
replaced by '...', in order to protect the authors from bot harvesting, and
2) to extract the friendly name, such as is normally displayed in e-mail readers.
(using the 'local' from local@domain as the friendly name if there isn't a proper one.)
I would suggest using Mail::Address for this task. You don't have to
"validate" email addresses for this (and you probably shouldn't - some
posters use invalid addresses on purpose, and an address which was valid
a few years ago might be invalid now), and Mail::Address should handle
the parsing, and return the "friendly name", local part and domain name.
hp
.
- Follow-Ups:
- Re: mail address validation
- From: ~greg
- Re: mail address validation
- From: Petr Vileta \"fidokomik\"
- Re: mail address validation
- References:
- Re: mail address validation
- From: ~greg
- Re: mail address validation
- Prev by Date: Re: mail address validation
- Next by Date: FAQ 4.38 Why don't my <<HERE documents work?
- Previous by thread: Re: mail address validation
- Next by thread: Re: mail address validation
- Index(es):
Relevant Pages
|