Re: DocumentHTML ?
- From: "A. Sinan Unur" <1usa@xxxxxxxxxxxxxxxxxxx>
- Date: Tue, 27 Feb 2007 01:07:45 GMT
"~greg" <g_m@xxxxxxxxxxxxxxxxxx> wrote in news:pZWdnZ_VB-MT1n7YnZ2dnUVZ_oGlnZ2d@xxxxxxxxxxx:
I am trying to get an InternetExplorer.Application to print out
the whole HTML document as text,
from the <HTML> (or before) to the </HTML>.
(-so as to feed it to a TreeBuilder parse).
print $Document->Body->innerHTML works,
but returns only the body's innerHTML.
print $Document->Body->outterHTML,
and print $Document->DocumentHTML,
don't work.
The error is:
Win32::OLE(0.1707) error 0x80020003: "Member not found"
in METHOD/PROPERTYGET "" at ...
Any hints, please?
Well, the first one would to use
http://search.cpan.org/~abeltje/Win32-IE-Mechanize-0.009_17/
I have successfully used that module to do some really complicated
automated downloading of about 10 GB of HTML from various web sites
(sorry can't be more specific).
Note the comment at
http://search.cpan.org/~abeltje/Win32-IE-Mechanize-0.009_17/lib/Win32/IE/Mechanize.pm#%24ie-%3Econtent
use strict;
use warnings; # do not leave it out.
#!/usr/bin/perl
use strict;
use warnings;
$|=1;
my $IEWindow;
my $Document;
my $Looping = 1;
use Win32::OLE qw(EVENTS in);
my $IE = Win32::OLE->new("InternetExplorer.Application")
or die "Could not start Internet Explorer.Application\n";
Win32::OLE->WithEvents($IE, \&MyIEHandler, "DWebBrowserEvents2");
sub MyIEHandler {
my ($obj, $event, @args) = @_;
if ($event eq "DocumentComplete") {
my $IEWindow = shift @args;
print $IEWindow->Document->documentElement->{outerHTML};
}
elsif($event eq 'OnQuit') {
Win32::OLE->WithEvents($IE);
$Looping = 0;
}
}
$IE->{visible} = 1;
$IE->Navigate("http://www.google.com");
while ($Looping) {
Win32::Sleep(40);
Win32::OLE->SpinMessageLoop();
}
__END__
Sinan
.
- Follow-Ups:
- Re: DocumentHTML ?
- From: ~greg
- Re: DocumentHTML ?
- References:
- DocumentHTML ?
- From: ~greg
- DocumentHTML ?
- Prev by Date: Perl threads - capturing value returned from sub
- Next by Date: Re: Perl threads - capturing value returned from sub
- Previous by thread: DocumentHTML ?
- Next by thread: Re: DocumentHTML ?
- Index(es):
Relevant Pages
|