Re: DocumentHTML ?



"~greg" <g_m@xxxxxxxxxxxxxxxxxx> wrote in news:pZWdnZ_VB-MT1n7YnZ2dnUVZ_oGlnZ2d@xxxxxxxxxxx:

I am trying to get an InternetExplorer.Application to print out
the whole HTML document as text,
from the <HTML> (or before) to the </HTML>.
(-so as to feed it to a TreeBuilder parse).


print $Document->Body->innerHTML works,
but returns only the body's innerHTML.

print $Document->Body->outterHTML,
and print $Document->DocumentHTML,
don't work.

The error is:
Win32::OLE(0.1707) error 0x80020003: "Member not found"
in METHOD/PROPERTYGET "" at ...


Any hints, please?

Well, the first one would to use

http://search.cpan.org/~abeltje/Win32-IE-Mechanize-0.009_17/

I have successfully used that module to do some really complicated
automated downloading of about 10 GB of HTML from various web sites
(sorry can't be more specific).

Note the comment at

http://search.cpan.org/~abeltje/Win32-IE-Mechanize-0.009_17/lib/Win32/IE/Mechanize.pm#%24ie-%3Econtent

use strict;

use warnings; # do not leave it out.

#!/usr/bin/perl

use strict;
use warnings;


$|=1;
my $IEWindow;
my $Document;
my $Looping = 1;

use Win32::OLE qw(EVENTS in);

my $IE = Win32::OLE->new("InternetExplorer.Application")
or die "Could not start Internet Explorer.Application\n";

Win32::OLE->WithEvents($IE, \&MyIEHandler, "DWebBrowserEvents2");

sub MyIEHandler {
my ($obj, $event, @args) = @_;

if ($event eq "DocumentComplete") {
my $IEWindow = shift @args;
print $IEWindow->Document->documentElement->{outerHTML};
}
elsif($event eq 'OnQuit') {
Win32::OLE->WithEvents($IE);
$Looping = 0;
}
}

$IE->{visible} = 1;
$IE->Navigate("http://www.google.com";);

while ($Looping) {
Win32::Sleep(40);
Win32::OLE->SpinMessageLoop();
}

__END__

Sinan
.



Relevant Pages

  • DocumentHTML ?
    ... the whole HTML document as text, ... (-so as to feed it to a TreeBuilder parse). ... my $IEWindow; ...
    (comp.lang.perl.misc)
  • Detecting host: Active Desktop or IE
    ... Anyone know of a way to enable an html document to detect if it is ... being hosted as an Active Desktop item without raising any security ...
    (microsoft.public.scripting.jscript)