Re: How to write something to a html textfield and send it?

From: Michael Wojcik (mwojcik_at_newsguy.com)
Date: 03/15/04


Date: 15 Mar 2004 17:08:46 GMT


In article <f53a6ce0.0403131411.503038d3@posting.google.com>, alexhanh@ranssi.paivola.net (Alexander) writes:
>
> > > I have a standard html web-page with a textfield and a send button
> > > added to it (look below for html). I need a way to open the page, find
> > > the specific textfield, write something to it and "click" send. I need
> > > to do this through a program.
> >
> > Do you actually need to control a particular browser and make it
> > perform these actions, or do you need to write an HTTP user agent
> > which can submit the form with the text field populated?
>
> No need for controlling any particular browser. I just need to log on
> first and then start submitting data. I'm not familiar with HTTP user
> agents, but this sounds like something correct.

A "user agent" is just the HTTP term for a client. It sounds like your
best bet may be to just write a simple HTTP client that does what you
want, since you don't need all the functions of a browser.

You may need more than something that just posts a form, though - see
below.

> I don't know if I need the browser for the solution. If I can "write"
> into the textfield without clicking on it and typing to it through a
> browser, then I don't need the browser at all.

Note that the solution I'm proposing doesn't involve "writ[ing] into
the textfield" at all. When you enter text in a textfield displayed
by a browser, you're just giving the browser the data to put in its
request to the server. I'm suggesting skipping that step entirely
and building the request in your program.

The server doesn't know anything about a textfield; what it knows is
that it served one page (with an HTML form - but it doesn't care
about that), and then it got a request that contained some parameter
data. That request could have been generated by all sorts of things;
an HTML form is just the most common.

So what you want to send to the server is a request that says "this
parameter has this value, and this other parameter has *this* value"
and so on.

> Google newsgroup is a
> very close to my problem. First, I need to log on.

This probably complicates things. There's no standard mechanism for
"logging on" to an HTTP server; there are a variety of them, some
more common than others. Probably the most commonly-used one is
a combination of an HTML form to pass some user credentials to the
server, and a cookie set by the server to maintain the session
information. There are a host of other possibilities, such as HTTP
authentication. And if the "log on" process is supposed to be
secure, SSL is probably being used, which will make writing a user
agent significantly more complicated.

However, I don't recall any login process for using Google Groups.
Is your application different from Google Groups in this respect,
or do you mean something different by "log on"?

> Then I need to go
> to comp.programming and click submit a new post. And then write
> something to the message field, and click post message - no preview.

If you're writing your own user agent, typically you won't be doing
anything like this. You'll create your request and post it to the
server using the proper Request-URI. Navigation typically isn't
required because the Request-URI for posting a message doesn't
change; you can figure out what it is before writing your program and
then hard-code it (or make it some kind of external configuration
item, or whatever).

> I
> need to run this process unlimited times. And the text written to the
> message field is gotten from outside or generated randomly.

Purely application considerations; they don't affect the basic design
of this sort of client.

> > [snip example of POST request]
>
> This sounds good. If I just could fill in the text field
> <input type="text" name="test" size="60" maxlength="20" class="text"/>
> <input type="submit" value="test!" class="submit" />
> it would be something like
> test=somerandomstring+anotherone&action=submit ?
> the code just differs from yours. Here class-attributes are used.

To be honest, I don't know offhand if or how the "class" attribute
affects the request-message. It may only be used by the browser
for rendering the input. You'd have to check the relevant standards,
or - and this is a good idea in any case - trace a few requests you
make using your browser, so you can see what your application
actually looks like on the wire.

There are several TCP tunnelling applications freely available
which will show you HTTP request and response messages. Typically
they work like this:

1. Tell the TCP tunnel application where the server is - hostname
(or IP address) and port (usually 80 for HTTP). Tell it what
local port you want it to listen on. Start the tunnel.

2. Make a local copy of the HTML page with the form. Edit it,
setting the "BASE" attribute to point to your TCP tunnel's port.

3. Fill in the form data and submit it. You should see in the
tunnel program's output your HTTP request and the server's response.

> I also found some footage from the site's html.
> <!DOCTYPE html
> PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
> "DTD/xhtml1-transitional.dtd">
> So, I guess it is standard 1.0 HTML.

Nope - that's XHTML 1.0, which is a successor to HTML 4.0. Chances
are that HTML 4.0 is close enough, though; the server probably won't
care.

> I'll start studying HTTP and HTML form encoding. Does usually HTTP
> APIs contain these functionalities?

I don't know; I haven't looked at them in detail. It seems likely,
though, since simulating requests produced by HTML forms ought to
be a pretty common need.

-- 
Michael Wojcik                  michael.wojcik@microfocus.com
The lark is exclusively a Soviet bird.  The lark does not like the
other countries, and lets its harmonious song be heard only over the
fields made fertile by the collective labor of the citizens of the
happy land of the Soviets.  -- D. Bleiman