Re: cut strings and parse for images
From: Paul McGuire (ptmcg_at_austin.rr._bogus_.com)
Date: 12/06/04
- Next message: Scott Frankel: "Re: file descriptors & fdopen"
- Previous message: Fredrik Lundh: "Re: byte code generated under linux ==> bad magic number under windows"
- In reply to: Andreas Volz: "cut strings and parse for images"
- Next in thread: Andreas Volz: "Re: cut strings and parse for images"
- Reply: Andreas Volz: "Re: cut strings and parse for images"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Mon, 06 Dec 2004 20:36:36 GMT
"Andreas Volz" <usenet-spam-trap@brachttal.net> wrote in message
news:20041206203456.251c6d85@frodo.mittelerde...
> Hi,
>
> I used SGMLParser to parse all href's in a html file. Now I need to cut
> some strings. For example:
>
> http://www.example.com/dir/example.html
>
> Now I like to cut the string, so that only domain and directory is
> left over. Expected result:
>
> http://www.example.com/dir/
>
> I know how to do this in bash programming, but not in python. How could
> this be done?
>
> The next problem is not only to extract href's, but also images. A href
> is easy:
>
> Install
>
> But a image is a little harder:
>
> <img class="bild" src="images/marine.jpg">
>
Check out the urlparse module (in std distribution). For images, you can
provide a default addressing scheme, so you can expand "images/marine.jpg"
relative to the current location.
-- Paul
- Next message: Scott Frankel: "Re: file descriptors & fdopen"
- Previous message: Fredrik Lundh: "Re: byte code generated under linux ==> bad magic number under windows"
- In reply to: Andreas Volz: "cut strings and parse for images"
- Next in thread: Andreas Volz: "Re: cut strings and parse for images"
- Reply: Andreas Volz: "Re: cut strings and parse for images"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|
|