Re: clueless student trying to parse XML
From: Emanuel Bulic (emanuelbulic_at_yahoo.com)
Date: 10/17/03
- Next message: Steve Horsley: "Re: fast communication with c++"
- Previous message: Steve Horsley: "Re: String consisting of spaces"
- In reply to: sal achhala: "clueless student trying to parse XML"
- Next in thread: Miguel De Anda: "Re: clueless student trying to parse XML"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: 17 Oct 2003 12:38:28 -0700
To begin... html is not always parseable by an xml parser. rules
regarding html are less strict than xml, which means valid HTML is not
equivalent to valid xml... many web pages contain invalid html
(missing closed tags, etc) that will not pass xml well-formedness.
Next... become familiar with XML processing and java. buy an XML for
Java book, and use online resources. apache is your best friend.
XML technologies (java)
JAXP- java api for xml proc. standard api for xml processing.
Xerces - open source xml parser by apache... xml.apache.org
xalan - " " " xml transformer by apache. same place
that should keep you busy for a week...
"sal achhala" <none@none.com> wrote in message news:<bmosr2$jve$1@south.jnrs.ja.net>...
> I need pointing in the right direction regards writing a parser to parse
> HTML/XML in order to extract the data from it.
>
> Im writing a prototype for the final application but bieng fairly new to
> java I'm totally at a loss where to start.
>
> I'm getting quite frustrated as i havent got a clue where to start (ive read
> some of the javadoc & have a pile of java reference books)
>
> Ive read up on the DOM/SAX standards and java's support for XML parsing but
> still no idea how to actually get coding.
>
> The final application is aimed at extracting data which meets user critera
> from a given website.
>
> thanks
>
> sal
>
> ps this is a final year University Computer Science Project
>
> more deails at http://www.mellowmoose.org/project.html
- Next message: Steve Horsley: "Re: fast communication with c++"
- Previous message: Steve Horsley: "Re: String consisting of spaces"
- In reply to: sal achhala: "clueless student trying to parse XML"
- Next in thread: Miguel De Anda: "Re: clueless student trying to parse XML"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|