Re: parsing contents of variable for a specific value...
From: Scaramouche (spamReallySucks_at_forgetit.com)
Date: 11/21/03
- Next message: Mark Smart: "Java MIDI with good timing"
- Previous message: Vlado: "Pasting image in clipboard on Mac OS"
- In reply to: hiwa: "Re: parsing contents of variable for a specific value..."
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Fri, 21 Nov 2003 13:57:41 GMT
hiwa,
i'm not familiar with that package (java.util.regex), thus i'll do some
research on it.
thank you.
"hiwa" <HGA03630@nifty.ne.jp> wrote in message
news:6869384d.0311210127.2a2e61ee@posting.google.com...
> "nos" <nos@nospam.com> wrote in message
news:<%ievb.200465$ao4.710467@attbi_s51>...
> > ok, you have to check the particular html page,
> > but it is perfectly fine to have "<title>" on one line and "</title>" on
> > another line.
> >
> > I did not mean to imply that it will usually be lower case, but your
program
> > should be able to handle both
> >
> > "Scaramouche" <spamSucks@forgetIt.com> wrote in message
> > news:JYdvb.686$MF.280@nwrddc03.gnilink.net...
> > > my variable (htmlContent) contains the source of an entire web page.
as i
> > > go through it i get the output below, if i try and assign the value
i'm
> > > after to a string variable i would get a null pointer exception.
> > >
> > > ======
> > > while ((htmlContent=input.readLine())!= null)
> > > {
> > > System.out.println(htmlContent);
> > > startIdx = htmlContent.indexOf("<title>");
> > > startIdx += 7;
> > > endIdx = htmlContent.indexOf("</title>");
> > > }
> > > // String myTitle = htmlContent.substring(startIdx, endIdx);
> > > System.out.println(startIdx + " " + endIdx);
> > > ------output------
> > > 6 -1
> > > ======
> > >
> > > not sure what i'm doing wrong.
> > >
> > >
> > >
> > > "nos" <nos@nospam.com> wrote in message
> > > news:Yndvb.195123$mZ5.1451628@attbi_s54...
> > > > i wonder if it might be something about an imbedded <cr><lf>
> > > >
> > > > "Greg" <spamMeNot@noThanks.com> wrote in message
> > > > news:M3dvb.3$Yh4.4383808@news.nnrp.ca...
> > > > > Hmm, strange. I compiled and ran your code and it worked just
fine.
> > > > >
> > > > > Here's the exact program I used:
> > > > >
> > > > > public class Tester {
> > > > >
> > > > > public static void main (String[] args) {
> > > > > String urlContent =
> > "<TITLE>this_value_is_what_i_want</TITLE>";
> > > > > String title=null;
> > > > > int startidx=0, endidx=0;
> > > > >
> > > > > startidx = urlContent.indexOf("<TITLE>");
> > > > > startidx += 7;
> > > > > endidx = urlContent.indexOf("</TITLE>");
> > > > > System.out.println("startidx: " + startidx + " endidx: "
+
> > endidx);
> > > > > title = urlContent.substring(startidx, endidx);
> > > > > System.out.println(title);
> > > > > }
> > > > > }
> > > > >
> > > > >
> > > > > And here's the output:
> > > > >
> > > > > startidx: 7 endidx: 32
> > > > >
> > > > > this_value_is_what_i_want
> > > > >
> > > > >
> > > > > Seems OK to me.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > Scaramouche wrote:
> > > > >
> > > > > > thank you for taking the time to try and help out.
> > > > > > i thought the same thing and when i checked, endidx did contain
> > a -1.
> > this
> > > > > > is somewhat confusing since the spelling and case of the closing
> > (</TITLE>)
> > > > > > tag is correct, i thought the slash might be throwing it off but
> > since
> > it's
> > > > > > a string i don't think that's it.
> > > > > > thanks again!
> > > > > >
> > > > > > "nos" <nos@nospam.com> wrote in message
> > > > > > news:98avb.258761$Tr4.806047@attbi_s03...
> > > > > >
> > > > > >>i would suggest you first check the result of the
> > urlContent.indexOf()
> > > > > >>method invocations to see if you are getting -1 or null or
whatever
> > > > > >>(some html pages use lower case)
> > > > > >>
> > > > > >>"Scaramouche" <spamReallySucks@forgetit.com> wrote in message
> > > > > >>news:o68vb.11938$M31.256257@twister.tampabay.rr.com...
> > > > > >>
> > > > > >>>i have the contents of an html page stored within a variable.
i
> > would
> > > > > >>
> > > > > >>like
> > > > > >>
> > > > > >>>to parse out the value of the TITLE tag,
> > > > > >>>ie..<TITLE>this_value_is_what_i_want</TITLE>
> > > > > >>>
> > > > > >>> String title=null;
> > > > > >>> int startidx=0, endidx=0;
> > > > > >>>
> > > > > >>> startidx = urlContent.indexOf("<TITLE>");
> > > > > >>> startidx += 7;
> > > > > >>> endidx = urlContent.indexOf("</TITLE>");
> > > > > >>> title = urlContent.substring(startidx, endidx);
> > > > > >>> System.out.println(title); //this doesn't work for me.
> > generates
> > an
> > > > > >>
> > > > > >>out
> > > > > >>
> > > > > >>>of bounds err msg.
> > > > > >>>
> > > > > >>>is there a way of doing this with java2...trying to stay away
from
> > xml
> > > > > >>
> > > > > >>since
> > > > > >>
> > > > > >>>it's new to me.
> > > > > >>>
> > > > > >>>thanks
> > > > > >>>
> > > > > >>>
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > >
> > >
>
> Use Java regular expression, java.util.regex package, with MULTILINE
> option. Don't parse specific string. Parse the whole document with a
> single breath.
>
- Next message: Mark Smart: "Java MIDI with good timing"
- Previous message: Vlado: "Pasting image in clipboard on Mac OS"
- In reply to: hiwa: "Re: parsing contents of variable for a specific value..."
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|
|