Re: Regular expression to read non-commented lines from a file



Mario Winterer wrote:

> Here's the code snippet that prints all lines that are not comments and
> do not start with whitespace.
> The input is a String (more general: CharSequence) containing the entire
> file content. DO NOT USE FOR LARGE FILES!
>
> /* BEGIN */
> String testLine = "This\n is\na\n#test";
>
> Pattern pattern = Pattern.compile("^[^\\s#].*$", Pattern.MULTILINE);
> Matcher matcher = pattern.matcher(testLine);
> while (matcher.find()) {
> String l = matcher.group();
> System.out.println(l);
> }
> /* END */
>
> In your case, it might be better to read lines using the
> BufferedReader's "readLine" method and just test if it starts with
> whitespace or "#":
>
> BufferedReader reader = new BufferedReader(new FileReader(yourFile));
> try {
> String line = null;
> while ((line = reader.readLine()) != null) {
> if (line.length() == 0) continue; /* skip line in case it is empty
> (is this correct?) */
>
> char c = line.charAt(0);
> if ((c == '#') || Character.isWhitespace(c)) continue;
> System.out.println(line);
> }
> } finally {
> reader.close();
> }
>
> Best regards,
> Tex
>
> "Jonny" <www.mail@xxxxxxxxxxxx> wrote in message
> news:XKZKe.3247$2C5.797@xxxxxxxxxxxxxxxxxxxxxxx
> > Hi,
> >
> > I would like to use a regular expression in Java to read those lines
> > from a file which are not comments and do not start with whitespace.
> > Commented lines start with #
> >
> > Currently with grep, I am using the command:
> >
> > grep -E "^[^#\ \t]" myfile
> >
> > to get the lines I want, but I am having problems converting this
> > regular expression for use in Java. I don't get any lines returned.
> >
> > If I replace the above regular expression with ".*" in my Java code,
> > then all lines of myfile are returned, as you might expect, so it would
> > appear that the problem is only with the regular expression shown in the
> > above grep example.
> >
> > Please can you help.

Thanks for a comprehensive response Mario. It is much appreciated.

I can see that I needed to use ^ and $, and also \\s for whitespace.
These were the two problems I was having.

Incidentally, the file I am reading is very small, so I used the
following code to read the file:

String fileAsString = new Scanner(new
File(myFile)).useDelimiter("\\A").next();

where myFile is the path to the file to be read.

Regards,
Jonny
.



Relevant Pages

  • RE: XmlTextReader, parsing, space as data
    ... Is the data coming in as XmlNodeType.Text or as whitespace? ... // and we are getting data from a string instead of a file ... > I tried using XmlTextReader to parse your Xml document. ... > code snippet and a part of the Xml document here, ...
    (microsoft.public.dotnet.xml)
  • Re: Requesting advice how to clean up C code for validating string represents integer
    ... you are assuming the string is meant to have only one data item. ... optionally with whitespace around it either/both way. ... I'll have to remember that paradigm if and when I ever ask a user ... The only place I used implicit int was in return value for main, ...
    (comp.lang.c)
  • WORD factors Was: Re: RFD: Legacy Wordset
    ... HERE PARSE-WORD STRING,; ... PARSE-NAME is a factor of WORD ... We want it to accept a character, parse the input buffer and discard ... We want it to parse the input buffer and discard whitespace, ...
    (comp.lang.forth)
  • I am having a hard time with this code...
    ... delimited by whitespace, which consists of spaces, tabs, ... ; character read was ... ;string printed to report ... Subroutine to get the next character from the standard input. ...
    (alt.lang.asm)
  • FAQ 4.32 How do I strip blank space from the beginning/end of a string?
    ... How do I strip blank space from the beginning/end of a string? ... replace all the leading or trailing whitespace with nothing. ... so the newline disappears too. ... The perlfaq-workers, a group of volunteers, maintain the perlfaq. ...
    (comp.lang.perl.misc)