Re: Regular expression to read non-commented lines from a file



Hi!

Here's the code snippet that prints all lines that are not comments and do not start with whitespace.
The input is a String (more general: CharSequence) containing the entire file content. DO NOT USE FOR LARGE FILES!

/* BEGIN */
String testLine = "This\n is\na\n#test";

Pattern pattern = Pattern.compile("^[^\\s#].*$", Pattern.MULTILINE);
Matcher matcher = pattern.matcher(testLine);
while (matcher.find()) {
  String l = matcher.group();
  System.out.println(l);
}
/* END */

In your case, it might be better to read lines using the BufferedReader's "readLine" method and just test if it starts with whitespace or "#":

BufferedReader reader = new BufferedReader(new FileReader(yourFile));
try {
 String line = null;
 while ((line = reader.readLine()) != null) {
   if (line.length() == 0) continue; /* skip line in case it is empty (is this correct?) */

   char c = line.charAt(0);
   if ((c == '#') || Character.isWhitespace(c)) continue;
   System.out.println(line);
 }
} finally {
 reader.close();
}

Best regards,
 Tex

"Jonny" <www.mail@xxxxxxxxxxxx> wrote in message news:XKZKe.3247$2C5.797@xxxxxxxxxxxxxxxxxxxxxxx
Hi,

I would like to use a regular expression in Java to read those lines
from a file which are not comments and do not start with whitespace.
Commented lines start with #

Currently with grep, I am using the command:

grep -E "^[^#\ \t]" myfile

to get the lines I want, but I am having problems converting this
regular expression for use in Java.  I don't get any lines returned.

If I replace the above regular expression with ".*" in my Java code,
then all lines of myfile are returned, as you might expect, so it would
appear that the problem is only with the regular expression shown in the
above grep example.

Please can you help.

Thanks,
Jonny

.



Relevant Pages

  • Re: regular expression
    ... the second part of the substitution operator is just a string, not a regular expression. ... And which of the five whitespace characters should ...
    (perl.beginners)
  • Re: Replace pattern with a variable num of spaces
    ... In the substitution operator, the left part is a regular expression and the right part is just a string, so the code above is replacing a single character matching '' with the string 's' ... You cannot put "whitespace" in the string half because whitespace is a regular expression character class and it has no meaning in a string context. ...
    (perl.beginners)
  • Re: Why is this sub removing newlines??
    ... because "whitespace" ... Perl has several abbreviations for common character classes: ... How do I strip blank space from the beginning/end of a string? ... UTF-8 format and the locale or EBCDIC code page that is in effect. ...
    (comp.lang.perl.misc)
  • Re: Requesting advice how to clean up C code for validating string represents integer
    ... you are assuming the string is meant to have only one data item. ... optionally with whitespace around it either/both way. ... I'll have to remember that paradigm if and when I ever ask a user ... The only place I used implicit int was in return value for main, ...
    (comp.lang.c)
  • WORD factors Was: Re: RFD: Legacy Wordset
    ... HERE PARSE-WORD STRING,; ... PARSE-NAME is a factor of WORD ... We want it to accept a character, parse the input buffer and discard ... We want it to parse the input buffer and discard whitespace, ...
    (comp.lang.forth)