Re: Why doesn't this work: matching capturing



In article
<B68EB32ADEE6D74594B63A7D2931F0BA075214@xxxxxxxxxxxxxxxxxxxxxxxxx>,
Kevin Zembower <kzembowe@xxxxxxxxxx> wrote:

I have a data file that looks like this:
uSF1 MD15000000 009214935522451020 9 0101001 ...
uSF1 MD15000000 009215035522451020 9 0101002 ...
uSF1 MD15000000 009215135522451020 9 0101003 ...

This is actually three lines that all start with 'uSF1'. This is the
Summary File from the US 2000 Census. I want to print all the census
tracts and blockgroup numbers for FIPS state code = "24" (Maryland) and
FIPS county code "510" (Baltimore City) for summary level '150'. These
are all fixed-length records. I tried:
[kevinz@www UScensus]$ perl -ne '($tract, $bg) =
/^.{8}150.{18}24510.{21}(.{6})(.)/; print "Tract $tract BLKGRP $bg\n";'
mdgeo.uf1 |head
Tract BLKGRP
Tract BLKGRP
Tract BLKGRP
<snip>

I thought that this would:
skip 8 characters and match '150'
skip 19 more characters and match '24' and '510'
skip 21 more characters and capture the next 6 in $tract
capture the next character in $bg
and print them.

The first two matches work, but nothing is captured. Any ideas what I'm
doing wrong?

It works for me:

% perl -ne '($t,$b)=m/^.{8}150.{18}24510.{21}(.{6})(.)/;print"Tract
$t\tBLKGRP $b\n";' mdgeo.uf1
Tract 010100 BLKGRP 1
Tract 010100 BLKGRP 2
Tract 010100 BLKGRP 3

Perhaps your files do not contain what you think they do.

I would use the unpack function for this task (severe line wrap ahead):

#!/usr/local/bin/perl
use strict;
use warnings;

while(my $line = <DATA>) {
my( $tract, $bg ) = unpack('x55 A6 A', $line);
print "Tract $tract, BLKGRP $bg\n";
}
__DATA__
uSF1 MD15000000 009214935522451020 9
010100188722397N07209999900116759 0Block Group 1S 1158
662+39283007-076574503
uSF1 MD15000000 009215035522451020 9
010100288722397N07209999900109338 0Block Group 2S 842
547+39280857-076573636
uSF1 MD15000000 009215135522451020 9
010100388722397N07209999900182248 135142Block Group 3S 920
442+39279557-076574311


Output:

Tract 010100, BLKGRP 1
Tract 010100, BLKGRP 2
Tract 010100, BLKGRP 3

--
Jim Gibson

Posted Via Usenet.com Premium Usenet Newsgroup Services
----------------------------------------------------------
** SPEED ** RETENTION ** COMPLETION ** ANONYMITY **
----------------------------------------------------------
http://www.usenet.com
.



Relevant Pages

  • RE: Why doesnt this work: matching capturing
    ... Tract 010100 BLKGRP 1 ... Subject: Why doesn't this work: matching capturing ... never checking the return value of the pattern match. ...
    (perl.beginners)
  • Re: Why doesnt this work: matching capturing
    ... Summary File from the US 2000 Census. ... Tract  BLKGRP ... never checking the return value of the pattern match. ...
    (perl.beginners)
  • RE: How do I separate a text value with dashes?
    ... You test is delimited and the delimiter ... > The first number is for Township, the second for Range, ... > the third for Section, and the fourth for Tract. ... > How can I break up the 11 characters so that they will automatically ...
    (microsoft.public.excel.worksheet.functions)
  • How do I separate a text value with dashes?
    ... I have a column (called Tract Number) that contains a column with a 11 ... character number (BTW, it really isn't a number, if that helps), such as this ... The first number is for Township, the second for Range, ... How can I break up the 11 characters so that they will automatically ...
    (microsoft.public.excel.worksheet.functions)
  • Calculating average annual change in real estate value
    ... I have median sales prices for the years 2000-2005 for each census ... particular census tract. ...
    (microsoft.public.excel.misc)