Re: Why doesn't this work: matching capturing
- From: Jim Gibson <jimsgibson@xxxxxxxxx>
- Date: Tue, 26 Feb 2008 13:05:27 -0800
In article
<B68EB32ADEE6D74594B63A7D2931F0BA075214@xxxxxxxxxxxxxxxxxxxxxxxxx>,
Kevin Zembower <kzembowe@xxxxxxxxxx> wrote:
I have a data file that looks like this:
uSF1 MD15000000 009214935522451020 9 0101001 ...
uSF1 MD15000000 009215035522451020 9 0101002 ...
uSF1 MD15000000 009215135522451020 9 0101003 ...
This is actually three lines that all start with 'uSF1'. This is the
Summary File from the US 2000 Census. I want to print all the census
tracts and blockgroup numbers for FIPS state code = "24" (Maryland) and
FIPS county code "510" (Baltimore City) for summary level '150'. These
are all fixed-length records. I tried:
[kevinz@www UScensus]$ perl -ne '($tract, $bg) =
/^.{8}150.{18}24510.{21}(.{6})(.)/; print "Tract $tract BLKGRP $bg\n";'
mdgeo.uf1 |head
Tract BLKGRP
Tract BLKGRP
Tract BLKGRP
<snip>
I thought that this would:
skip 8 characters and match '150'
skip 19 more characters and match '24' and '510'
skip 21 more characters and capture the next 6 in $tract
capture the next character in $bg
and print them.
The first two matches work, but nothing is captured. Any ideas what I'm
doing wrong?
It works for me:
% perl -ne '($t,$b)=m/^.{8}150.{18}24510.{21}(.{6})(.)/;print"Tract
$t\tBLKGRP $b\n";' mdgeo.uf1
Tract 010100 BLKGRP 1
Tract 010100 BLKGRP 2
Tract 010100 BLKGRP 3
Perhaps your files do not contain what you think they do.
I would use the unpack function for this task (severe line wrap ahead):
#!/usr/local/bin/perl
use strict;
use warnings;
while(my $line = <DATA>) {
my( $tract, $bg ) = unpack('x55 A6 A', $line);
print "Tract $tract, BLKGRP $bg\n";
}
__DATA__
uSF1 MD15000000 009214935522451020 9
010100188722397N07209999900116759 0Block Group 1S 1158
662+39283007-076574503
uSF1 MD15000000 009215035522451020 9
010100288722397N07209999900109338 0Block Group 2S 842
547+39280857-076573636
uSF1 MD15000000 009215135522451020 9
010100388722397N07209999900182248 135142Block Group 3S 920
442+39279557-076574311
Output:
Tract 010100, BLKGRP 1
Tract 010100, BLKGRP 2
Tract 010100, BLKGRP 3
--
Jim Gibson
Posted Via Usenet.com Premium Usenet Newsgroup Services
----------------------------------------------------------
** SPEED ** RETENTION ** COMPLETION ** ANONYMITY **
----------------------------------------------------------
http://www.usenet.com
.
- References:
- Why doesn't this work: matching capturing
- From: Kevin Zembower
- Why doesn't this work: matching capturing
- Prev by Date: Re: Why doesn't this work: matching capturing
- Next by Date: RE: Why doesn't this work: matching capturing
- Previous by thread: RE: Why doesn't this work: matching capturing
- Next by thread: looping thru delete check boxes
- Index(es):
Relevant Pages
|