stripping out ASCII chars using regexp?
From: Greg (djbitchpimp_at_snowboard.com)
Date: 03/18/04
- Next message: Richard Morse: "Re: How do you reference web form elements?"
- Previous message: Amir Kadic: "Re: capitalizing each word in an array"
- Next in thread: David K. Wall: "Re: stripping out ASCII chars using regexp?"
- Reply: David K. Wall: "Re: stripping out ASCII chars using regexp?"
- Reply: Bob Walton: "Re: stripping out ASCII chars using regexp?"
- Reply: Anno Siegel: "Re: stripping out ASCII chars using regexp?"
- Reply: Greg: "Re: stripping out ASCII chars using regexp?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: 17 Mar 2004 15:23:48 -0800
I am trying to get rid of the ASCII chars from the end of a string
that I download from a webpage using LWP::Simple. The script downloads
the HTML from a webpage and then uses HTML::TableExtract to extract
the information from specific tables on the page.
This basically gives me a string like this:
���,Temper Tantrum�,Take Care Comb Your
Hair�,,,CD�,23.56��,�,0 Days
Ago�,Scotla�
which I then split into an array using:
my @line = split (',', $line) ;
I then do a comparison on $line [6]:
if ($line [6] >= 75) {
... do something
When I run this using -w, I get the following error:
Argument "15.99M- M- " isn't numeric in numeric ge (>=) at
./parse_wants.pl line 49.
This is because somehow some extended ASCII chars got in the end of
the string. If I do:
my @chars = split ('', $line [6]) ;
foreach $char (@chars) {
print "$char ";
print ord ($char) ;
print "\n" ;
}
It gives me
1 49
5 53
. 46
9 57
9 57
� 160
� 160
I have tried stripping off these trailing ASCII 160 chars a number of
ways:
s/\240//g
s/[\200-\377]//g
tr/\177-\377//d
s/\�//g
but the only way I could get rid of them was using:
chop $line [6]
chop $line [6]
Can anyone figure out a way to get rid of these trailing ASCII
characters using a regular expression?
Thanks
Greg
- Next message: Richard Morse: "Re: How do you reference web form elements?"
- Previous message: Amir Kadic: "Re: capitalizing each word in an array"
- Next in thread: David K. Wall: "Re: stripping out ASCII chars using regexp?"
- Reply: David K. Wall: "Re: stripping out ASCII chars using regexp?"
- Reply: Bob Walton: "Re: stripping out ASCII chars using regexp?"
- Reply: Anno Siegel: "Re: stripping out ASCII chars using regexp?"
- Reply: Greg: "Re: stripping out ASCII chars using regexp?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|
|