Re: Extracting patterned filenames from [glob] without a loop - possible?
From: Bryan Oakley (bryan_at_bitmover.com)
Date: 12/31/03
- Next message: Pat Thoyts: "Re: Philosophy of design question [Was: Re: [Patch] Patch for windows XP themed buttons/checkbuttons/radiobuttons"
- Previous message: Don Porter: "Re: Extracting patterned filenames from [glob] without a loop - possible?"
- In reply to: Phil Powell: "Extracting patterned filenames from [glob] without a loop - possible?"
- Next in thread: Phil Powell: "Re: Extracting patterned filenames from [glob] without a loop - possible?"
- Reply: Phil Powell: "Re: Extracting patterned filenames from [glob] without a loop - possible?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Tue, 30 Dec 2003 23:53:43 GMT
Phil Powell wrote:
> Consider this code:
>
> set fileList [glob -nocomplain -- /my/directory/*.*]
>
> if {$willArchiveTransferredFiles && [string length $fileList] > 0} {
> # FILTER OUT TO ONLY HAVE FILES ENDING IN _[timestamp] AS THESE ARE
> GENERATED BACKUP FILES BY transfer.tcl TO BE ARCHIVED
> set blah [regsub -all
> {((/[a-zA-Z0-9\-_\.]+)*[a-zA-Z0-9\-\._%~\+]+_[0-9]+\.[a-zA-Z0-9]+)}
> $fileList "\\1 " fileList2]
> puts "blah = $blah"
> set fileList2 [string trim $fileList2]
> puts "fileList2: $fileList2"
> }
>
> The [regsub] does not work and I am absolutely unclear as to how to
> obtain the correct pattern to make it work.
A couple obvious problems relate to the fact you are using string
operations (string length, regsub) on what is obviously a list
($fileList). This is relatively safe since lists can easily be converted
to strings but it's generally very bad to mix strings and lists this
way. Plus, you may get strings with extra quoting characters that you
probably don't expect. You'll almost definitely have a problem if any
of your filenames have spaces in them.
When you try to convert the string back into a list, or just use it as a
list putting faith in the implicit conversion as I suspect you would
likely do, you're asking for trouble.
>
> Bottom line is that I want to get every file name from $fileList that
> has this kind of pattern:
>
> /dir1/dir2/.../dirN/filename_1918272728722.[some extension]
>
> The filename will always contain the following:
> 1) directory
> 2) alphanumeric filename
> 3) an underscore separator
> 4) timestamp
> 5) .[ext]
>
> Thus far I am unsuccessful at extracting this exact pattern from
> $fileList. I am trying very hard to avoid using a loop since the
> number of files extracted from [glob] into $fileList could be in the
> hundreds and that will horrifically clog processing time on the
> production box by doing this.
This doesn't make sense. Tcl could easily loop through thousands of
filenames without 'horrifcally clogging' a system. In fact, it should be
able to process tens of thousands of filenames in under a second. Why do
you think a loop will clog the system only processing a few hundred
files? Are you working on a tiny system (say, a PDA or embedded
processor) or have some other constraint you haven't told us about?
>
> I want to ensure that this code is as streamlined and clean as well as
> fully functional as it should be, but for now it fails and I am unsure
> as to how to fix it. Suggestions welcome.
Use a loop. A naive implementation I just wrote sped through a list of
50,000 filenames in about 800 milliseconds. Admittedly I have a fast
machine, but your parameters seem to imply you need to process only "in
the hundreds". Even on a slow box I'd guess processing several hundred
files would take just a fraction of a second.
- Next message: Pat Thoyts: "Re: Philosophy of design question [Was: Re: [Patch] Patch for windows XP themed buttons/checkbuttons/radiobuttons"
- Previous message: Don Porter: "Re: Extracting patterned filenames from [glob] without a loop - possible?"
- In reply to: Phil Powell: "Extracting patterned filenames from [glob] without a loop - possible?"
- Next in thread: Phil Powell: "Re: Extracting patterned filenames from [glob] without a loop - possible?"
- Reply: Phil Powell: "Re: Extracting patterned filenames from [glob] without a loop - possible?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|