Re: Joining Big Files
- From: Paul McGuire <ptmcg@xxxxxxxxxxxxx>
- Date: Sat, 25 Aug 2007 18:48:33 -0700
On Aug 25, 8:15 pm, Paul McGuire <pt...@xxxxxxxxxxxxx> wrote:
On Aug 25, 4:57 am, mosscliffe <mcl.off...@xxxxxxxxxxxxxx> wrote:
I have 4 text files each approx 50mb.
<yawn> 50mb? Really? Did you actually try this and find out it was a
problem?
Try this:
import time
start = time.clock()
outname = "temp.dat"
outfile = file(outname,"w")
for inname in ['file1.dat', 'file2.dat', 'file3.dat', 'file4.dat']:
infile = file(inname)
outfile.write( infile.read() )
infile.close()
outfile.close()
end = time.clock()
print end-start,"seconds"
For 4 30Mb files, this takes just over 1.3 seconds on my system. (You
may need to open files in binary mode, depending on the contents, but
I was in a hurry.)
-- Paul
My bad, my test file was not a text file, but a binary file.
Retesting with a 50Mb text file took 24.6 seconds on my machine.
Still in your working range? If not, then you will need to pursue
more exotic approaches. But 25 seconds on an infrequent basis does
not sound too bad, especially since I don't think you will really get
any substantial boost from them (to benchmark this, I timed a raw
"copy" command at the OS level of the resulting 200Mb file, and this
took about 20 seconds).
Keep it simple.
-- Paul
.
- Follow-Ups:
- Re: Joining Big Files
- From: vasudevram
- Re: Joining Big Files
- References:
- Joining Big Files
- From: mosscliffe
- Re: Joining Big Files
- From: Paul McGuire
- Joining Big Files
- Prev by Date: Re: simple spider in python
- Next by Date: Re: Need a better understanding on how MRO works?
- Previous by thread: Re: Joining Big Files
- Next by thread: Re: Joining Big Files
- Index(es):
Relevant Pages
|
|