Re: Creating huge data in very less time.



On Mar 31, 1:15 pm, Steven D'Aprano
<ste...@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
On Mon, 30 Mar 2009 22:44:41 -0700, venutaurus...@xxxxxxxxx wrote:
Hello all,
            I've a requirement where I need to create around 1000
files under a given folder with each file size of around 1GB. The
constraints here are each file should have random data and no two files
should be unique even if I run the same script multiple times.

I don't understand what you mean. "No two files should be unique" means
literally that only *one* file is unique, the others are copies of each
other.

Do you mean that no two files should be the same?

Moreover
the filenames should also be unique every time I run the script. One
possibility is that we can use Unix time format for the file   names
with some extensions.

That's easy. Start a counter at 0, and every time you create a new file,
name the file by that counter, then increase the counter by one.

Can this be done within few minutes of time. Is it
possble only using threads or can be done in any other way. This has to
be done in Windows.

Is it possible? Sure. In a couple of minutes? I doubt it. 1000 files of
1GB each means you are writing 1TB of data to a HDD. The fastest HDDs can
reach about 125 MB per second under ideal circumstances, so that will
take at least 8 seconds per 1GB file or 8000 seconds in total. If you try
to write them all in parallel, you'll probably just make the HDD waste
time seeking backwards and forwards from one place to another.

--
Steven

That time is reasonable. The randomness should be in such a way that
MD5 checksum of no two files should be the same.The main reason for
having such a huge data is for doing stress testing of our product.
.