Populating a dictionary, fast
- From: Michael Bacarella <mbac@xxxxxxxxxxxxx>
- Date: Sat, 10 Nov 2007 13:56:35 -0800 (PST)
The id2name.txt file is an index of primary keys to strings. They look like this:
11293102971459182412:Descriptive unique name for this record\n
950918240981208142:Another name for another record\n
The file's properties are:
# wc -l id2name.txt
8191180 id2name.txt
# du -h id2name.txt
517M id2name.txt
I'm loading the file into memory with code like this:
id2name = {}
for line in iter(open('id2name.txt').readline,''):
id,name = line.strip().split(':')
id = long(id)
id2name[id] = name
This takes about 45 *minutes*
If I comment out the last line in the loop body it takes only about 30 _seconds_ to run.
This would seem to implicate the line id2name[id] = name as being excruciatingly slow.
Is there a fast, functionally equivalent way of doing this?
(Yes, I really do need this cached. No, an RDBMS or disk-based hash is not fast enough.)
.
- Follow-Ups:
- Re: Populating a dictionary, fast
- From: Istvan Albert
- Re: Populating a dictionary, fast
- From: Paul Rubin
- Re: Populating a dictionary, fast
- From: Steven D'Aprano
- Re: Populating a dictionary, fast
- From: Ben Finney
- Re: Populating a dictionary, fast
- Prev by Date: Re: security code whit python
- Next by Date: Re: security code whit python
- Previous by thread: Coding Help
- Next by thread: Re: Populating a dictionary, fast
- Index(es):
Relevant Pages
|