help in algorithm
- From: Paolino <paolo_veronelli@xxxxxxxxxx>
- Date: Wed, 10 Aug 2005 16:51:55 +0200
I have a self organizing net which aim is clustering words. Let's think the clustering is about their 2-grams set. Words then are instances of this class.
class clusterable(str):
def __abs__(self):# the set of q-grams (to be calculated only once)
return set([(self+self[0])[n:n+2] for n in range(len(self))])
def __sub__(self,other): # the q-grams distance between 2 words
set1=abs(self)
set2=abs(other)
return len(set1|set2)-len(set1&set2)I'm looking for the medium of a set of words, as the word which minimizes the sum of the distances from those words.
Aka:sum([medium-word for word in words])
Thanks for ideas, Paolino
___________________________________ Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB http://mail.yahoo.it
.
- Follow-Ups:
- Re: help in algorithm
- From: Bengt Richter
- Re: help in algorithm
- From: Tom Anderson
- Re: help in algorithm
- From: gene tani
- Re: help in algorithm
- Prev by Date: Re: Help with Regular Expressions
- Next by Date: Bug in slice type
- Previous by thread: FTP over SSL (explicit encryption)
- Next by thread: Re: help in algorithm
- Index(es):