Re: Case tagging and python




I second the idea of just using the islower(), isupper(), and
istitle() methods.
So, you could have a function - let's call it checkCase() - that
returns a string with the tag you want...

def checkCase(word):

if word.islower():
tag = 'nocap'
elif word.isupper():
tag = 'allcaps'
elif word.istitle():
tag = 'cap'

return tag

Then let's take an input file and pass every word through the
function...

f = open(path:to:file, 'r')
corpus_text = f.read()
f.close()

tagged_corpus = ''
all_words = corpus_text.split()

for w in all_words:
tagtext = checkCase(w)
tagged_corpus = tagged_corpus + ' ' + w + '/' + tagtext

output_file = open(path:to:file, 'w')
output_file.write(tagged_corpus)
print 'All Done!'



Also, if you're doing natural language processing in Python, you
should get NLTK.

.



Relevant Pages

  • Re: "also" to balance "else" ?
    ... in reguard to the loop test it self, it's just when you apply it to the loop as a whole, the resulting expected code flow becomes counter intuitive. ... tag = 'replace' ...
    (comp.lang.python)
  • Re: Re: difflib and intelligent file differences
    ... Stop when either source has run out, and then flush the rest of the other source to the output, with appropriate tag. ... Memory usage scales linearly with the size of the file difference, and time scales linearly with file sizes. ...
    (comp.lang.python)