Natural Language Tool Kit – Tutorial 8

Lemmatizing

Lemmatizing is very similar to stemming with the key difference being that lemmatizing ends up at a real word.

from nltk.stem import WordNetLemmatizer

lemmatizer = WordNetLemmatizer()
print(lemmatizer.lemmatize("cats"))
print(lemmatizer.lemmatize("cacti"))
print(lemmatizer.lemmatize("geese"))
print(lemmatizer.lemmatize("rocks"))
print(lemmatizer.lemmatize("python"))
print(lemmatizer.lemmatize("better", pos="a"))
print(lemmatizer.lemmatize("best", pos="a"))
print(lemmatizer.lemmatize("run"))
print(lemmatizer.lemmatize("run",'v'))

Gives the output:-

galiquis@raspberrypi:$ python3 ./nltk_tutorial8.py
cat
cactus
goose
rock
python
good
best
run
run

Some points to note:-

  • Lemmatize takes part of the POS parameter/tag so:-
    • pos=”a”or ‘a’ will find the closest adjective
    • pos=”v” or ‘v’ will find the closest verb
    • the default (no option) finds the closest noun
  • More powerful than stemming

Leave a Reply