Lemmatizing
Lemmatizing is very similar to stemming with the key difference being that lemmatizing ends up at a real word.
from nltk.stem import WordNetLemmatizer lemmatizer = WordNetLemmatizer() print(lemmatizer.lemmatize("cats")) print(lemmatizer.lemmatize("cacti")) print(lemmatizer.lemmatize("geese")) print(lemmatizer.lemmatize("rocks")) print(lemmatizer.lemmatize("python")) print(lemmatizer.lemmatize("better", pos="a")) print(lemmatizer.lemmatize("best", pos="a")) print(lemmatizer.lemmatize("run")) print(lemmatizer.lemmatize("run",'v'))
Gives the output:-
galiquis@raspberrypi:$ python3 ./nltk_tutorial8.py
cat
cactus
goose
rock
python
good
best
run
run
Some points to note:-
- Lemmatize takes part of the POS parameter/tag so:-
- pos=”a”or ‘a’ will find the closest adjective
- pos=”v” or ‘v’ will find the closest verb
- the default (no option) finds the closest noun
- More powerful than stemming