Language | Spacy model in AVOBMAT | Lemmatization (Spacy) | Lemmatization (Lemmagen) | Named entity recognition | Named entity linking & disambiguation | Parts of speech tagging | ||||
Small | Medium | Large | Transformer | |||||||
Currently supported spaCy models languages | Catalan | ✅ | ✅ | ✅ | ✅ | ✅ | ||||
Chinese | ✅ | ✅ | ✅ | Coming soon | Coming soon | ✅ | ||||
Croatian | ✅ | ✅ | ✅ | ✅ | ✅ | |||||
Danish | ✅ | ✅ | ✅ | ✅ | ✅ | |||||
Dutch | ✅ | ✅ | ✅ | ✅ | ✅ | |||||
English | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | |||
Finnish | ✅ | ✅ | ✅ | ✅ | ✅ | |||||
French | ✅ | ✅ | ✅ | ✅ | ✅ | |||||
German | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ||||
Greek | ✅ | ✅ | ✅ | ✅ | ||||||
Italian | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ||||
Japanese | ✅ | ✅ | ✅ | Coming soon | Coming soon | |||||
Korean | ✅ | ✅ | ✅ | Coming soon | ✅ | |||||
Lithuanian | ✅ | ✅ | ✅ | ✅ | ✅ | |||||
Macedonian | ✅ | ✅ | ✅ | ✅ | ||||||
Multilanguage | ✅ | ✅ | ||||||||
Norwegian | ✅ | ✅ | ✅ | Coming soon | ||||||
Polish | ✅ | ✅ | ✅ | ✅ | ✅ | |||||
Portuguese | ✅ | ✅ | ✅ | ✅ | ✅ | |||||
Romanian | ✅ | ✅ | ✅ | ✅ | ✅ | |||||
Russian | ✅ | ✅ | ✅ | ✅ | ✅ | |||||
Slovenian | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ||||
Spanish | ✅ | ✅ | ✅ | ✅ | ✅ | |||||
Swedish | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ||||
Ukranian | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ||||
Hungarian | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ||||
Current Lemmagen support | Slovenian | ✅ | ||||||||
Serbian | ✅ | |||||||||
Italian | Coming soon | |||||||||
Romanian | ✅ | |||||||||
Czech | ✅ | |||||||||
Bulgarian | ✅ | |||||||||
Estonian | ✅ |
*AVOBMAT can identify the language of texts in 52 languages before further processing. Learn more>