Features - AVOBMAT

AVOBMAT

The Features of AVOBMAT

Text and data mining large library & research databases

Content analyis

Interactive metadata analysis & visualizations

19 Preprocessing options

Topic modeling, correlations, visualizations

Network analysis

24 available languages

N-gram viewer

Over 200 metadata fields

Individual configuration of all analytical tools

TagSpheres (context of a word)

Metadata enrichment

TEI XML import

Significant text analysis (comparing corpora)

Automatic language detection

Fast advanced, fuzzy, proximity and commandline searches

Keyword in Context / Concordance

Gender analysis of authors

Named Entity Recognition

Named Entity Linking (to Wikidata) and Disambiguation

Reproducibility (import/export)

Part-of-speech tagging

8 Lexical diversity metrics

Zotero import capability

	Language	Spacy model in AVOBMAT				Lemmatization (Spacy)	Lemmatization (Lemmagen)	Named entity recognition	Named entity linking & disambiguation	Parts of speech tagging
	Language	Small	Medium	Large	Transformer	Lemmatization (Spacy)	Lemmatization (Lemmagen)	Named entity recognition	Named entity linking & disambiguation	Parts of speech tagging
Currently supported spaCy model languages	Catalan	✅		✅	✅	✅		✅
	Chinese (Mandarin)	✅		✅	✅			Coming soon	Coming soon	✅
	Croatian	✅		✅		✅		✅		✅
	Danish	✅		✅	✅	✅		✅
	Dutch	✅		✅		✅		✅		✅
	English	✅		✅	✅	✅		✅	✅	✅
	Finnish	✅		✅		✅		✅		✅
	French	✅		✅		✅		✅	✅
	German	✅		✅		✅		✅	✅	✅
	Greek	✅		✅		✅		✅
	Italian	✅		✅		✅		✅	✅	✅
	Japanese	✅		✅	✅			Coming soon	Coming soon
	Korean	✅		✅		✅		Coming soon		✅
	Lithuanian	✅		✅		✅		✅		✅
	Macedonian	✅		✅		✅		✅
	Multilanguage	✅						✅
	Norwegian	✅		✅		✅		Coming soon
	Polish	✅		✅		✅		✅		✅
	Portuguese	✅		✅		✅		✅	✅
	Romanian	✅		✅		✅		✅		✅
	Russian	✅		✅		✅		✅	✅
	Slovenian	✅		✅	✅	✅		✅		✅
	Spanish	✅		✅		✅		✅	✅
	Swedish	✅		✅		✅		✅	✅	✅
	Ukranian	✅		✅	✅	✅		✅	✅
	Hungarian		✅	✅	✅	✅		✅		✅
Current Lemmagen support	Slovenian						✅
	Serbian						✅
	Romanian						✅
	Czech						✅
	Bulgarian						✅
	Estonian						✅
Detected languages	Afrikaans
	Albanian
	Amharic
	Ancient Greek
	Arabic
	Armenian
	Azerbaijani
	Basque
	Bengali
	Bulgarian
	Czech
	Estonian
	Faroese
	Gujarati
	Hebrew
	Hindi
	Icelandic
	Indonesian
	Irish
	Kannada
	Kyrgyz
	Latin
	Latvian
	Ligurian
	Lower Sorbian
	Luganda
	Luxembourgish
	Malay
	Malayalam
	Marathi
	Nepali
	Norwegian Nynorsk
	Persian
	Sanskrit
	Serbian
	Setswana
	Sinhala
	Slovak
	Tagalog
	Tamil
	Tatar
	Telugu
	Thai
	Tigrinya
	Turkish
	Upper Sorbian
	Urdu
	Vietnamese
	Yoruba

*AVOBMAT can identify the language of texts in 52 languages before further processing. Learn more>

AVOBMAT 2024©