Language Resources Available Online

The Index Thomisticus Treebank contains Latin texts of Thomas Aquinas (Medieval Latin) enhanced with complex and interlinked morphological, syntactic (around 450,000 nodes and more than 26,000 sentences) and semantic/pragmatic annotation (around 28,000 nodes and 2,000 sentences). The texts of Thomas Aquinas are taken from the Index Thomisticus corpus. Built by father Roberto Busa SJ, the Index Thomisticus is considered to be a pathfinder resource in humanities computing and computational linguistics. Learn more


Developed at Perseus Digital Libray (Tufts University, Boston, MA, USA), the Latin Dependency Treebank includes around 53,000 syntactically annotated nodes excerpted from texts of authors of Classical Latin. The guidelines for syntactic annotation of the Latin Dependency Treebank are the same as those for the corresponding layer of annotation of the Index Thomisticus Treebank. Learn more


The Latin valency lexicon (called VALLEX) has been built in close connection with the semantic/pragmatic annotation of the Index Thomisticus Treebank and the Latin Dependency Treebank. VALLEX contains around 2,500 valency frames for more than 1,000 lexical entries. Learn more


IT-VaLex is a collection of verbal lexical entries enhanced with valency and subcategorization frames at syntactic level. IT-VaLex is closely related to the Index Thomisticus Treebank project, since it is a corpus-driven valency lexicon automatically induced from the syntactic layer of annotation of the Index Thomisticus Treebank. Learn more