This page refers to the dataset presented in the paper:
It is MarkIT That is New: An Italian Treebank of Marked Constructions. Teresa Paccosi, Alessio Palmero Aprosio and Sara Tonelli, To appear in Proceedings of the Eighth Italian Conference on Computational Linguistics 2022 (CLIC-it 2021)
The MarkIT resource contains around 800 sentences extracted from students' essays manually annotated with syntactic depencendies. The treebank covers seven types of marked constructions, plus some ambiguous sentences whose syntax can be wrongly classified as marked.
MarkIT is a treebank of marked constructions in Italian, containing around 800 sentences with dependency annotation. First we automatically annotate the sentences using Tint, then a manual fix of the errors is performed on the whole dataset. The resource covers seven types of marked constructions plus some ambiguous sentences, whose syntax can be wrongly classified as marked.
The selection, extraction, and annotation of the dataset have been performed by Teresa Paccosi, Alessio Palmero Aprosio, and Sara Tonelli.
=== Machine-readable metadata (DO NOT REMOVE!) ================================ Data available since: UD vX.X License: CC BY 4.0 Includes text: yes Genre: learner-essays Lemmas: automatic with corrections UPOS: automatic with corrections XPOS: automatic with corrections Features: automatic with corrections Relations: manual native Contributors: Paccosi, Teresa; Palmero Aprosio, Alessio; Tonelli, Sara Contributing: elsewhere Contact: [email protected] ===============================================================================