Alksnis
Failo pavadinimas | Komentaras | Versija | Įkėlimo data | Dydis | Atsisiuntimas |
---|---|---|---|---|---|
Alksnio_3.0_pastoviuju_junginiu_anotavimo_gaires.pdf | Nenurodyta | Nenurodyta | 2022-11-04 09:36:52 | 1.396MB | Atsisiųsti |
Alksnio_3.0_sandara.docx | Nenurodyta | Nenurodyta | 2022-11-04 09:36:48 | 0.054MB | Atsisiųsti |
Alksnis-3.0.zip | The Lithuanian dependency treebank ALKSNIS v3.0 | V.3.0 | 2022-11-04 09:36:50 | 2.17MB | Atsisiųsti |
Jablonskis-LT.pdf | Nenurodyta | Nenurodyta | 2022-11-04 09:36:51 | 0.195MB | Atsisiųsti |
Alksnis (Github versija) | Nenurodyta | V.3.0 | 2019-10-07 |
Nenurodyta
|
Žiūrėti |
Summary
The Lithuanian dependency treebank ALKSNIS v3.0 (Vytautas Magnus university). From v.2.1 to v.3.0 was developed during the project “Semantika2” (Nr. 02.3.1-CPVA-V-527-01-0002)
Introduction
This is a new corrected and enhanced version of the ALKSNIS Lithuanian treebank. It is annotated in a style derived from the Prague Dependency Treebank of Czech. The previous ALKSNIS v2.1 consists of 2,355 syntactically annotated sentences. Each node of a tree corresponds to a word, a punctuation mark or other text element (symbol, digit etc.) within a sentence. ALKSNIS v.2.1 is published in CLARIN LT repository at http://hdl.handle.net/20.500.11821/10. (Some users experience DNS errors when trying to access the repository; configuring the client machine to use 8.8.8.8 as the DNS server may help. See also http://clarin-lt.lt/?page_id=86.) A version of the MULTEXT-East (http://nl.ijs.si/ME/V4/msd/html/index.html) tag set is used in ALKSNIS v2.1. The following information is presented for each node: 1) a used form; 2) a lemma; 3) a morphology tag, and 4) a syntactic function (subject, object, etc.). Dependencies are shown by links between words. ALKSNIS v3.0 from v2.1 was developed during the Vytautas Magnus University project “Semantika2” (Nr. 02.3.1-CPVA-V-527-01-0002). It consists of 3,643 syntactically annotated sentences.
Modifications from v2.1 to 3.0 (2019-07-08)
The older version undergone full review of syntactic information based on improved guidelines to enhance annotation quality.
New layer added: non-compositional multiword expressions (light verbs and idioms).
Added new data: scientific abstracts and reviews, additional administrative texts.
Schema version modified as 3.0.
Jablonskis tagset, which is human-friendly, is used instead of MULTEXT-East tagset.
Some syntactic relations were corrected or modified (details to be published in the improved guidelines).
Conllu files are added together with the pml files (RMQ conllu files does not keep the mwe field).
Content:
ALKSNIS-3.0.ZIP – The Lithuanian dependency treebank files.
Jablonskis-LT.pdf – Morphological annotation standart used in ALKSNIS.
ALksnio-3.0_sandara.docx – the structure of ALKSNIS v.3.0 files
Acknowlegments
From v.2.1 to v.3.0 was developed during the project “Semantika2” (Nr. 02.3.1-CPVA-V-527-01-0002). The Project funded by European Structural Funds
References
For ALKSNIS v.2.1: • Agnė Bielinskienė, Loïc Boizou, Jolanta Kovalevskaitė, Erika Rimkutė (2016): Lithuanian Dependency Treebank ALKSNIS. In: I. Skadiņa and R. Rozis (Eds.): Human Language Technologies – The Baltic Perspective, pp. 107–114. Amsterdam: IOS Press. doi:10.3233/978-1-61499-701-6-107 http://fcim.vdu.lt/~erika_rimkute/straipsniai/Alksnis_HLT.pdf, http://ebooks.iospress.nl/volumearticle/45523
This item is for Academic Use and licensed under: ACA_CLARIN-LT_End-User-Licence-Agreement_EN-LT