Alksnis

Failo pavadinimas Komentaras Versija Įkėlimo data Dydis Atsisiuntimas
search icon Alksnio_3.0_pastoviuju_junginiu_anotavimo_gaires.pdf Nenurodyta Nenurodyta 2022-11-04 09:36:52 1.396MB Atsisiųsti
search icon Alksnio_3.0_sandara.docx Nenurodyta Nenurodyta 2022-11-04 09:36:48 0.054MB Atsisiųsti
search icon Alksnis-3.0.zip The Lithuanian dependency treebank ALKSNIS v3.0 V.3.0 2022-11-04 09:36:50 2.17MB Atsisiųsti
search icon Jablonskis-LT.pdf Nenurodyta Nenurodyta 2022-11-04 09:36:51 0.195MB Atsisiųsti
search icon Alksnis (Github versija) Nenurodyta V.3.0 2019-10-07
Nenurodyta
Žiūrėti
Semantika.lt projekto rezultatai / 2019-10-07
Aprašymas

Summary

The Lithuanian dependency treebank ALKSNIS v3.0 (Vytautas Magnus university). From v.2.1 to v.3.0 was developed during the project “Semantika2” (Nr. 02.3.1-CPVA-V-527-01-0002)

Introduction

This is a new corrected and enhanced version of the ALKSNIS Lithuanian treebank. It is annotated in a style derived from the Prague Dependency Treebank of Czech. The previous ALKSNIS v2.1 consists of 2,355 syntactically annotated sentences. Each node of a tree corresponds to a word, a punctuation mark or other text element (symbol, digit etc.) within a sentence. ALKSNIS v.2.1 is published in CLARIN LT repository at http://hdl.handle.net/20.500.11821/10. (Some users experience DNS errors when trying to access the repository; configuring the client machine to use 8.8.8.8 as the DNS server may help. See also http://clarin-lt.lt/?page_id=86.) A version of the MULTEXT-East (http://nl.ijs.si/ME/V4/msd/html/index.html) tag set is used in ALKSNIS v2.1. The following information is presented for each node: 1) a used form; 2) a lemma; 3) a morphology tag, and 4) a syntactic function (subject, object, etc.). Dependencies are shown by links between words. ALKSNIS v3.0 from v2.1 was developed during the Vytautas Magnus University project “Semantika2” (Nr. 02.3.1-CPVA-V-527-01-0002). It consists of 3,643 syntactically annotated sentences.
Modifications from v2.1 to 3.0 (2019-07-08)

The older version undergone full review of syntactic information based on improved guidelines to enhance annotation quality.
New layer added: non-compositional multiword expressions (light verbs and idioms).
Added new data: scientific abstracts and reviews, additional administrative texts.
Schema version modified as 3.0.
Jablonskis tagset, which is human-friendly, is used instead of MULTEXT-East tagset.
Some syntactic relations were corrected or modified (details to be published in the improved guidelines).
Conllu files are added together with the pml files (RMQ conllu files does not keep the mwe field).

Content:

ALKSNIS-3.0.ZIP – The Lithuanian dependency treebank files.
Jablonskis-LT.pdf – Morphological annotation standart used in ALKSNIS.
ALksnio-3.0_sandara.docx – the structure of ALKSNIS v.3.0 files

Acknowlegments

From v.2.1 to v.3.0 was developed during the project “Semantika2” (Nr. 02.3.1-CPVA-V-527-01-0002). The Project funded by European Structural Funds
References

For ALKSNIS v.2.1: • Agnė Bielinskienė, Loïc Boizou, Jolanta Kovalevskaitė, Erika Rimkutė (2016): Lithuanian Dependency Treebank ALKSNIS. In: I. Skadiņa and R. Rozis (Eds.): Human Language Technologies – The Baltic Perspective, pp. 107–114. Amsterdam: IOS Press. doi:10.3233/978-1-61499-701-6-107 http://fcim.vdu.lt/~erika_rimkute/straipsniai/Alksnis_HLT.pdf, http://ebooks.iospress.nl/volumearticle/45523

Kontaktinis asmuo
A. Utka
Licencijos aprašymas

This item is for Academic Use and licensed under: ACA_CLARIN-LT_End-User-Licence-Agreement_EN-LT