rogierkraf 2013-11-30+02:00 clarin.eu:cr1:p_1342181139640 CLARIN Netherlands
SearchPage http://opensonar.inl.nl OpenSONAR OpenSONAR: a 500 MW reference corpus of Contemporary Written Dutch 2013 http://opensonar.inl.nlDutch Language Institutehttp://portal.clarin.nl/node/4195 published 2014-10-06 CLARIN-NLCLARIN in the Netherlands184.021.003NWOhttp://www.clarin.nlJan OdijkNational Coordinator
Utrecht, the Netherlands
j.odijk@uu.nlUiL-OTSUtrecht University
20092015
CLARIAH-CORECommon Lab Research Infrastructure for the Arts and the Humanities184.033.101NWOhttp://www.clariah.nlJan OdijkNational Coordinator
Utrecht, the Netherlands
j.odijk@uu.nlUiL-OTSUtrecht University
20152018
NetherlandsNL SoNaR is a 500-million-word reference corpus of contemporary written Dutch for use in different types of linguistic (incl. lexicographic) and HLT research and the development of applications. The STEVIN funded SoNaR project (2008-2011) built on the results obtained in the D-Coi and Corea projects which were awarded funding in the first call of proposals within the STEVIN programme. SONAR contains over 500 million words (i.e. word tokens) of full texts from a wide variety of text types including both texts from conventional media and texts from the new media. All texts except for texts from the social media (Twitter, Chat, SMS) have been tokenized, tagged for part of speech and lemmatized, while in the same set the Named Entities have been labelled. All annotations were produced automatically, no manual verification took place. The texts are enriched with several annotations (Part of Speech and lemma information) and are available as FoLiA xml files (folia.xml). The system relies on BlackLab server as back-end and WhiteLab as user-interface. OpenSONAR is an online application for exploration of and searching in the SoNaR corpus.
mono-lingual tool written language tool language resource corpus browsing corpus searching corpus exploration Browsing and SearchingData analysis Linguistics computational linguistics general linguistics lexicology morphology syntax text and corpus linguistics yes Dutchnldno Online available graphical user interface web application other web service text text corpus statistics corpus fragments https://portal.clarin.inl.nl other academic Free, accessible through CLARIN Institutional login https://portal.clarin.inl.nl/opensonar_whitelab/page/search 0 EUR servicedesk@ivdnt.org Instituut voor de Nederlandse Taal Institute for the Dutch Language http://www.ivdnt.org/ OpenSONAR Manual - First Use user http://black.uvt.nl/opensonar/OpenSoNaR%20Manual.pdf eng SoNaR User Manual 1.0.4 user https://ticclops.uvt.nl/SoNaR_end-user_documentation_v.1.0.4.pdf eng in bookscientific backgroundyesvan de Camp, M, Reynaert,MandOostdijk, N. 2017.WhiteLab 2.0: AWeb Interface for Corpus Exploitation. In: Odijk, J and van Hessen, A. (eds.) CLARIN in the Low Countries, Pp. 231–243. London: Ubiquity Press. DOI: https://doi.org/10.5334/bbi.19. License: CC-BY 4.0in bookscientific backgroundyesde Does, J, Niestadt, J and Depuydt, K. 2017. Creating Research Environments with BlackLab. In: Odijk, J and van Hessen, A. (eds.) CLARIN in the Low Countries, Pp. 245–257. London: Ubiquity Press. DOI: https://doi.org/10.5334/bbi.20. License: CC-BY 4.0 in book scientific background yes Oostdijk, N., Reynaert, M., Hoste, V., Schuurman, I. (2013) The Construction of a 500 Million Word Reference Corpus of Contemporary Written Dutch in: Essential Speech and Language Technology for Dutch: Results by the STEVIN-programme (eds. P. Spyns, J. Odijk), Springer Verlag. http://dev.clarin.nl/sites/default/files/opensonar.jpg OpenSONAR OpenSONAR NTU/STEVIN http://www.taalunieversum.org/stevin - creator Dr. Nelleke Oostdijk
Erasmusplein 1, 6525 HT Nijmegen. Postbus 9103, 6500 HD Nijmegen
n.oostdijk@let.ru.nl Radboud University Nijmegen http://www.ru.nl/clst/staff/virtuele-map/nelleke-oostdijk/
Coordinator Radboud University Katholieke Universiteit Leuven (CCL) Hogeschool Gent (Dept. Vertaalkunde, LT3) reynaert@uvt.nl Tilburg University (TiCC/ILK) Twente University (HMI) Utrecht University (UiL-OTS)
unknown