Repository CLARIN-D Centre Leipzig
Introduction to the Repository
The CLARIN-D repository at the University of Leipzig offers longterm preservation of digital resources, along with their descriptive metadata. The mission of the repository is to ensure the availability and longterm preservation of resources, to preserve knowledge gained in research, to aid the transfer of knowledge into new contexts, and to integrate new methods and resources into university curricula.
CLARIN-D is developing a digital infrastructure for language-centred research in the social sciences and humanities. The main function of the CLARIN-D service centres is to provide relevant, useful data and tools in an integrated, interoperable and scalable way. CLARIN-D will roll the infrastructure out in close collaboration with expert scholars in the humanities and social sciences, to ensure that it meets the needs of users in a systematic and easily accessible way. Integration of the repository into the national CLARIN-D and international CLARIN infrastructures gives it wide exposure, increasing the likelihood that the resources will be used and further developed beyond the lifetime of the projects in which they were developed.
Among the resources currently available in the Leipzig repository are a set of corpora of the Leipzig Corpora Collection (LCC), based on newspaper, Wikipedia and Web text. Furthermore several REST-based webservices are provided for a variety of different NLP-relevant tasks.
Depositing Data into the Archive
A depositor can be anyone obeying the following rules.
The depositor can
- provide data to be archived and distributed by the repository and
- determine to whom the data may be distributed by specifying an access level.
The archiving process follows a defined workflow for depositing the data and accepts digital resources (including data and tools) for depositing on the servers.
The repository CLARIN-D Centre Leipzig will only accept a resource that
- is the result of research projects,
- comes with exhaustive metadata,
- which's data structure is described by a sophisticated documentation (PDF/A),
- comes with information on how the data was originally created,
- was reviewed by a third party.
- metadata has to be provided in CMDI,
- it is recommended to use formats listed in the CLARIN standard recommendations,
- if no recommended format is used, an exhaustive documentation of the data has to be provided,
- only data that is freely available to everyone or that comes with a limited license which allows people working in resource institutions will be added to the repository. Access to metadata must not be limited in any way.
The specific procedure to deposit resources at the CLARIN-D Center Leipzig
contains the following steps:
- signing the depositors agreement (or in a first stage stating to do so in case the request is accepted by the repository)
- filling out the resource deposition request form
- mailing these documents to email@example.com
The following documents contain all relevant information in more detail.
Name: CLARIN-D Resource Center Leipzig
Repository: Fedora Repository
Search in repository: Fedora Repository Search
Virtual Language Observatory: Search in the VLO
OAI-PMH: OAI-PMH Identify
Corpus portal: Main page
Name: NLP Group, Department of Computer Science, University of Leipzig
Phone number: +49-(0)341-9732230
Postal adress: Universität Leipzig; Institut für Informatik; PF 100920; 04009 Leipzig; Germany
This repository has been awarded the Data Seal of Approval.
This repository has been certified as CLARIN Centre Type B.