It is possible to add metadata to a resource, file or folder containing more than one file. The metadata enables resource records to be generated accurately. Diverse formats are available so that they can be queried by external OAI-PMH harvesters.
The metadata are saved on the ORTOLANG server in the JSON format and are ingested into an ElasticSearch indexing engine so that searches can be run. They are served to the HTML client via the REST API to display the resource’s information page and to edit the metadata form in the workspace. They are also used to provide OAI-PMH output of the OAI_DC, OLAC and CMDI types.
It is useful to fill in as many metadata fields as possible to ensure maximum visibility for the resource but there are only 3 mandatory fields.
- Title : this is a mandatory field. It is displayed in the search results and the resource record.
- Description : this is a mandatory field. It is displayed in the search results in a detailed list and the resource record.
- Type de ressource : this is a mandatory field. It is used to classify the resource in one of the categories (Corpus, Lexicon, Terminology, Tool).
- Documentation : It is possible to make references to link to one or more files which describe the resource. These documents must be deposited in the workspace.
- Producers : A resource is usually created by several collaborating research institutions. This information is displayed in the resource’s record which the platform’s users will consult. We also commit to providing a list of the resources produced by institutions on a page accessible to all online (see the list of producers).
- Sponsors : The participant contributed financial support to the creation of the resource.
- Contributors : The others participants.
- Reference publications : Publications which describe the resource
- Preview : A sample, image or video which previews the content of the resource
- Keywords : Keywords help to retrieve more efficiently the resource from the search bar
- Website : A link to the website describing the resource or to an index HTML file from the workspace ; Static website can be hosted to the worspace
- License : A license agreement which accompany the resource
- Usage rights : A specific usage rights for the resource
- Copyright : Copyright mention
If your resource is split in several parts, it is possible to download and browse separately. For each part you can fill this fields :
- Path : Path to the folder representing the sub part
Once you select a type of resource (from the General Informations), you can fill this fields :
- Type of corpora
- Corpora languages
- Study languages
- Annotation levels
- File encoding
- Data type
- Word count
- Type of corpora
- Type of entries
- Type of languages
- Languages of entries
- Number of lexicon entities described
- Type of description
- Description languages
- Operating Systems
- Programming languages
- Input data
- Output data
- File encoding
- Tool language
- Navigation language
- Tool Support
- Terminology type
- Structure type
- Description fields
- Type of linguistic coverage
- Input languages
- Formats and models
- Input count
We use JSON schema files type to check that the metadata are valid. There is a schema for each type of metadata handled by ORTOLANG. The most important schemas are:
To answer queries from OAI-PMH harvesters for metadata formats, we have set up an algorithm to convert the JSON format to XML. This way, resources which only contain JSON metadata can still serve OAI-PMH harvesters XML Dublin Core, OLAC and CMDI documents according to the following mapping table:
Notes : * Here is the mapping table used by the VLO : https://github.com/clarin-eric/VLO-mapping/blob/master/mapping/facetConcepts.xml
On a resource
A resource’s metadata are edited in the workspace using an HTML form. However it is also possible to import them from a JSON file in the “Import” section of the “Metadata” tab (the file extension must be .json).
On a file/folder
To import a set of metadata on files/folders, a zip file must be created which is structured so that the path of the files/folders can be identified. The name of the metadata file must have the same name as the metadata format. Here is a list of possible formats:
And here is an example of a structure:
In this example, the folders are folder1, sub-folder1 and file1. The metadata files are oai_dc (which is associated to the folder folder1/sub-folder1) and olac (which is associated to the file file1).
Log in then go to a workspace and the Contents section. Click on the + (in the toolbar) and then on Import a zip. To import a set of metadata, please select your previously created zip and then click the Upload metadata files checkbox.
Note : The Folder field and the Replace checkbox are not used when metadata are imported.