Ecological Metadata Language
The metadata that are filled out on the OBIS Node IPT are the metadata that will eventually be shown on the OBIS metadata pages. Only a number of sections should be seen as required. For the other sections, one can add as much information as you can. The amount of metadata you can complete will also largely depend on what you receive from the data provider. The better and more complete the metadata, the more visibility the dataset will have on the OBIS pages. The bare minimum of metadata are:
- Resource Contact (contact person)
Below we give an overview of the different IPT metadata sections and for the purpose of OBIS, we will explain a number of fields in more detail. The sequence in this manual follows the sequence of the IPT toolkit.
IPT Section “Basic Metadata”
When you create a new resource (=dataset) in your IPT you need to provide a
Shortname. These short name identifiers are used to create folders on the IPT and cannot easily be changed.
A good descriptive
title is indispensable. The title can provide the OBIS-user with valuable information, making selection of data easier. If the title would not be understandable or rather cryptic, contact the data provider with a suggestion or ask for a more descriptive title.
If the dataset has already been published (made publicly available) elsewhere, then the same title should be kept (even if it would not meet the proposed standards)! Changing the title of an already published dataset cannot be done, as this will generate confusion and possible duplicates in databases in a later stage.
Titles are often very cryptic and only understandable by the data provider. In most cases, the originally provided title is an acronym or working title of the full title. However, to be useful in a large data system, with large numbers of datasets, the given title should be as descriptive and complete as possible, to make quick screening an easy task.
The acronym or working title could still be documented in the metadata, so there is no confusion on how the full title is linked to the originally provided acronym or working title.
Always consult the data provider when changing the given title to a more workable and descriptive version.
Originally received title Title Recommended by Node Manager ------------------------- --------------------------------- BIOCEAN BIOCEAN database on deep sea benthic fauna Biomôr Benthic data from the Southern Irish Sea from 1989-1991 Kyklades Zoobenthos of the Kyklades (Aegean Sea) REPHY Réseau de Surveillance phytoplanctonique
The description or abstract of a dataset provides basic information on the content of the dataset. This information will help in better understanding and interpreting the data. It is recommended that the description indicates whether the dataset is a subset of a larger dataset and – if so – provide a link to the parent metadata and/or dataset.
All metadata for OBIS should be in English. If nodes would prefer bi- or multilingual entries for the description (e.g. due to national obligations) then the following procedure can be followed:
- Indicate English as metadata language
- Enter the English description first
- Type a slash (/)
- Enter the description in the second language
The Louis-Marie herbarium grants a priority to the Arctic-alpine, subarctic and boreal species from the province of Quebec and the northern hemisphere. This dataset is mainly populated with specimens from the province of Quebec. / L’Herbier Louis-Marie accorde une priorité aux espèces arctiques-alpines, subarctiques et boréales du Québec, du Canada et de l’hémisphère nord. Ce jeu présente principalement des spécimens provenant du Québec.
Data type & subtype
OBIS deals with occurrence data. The subtype will in most cases be ‘observation’. If it would concern a museum collection, then the subtype will be ‘specimen’.
Resource contact (required by OBIS)
The resource contact is the person or organization that should be contacted to get more information about the resource, that curates the resource or to whom putative problems with the resource or this data should be addressed.
It is recommended to give the ‘point of contact’ in this section. Although a number of fields are not required, we strongly recommend filling in all of them, and in particular the email address of the contact person. This will also be the contact information that appears on the OBIS metadata pages.
- Custodian steward (person/organization responsible for/takes care of the dataset paper)
- Owner (person/organization that owns the data – may or may not be the custodian)
- Point of contact (person/organization to contact for further information on the dataset)
- Principle investigator (primary scientific contact associated with the dataset)
Example roles that distinguish the resource contact:
The resource creator is the person or organization responsible for the original creation of the resource content. When there are multiple creators, the one that bears the greatest responsibility is the resource creator, and other should be added as associated parties with the role ‘originator’ or ‘content provider’ (see further: section “Associated Parties). Although a number of fields are not required, we strongly recommend filling in all of them, and in particular the email address of the contact person. This will also be the contact information that appears on the OBIS metadata pages.
- Originator (person/organization that originally gathered/prepared the dataset)
- Content provider (principal person/organization that contributed content to the dataset)
If the resource contact and the resource creator are identical, the information can easily be copied.
The metadata provider is the person or organization responsible for producing the resource metadata. If the metadata are provided by the original data provider, then his/her contact details should be filled in here. If no metadata are available (e.g. for historical datasets, with no contact person), then the metadata can be completed by the node manager. In this case, the node manager becomes the metadata provider.
IPT Section “Associated Parties”
These metadata pages contains information about one or more people or organizations associated with the resource in addition to those already covered on the Basic Metadata page. If there would be multiple contact persons or metadata creators, they can be added in this section. The principle contact/creator should however be added in the Basic Metadata section (see earlier). It is recommended to complete this section together with the Basic Metadata page, to avoid confusion or overlap in added information.
The owner of a dataset will, in most cases, be an institute, and not an individual person. Although the fields ‘last name’, ‘first name’ and ‘position’ are indicated as mandatory fields, it is possible to just add the institute name for the role ‘owner’.
IPT Section “Geographic Coverage”
The geographic coverage can either be set by dragging the markers on the given map or by filling in the coordinates of the bounding box. In the description field, a more elaborate text can be provided to describe the spatial coverage indicating the larger geographical area where the samples were collected. For the latter, the sampling locations can be plotted on a map and – by making use of a Gazetteer – the wider geographical area can be derived: e.g. the relevant Exclusive Economic Zone (EEZ), IHO, FAO fishing area, Large Marine Ecosystem (LME), Marine Ecoregions of the World (MEOW), etc. The VLIZ Marine Gazetteer might prove to be a useful online tool to define the most relevant sea area(s). There are also LifeWatch Geographical Services that translate geographical positions to these wider geographical areas.
The information given in this section can also help the node manager in the geographic quality control. If the geographic coverage would e.g. be “North Sea”, but a number of data points are outside of this scope, then the data provider needs to be contacted to check on possible errors.
If data would originate from two areas, this cannot be indicated separately on the coordinate level. If this would be the case (e.g. samples from the North Sea and the Mediterranean Sea in 1 dataset), then this should clearly be mentioned in the description field of this section.
IPT Section “Taxonomic Coverage”
This section can capture 2 things:
- A description of the range of taxa that are addressed in the data set or collection. The description can contain non-taxonomic terms, such as e.g. benthic foraminifera or freshwater mussels.
- An overview of all the involved taxa.
Information on the involved taxonomic groups should already be mentioned in the (descriptive) title, the general dataset description and/or the taxonomic coverage description. As all the taxa are in se listed in the dataset, it can be redundant to enter all the taxon names on this level too. If people want the full taxon list, they should go to the data itself. Exceptions could be made e.g. for deep-sea datasets dealing with specific taxa.
For OBIS datasets, we recommend to only add the higher classification (Kingdom, Class or Order) of the involved groups (e.g. Bivalvia, Cetacea, Aves, Ophiuroidea…). Your data provider can easily come up with such a list. The taxonomic coverage is not a mandatory field, but the information stored here can be very useful as background information.
IPT Section “Temporal Coverage”
The temporal coverage will be a date range, which can easily be documented. If it is a single date, the start and end date will be the same. The information added here can be used as a quality check for the actual dates in the datasets.
IPT Section “Keywords”
Relevant keywords facilitate the discovery of a dataset. An indication of the represented functional groups can help in a general search (e.g. plankton, benthos, zooplankton, phytoplankton, macrobenthos, meiobenthos …). Assigned keywords can be related to taxonomy, geography or relevant keywords extracted from thesauri such as the ASFA thesaurus, the CAB thesaurus or GCMD keywords. As taxonomy and geography are already covered in previous sections, there is no need to repeat related keywords here. Please consult your data provider which (relevant) keywords can be assigned.
IPT Section “Project Data”
If the dataset in this resource is produced under a certain project, the metadata on this project can be documented here. Part of the information entered here, can partly overlap with information given in other sections of the IPT metadata (e.g. study area description can have lot of parallel with the geographic coverage section). This is not a problem.
IPT Section “Sampling Methods”
still needs to be written
IPT Section “Citations”
If the dataset is downloaded from IPT/OBIS and used, the user has to be able to properly cite the dataset. Citing a dataset can be compared to citing a publication. Ideally, a citation should include the title of the dataset, the authors (data managers, custodians, collectors…) and the name of the data holding institute.
The node managers should try to implement a certain degree of format standardization for their citations. Some general guidelines could be formulated, e.g. the second author should have the initials consistently before or after the surname. First names should be replaced with initials and there should be a common use of commas and full stops. If a version number is included in the citation, then the term ‘version’ should be written in full, to avoid confusion.
- Gambi, C. & Danovaro, R. (1992). Meiofauna of the Ligurian Sea. Polytechnic University of Marche, Faculty of Sciences, Department of Marine Sciences, Italy.
- MarBEF (2005). Macroben, an integrated database on benthic invertebrates of European continental shelves. http://www.vliz.be/imis/imis.php?module=dataset&dasid=631
- Fish trawl survey: Irish ground fish survey. ICES Database of trawl surveys (DATRAS). The International Council for the Exploration of the Sea, Copenhagen. 2010. Online source: http://ecosystemdata.ices.dk.
- Campana, S., Joyce, W. OTN/DFO Maritimes Spiny Dogfish Tagging. In: OTN digital collections.
- Pezzack, L. 2004. Nova Scotia Museum of Natural History - Marine Birds, Mammals and Fishes. Nova Scotia Museum of Natural History Digital Collections. OBIS Canada, Bedford Institute of Oceanography, Dartmouth, Nova Scotia, Canada, Version 1.
- Nancy Jacobsen Stout, Linda Kuhnz, Lonny Lundsten, Kyra Schlining, Susan von Thun, (2002) Video Annotation and Reference System (VARS) database, Year 2000, Monterey Bay Aquarium Research Institute, Moss Landing, California USA, Database, www.mbari.org/vars
This section can include the references of the publications that are related to the described dataset. They can describe the dataset, be based on the dataset or be used in this dataset. Publications can be scientific papers, reports, PhD or master theses.
This overview will contribute to a better understanding of the data as these publications can hold important additional information on the data and how they were acquired.
Two things can be added:
- Citation Identifier: This can be a
DOI (Digital Object Identifier), which provides the possibility to register and use persistent interoperable identifiers for electronic objects, such as a publication. A DOI for a document is permanent, whereas its location and other available metadata may change.
- Citation: The actual citation of the publication (you should also add the DOI in this field).
IPT Section “Collection Data”
This section should only be filled out if there are specimens held in a museum. If relevant, it is strongly recommended that this information is supplied by the data provider or left blank.
IPT Section “External Links”
This section can include the link to the resource homepage. Additional links can be added (e.g. where and in what format the data can be downloaded).
Links to the online dataset on the OBIS website can be added once the data is available there. For these OBIS links, the required fields should be completed as follows:
- Name: online dataset
- Character set: UTF-8
- Data format: html
If other links are added, then the data format for web-based data is ‘html’. If the link refers to a file, the data format of the file will need to be added (e.g. excel, Access …). The character set for all Darwin Core files is UTF-8, whereas for other web pages this can vary.
IPT Section “Additional Metadata”
Not applicable for OBIS datasets.
License and IP Rights
The licenses offered are the Creative Commons Licenses (CC-0, CC-BY, CC-BY-NC). This license can be summarized as follows:
- You are free:
- to share => to copy, distribute and use the database
- to create => to produce works from the database
- to adapt => to modify, transform and build upon the database
In case of CC-0:
public domain: CC-0 is the preferred option identified by the OBIS steering group. You waive any copyright you might have over the data(set) and dedicate it to the public domain. You cannot be held liable for any (mis)use of the data either. Although CC-0 doesn’t legally require users of the data to cite the source, it does not take away the moral responsibility to give attribution, as is common in scientific research. A good blog on why using CC-0 can be found here.
In case of CC-BY:
Attribution: You must attribute any public use of the database, or works produced from the database, in the manner specified in the license. For any use or redistribution of the database, or works produced from it, you must make clear to others the license of the database and keep intact any notices on the original database.
- In case of CC-BY-NC:
non-commercial: like CC-BY but commericial use is not allowed.
Any remaining information that could not be catalogued under any of the other sections, can be mentioned here.