Data published through OBIS must come from credible, authoritative sources. The scientists and institutions responsible for collecting and managing the data are clearly named. Before publication, the data must pass through a series of technical controls described below, and these are repeated every time the data may be crawled again from its source. Any errors, such as species name misspellings, names not recognised in OBIS, and possible mapping errors, are reported to the data provider to review, and if necessary, correct. Thus the next time the data are published they are more correct, and the source database quality is also improved. Data use is a very important way of finding actual and possible errors in data. Users may contact the data source directly or OBIS with such issues.
The OBIS Quality Control protocol is as follows:
- If the required data fields are not properly filled, notification will be sent to the Data Provider. No further action will be taken until the required fields are filled.
- If fields have questionable values, notification will be sent to Data Provider. These questionable values will be set as empty in the data published.
- Data located on land will be reported to the Data Provider but will not be deleted unless instructed by the Data Provider, because they may represent a species in an estuary or the centre point of a location. If a Data Provider changes the values, new values will show up after the next round of crawling.
- If species names cannot be (a) verified against known valid names in OBIS, or (b) to the OBIS taxonomic hierarchy, or (c) the World Register of Marine Species (WoRMS) the Data Provider will be notified so they can check they are current and correct. Names that can not be placed after checking with WoRMS and OBIS are, where possible, placed on the basis of other authoritative sources, such as the Catalogue of Life or ITIS. Some non-verified names may be assigned a position in the taxonomic hierarchy by virtue of their genus.
- The portal staff will communicate with data providers to inform them of any problems and improve data quality. They will check that the data conforms to the metadata description of the dataset; i.e. it should have the correct number of records and species in the right geographic locations. After the data is transferred to the server from where it will be published online, a form email will be sent to the technical person and manager specified, detailing number of records obtained and missing records if applicable, time of next crawling, and any errors identified.