DCAT Harvester

By DIFI

The Data Catalog Vocabulary, or DCAT for short, is an RDF vocabulary for representing data catalogs. Many organizations publish their datasets online for free, giving researchers and others a wealth of information. These datasets range from groundbreaking medical research to national company registers.

Not all of these datasets are easy to locate for the average user or researcher, which is what DCAT aims to solve. The DCAT vocabulary can represent datasets, with their metadata and their distributions and organize them into catalogs. DCAT is machine readable, and can be queried using the SPARQL query language.

DCAT Harvester

By DIFI

User guide

Architecture

Dev guide