DCAT Harvester

By DIFI

The Data Catalog Vocabulary, or DCAT for short, is an RDF vocabulary for representing data catalogs. Many organizations publish their datasets online for free, giving researchers and others a wealth of information. These datasets range from groundbreaking medical research to national company registers.

Not all of these datasets are easy to locate for the average user or researcher, which is what DCAT aims to solve. The DCAT vocabulary can represent datasets, with their metadata and their distributions and organize them into catalogs. DCAT is machine readable, and can be queried using the SPARQL query language.

User guide

For users and admins of the system. From adding source DCAT catalogs, interpreting error messages to managing users.

View details »

Architecture

Architecture diagrams for developers detailing the system components and how they interact.

View details »

Dev guide

Detailed information about how to build and deploy the DCAT harvester and information about how the code is structured.

View details »