FSDF Supermodel Specification

A model for the integration of the multiple models within the Australian, national, Foundational Spatial Data Framework scenario.

Figure 1. Overview of this Supermodel

1. Metadata

IRI

https://linked.data.gov.au/def/fsdf-supermodel

Title

FSDF Supermodel Specification

Description

This Model - the FSDF Supermodel - is the overarching data model that provides integration logic for all FSDF elements. It is based on the general-purpose Supermodel Model.

Created

2022-02-24

Modified

2022-08-05

Issued

2022-08-05

Creator

Geoscience Australia & SURROUND Australia Pty Ltd

Publisher

Intergovernmental Committee on Surveying & Mapping

License

Creative Commons Attribution 4.0 International (CC BY 4.0)

2. Preamble

2.1. Abstract

This Model - the FSDF Supermodel - is the overarching data model that provides integration logic for all FSDF datasets. It is based on the general-purpose Supermodel Model.

This model is effectively an update to the methodology of the Location Index (Loc-I) Project which introduced a generic spatial dataset model, the Loc-I Ontology, and a series of models for the specific FSDF datasets considered by the Loc-I Project. This Supermodel is a more formalised implementation of that Loc-I Project’s vision and also one that relies on updated background models, particularly GeoSPARQL, and updated or new dataset models.

2.2. Namespaces

This model is built on a "baseline" of Semantic Web models which use a variatey of namespaces. Prefixes for these namespaces, used throughout this document, are listed below. Additional namespaces and prefixes are listed in later sections of this document where they only apply to that section.

Table 1. Namespaces
Prefix	Namespace	Description
`dcterms:`	`http://purl.org/dc/terms/`	Dublin Core Terms vocabulary namespace
`ex:`	`http://example.com/`	Generic examples namespace
`geo`	`http://www.opengis.net/ont/geosparql#`	GeoSPARQL ontology namespace
`owl:`	`http://www.w3.org/2002/07/owl#`	Web Ontology Language ontology namespace
`rdfs:`	`http://www.w3.org/2000/01/rdf-schema#`	RDF Schema ontology namespace
`sosa:`	`http://www.w3.org/ns/sosa/`	Sensor, Observation, Sample, and Actuator ontology namespace
`skos:`	`http://www.w3.org/2004/02/skos/core#`	Simple Knowledge Organization System (SKOS) ontology namespace
`suterms:`	`https://linked.data.gov.au/def/supermodel/terms/`	Supermodel Terms & Definitions Vocabulary

2.3. Terms & Definitions

The following terms appear in this document and, when they do, the definitions in this section apply to them.

These terms are presented as a formal Semantic Web vocabulary at

Backbone Model: suterms:backbone-model

An integrative, summary model, that allows for crosswalking of elements within the Component Models of a Supermodel.

Background Model: suterms:background-model

A standard and common Semantic Web model used as "upper" or higher order/abstract model for all other Supermodel models to conform to when modelling something within the Background Model’s purview.

Central Class: suterms:central-class

Central Classes are the generic data classes at the centre of Data Domains with high-level relationships between them defined in this supermodel.

These classes are taken from general standards - usually well-known international standards - and specialised and extended within implementation scenarios to cater for specific needs.

Component Model: suterms:component-model

An individual models of something of importance within a Supermodel scenario.

Data Domain: suterms:data-domain

High-level conceptual areas within which Geosicence Australia has data.

These Data Domains are not themed scientifically - 'geology', 'hydrogeology', etc. - but instead based on parts of the Observations & Measurement [ISO19156] standard, realised in Semantic Web form in the SOSA Ontology, part of the Semantic Sensor Network Ontology [SSN].

Current Data Domain are shown in Figure 1.

FSDF: :fsdf-defn - defined here

The Foundation Spatial Data Framework (FSDF): a project to deliver national coverages of the best available, most current, authoritative foundation data which is standardised and quality controlled. See https://link.fsdf.org.au.

Knowledge Graph: suterms:knowledge-graph

A Knowledge Graph is a dataset that uses a graph data structure - nodes and edges - with strongly-defined elements.

Linked Data: suterms:linked-data

A set of technologies and conventions defined by the World Wide Web Consortium that aim to present data in both human- and machine-readable form over the Internet.

Linked Data is strongly-defined with each element having either a local definition or a link to an available definition on the Internet.

Linked Data is graph-based in nature, that is it consists of nodes and edges that can forever be linked to further concepts with defined relationships.

— https://www.w3.org/standards/semanticweb/data

Location Index: suterms:location-index

A project aiming to provide a consistent way to seamlessly integrate spatial data from distributed sources.

— Location Index Project Website

Null Profile: :mull-profile-defn - defined here

A Null Profile is a Profile of a Standard that implements no additional constraints on the profile. A Null Profile’s purpose is to act as a conformance target for the Standard by supplying itemised requirements and machine-executable validators when the Standard itself cannot have these elements added to it.

Ontology: suterms:ontology

In computer science and information science, an ontology encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains of discourse.

The word ontology was originally defined as "the branch of philosophy that studies concepts such as existence, being, becoming, and reality". and the computer science term is derived from that definition.

Profile

https://www.w3.org/TR/dx-prof/#dfn-profile

A data standard that constrains, extends, combines, or provides guidance or explanation about the usage of other standards.

This definition includes what are sometimes called "data profiles", "application profiles", "metadata application profiles", or "metadata profiles". In this document, "profile" and these other variants are all referred to as just "profiles".

Note	This definition has been taken from [PROF] and altered slightly for clarity here

Semantic Web: suterms:semantic-web

The World Wide Web Consortium's vision of an Internet-based web of Linked Data.

Semantic Web is used to refer to something more than just the technologies and conventions of Linked Data; the term also encompasses a specific set of interoperable data models - often called ontologies - published by the W3C, other standards bodies and some well-known companies.

The 'semantic' refers to the strongly-defined nature of the elements in the Semantic Web: the meaning of Semantic Web data is as precisely defined as any data can be.

— https://www.w3.org/standards/semanticweb/

Vocabulary

suterms:vocabulary

A managed codelist or taxonomy of concepts.

Note	Supermodels tend to use vocabularies formulated according to the [SKOS] taxonomy model or lists of objects defined using basic [RDF], [RDFS] or [OWL] elements.

2.4. Conventions

All model diagrams use elements introduced in Figure 5. These elements are defined in the [RDF], [RDFS] and [OWL] ontologies.

All code snippets in this document, used to show formal and machine-readable versions of concepts, are expressed using the Turtle RDF syntax [TTL].

3. Introduction

This Supermodel is based on previous work from the Location Index (Loc-I) Project. For clarity, that the Loc-I Project is described, followed by how this Supermodel inherits from it.

3.1. Loc-I Project

The Location Index (LOC-I) project, established in 2018, created a methodology, data models and an informal framework to allow for a consistent way to seamlessly integrate spatial data from distributed sources. The target was Australian spatial data "of national significance", meaning most - initially all - of the data considered was Australian Federal government data.

See the project website, http://www.ga.gov.au/locationindex, for more project information.

3.2. Loc-I Technical Implementation

The technical implementation of Loc-I was based on Semantic Web principles allowing datasets to be published as Linked Data independently, by data holders - different government departments, companies etc. - and consumed with minimal effort required for integration.

The technical implementation relied on data from the various datasets sharing common patterns, principally, how the datasets packaged their content and how real-world objects, their spatiality and non-spatial properties were modelled.to this end, a number of Background Models were used, to which all Loc-I dataset models - here called Component Models - conformed. Additionally, a Loc-I Ontology was also created which both included modelling elements needed for the LocI Project that were not present in Background Models and which was to act as a conceptual, if not technical, conformance target for all Component Models.

Figure 2 below shows the original detailed architecture diagram used to explain Loc-I’s parts from 2018 - 2021.

Figure 2. Original Loc-I Detailed Architecture (from CSIRO)

3.3. Loc-I to Supermodel

The FSDF Project has adopted a more rigorously defined Supermodel concept to formalise things of importance to Loc-I-like work but which the Loc-I Project didn’t define. For example, the categorisation of relevant models as Background Models, Component Models and so on. The major differences/additions are:

Formalised terminology
- Of general relevance within the implementation scenario
- See the Terms & Definitions section.
Model categorisation
- For the different types of models within the scenario, for example, Background Models, Component Models
Explicit integration
- By explicitly defining a Backbone Model for each scenario deployment, the Supermodel conventions indicate precisely what minimum requirements for Component Models are to be able to be integrated
Validatable profiling instead of model specialisation
- Loc-I relied on defining an ontology to which datasets were expected to conform
- Supermodel implements a Profile of Background Models to which Component Datasets mush conform
- The Profile provides executable data validators

In addition to these model/methodological changes, this particular Spuermodel deployment has updated scenario-specific things, in particular:

Use of GeoSPARQL 1.1
- The Loc-I Project motivated extensions to the GeoSPARQL 1.0 ontology which were captured in the GeoSPARQL Extensions Ontology (GeoX). That ontology was then used by many Loc-I Project dataset models (Component Models)
- The update to GeoSPARQL, GeoSPARQL 1.1, absorbed many of these updates and so there is no longer a need to use GeoX
Collection-based Feature organisation
- Several Loc-I dataset models (Component Models) used specialised properties to indicate aggregations of Feature, e.g. the ASGS Ontology’s aggregatesTo & isAggregationOf
- The latest issue of the ASGS dataset within this Supermodel, online at https://linked.data.gov.au/dataset/asgsed3, uses only Collection membership (all Meshblock Features are members of the Meshblocks Feature Collection) and standard topological relations, e.g. each SA2 is geo:sfWithin an SA3
- This takes advantage of GeoSPARQL 1.1’s collections which match OGC API structures and removes non-standard spatial object relations
- specialised Feature aggregations may be re-added to objects within this Supermodel, if required
Topological querying instead of Linksets
- Loc-I Linksets are datasets that declare topologicla relations between Features
- GeoSPARQL allows topological relationships to be calculated using topologicla functions
- Loc-I Project established only a limited set of Linksets, e.g. the Current Addresses to 2016 Mesh Blocks Linkset
- This Supermodel deployment includes a cache of all datasets, https://cache.linked.fsdf.org.au, on which topological queries across all datasets can be performed

To conclude this Loc-I/Supermodel relations, here is a table mapping elements and terminology.

Table 2. Loc-I/FSDF Supermodel element comparison
Loc-I	FSDF Supermodel	Notes
Upper ontologies GeoSPARQL 1.0, DCAT etc.	Background Models	The Supermodel precisely lists Background Models in Background Models where Loc-I left their discovery to general documentation or dataset model imports
Loc-I Ontology & GeoX Ontology	Backbone Model	The Backbone Model profiles GeoSPARQL 1.1 and thus incorporates equivalent (updated) GeoX modelling. Loc-I Ontology Linksets are not used. Loc-I Ontology Datasets are replaced with Backbone Model profiling of DCAT
Dataset models GNAF, ASGS, Geofabric	Component Models	The FSDF Supermodel lists the Component Models defined to be within it formally in this Supermodel document
GNAF Ontology	ANZ National Address Model	The new model is a more standards-based form of the previous one and all aspects of the original model are covered by the new
ASGS Ontology	Backbone Model	The current delivery of the ASGS at https://asgs.linked.fsdf.org.au uses no properties other than those already in the Backbone Model therefore no custom Component Model is needed
Geofabric Ontology	unchanged	The Geofabric data online at https://geofabric.linked.fsdf.org.au currently uses only one element of the Geofabric Ontology, the property `hasDownstreamCatchment` however more use of that ontology, or an extended version of it, may be used in the future
Placenames Ontology	unchanged	While not an official original Loc-I dataset, Placenames was implemented in Loc-I style at https://fsdf.org.au/dataset/placenames/. It has been brought in to the Supermodel as a standard Component Model using the same ontology, albeit with updates
Geometry Data Service	Graph Cache	Loc-I build a non-Semantic geometry data service for some cross-dataset spatial queries. Supermodel implements a total cache of all Component Models' content in semantic form for GeoSPARQL spatial querying and other semantic querying

The Loc-I project also implemented a series of custom clients for Loc-I systems. These clients demonstrated

downloading lists of Loc-I identifiers (IRIs) for classes of object for offline use
reapportioning numerical observations data formed according to one set of geometries to another
finding Features that intersect with a given point or Feature

Some of this functionality is now available through the SPARQL Endpoints available for all FSDF APIs and also the API accessing a cached copy of all data.

Functionality not covered by SPARQL Endpoints is being implemented in new FSDF clients which Geoscience Australia will release when ready.

4. Supermodel

Figure 3. The various models of this Supermodel

4.1. Overview

This Section describes the structure of this Supermodel, aspects of the modelling involved and how to use this Supermodel. Following Sections describe the elements of the Supermodel in detail.

4.2. Structure

The high-level structure of this Supermodel consists of:

Background Models
- Standard and common Semantic Web models used as "upper" or higher order/abstract model for all other Supermodel models to conform to when modelling something within the Background Model’s purview.
- Models such as the Provenance Ontology [PROV] model provenance and all Supermodel models follow it when doing provenance work
- GeoSPARQL [GEO] serves as the background model for spatial objects - features and their geometries
Backbone Model
- This is a profile of the Background Models and includes validators
- Data must conform to this model in order to be considered within this Supermodel
- This model is a bare minimum: Component Models can, and already do, extend beyond this model to cater for their specific needs
Component Models
- These are individual models for datasets within this Supermodel
- Not all dataset require a Component Model, for example, the ASGS is currently modelled using Backbone Model elements only
Supporting Vocabularies
- Vocabularies that support the Backbone and Component models
- They must conform to the VocPub Profile of SKOS
- They may contain specialised elements beyond VocPub/SKOS too

Further details of and definitions for these elements are provided in the Terms & Definitions section, above, and in the Supermodel Model.

The next section deals with some aspects of how the models are created.

4.3. Modelling Methods

The modelling language/system used for all Supermodel elements expressed formally is the Web Ontology Language [OWL]. OWL diagramming is used for formal model images and, when it is, this is noted in the figure description. The figure below is a key for all OWL diagramming elements.

Figure 4. Diagram elements key

4.3.1. Object Modelling

The elements from the above subsection are shown in relation to one another in the figure below.

Figure 5. OWL objects and their relations

The elements shown above are identified with prefixed IRIs that correspond to entries in the Namespace Table. A short explanation of the diagram key elements is:

owl:Class - represents any conceptual class of objects. Classes are expected to contain individuals - instances of the class - and the class, as a whole, may have relations to other classes
owl:NamedIndividual - an individual of an owl:class. For example, for the class ships, an individual might be Titanic
rdf:property - a relationship between classes, individuals, or any objects and Literals
rdfs:subClassOf - an rdf:property indicating that the domain (from object) is a subclass of the range (to objects). An example is the class student which is a subclass of person: all students are clearly persons but not vice versa
rdf:type - the property that related an owl:NamedIndividual to the owl:Class that it’s a member of
Literal - a simple literal data property, e.g. the string "Nicholas", or the number 42. Specific literal types are usually indicated when used

The remaining diagrams in this document use extensions to this basic model, for example Figure 3 uses colour-coded specialised forms of owl:Class (subclasses of it).

4.3.2. Provenance

General provenance/lineage information about anything - a rock sample, a dataset, a term in a vocabulary etc. - is described using the Provenance Ontology [PROV] which views everything in the world as being of one or more types in Figure 3.

Figure 6. PROV main classes and main relations

According to PROV, all things are either a:

prov:Entity - a physical, digital, conceptual, or other kind of thing with some fixed aspects
prov:Agent - something that bears some form of responsibility for an activity taking place, for the existence of an entity, or for another agent’s activity
prov:Activity - something that occurs over a period of time and acts upon or with entities

While not often in front of mind for objects in any Data Domain, provenance relations always apply, for example: a sosa:Sample within the Sampling domain is a prov:Entity and will necessarily have been created via a sosa:Sampling which is a prov:Activity. Another example: an sdo:Person related to a dcat:Dataset via the property dcterms:creator in the DataCataloging domain is a specialised form of a prov:Agent related to a prov:Entity via prov:wasAttributedTo.

4.4. Ensuring Data Conformance

First the requirements for data to conform to are described, then how to test conformance in the Validation section.

4.4.1. Conformance Requirements

Data wishing to be used within this Supermodel must conform to:

Relevant Background Models
The Backbone Model
Perhaps a Component Model

Relevant Background Models

All data within the Supermodel will need to conform to at least some Background Models. Working out which ones are relevant is done by looking at the conceptual scope of the various listed Background Models and comparing the conceptual scope of the data to them.

For example, if the data is spatial - and most of it will be - it will need to conform to GeoSPARQL [GEO]. If it’s observations information, Data Cube [DQ].

Many Background Models are generic and have a wide scope and thus most data will need to conform to most Background Models.

For example, if the data contains provenance information, regardless of whether it’s a spatial or observations dataset, it will need to conform to the Provenance Ontology [PROV].

Backbone Model

All data will need to conform to the Backbone Model as this model is used to ensure all data can work together.

This model is only concerned with minimum requirements for data, so data conforming to this model may have any other things in it - details specific to that dataset’s concern - that are un-handled/unknown in the Backbone Model. That’s fine, as long as the minimal requirements are met.

Component Model

Many datasets will have a Component Model implemented for them. If they due, obviously data within that dataset must conform to it. If no Component Model has been implemented, it means that dataset is a direct implementation of the Backbone Model and need only conform to that.

4.4.2. Validation

To ensure that data within a dataset conforms to the models it needs to, automated validation of data must occur. This Supermodel implements validators for the Backbone Model that must be used to test data with.

This Supermodel is also either obtaining or implementing validators for all Background Models over time. Validators implemented for Background Models in this Supermodel are implemented within Null Profiles of them since most of the Background Models are previously defined controlled standards that cannot have all the profiling elements relevant to Supermodels just added to them.

4.5. How to use this Supermodel

This Supermodel provides a general structure for datasets that want to integrate within the FSDF Data Platform. The common tasks you might perform with the Supermodel are:

Model a new dataset as an FSDF Supermodel generic dataset
Validate new dataset data according to the FSDF Supermodel
Create an extended/specialised Component Model for a dataset
Validate extended/specialised new dataset data according to the extended/specialised FSDF Supermodel Component Model
Create a dataset of observations - population/statistical or natural world - linked to Component Models

Detailed suggestions as to how to achieve these tasks are given below.

4.5.1. 1. Model a new dataset

Individual datasets are modelled as Component Models. The most basic of Component Models contain Dataset, FeatureCollection & Feature classes modelled using the DCAT & GeoSPARQL Background Models with certain relations. The details of this modelling are given in the first part of the Component Models section.

To model a highly specialised dataset, you will need to be able to implement both the most basic Component Model elements but also model the specialised elements relevant to your dataset. No specific guidance about your dataset can be given here however the Component Models section does indicate existing datasets that contain a large amount of specialisation that you may draw inspiration from.

In all cases, you can use the tools listed in the Validators section to test any data you’ve created to see if it really is valid according to this Supermodel.

4.5.2. 2. Validate new dataset

Data validators are available, for all elements of this Supermodel, so you can use them to validate your data. See the Validators section.

4.5.3. 3. Create an extended/specialised 'Component Model'

As per subsection 1. above, we can’t give specific details about specialised modelling here since we don’t know about your particular dataset however we can both indicate existing specialised datasets (see the start of the Component Models section), and we can make a few general points:

this Supermodel is concerned with the modelling of spatial datasets as Component Models with Dataset, FeatureCollection & Feature with certain relations between them
most specialisation is likely to occur by adding special properties to Feature instances
- for example, the Feature instances within the FSDF’s Power Stations FeatureCollection contain properties relevant to power generation, such as primaryfuelType indicating coal, biogas etc., and these are important for knowledge of Power Stations but don’t affect the general spatial feature modelling of this Supermodel in any way
spatial relations - between Feature instances within one FeatureCollection or even across Dataset instances - are expected and can be modelled using GeoSPARQL’s Simple Features Topological Relations Family.
- No custom modelling is likely required for standard spatial relations
The geometries of Feature instances can be represented in several ways and Feature instances can have multiple geometries
- Boundaries at different levels of resolution may be given or geometries with different roles, e.g. high and low tide boundaries

4.5.4. 4. Validate extended/specialised new dataset data

As per section 2. above, see the Validators section. Of course, your specialised modelling won’t have a validator for it, however you can certainly ensure that your new data is valid according to this Supermodel.

4.5.5. 5. Create a dataset of observations

The spatial datasets within this Supermodel are intended to present spatial objects that observations' data can be referenced against. For example, Australian census data is keyed to the Mesh Blocks and other spatial areas of the ASGS dataset, water data in the Bureau of Meteorology’s AWRIS system are keyed to catchments within the Geofabric dataset.

You can create your own observations data and key them to any datasets that exist within this Supermodel or to datasets that you make that are compatible with this Supermodel’s elements.

5. Background Models

Background Models are:

standard and common Semantic Web model used as "upper" or higher order/abstract model for all other Supermodel models to conform to when modelling something within the Background Model’s purview.

— Background Models definition from the Terms & Definitions section

5.1. List

The particular Background Models in this FSDF Supermodel are given in the table below, with a description of the conceptual area they cover indicated as 'Domain', to assist in assessment of their relevance to Supermodel data.

Table 3. Background Models
Background Model	Reference	Domain
Web Ontology Language	[OWL]	General Modelling: all the other Background Models are OWL models
schema.org	schema.org	General Modelling: Agents (People & Organisations), licensing etc.
Data Catalog Vocabulary	[DCAT]	Dataset Metadata
The Provenance Ontology	[PROV]	Data Metadata: attribution of data to owners/publishers etc. Data lineage: what things datasets derive from
GeoSPARQL 1.1	[GEO]	Spatiality: Feature/Geometry links, topological relations, spatial scalar values (e.g. area)
Data Cube Vocabulary	[QB]	Data Dimensions: for observations data, e.g. population census
Sensor, Observation, Sample, and Actuator (SOSA)	[SSN]	Data Dimensions: for spatial and natural-world data, e.g. from satellites
Simple Knowledge Organization System	[SKOS]	Vocabularies
Vocabulary Publications Profile of SKOS	[VOCPUB]	A profile of SKOS requiring certain properties for vocabularies and their elements

5.2. Domain Details

The domains for each of the Background Models are noted in the table above, however here now are indicative models for each of then indicating the main classes and properties of concern within the domain.

5.2.1. General Modelling

All of the Background Models, the Backbone Model and all other Supermodel models use the Web Ontology Language, OWL, for their modelling structures. While a description of OWL is out-of-scope for this document, below is given a key for the main OWL elements seen in subsequent figures.

Figure 7. Figure key for the OWL modelling used in subsequent Figures

OWL it itself built on [RDF] & [RDFS], but we only reference OWL in this Background Models section as OWL "covers" these lower-level models.

5.2.2. Dataset Metadata

Metadata for datasets within this Supermodel is based on the Data Catalog Vocabulary [DCAT] with elements of The Provenance Ontology [PROV] for a few purposes, such as Data/Agent relations. Note that DCAT recommends this use of PROV. The figure below gives an informal overview of the concerns in this domain.

Figure 8. Data Cataloguing Model, based on DCAT & PROV

Specifics properties for dataset-level metadata, such as license, copyright notices, who the publisher is etc. are mostly taken from schema.org which is a general-purpose OWL (or at least OWL-compatible!) vocabulary of classes and properties.

schema.org is also used for Agent/Agent relations, as per the figure below.

Figure 9. Organisation Model, based on schema.org

5.2.3. Spatialiaty

This Supermodel’s core concern of modelling spatiality is based on use of the GeoSPARQL 1.1 Standard [GEO] which concerns itself with the elements in the figure below. The figure is a part reproduction of GeoSAPRQL’s overview diagram.

Figure 10. Classes and properties of the GeoSPARQL model from [GEO], Figure 3

Essentially all spatial relations between objects and the associations of objects with spatiality (Features with Geometries) and the details of Geometry data are defined by GeoSPARQL.

5.2.4. Data Dimensions

The dimensions of data are, in general, modelled in relation to observations according to the Data Cube Vocabulary [QB], however the dimensions (observable properties) of spatial and real-world objects is modelled using Sensor, Observation Sampling & Actuation (SOSA) ontology within the Semantic Sensor Networks [SSN] standard. SOSA is, within this Supermodel at least, a domain-specialised version of QB.

Figure 11. Observations Model, based on Data Cube Vocabulary overview

Figure 12. Observations Model for Spatial & Real-World Features

The separate modelling for spatial/real-world features' properties is due to widespread use of SOSA for observations in that domain, for example the Geoscience Australia Samples catalogue (http://sss.pid.geoscience.gov.au/sample/).

The net effect of both QB and SOSA is to define data types for and observable properties/dimensions that observations are of.

5.2.5. Vocabularies

Many of the models in this Supermodel rely on vocabularies of individual items, for example Data/Agent relations rely on vocabularies of Agent roles. When vocabularies are modelled, this Supermodel uses a profile of the Simple Knowledge Organization System, [SKOS], called VocPub [VOCPUB]. VocPub just requires certain metadata, allowed by bot not mandated by SKOS, to be present within vocabularies for data management purposes. At a whole-of-vocabulary level, VocPub requires very similar metadata to DCAT, thus a Vocabulary appears as a form of Dataset.

All the current FSDF vocabularies at https://linked.fsdf.org.au/vocab conform to VocPub.

Figure 13. Basic SKOS/VocPub data model

6. Backbone Model

The Backbone Model of this Supermodel is the model to which all data MUST conform. The model is a profile of the Background Models, which means it implements no new modelling elements of its own but just constrains existing elements in the Background Models. Thus, anything that conforms to the Backbone Model will conform to the Background Models also. This form of profiling is formally defined in The Profiles Vocabulary [PROF].

This Backbone Model is quite "lite" in that it only has comparatively few requirements. This is due to its role: to ensure that all data in the Supermodel exhibits a minimum set of properties and patters for interoperability. the Backbone Model doesn’t try to model all thins within all sub-domains of this Supermodel: that is the job of the various Component Models.

This Backbone Model is defined at:

https://linked.data.gov.au/def/fsdf-backbone

The profile’s main elements are articulated here for the completeness of documentations.

6.1. Definition

This is part of the formal definition of this profile:

<https://linked.data.gov.au/def/fsdf-backbone>
    a prof:Profile ;
    sdo:name "FSDF Backbone Model Profile" ;
    sdo:description "This is a profile of DCAT, GeoSPARQL & VocPub to be used as a conformance target for data within the FSDF Supermodel"@en ;
    prof:profileOf
        <https://www.w3.org/TR/vocab-dcat/> ,
        <http://www.opengis.net/doc/IS/geosparql/1.1> ,
        <https://www.w3.org/TR/vocab-data-cube/> ,
        <https://www.w3.org/TR/vocab-ssn/> ,
        <https://w3id.org/profile/vocpub> ;
    ...
.

This code identifies the profile, https://linked.data.gov.au/def/fsdf-backbone, and, appart from basic human-readable annotations, states that it is a profile of (profileOf) the main Background Models.

The full profile definition, online at https://linked.data.gov.au/def/fsdf-backbone, gives further details such as the listing of resources within the profile, which includes its specification & validators - also described below.

6.2. Requirements

Here the requirements for data to conform to this profile are listed. The Requirements are identifies (GM1 etc.) and they are referenced by validation rules in the Validation section described below. Note that some of these Requirements reference whole other Standards and Profile so that all the Requirements from them are relevant.

The capitalised, italicised words such as MUST, MAY etc., have meanings as per [RFC2119].

The rdf namespace referred to is http://www.w3.org/1999/02/22-rdf-syntax-ns#.

Domain ID Name Definition

Domain	ID	Name	Definition
General Modelling	GM1	OWL Conformance	Data must conform to OWL
	GM2	Class Modelling	Classes of object MUST be modelled as `owl:Class` instances
	GM3	Property Modelling	Properties and predicates MUST be modelled as either `rdf:Property` instances or instances of the various OWL properties
Dataset Metadata	DM1	DCAT Conformance	Data must conform to DCAT
	DM2	Dataset Mandatory Properties	`dcat:Dataset` instances MUST be presented as `dcat:Dataset` instances with at least the following properties: `dcterms:identifier` - as an `xsd:token` value, `dcaterms:title` & `dcterms:description` - as strings or langString values, `sdo:dateCreated` & `sdo:dateModified` - as `xsd:date`, `xsd:dateTime` or `xsd:dateTimeStamp` values
	DM3	Dataset Agents	`dcat:Dataset` instances MUST indicate at least creator & publisher Agents via use of the `sdo:creator` and `sdo:publisher` properties. The values for these properties must be IRIs, not strings
	DM4	Dataset Provenance	`dcat:Dataset` instances SHOULD indicate provenance in one of three ways: if derived from an RDF dataset then with `prov:wasDerivedFrom` - with an IRI value, if derived from an online but non-RDF data source then with `dcterms:source` - xsd:anyURI` value and if neither then with a written statement with `dcterms:provenance` - string or langString value
	DM5	Dataset Spatiality	`dcat:Dataset` instances MUST indicate the total spatial footprint the elements within them by indicating a `geo:Geometry` object via a `geo:hasGeometry` or `geo:hasBoundingBox` property
	DM6	Dataset Feature Collections	`dcat:Dataset` instances MUST indicate at least one `geo:FeatureCollection` instances within them with the `rdfs:member` property
Spatiality	S1	GeoSPARQL Conformance	Spatial data MUST conform to the GeoSPARQL 1.1 Standard
	S2	Feature Collection Mandatory Properties	`geo:FeatureCollection` instances MUST be presented with at least the following properties: `dcterms:identifier` - as an `xsd:token` value, `dcaterms:title` & `dcterms:description` - as strings or langString values
	S3	Dataset Spatiality	`geo:FeatureCollection` instances MUST have the total spatial footprint the elements within them by indicating a `geo:Geometry` object via a `geo:hasGeometry` or `geo:hasBoundingBox` property
	S4	Feature Collection Features	`geo:FeatureCollection` instances MUST indicate at least one `geo:Feature` instances within them with the `rdfs:member` property
	S5	Feature Mandatory Properties	`geo:Feature` instances MUST be presented with at least the following properties: `dcterms:identifier` - as an `xsd:token` value
Data Dimensions	DD1	Data Cube Vocabulary Conformance	Non-physical sciences observations data must conform to Data Cube Vocabulary
	DD2	SOSA Conformance	Physical sciences observations data must conform to the SOSA ontology
Vocabularies	V1	VocPub Conformance	Vocabularies MUST conform to the VocPub Profile of SKOS

General Modelling

GM1

OWL Conformance

Data must conform to OWL

GM2

Class Modelling

Classes of object MUST be modelled as owl:Class instances

GM3

Property Modelling

Properties and predicates MUST be modelled as either rdf:Property instances or instances of the various OWL properties

Dataset Metadata

DM1

DCAT Conformance

Data must conform to DCAT

DM2

Dataset Mandatory Properties

dcat:Dataset instances MUST be presented as dcat:Dataset instances with at least the following properties: dcterms:identifier - as an xsd:token value, dcaterms:title & dcterms:description - as strings or langString values, sdo:dateCreated & sdo:dateModified - as xsd:date, xsd:dateTime or xsd:dateTimeStamp values

DM3

Dataset Agents

dcat:Dataset instances MUST indicate at least creator & publisher Agents via use of the sdo:creator and sdo:publisher properties. The values for these properties must be IRIs, not strings

DM4

Dataset Provenance

dcat:Dataset instances SHOULD indicate provenance in one of three ways: if derived from an RDF dataset then with prov:wasDerivedFrom - with an IRI value, if derived from an online but non-RDF data source then with dcterms:source - xsd:anyURI` value and if neither then with a written statement with dcterms:provenance - string or langString value

DM5

Dataset Spatiality

dcat:Dataset instances MUST indicate the total spatial footprint the elements within them by indicating a geo:Geometry object via a geo:hasGeometry or geo:hasBoundingBox property

DM6

Dataset Feature Collections

dcat:Dataset instances MUST indicate at least one geo:FeatureCollection instances within them with the rdfs:member property

Spatiality

GeoSPARQL Conformance

Spatial data MUST conform to the GeoSPARQL 1.1 Standard

Feature Collection Mandatory Properties

geo:FeatureCollection instances MUST be presented with at least the following properties: dcterms:identifier - as an xsd:token value, dcaterms:title & dcterms:description - as strings or langString values

Dataset Spatiality

geo:FeatureCollection instances MUST have the total spatial footprint the elements within them by indicating a geo:Geometry object via a geo:hasGeometry or geo:hasBoundingBox property

Feature Collection Features

geo:FeatureCollection instances MUST indicate at least one geo:Feature instances within them with the rdfs:member property

Feature Mandatory Properties

geo:Feature instances MUST be presented with at least the following properties: dcterms:identifier - as an xsd:token value

Data Dimensions

DD1

Data Cube Vocabulary Conformance

Non-physical sciences observations data must conform to Data Cube Vocabulary

DD2

SOSA Conformance

Physical sciences observations data must conform to the SOSA ontology

Vocabularies

VocPub Conformance

Vocabularies MUST conform to the VocPub Profile of SKOS

6.3. Validation

To prove that data does conform to this Backbone Model, it must be validated. Since all the expected data for this Supermodel is RDF data, SHACL [SHACL] validation may be used.

This Profile presents its own validator which only includes tests for the rules specific to this profile and not those of the things this Profile profiles. However, a compounded validator is also given below which includes this Profile’s validator and validators from all the Standards and Profiles that this Profile profiles, that have validators. The Standards' and Profiles' are also listed individually.

Note	Since of the Standards that this Profile Profiles do not present SHACL validators, we use Null Profiles for them where a Null Profile is a Profile that implements no constrains on the Standard profiles and exists only to provide a validator for it.

For total validation, the compounded validator should be used. For partial validation, use each of the individual ones.

6.3.1. Process

To validate RDF data, a SHACL validation tool, such as pySHACL (online tools for validation exist too, see [tooling] below), is used with the data to be validated and the validator as inputs. The data to be validated must include all the elements necessary for validation, for example, if a valid Dataset/Agent relation includes the requirement for the Agent to be classed as an sdo:Person or an sdo:Organization then the data to be validated must declare this classification, rather than leaving it up to external resources.

Note	Validators that find nothing to validate will return true, so if the data to be validated contains no instances fo classes known to the validator, no sensible result will be obtained.

Regarding scale: validation is a resource-intensive task, so large datasets should not be validated without dedicated systems. It is probably appropriate to validate only a sample of Dataset contents, especially if the content is produced by a script or someother method that makes similar Feature Collections & Features.

6.3.2. Validators

Standard / Profile	Validator	IRI
Backbone Model	Backbone Model Validator	https://linked.data.gov.au/def/fsdf-backbone/validator
Backbone Model	Backbone Model Compounded Validator	https://linked.data.gov.au/def/fsdf-backbone/validator-compounded
DCAT	DCAT Null Profile Validator	https://w3id.org/profile/dcat-null
GeoSPARQL 1.1	GeoSPARQL Validator	http://www.opengis.net/def/geosparql/validator
Data Cube Vocabulary	Data Cube Vocabulary Null Profile Validator	https://w3id.org/profile/qb-null
SOSA	SOSA Null Profile Validator	https://w3id.org/profile/sosa-null
VocPub	VocPub Validator	https://w3id.org/profile/vocpub/validator

Standard / Profile

Validator

IRI

Backbone Model

Backbone Model Validator

https://linked.data.gov.au/def/fsdf-backbone/validator

Backbone Model

Backbone Model Compounded Validator

https://linked.data.gov.au/def/fsdf-backbone/validator-compounded

DCAT

DCAT Null Profile Validator

https://w3id.org/profile/dcat-null

GeoSPARQL 1.1

GeoSPARQL Validator

http://www.opengis.net/def/geosparql/validator

Data Cube Vocabulary

Data Cube Vocabulary Null Profile Validator

https://w3id.org/profile/qb-null

SOSA

SOSA Null Profile Validator

https://w3id.org/profile/sosa-null

VocPub

VocPub Validator

https://w3id.org/profile/vocpub/validator

6.3.3. Tooling

Several online SHACL validation tools exist that may be used with the validators above:

SHACL Playground
- https://shacl.org/playground/
EU SHACL Validator
- https://data.europa.eu/mqa/shacl-validator-ui/

We recommend either the public RDFTools Online tool, since it is actively maintained, and includes some of the validators listed above, or the Geoscience Australia copy of RDFTools with all the validators above preloaded:

Public RDFTools Online
- http://rdftools.kurrawong.net/validate
GA RDFTools Online
- link needed from GA

7. Component Models

This section contains the details of the various Component Models used for the individual datasets within the FSDF Data Platform.

Several datasets in the Platform use only the Backbone Model for their model and thus require no specialised Component Model. It is expected that many straightforward spatial dataset may be able to be created as plain Backbone Model-only datasets in this way if the only contain collections of names Features without special relationships or properties.

The table below lists the current major Datasets in the FSDF Data Platform and their Component Models or use of the Backbone Model.

Dataset Persistent Identifier Model Notes

Dataset	Persistent Identifier	Model	Notes
Australian Statistical Geographies Standard, Edition 3	https://linked.data.gov.au/dataset/asgsed3	Backbone Model	Previous editions of the ASGS dataset used specialised Component Models, in particular the ASGS Ontology, but the current Edition 3 uses only the Backbone Model
Australian Hydrological Geospatial Fabric (Geofabric), v3	https://linked.data.gov.au/dataset/geofabric	Geofabric Ontology	The Geofabric Ontology is a very small specialisation of the Backbone Model that only declares specialised classes of `geo:Feature` (`Catchment`, `RiverRegion` etc.), but declares only one special property `hasDownstreamCatchment` for `Catchment` instances
(Australian) Geocoded National Address File (G-NAF)	https://linked.data.gov.au/dataset/gnaf	ANZ National Address Model	The G-NAF uses a highly specialised Component Model that models many aspects of Addresses
Placenames	https://linked.data.gov.au/dataset/placenames	Place Names Ontology	The Place Names ontology originally made for the Loc-I Project is currently used, however a future version might use parts of the ANZ National Address Model which covers Place Names also

Australian Statistical Geographies Standard, Edition 3

https://linked.data.gov.au/dataset/asgsed3

Backbone Model

Previous editions of the ASGS dataset used specialised Component Models, in particular the ASGS Ontology, but the current Edition 3 uses only the Backbone Model

Australian Hydrological Geospatial Fabric (Geofabric), v3

https://linked.data.gov.au/dataset/geofabric

Geofabric Ontology

The Geofabric Ontology is a very small specialisation of the Backbone Model that only declares specialised classes of geo:Feature (Catchment, RiverRegion etc.), but declares only one special property hasDownstreamCatchment for Catchment instances

(Australian) Geocoded National Address File (G-NAF)

https://linked.data.gov.au/dataset/gnaf

ANZ National Address Model

The G-NAF uses a highly specialised Component Model that models many aspects of Addresses

Placenames

https://linked.data.gov.au/dataset/placenames

Place Names Ontology

The Place Names ontology originally made for the Loc-I Project is currently used, however a future version might use parts of the ANZ National Address Model which covers Place Names also

The table below lists the current smaller "FSDF" Datasets in the FSDF Data Platform.

Dataset	Persistent Identifier	Model	Notes
Electrical Infrastructure	https://linked.data.gov.au/dataset/power-infrastructure	Backbone Model
Facilities	https://linked.data.gov.au/dataset/facilities	Backbone Model
Sandgate example	http://example.com/dataset/sandgate	Backbone Model	This dataset does not have a Persistent Identifier as it’s an example dataset only

Dataset

Persistent Identifier

Model

Notes

Electrical Infrastructure

https://linked.data.gov.au/dataset/power-infrastructure

Backbone Model

Facilities

https://linked.data.gov.au/dataset/facilities

Backbone Model

Sandgate example

http://example.com/dataset/sandgate

Backbone Model

This dataset does not have a Persistent Identifier as it’s an example dataset only

8. Supporting Vocabularies

The vocabularies supporting the Component Models of the Datasets listed in the Section above are all from one of the following types:

Well-known, public vocabularies
- For example, the vocabulary of Role types for Agents with respect to Datasets is the ISO’s Role Code vocabulary
Vocabularies within Component Models
- For example, the various vocabularies within the ANZ National Address Model, such as Address Component Types
Vocabularies created for Dataset within this Supermodel not within Component Models
- For example https://linked.data.gov.au/def/fsdf/ground-relations [FSDF Ground Relations]

All vocabularies are discoverable by inspecting the data that refers to them. Those within Component Models are also discoverable via their models, but those in Category 3. are also discoverable via the FSDF Vocabulary Server:

https://linked.fsdf.org.au/vocprez

References

[ABIS] Department of Agriculture, Water and the Enviroment, Australia Biodiversity Information Standard (ABIS), Australian government Semantic Web Standard (2022-01-14). https://linked.data.gov.au/def/abis
[DCTERMS] DCMI Usage Board, DCMI Metadata Terms, A DCMI Recommendation (2020-01-20). https://www.dublincore.org/specifications/dublin-core/dcmi-terms/
[DCAT] World Wide Web Consortium, Data Catalog Vocabulary (DCAT) - Version 2, W3C Working Group Note (04 February 2020). https://www.w3.org/TR/vocab-dcat/
[GEO] Open Geospatial Consortium, OGC GeoSPARQL - A Geographic Query Language for RDF Data, Version 1.1 (2021). OGC Implementation Specification. http://www.opengis.net/doc/IS/geosparql/1.1
[ISO19156] International Organization for Standardization, ISO 19156: Geographic information — Observations and measurements (2011)
[OGCAPI] Open Geospatial Consortium, _OGC API - Features, overview website (2022). OGC Implementation Specification. https://ogcapi.ogc.org/features/. Accessed 2022-03-03
[OGCLDAPI] SURROUND Australia Pty Ltd, OGC LDA PI Profile, Profiles Vocabulary Profile (2021). https://w3id.org/profile/ogcldapi
[OWL] World Wide Web Consortium, OWL 2 Web Ontology Language Document Overview (Second Edition), W3C Recommendation (11 December 2012). https://www.w3.org/TR/owl2-overview/
[PROF] World Wide Web Consortium, The Profiles Vocabulary, W3C Working Group Note (18 December 2019). https://www.w3.org/TR/dx-prof/
[PROV] World Wide Web Consortium, PROV-O: The PROV Ontology, W3C Working Group Note (18 December 2019). https://www.w3.org/TR/prov-o/
[RDF] World Wide Web Consortium, RDF 1.1 Concepts and Abstract Syntax, W3C Recommendation (25 February 2014). http://www.w3.org/TR/rdf11-concepts/
[RDFS] World Wide Web Consortium, RDF Schema 1.1, W3C Recommendation (25 February 2014). https://www.w3.org/TR/rdf-schema/
[RFC2119] Internet Engineering Task Force. Key words for use in RFCs to Indicate Requirement Levels, Best Current Practice (March 1997) https://tools.ietf.org/html/rfc2119
[QB] World Wide Web Consortium, The RDF Data Cube Vocabulary, W3C Recommendation (16 January 2014). https://www.w3.org/TR/vocab-data-cube/
[SDO] W3C Schema.org Community Group, schema.org. Community ontology (2015). https://schema.org
[SSN] World Wide Web Consortium, Semantic Sensor Network Ontology, W3C Recommendation (19 October 2017). https://www.w3.org/TR/vocab-ssn/
[SKOS] World Wide Web Consortium, SKOS Simple Knowledge Organization System Reference, W3C Recommendation (18 August 2009). https://www.w3.org/TR/skos-reference/
[TTL] World Wide Web Consortium, RDF 1.1 Turtle Terse RDF Triple Language, W3C Recommendation (25 February 2014). https://www.w3.org/TR/turtle/
[VOCPUB] SURROUND Australia Pty Ltd, VocPub, Profile of SKOS (14 June 2020). https://w3id.org/profile/vocpub