FINNISH LINKED DATA PILOTS Kai Koistinen Data Linking by Indirect Reference Systems -workshop 5.9.2018 1
NLS FINLAND National Land Survey of Finland National mapping and cadastral agency Geodetic research institute FGI integrated into NLS few years ago SDI services department Department of FGI SDI development INSPIRE secretariat 5.9.2018 2
CONTENT Linked Data pilots on themes Geographical Names Buildings Administrative units and Statistical units Conclusions 5.9.2018 3
THEME: GEOGRAPHICAL NAMES 5.9.2018 4
LD PILOT ON GN THEME Public Administration Recommendation Unique identifiers of the geographic information (2015-2016) National recommendation for spatial data identifiers URI structure Examples on how to publish spatial data also as RDF National spatial data URI redirection service paikkatiedot.fi Spatial object URI pattern: http://paikkatiedot.fi/so/[datasetid]/[localid] PoC on identifier and URI service implementation 5.9.2018 5
LD PILOT ON GN THEME Data from NLS s Geographical Names Register Feature types Named places (point geometry, classification) Geographical names (name and it s translations) Map names (name positioning on cartographic products, not used in pilot) Every feature has a locally persistent ID Pilot Local IDs Global paikkatiedot.fi URIs URI service on top of WFS http data card for human viewers Machine readable formats for applications (RDF/XML, Turtle, JSON-LD, Schema.org, GML) Links Named Place <-> Geographical Name Named Place <-> Classification based subdatasets (i.e. named buildings of Helsinki) 5.9.2018 6
GN PILOT DEMO Named Place data card for city of Helsinki http://paikkatiedot.fi/so/1000772/10342733 Geographical name Helsinki http://paikkatiedot.fi/so/1000773/40342733 Named Place types of Helsinki http://paikkatiedot.fi/so/1000772:091/ Named Buildings of Helsinki http://paikkatiedot.fi/so/1000772:091500/ 5.9.2018 7
5.9.2018 8
GN PILOT CONCLUSIONS http URIs and html data cards improves data linkability and accessibility On the fly transformation from WFS no SPARQL API for utilizing the RDF data Created subdatasets work as stored queries (i.e. Buildings of Helsinki) user requirements need to be clearly defined Custom queries can be only made in WFS Google has indexed the data cards 5.9.2018 9
THEME: BUILDINGS 5.9.2018 10
LD PILOT ON BU THEME Geospatial Platform project (->2019) Platform collects spatial data from various public administration providers and makes them available to users Harmonized data models (national and INSPIRE) Standard services (WMS, WFS ) National Topographic Database is one of the subprojects Renewal of the NLS s national topographic database data models, production processes Piloting and implementation of Buildings theme during 2016-2017 Best available public administration buildings data in 2D and 3D Linked Data product pilot 5.9.2018 11
LD PILOT ON BU THEME Data from the renewed topographic database 2D and 3D geometries Every building has a persistent http URI Automatic scripts for generating RDF data from relational spatial database (PostGIS) Data was stored as triples in RDF database (Apache Jena Fuseki) Data maintenance process was also succesfully automated in pilot Up-to-date Buildings as Linked Data product Buildings data was linked with the named building (from GN pilot) Also links to Wikipedia and Wikidata HTML data cards were built on top of the Linked Data service (not on top of WFS like in GN pilot) 5.9.2018 12
BU PILOT DEMO Basic HTML Buildings data card (not publicly available) http://paikkatiedot.fi/so/kmtk_building/715591e5-9a50-4f43-8574- 143765c7273d HTML data card with links to GN and other datasets (not publicly available) http://paikkatiedot.fi/id/kmtk_rakennukset/kmtk_612d59d8-2668-4d13- ad86-4867f5afcfb0 5.9.2018 13
5.9.2018 14
5.9.2018 15
BU PILOT CONCLUSIONS A real Linked Data RDF database with SPARQL API Data had to be duplicated to generate the LD product Succesfull implementation of automatic duplication process Data was enriched in the process with GN, Wikipedia and Wikidata linkings by using spatial analysis Similar methods could be used to link other spatial datasets In the renewed topographic database BU --> GN link will be maintained in the source database 5.9.2018 16
THEMES: ADMINISTRATIVE UNITS AND STATISTICAL UNITS 5.9.2018 17
LD PILOT ON AU AND SU THEMES Integration of spatial and statistical data (->2019) Spatial Statististics on the Web (SSW and SSW2 projects) Table Joining Service standard (OGC TJS) development Integration of Geographic data and Areal classifications as Linked Open Data (IGALOD) The aim is to plan and pilot how Statistics Finland s areal classifications can be joined with NLS s geometry data using Linked Data techniques 5.9.2018 18
LD PILOT ON AU AND SU THEMES Pilot phase Fast track Municipality based areal classifications from Statistics Finland Administrative units spatial data from NLS Both organizations transform their data into RDF and provide SPARQL APIs AU INSPIRE and GeoSPARQL ontologies used in AU theme XKOS used in SU theme Both RDF datasets contain Municipality IDs which can be utilized to create URI linkings between the datasets Result: any municipality based areal classification can be presented in map To make the data more interesting some statistical data will be transformed to RDF and linked with the areal classification 5.9.2018 19
MAPPING ARCHITECTURE (DRAFT)? 20 5 September 2018 Etunimi Sukunimi
AU AND SU PILOT DEMO 5.9.2018 21
AU AND SU PILOT DEMO 5.9.2018 22
AU AND SU PILOT CONCLUSIONS SPARQL is very useful for combining two or more datasets from different organizations Easy to combine data from different source on-the-fly Importance of data modelling (ontologies) and id persistency when linking across organization boundaries 5.9.2018 23
CONCLUSIONS 5.9.2018 24
CONCLUSIONS Statistics GN General Finnish Ontology YSO Archives, libraries, museums SU BU AD AU There are already a lot of relationships between datasets GN theme is the glue between many themes Persistent http URIs and common ontologies are vital for linking and accessing datasets RDF/SPARQL is better than GML/WFS for combining data from different sources LD approach can bring most benefit for users who need to combine data from many sources 25
THANK YOU! QUESTIONS? 5.9.2018 26