United Nations Economic and Social Council Distr.: General 6 April 2016 ECE/ /CES/2016/25 English only Economic Commission for Europe Conference of European Statisticians Sixty-fourth plenary session Paris, 27-29 April 2016 Item 6 of the provisional agenda Geospatial information services based on official statistics Integration of spatial information in statistics production and establishment of spatial statistical servicess Note prepared by Statistics Norway Summary This document shares information on the measures takenn by Statistics Norway to integrate geospatial information in the Generic Statistical Business B Process Model (GSBPM) applied in statistical production. The document describes the progress achieved, and points at possible developments and recommendations for future work. The document is presented to the Conference of Europeann Statisticians seminar on Geospatial information services s basedd on official statistics for discussion. d GE.16-05567(E) *1605567*
I. Summary 1. Since the late 1990s, Statistics Norway has been gradually integrating spatial information into statistics, while organizing statistical production along the Generic Statistical Business Process Model (GSBPM). Collaboration with the National Mapping Authority (NMA), other mapping agencies and entities responsible for national registers is crucial in terms of access to national map data, cadastral information and geocoded national registers (location information). 2. Within the domain of land use statistics a number of regular official statistics have been developed. However, several statistical domains in Statistics Norway have yet to explore geographical information systems (GIS) as an analytic tool and as a tool for compiling statistics from various geo-referenced datasets. 3. Recently, to boost the process of integration of statistics and geospatial information in Statistics Norway, two initiatives have been launched; there has been established a designated geodatabase for all geo-referenced datasets in Statistics Norway and a Resource Centre for GIS and Geospatial Information. A web map application with wms and wfs services for data sharing has been established. The number of statistics available for map display is limited, but gradually increasing. 4. It is important to emphasize the importance of a common understanding and goals for the work on the integration of geospatial information and statistics in the collaboration between NSOs, mapping agencies and national registers. NSOs may be totally dependent on mapping agencies to get access to reliable basic geospatial information (topography and thematic). Similarly, in a geospatial statistical setting, NSOs are totally dependent on the access to national registers with location information attached to the statistical units. 5. It is clearly an advantage if an NSO has the legal basis to put forward proposals concerning the manner in which data-processing systems and registers should be designed in order to safeguard inter-linkage with statistics. However, the coordination at national level could be demanding. II. Description of current work 6. Statistics Norway is aiming at having geospatial information fully integrated in the GSBPM (figure 1). Application of such data in statistics production should over time become equally important and as ordinary as the use of any other type of data. 7. NMA established the Norway Digital Consortium in 2005. The Norway Digital Consortium is a national spatial data infrastructure in which more than 600 partners are active. This initiative and the need to implement the Infrastructure for Spatial Information in Europe (INSPIRE) directive, in 2010, resulted in the Geodata Act. 8. The Geodata Act states that NMA is the National Geodata Coordinator in Norway. Statistics Norway has since the 1980s had a tradition for including a geographical component in its statistical registers (usually enhanced copies of national registers). As stated in the Statistics Act 1, implemented by the Ministry of Finance in 1989, Statistics Norway has privileged access to all data necessary for the production of official statistics. By taking part in the Norway Digital Consortium, Statistics Norway gets access to several spatial datasets that can be used in production of official statistics, but it also implies that 1 See the Statistics Act: https://www.ssb.no/en/omssb/styringsdokumenter/lover-og-prinsipper/thestatistics-act-of-1989 2
Statistics Norway is obliged to share data with the other parts in the consortium. As a result of close cooperation with NMA and other agencies, Statistics Norway receives,, on a regular basis, several spatial datasets and cadastral information. 9. In an attempt to ensure best possible data quality and timely delivery of f the datasets, Statistics Norway has established a regimen of agreements with national registers. The agreements and annexes are concerned with quality enhancement measures and arrangements to ensure safe, s efficientt and timely dataset delivery. Most of the registers have statistical units with location information (addresses, property identifiers etc.). 10. Since the late 1990s, Statistics Norway has gradually integrated the use of geospatial information in the GSBPM. A number of regular official statistics have been developed using geospatial data. Still, at severall statistical domains there have been noo attempts to explore the geographical informationn systems (GIS) as an analytic a tool or tool for compiling statistics from various geo-referenced datasets that are available. 11. Incorporating spatial data in the GSBPM is currently strongly encouraged by the management in Statistics Norway. Until now the work has encompassed: Identifying steps in the GSBPM where spatial data lacks attention; Identifying steps in the model where spatial information has been integrated; Communicating strengths and weaknesses to the management (Board off Directors). Figure 1. The Generic Statistical Business Process Model 12. In 2013, the Directors Forum off Statistics Norway discussed the use and integration of geospatial data and GIS in statistical production. A preparatory group for the Directors Forum suggested that Statistics Norway should put more effort into i integrating the spatial dimension in statistics more widely as well as strengthening the collection of geospatial data for the Statistics Norway s common geodatabase. Statistics Norway should also strengthen the dissemination part of the GSBPM concerning statistics in maps and data sharing services. One of the important findings in a quality assessment of some statistics was that focus groups off selected users asked for more visualization of statistics (graphs and maps) to enhance the quality of statistics as a communication product. 13. The current statistics productionn (see figure 2) that involves spatial information and GIS tools usually followss the broad and curved arrows all the way w to 5. Dissemination. Occasionally, for variouss reasons, the chosen paths are the shortcuts along thee thin line or the dashed line. The data collection stepp is omitted from the sketch. 3
Figure 2. Main elements in the current production line of spatial statistics in Statistics Norway 14. In order to raise awareness towards spatial data and thee wide range of potential implications for the entire Statistics Norway, a Resource Centre for GIS and Geospatial Information was recently established. The centre has the mandate to: Actively introducee statisticians to GIS tools and the concept of geospatial data, as well as the unleashed potential of the basic geo-referenced d datasets and d the thematic datasets with location information available to them; Provide statisticians with adequate training in using GIS G tools and geospatial analysis; Support the units responsible for the Collect step in thee GSBPM, in order to get complete coverage and all the right variables into the Statistics Norway s geodatabase ; Support the unitss responsiblee for the Processing step in the GSBPM, the Analyse step as well as the Disseminate step; in matters m regarding quality control, correct storage in thee Statistics Norway s geodatabase and metadata specifications for data to be shared in accordance with the INSPIRE directive; Represent Statistics Norway nationally and internationally in geospatial matters. 15. As of April 2016, the resource centre will be staffed withh the equivalent of 3 full- its activity, time man-years. The long-term goal is that the resource centre could reducee since each unit producing statistics using the GSBPM should eventually gain the skills for using GIS and geospatial data. The various units in Statistics Norway should handle geospatial data and GIS tools t just as naturally as any other datasets and tools necessary to work in line with GSBPM. A. User needs and new statistics with geospatial data 16. It is important to start a development process from user needs in accordance with the GSBPM, and each statistician shouldd be familiar with the user needs concerning their respective statistics. After the establishment of the resource centre, the statistical divisions of Statistics Norway identified statistics where GIS and geospatial data couldd be utilized. 4
This information is the starting point for the resource centre in guiding statisticians towards developing new or improved statistics or new analyses. 17. As the knowledge of GIS tools and geospatial data increases in the organization, more statistics that could be integrated with geospatial data are likely to be found. This will be based on an assessment of user needs and possibilities offered by the GIS software and spatial data in combination with the statistics in question. 18. Linking statistics and geospatial data is a continuous effort, but the resource centre will contribute to a stronger emphasis on this during the early years. Hopefully, it will help statisticians to find new insights and develop new or improved statistics and analysis that will benefit society. B. Data collection, data management and preparation 19. The resource centre will support the process of data collection and management and see to that it is carried out with sufficient quality measures in all stages. Crucial in this respect is: Good cooperation with the national geospatial data coordinator (in Norway; the National Mapping Authority); An annual cycle with dates for when data are expected ready prepared and a list of who-does-what; Implementation of a number of quality checks; Sufficient resources for updates and maintenance of the geodatabase; Contribution to increased effort in geo-referencing relevant registers. 20. The Statistics Norway s geodatabase is established and a number of datasets will be updated annually. The datasets are nationwide, but incorporated with the commonly used regional statistical units. This makes it easy to retrieve data for processing on a manageable size for specific purposes. 21. Some of the most important spatial datasets in the database so far include addresses with residents (number, age group and gender), buildings (building type, areas, dates etc.) and land use (figure 3). These make up vital populations for extracting and further processing in connection with different statistics. When it comes to land use and land cover statistics, a substantial effort has been done to integrate a number of geo-referenced datasets. The land use/land cover data are then made available in the Statistics Norway s geodatabase as a basis for different statistics (urban areas, coastal zone and so on). 22. Statistics Norway considers preparing datasets for geographic analysis for some basic and regularly used registers. This would render geographic analysis to statisticians without the need for advanced GIS knowledge. Examples could be: adding distance to coastline, distance to centre zones and distance to hospital to the addresses in the residents dataset. 5
Figure 3 Schematic data flow and illustration of reuse of statistical spatial datasets C. Dissemination 23. After the Directors Forum meeting in 2013, one of thee Director General s first actions was to strengthen the dissemination part in the GSBPM and this work is still ongoing. Statistics Norway alreadyy had a web page for viewing, web services (http://kart.ssb.no) and a separate s web page for downloading of datasets. The kart.ssb.no - page is the first tool for distributing the geospatial data produced by Statistics Norway to the public and for fulfilling the national and international disseminationn obligations (INSPIRE). In addition, Statistics Norway has published some maps m using ArcGis online. One example is a map on gender equality in society (gender equality map 2, inn Norwegian only). This tool has a lower technical entry level than kart.ssb.no. 24. The latest efforts related r to dissemination of spatial statistics have clarified the areas of responsibility in the GSBPM, and the update routines have been b improved. Currently, work is done to establish better dissemination of articles on statistics. One of the most- it will be wanted features is embedded interactive maps in the articles (figure 4). For this important to choose the technology for and find efficient routines for updating. More efficient routines will probably include the use of our own Adaptive Programmable Interfaces (APIs) as dataa sources in some dissemination products. This will be explored further and tested during 2016. 2 See the gender equality map: ssb1.maps.arcgis.com/apps/mapseries/index.html?appid= b75bd3293e2d4b62af945dd8cb7f2e9e 6
Figure 4. Example on dissemination with embedded map (http://www.ssb.no/en/befolkning/statistikker/beftett/aar/2015-12-11) IV. Issues and challenges 25. This part of the document is organized according to the steps in the GSBPM. A. Specify needs 26. In order to give the users new or improved analysis or statistics s it is important to identify existing statisticss with a potential for further development with regards to a spatial dimension. The users of modern communications technology expect to use map applications in their search for information and statistics, as well w as they expect more statistics to be produced based on all the spatial information that is availablee to national statistical offices and other statistics providers. Some success stories are highly wanted in order to inspire and raisee awareness and knowledgee about spatial informationn as basis for statistics. For some statistics GIS havee been used for several years, but the value added to statistics in general using location information in register data gathered by Statistics Norway is expected to be high. As a starting point for the work ahead, in planning activities for a Resource Centre for GIS and Spatial Information, a questionnaire was distributed to 7
all statistics departments in Statistics Norway. Several departments have signalled back concrete projects where the spatial dimension and application of GIS will play an important part. B. Develop and design 27. Quality assured geospatial data available on agreed dates presupposes clear responsibilities and efficient handling of data, in close cooperation with the data providers. Capacity building throughout the Statistics Norway s organization is important in order to reach the goal of getting the various units in Statistics Norway to be able to handle geospatial data and GIS tools just as good as any other datasets and tools in line with GSBPM. 28. Statistics Norway is using standard commercial GIS-software. The commercial software has been used in the organization for years (although in a quite limited extent, considering the ratio of active licences/employees) and it is reliable and well proven for producing statistics. It is also relatively easy to use for setting up automatic routines using graphic modelling tools without the need for programming or code. This should help convincing the ordinary statistician to use GIS in building new or improved statistics, which is one of the goals for the Resource Centre for GIS and Spatial Information. Relying on pure scripting software would make this competence building much more challenging. 29. On a medium-term time scale Statistics Norway will consider switching to opensource software. One advantage of switching to open-source is of course license costs. But in the initial phase of the broad scale capacity building on GIS and spatial information, Statistics Norway will make use of commercial software as this is expected to give a higher probability of success, given the in-house software and expert capacity available. C. Build 30. There has been established a central Geodatabase for all location information and geo-referenced data in Statistics Norway. Statistics Norway collects data as of January each year and store the data as basic data, building up a time series in the Geodatabase. This mutual database is to ensure that the same data is used for different statistics. This makes the data and definitions coherent across statistical domains. Still remains to develop a system for long term storage of the statistical data that stacks up as a result of production runs, but having the central Geodatabase is a good starting point for this work. D. Collect 31. As with any kind of data for statistical purposes, spatial data and collection is a continuous process where new data sources must be considered and existing data sources must be assessed from year to year. 32. The bulk of geospatial data collected annually is an update of the national datasets comprising the most detailed topographic and thematic map data from the National Mapping Authority, as well as other national mapping entities and data providers. 33. Most of the data is established as a seamless national dataset by the National Mapping Authority in the context of the Norway Digital Consortium. Statistics Norway collects and puts the data into the Geodatabase. Close collaboration with the National Mapping Authority is important for data delivery at agreed dates, but also for information regarding data quality. 8
34. The massive amount of new geo-referenced data generated by handheld devices, devices installed in vehicles, bar-code scanner data, stationary logging instruments and cameras, satellite data and other remote sensing data are of course very interesting information to collect. The data owners should be made aware of that they are contributing to a pool of spatial information that is of great interest for the statistical communities, and that the data should be managed according to common international guidelines. A challenge is to get an overview and be able to assess what can be of value for national statistical offices and other providers of official statistics. E. Process 35. To be able to detect errors in datasets at an early stage in the production process, and to find solutions for obtaining corrected data is crucial for progress through the steps of the GSBPM. National statistical offices are also expected to keep the costs down and be effective, follow Code of Practice and fulfil dissemination targets (on timeliness and current interest) set by the central authorities. 36. As regards spatial data from NMA and other major contributors of spatial data from registers, Statistics Norway is implementing routines for automatic feedback of noticeable changes and obvious errors in the data material before storing the data in the Geodatabase. It is essential also to follow up on these quality issues in the dialog with national registers. The agreements on quality enhancement measures and arrangements to ensure safe, efficient and timely dataset delivery are updated accordingly. F. Analysis 37. By having a common Geodatabase and access to user friendly software the principal challenge is building competence and knowledge in the organization in order to; use the right datasets for the matter at hand, make the analyses (e.g. map overlays, network analysis and statistical analysis) and compile the statistics the users are asking for. In order to reach this goal Statistics Norway will organize courses and training works shops. 38. Crucial for the capacity of providing statistics and analyses of sufficient quality based on spatial information and GIS technology is the human recourses available. A prerequisite is of course enough computer capacity, and as the number of users of GIS licence rise, the capacity has to be monitored closely and measures taken to meet the demand. G. Disseminate 39. Challenges identified so far are: More systematic analysis of user needs. What kind of topics should be covered? Which system/tool to choose: One-stop shopping, one tool for all? (kart.ssb.no) Thematic applications like ArcGis online Other tools like Highmaps, CartoDB, etc. Where to publish? In context of other dissemination products? 9
Stand-alone websites? Organization of the collaboration between Department of communications and the statistics departments; Creating good routines for automatic updates of tables and graphics on the website (ssb.no/en), using the StatBank Norway (www.ssb.no/en/statistikkbanken). V. Conclusions and recommendations 40. Integration of geospatial information and statistics can to some extent be accomplished without using GIS software. However, the great advantages of using GIS and having skilled statisticians with GIS competence in a national statistical office should not be disregarded. 41. A close collaboration with the National Mapping Agency is crucial, as well as with other data providers. Statistics Norway recommends that national statistical offices have a legal basis for putting forward proposals concerning the manner in which data processing systems and registers should be designed in order to safeguard consideration for statistics. 42. Agreements with entities administrating registers on quality enhancement measures and arrangements to ensure safe, efficient and timely dataset delivery are helpful instruments in the collaboration on data collection for statistical purposes. 43. The generic geospatial statistical framework that is in the making by the Expert Group under the United Nations initiative on Global Geospatial Information Management (UN-GGIM) will provide the statistical and mapping communities, as well as national registers and other collectors of data, with a common approach to the challenge of integrating geospatial information in a statistical context. The statistical community should support the framework proposal when it is presented for global consultation before being submitted to the United Nations Statistical Commission for adoption. 10