What are we like? Population characteristics from UK censuses Justin Hayes & Richard Wiseman UK Data Service Census Support
Who are we? Richard Wiseman UK Data Service / Jisc Justin Hayes UK Data Service / Jisc
The UK Data Service Funded by the ESRC, integrating several previous resources A single, comprehensive and integrated point of access to a wide range of social science data Support, training and guidance ukdataservice.ac.uk
UK Data Service Census Support A specialist unit of the UK Data Service Access to, and support for use of data from the last five UK censuses (1971 2011) Bespoke interfaces to make data easy to find, understand and use census.ukdataservice.ac.uk
Who are you? Where are you from? What do you do with data? Why are you here?
Workshop structure
Workshop structure
UK Censuses Decennial questionnaire surveys Entire UK population every ten years* since 1801 Questions mainly about people and households 2011 Census cost ~ 500m Primary evidence for government policy and spending Wide range of high quality demographic and socio-economic characteristics Detailed combinations of characteristics - What? Small areas - Where? Long history When? Rich secondary source of information Open Government License!
UK 2011 Census 27 March 2011 Three UK census agencies (ONS, NRS, NISRA) New questions and variables Online and postal completion Targeted enumeration Sophisticated quality assurance
New variables National identity Passports held Ability in spoken English Languages other than English used at home Long term health conditions Month/year of arrival into the UK (for people not born in the UK) Intention to stay Second homes
Main language other than English by number of speakers in England and Wales
Main language other than English by number of speakers in England and Wales
Census output data types Aggregate data Population characteristics and locations Flow data Migration and travel to work Anonymised microdata Personal and household level Digital boundary data Mapping and spatial analysis
Census output data types Aggregate data Population characteristics and locations Flow data Migration and travel to work Anonymised microdata Personal and household level Digital boundary data Mapping and spatial analysis
Aggregate data Counts of people and households in specific locations with particular combinations of characteristics Derived from questionnaire responses Characteristics What? Complex combinations of variables and categories Location Where? Very large to very small areas (UK to postcodes) For example Counts of female usual residents aged 16-74 in employment in associate professional and technical occupations for all wards in the County of Devon
43 65 152 38
Workshop structure
Aggregate data Age : Age 16 to 74 - Economic activity : in employment the week before the census - Occupation : 3. Associate professional and technical occupations - Sex : Female - Unit : Persons Age : Age 16 to 74 - Economic activity : in employment the week before the census - Occupation : All categories\ Occupation - Sex : Female - Unit : Persons
Aggregate data
Workshop structure
2011 Census geographies Subdivisions of the UK into smaller areas Sets of similar areas called geographies Functional and statistical geographies Local government districts Wards and electoral divisions Expecting around 100 different geographies Hierarchies of geographies with nesting areas Administrative Statistical Health, Electoral, Postcode, etc
UK administrative geographies
UK statistical geographies
Workshop structure
Data scale and complexity Second instalment of 2011 data from InFuse Local and Detailed Characteristics for England and Wales Harmonised UK outputs to local authority level 422 supply characteristics table specifications 11, 000+ supply data files 97 variables 2501 categories 281 variable combinations 139,554 category combinations 4,595,560,475 values Inconsistent descriptions inhibit global operations Inconsistent releases from three UK agencies
Characteristics tables
Geographies scale and complexity Second instalment of 2011 data from InFuse Local and Detailed Characteristics for England and Wales Harmonised UK outputs to local authority level 31 geography types 241,334 areas Expecting 100 geography types 2 million areas
Raw geography entities and relationships
Incomplete geographical availability Not all characteristics for all areas Principle of confidentiality Information detail vs geographical detail Random noise added to data in outputs Lower threshold data Produced for all areas Key Statistics, Quick Statistics, Local Characteristics Upper threshold data Produced for areas down to wards* and MSOAs Detailed Characteristics Other outputs UK data to district level
Workshop structure
Working with census aggregate information All values are all estimates! Imputation and record swapping Making comparisons Use denominators to compare rates Use identifiers Areas Cells and intersections Variations within and between censuses Match to non-census data Expect gaps in data from InFuse
Traditional table selection
Traditional table cell selection
InFuse Open Interface to UK census data Currently 2011 and 2001 censuses New approach to dissemination for the 2011 Census Simple and comprehensive global search across UK data Selection of multiple areas at different levels across UK Based on integrated data and geography models API with lightweight dynamic application
InFuse data model Single multidimensional dataset Deconstruction, rationalisation and re-integration of variables and categories All UK table specifications processed Integration of table universes as variables Enforce consistency across dataset Library of variables and categories to describe all counts Re-insertion of counts into model Retain original cell identifiers Attachment of metadata
InFuse variable combination selection
InFuse geography model(s) Raw geography model All original geographies and their areas Direct and indirect hierarchical relationships Simplified geography model Combinations of equivalent geographies into geography sets with UK coverage where possible Condensed standard/merged geographies in England Selections of areas across the UK Multiple geographies in one operation Geography jumps in interface Currently administrative and statistical geographies More to follow
Raw geography entities and relationships
Admin and statistical geography layers
United Kingdom Data under Open Government License (OGL) version 1.0
Nations Data under Open Government License (OGL) version 1.0
Regions Data under Open Government License (OGL) version 1.0
Counties Data under Open Government License (OGL) version 1.0
District Layer Data under Open Government License (OGL) version 1.0
Ward Layer Data under Open Government License (OGL) version 1.0
Output Area Layer Data under Open Government License (OGL) version 1.0
InFuse geographic area selection
Key features Open access! Fast and easy global search Variable and category combinations No tables! Simplified geographies Guide users to find data Populated variable combinations Available geographies No data fast! More data for more geographies 2.5% of values unique to InFuse Improved contextual information
Workshop structure
InFuse demonstration http://infuse.mimas.ac.uk/
Workshop structure
Workshop structure
Mapping aggregate data
Digital boundary data
Geographical Information Systems (GIS) Sophisticated database applications Store spatial entities and relationships Perform common spatial operations Visualisations Spatial analysis Join aggregate values to spatial entities (What to Where) Use area identifiers
Aggregate data
Ward Layer Data under Open Government License (OGL) version 1.0
Aggregate data
Workshop structure
Workshop structure
What s next? Big data release imminent! Remaining UK 2011 outputs Scotland and Northern Ireland Previous censuses (currently in Casweb) Integrated boundary data in GIS formats Interface design and features More contextual information Access to API for application development Continue engagement with census agencies Better data from producers Less processing!
And after that? Integration of multiple censuses Resampling to common geographies Output Area Classification Non-census data External APIs Development of data and geography models Postcodes? Flow (origin/destination) data
Support Web forms ukdataservice.ac.uk/help/ Follow us UKDATASERVICE@JISCMAIL.AC.UK twitter.com/ukdataservice www.facebook.com/ukdataservice
Questions?