A clustering view on ESS measures of political interest:

Similar documents
Minimum message length estimation of mixtures of multivariate Gaussian and von Mises-Fisher distributions

Weighted Voting Games

WHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the WHO European Region

AD HOC DRAFTING GROUP ON TRANSNATIONAL ORGANISED CRIME (PC-GR-COT) STATUS OF RATIFICATIONS BY COUNCIL OF EUROPE MEMBER STATES

The EuCheMS Division Chemistry and the Environment EuCheMS/DCE

Appendix B: Detailed tables showing overall figures by country and measure

WHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the European Region

WHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the European Region

WHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the WHO European Region

North-South Gap Mapping Assignment Country Classification / Statistical Analysis

WHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the European Region

APPLYING BORDA COUNT METHOD FOR DETERMINING THE BEST WEEE MANAGEMENT IN EUROPE. Maria-Loredana POPESCU 1

Economic and Social Council

Use of the ISO Quality standards at the NMCAs Results from questionnaires taken in 2004 and 2011

WHO EpiData. A monthly summary of the epidemiological data on selected vaccine preventable diseases in the European Region

WHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the European Region

A Markov system analysis application on labour market dynamics: The case of Greece

Composition of capital ES060 ES060 POWSZECHNAES060 BANCO BILBAO VIZCAYA ARGENTARIA S.A. (BBVA)

Composition of capital NO051

Composition of capital CY007 CY007 POWSZECHNACY007 BANK OF CYPRUS PUBLIC CO LTD

Composition of capital DE025

Composition of capital LU045 LU045 POWSZECHNALU045 BANQUE ET CAISSE D'EPARGNE DE L'ETAT

Composition of capital CY006 CY006 POWSZECHNACY006 CYPRUS POPULAR BANK PUBLIC CO LTD

Composition of capital DE028 DE028 POWSZECHNADE028 DekaBank Deutsche Girozentrale, Frankfurt

Composition of capital FR013

Composition of capital FR015

Composition of capital DE017 DE017 POWSZECHNADE017 DEUTSCHE BANK AG

Composition of capital ES059

Gravity Analysis of Regional Economic Interdependence: In case of Japan

Trends in Human Development Index of European Union

United Nations Environment Programme

Model-based cluster analysis: a Defence. Gilles Celeux Inria Futurs

Online Appendix for Cultural Biases in Economic Exchange? Luigi Guiso Paola Sapienza Luigi Zingales

Composition of capital as of 30 September 2011 (CRD3 rules)

Composition of capital as of 30 September 2011 (CRD3 rules)

Composition of capital as of 30 September 2011 (CRD3 rules)

Composition of capital as of 30 September 2011 (CRD3 rules)

Composition of capital as of 30 September 2011 (CRD3 rules)

Composition of capital as of 30 September 2011 (CRD3 rules)

EuroGeoSurveys An Introduction

Estimation of Design Effects for ESS Round II

NASDAQ OMX Copenhagen A/S. 3 October Jyske Bank meets 9% Core Tier 1 ratio in EU capital exercise

Variance estimation on SILC based indicators

PIRLS 2011 The PIRLS 2011 Safe and Orderly School Scale

F M U Total. Total registrants at 31/12/2014. Profession AS 2, ,574 BS 15,044 7, ,498 CH 9,471 3, ,932

PLUTO The Transport Response to the National Planning Framework. Dr. Aoife O Grady Department of Transport, Tourism and Sport

ACCESSIBILITY TO SERVICES IN REGIONS AND CITIES: MEASURES AND POLICIES NOTE FOR THE WPTI WORKSHOP, 18 JUNE 2013

40 Years Listening to the Beat of the Earth

Weekly price report on Pig carcass (Class S, E and R) and Piglet prices in the EU. Carcass Class S % + 0.3% % 98.

PIRLS 2016 INTERNATIONAL RESULTS IN READING

Bathing water results 2011 Slovakia

STATISTICA MULTIVARIATA 2

Regional economy upgrading triple helix at work? Some selected cases from the Czech republic (and Central Eastern Europe) Pavel Ptáček

Bathing water results 2011 Latvia

Part A: Salmonella prevalence estimates. (Question N EFSA-Q ) Adopted by The Task Force on 28 March 2007

The School Geography Curriculum in European Geography Education. Similarities and differences in the United Europe.

This document is a preview generated by EVS

EUMETSAT. A global operational satellite agency at the heart of Europe. Presentation for the Spanish Industry Day Madrid, 15 March 2012

USDA Dairy Import License Circular for 2018

INSPIRing effort. Peter Parslow Ordnance Survey December Various European approaches to managing an SDI

Publication Date: 15 Jan 2015 Effective Date: 12 Jan 2015 Addendum 6 to the CRI Technical Report (Version: 2014, Update 1)

This document is a preview generated by EVS

TIMSS 2011 The TIMSS 2011 Instruction to Engage Students in Learning Scale, Fourth Grade

This document is a preview generated by EVS

Export Destinations and Input Prices. Appendix A

WHAT IS SILMI? SILMI is a Research Networking Programme of the European Science Foundation ( ESF ) in the Physical and Engineering Sciences ( PESC ).

Evaluating sensitivity of parameters of interest to measurement invariance using the EPC-interest

ORGANISATION FOR ECONOMIC CO-OPERATION AND DEVELOPMENT

Analysis of European Topographic Maps for Monitoring Settlement Development

This document is a preview generated by EVS

United Nations Environment Programme

PIRLS 2011 The PIRLS 2011 Students Motivated to Read Scale

This document is a preview generated by EVS

THE NEW DEGREE OF URBANISATION

PIRLS 2011 The PIRLS 2011 Teacher Career Satisfaction Scale

Программа Организации Объединенных Наций по окружающей среде

IEEE Transactions on Image Processing EiC Report

RISK ASSESSMENT METHODOLOGIES FOR LANDSLIDES

Modelling structural change using broken sticks

APPENDIX IV Data Tables

Sustainability of balancing item of balance of payment for OECD countries: evidence from Fourier Unit Root Tests

EuroGeoSurveys & ASGMI The Geological Surveys of Europe and IberoAmerica

MB of. Cable. Wholesale. FWBA (fixed OAOs. connections of which Full unbundled. OAO owning. Internet. unbundled broadband

USDA Dairy Import License Circular for 2018 Commodity/

Canadian Imports of Honey

04 June Dim A W V Total. Total Laser Met

Quick Guide QUICK GUIDE. Activity 1: Determine the Reaction Rate in the Presence or Absence of an Enzyme

2017 Source of Foreign Income Earned By Fund

Friedel Pas, European Liaison Officer International Dark-Sky Association Vienna, Austria August 2008

Mathematics. Pre-Leaving Certificate Examination, Paper 2 Higher Level Time: 2 hours, 30 minutes. 300 marks L.20 NAME SCHOOL TEACHER

This document is a preview generated by EVS

Different points of view for selecting a latent structure model

This document is a preview generated by EVS

Assessment and Improvement of Methodologies used for GHG Projections

Directorate C: National Accounts, Prices and Key Indicators Unit C.3: Statistics for administrative purposes

USDA Dairy Import License Circular for 2018

How Well Are Recessions and Recoveries Forecast? Prakash Loungani, Herman Stekler and Natalia Tamirisa

Variable selection for model-based clustering

TIMSS 2011 The TIMSS 2011 Teacher Career Satisfaction Scale, Fourth Grade

A Survey of Thermodynamics and Transport Properties in Chemical Engineering Education in Europe and the USA

Transcription:

A clustering view on ESS measures of political interest: An EM-MML approach Cláudia Silvestre Margarida Cardoso Mário Figueiredo Escola Superior de Comunicação Social - IPL BRU_UNIDE, ISCTE-IUL Instituto de Telecomunicações, Inst. Sup. Técnico Portugal

Outline Objective Model Finite Mixture Models Selection Criterion Minimum Message Length Algorithm EM-MML Results Conclusions

Objective Clustering the regions in the European Social Survey based on attitudes towards politics Voted last national election ( Yes; No; Not eligible) Contacted politician or government official in last 12 months Worked in political party or action group in last 12 months Worked in another organisation or association in last 12 months Worn or displayed campaign badge/sticker in last 12 months Signed petition in last 12 months Taken part in lawful public demonstration in last 12 months Boycotted certain products last in 12 months Feel closer to a particular party than all other parties (Y/N)

K f y i θ = α k f y i θ k Model: Finite Mixture Models k=1 K is the number of segments y i is regarded as incomplete data, the allocation to segments (z i ) being missing Complete data: y i, z i

The log of complete likelihood Model: Finite Mixture Models f y i, z i θ z i α f y i θ i uknown

How to select the number of segments? Information criteria such as BIC, AIC, CAIC, AIC3 or ICL can be used Selection Criterion We adopt Minimum Message Length criterion embedded in the model estimation (Figueiredo and Jain, 2002), which: Provides estimates of all the model parameters including the number of segments Is less sensitive to initialization than EM Avoids the boundary of the parameters space

Shannon s Information Theory: optimally transmitting a random variable Y with probability f y requires about 2 f y bits of information. Selection Criterion: MML to encode y: y θ 2 f y, θ to encode y and θ the total message length is: y, θ y θ + θ

Algorithm: EM-MML EM is a popular algorithm for finding ML parameter estimates, when unobserved (missing) data is considered in the model. The EM-MML A mixture of multinomials is adopted and the MML estimates are obtained via an EM-type algorithm.

Categorical variables: Y Y,, Y i, Y Algorithm: EM-MML Y i Y i,, Y id where variable d (d 1 D) has C d categories θ θ,, θ, α,, α, α are the clusters weights or mixing probabilities θ the multinomials parameters

log f y θ i log f y i θ Algorithm: EM-MML Mixture of multinomials: D C d θ dc y idc f y i θ α d n! c y idc!

Assuming that: The segments have independent priors independent from the mixing probabilities Algorithm: EM-MML A noninformative Jeffreys prior for θ y, θ y θ + θ M 2,αk >0 n α 12 + k z n 12 + k z M + 1 2 log f y θ M is the number of parameters specifying each segment k z is the number of segments with non-zero probability

E-step E Z i y i ; θ t P Z i 1 y i ; θ t Algorithm: EM-MML α t f y i ; θ t α t f y i ; θ t where D C d t θ dc y idc f y i ; θ t d n! c y idc!

M-step Update the estimates of mixing probabilities α t+ max 0, i P Z i 1 y i ; θ t M 2 Algorithm: EM-MML max 0, i P Z i 1 y i ; θ t M 2 Update the estimates of multinomial parameters θ dc t+ i P Z i 1 y i ; θ t y idc n! i P Z i 1 y i ; θ t

compute P Z i α 1 Y i, θ (t) Algorithm: EM-MML α = 0 K:=K-1 > 0 compute θ

Results The clustering of Regions in the European Social Survey based on attitudes towards politics, using EM-MML, yields 2 clusters

Results: cohesion-separation stability computation time BIC; CAIC; ICL AIC; AIC3 EM-MML 2012 Number of clusters 7 7 2 Silhouette index 0.213 0.191 0.361 Calinski-Harabasz 83.327 74.977 190.825 Computation time (seconds) 109 109 2 2014 Number of clusters 7 8 2 Silhouette index 0.152 0.164 0.367 Calinski-Harabasz 80.766 78.477 189.552 Computation time (seconds) 91 91 2 2012 vs 2014 Adjusted Rand 0.377 0.499 0.707 Normalized mutual information 0.523 0.591 0.598

160 number of regions 160 number of regions 147 140 140 126 120 114 120 Results: round 6 vs round 7 100 80 100 80 93 60 60 40 40 20 20 0 ESS6 CLU 1 ESS6 CLU 2 0 ESS7 CLU 1 ESS7 CLU 2

8 76% 7 6 Regions in cluster 2 share a more active role in politics (Yes %) 57% 6 5 4 3 26% 32% 28% 37% 2 1 1 16% 2% 6% 4% 12% 13% 6% 9% 9% 8% 9% Contacted politician or government official last 12 months Worked in political party or action group last 12 months Worked in another organisation or association last 12 months Worn or displayed campaign badge/sticker last 12 months Signed petition last 12 months Taken part in lawful public demonstration last 12 months Boycotted certain products last 12 months Feel closer to a particular party than all other parties Voted last national election Not eligible to vote ESS6 CLU 1 ESS6 CLU 2

8 73% 7 6 5 Regions in cluster 1 share a more passive role in politics (Yes %) 58% 6 41% 4 34% 3 2 28% 2 1 13% 18% 3% 6% 12% 12% 9% 7% 8% 1 Contacted politician or government official last 12 months Worked in political party or action group last 12 months Worked in another organisation or association last 12 months Worn or displayed campaign badge/sticker last 12 months Signed petition last 12 months Taken part in lawful public demonstration last 12 months Boycotted certain products last 12 months Feel closer to a particular party than all other parties Voted last national election Not eligible to vote ESS7 CLU 1 ESS7 CLU 2

4 4 3 3 2 Regions in cluster 2 are clearly more interested in politics (as expected ) 41% 4 37% 32% 4 3 3 3 26% 2 42% 36% 29% 3 27% Results 2 1 1 7% 14% 12% 2 1 1 8% 1 12% How interested in politics - very interested How interested in politics - not at all interested How interested in politics - very interested How interested in politics - not at all interested ESS6 CLU 1 ESS6 CLU 2 ESS7 CLU 1 ESS7 CLU 2

2 Most respondents in Cluster 1 do not trust politicians 21% 2 19% Results 1 1 1 11% 1 13% 13% 9% 13% 13% 11% 14% 11% 7% 2% 4% 1% 1% Trust in politicians - Not at all Trust in politicians - Completely ESS6 CLU 1 ESS6 CLU 2

or political parties 2 2 2 2 Results 1 12% 1 14% 14% 13% 11% 13% 14% 11% 1 1 9% 7% 4% 4% 2% 1% 1% Trust in political parties - Not at all Trust in political parties - Completely ESS6 CLU 1 ESS6 CLU 2

2 Regions in cluster 2 share a more positive view of other people ESS6 and ESS7 results being very similar 21% 21% 2 18% 1 1 12% 11% 11% 12% 13% 12% 1 9% 1 9% 2% 4% 2% 3% 3% 2% 2% Mostly looking out for themselves Most of the time people helpful ESS6 CLU 1 ESS6 CLU 2

All regions in Sweden, Norway, Finland, Denmark and Germany belong to cluster 2 Slovenia Sweden Portugal Poland Norway Netherlands Lithuania Israel Ireland Hungary United Kingdom France Finland Spain Estonia Denmark Germany Czech Republic Switzerland Belgium 3% 4% 5,7% 6% 8,7% 9% 9% 8% 1 9% 1 9,6% 9,2% 9,2% 9,8% 11,4% 10,9% 12% 1,8% 3,9% 9,4% 6,8% 16% 0, 1,2% 0, 1,9% 2% 4% 6% 8% 1 12% 14% 16% 18% ESS6 CLU 2 ESS6 CLU 1 0, 0, 0, 0,

All regions in Sweden, Norway, Finland, Denmark and Germany belong to cluster 2 25 regions change to cluster 2, e.g. Lisbon (in Portugal ) Jihoceský kraj (in Czech Republic) 4 regions change to cluster 1: Prov. West-Vlaanderen (in Belgium ), Principado de Asturias, La Rioja (in Spain) and Drenthe (in Netherlands) Slovenia Sweden Portugal Poland Norway Netherlands Lithuania Israel Ireland Hungary United Kingdom France Finland Spain Estonia Denmark Germany Czech Republic Switzerland Belgium 1% 1% 4% 7% 7,4% 7% 7% 6% 8% 5,8% 8% 9,8% 10,3% 1 9% 1 8% 0, 12,4% 13,6% 0, 14% 12,2% 0, 15, 2,3% 0, 0,4% 9, 0, 0,4% 2% 4% 6% 8% 1 12% 14% 16% 18% ESS7 CLU 2 ESS7 CLU 1 0, 0,4% 0,

Conclusions A new EM variant the EM-MML was used to cluster categorical aggregated data and estimate the number of clusters simultaneously. It estimates parameters of a finite mixture of multinomials, using a Minimum Message Length criterion. EM-MML shows better performance when compared with traditional EM-ML combined with BIC, AIC and ICL: more parsimonious and robust solutions; better cohesion-separation and stability Abrief profiling of the segments showing that the main changes occurred between rounds 6 and 7

References Biernacki, C., Celeux, G. and Govaert, G., 2000. Assessing a Mixture model for Clustering with the integrated Completed Likelihood. IEEE Transactions on Pattern analysis and Machine Intelligence, 22: 719 725. Figueiredo, M. A. T., and Jain, A. K., 2002. Unsupervised learning of finite mixture models", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 24, pp. 381-396. Fonseca, J. R. & Cardoso, M. G. (2007). Mixture-model cluster analysis using information theoretical criteria. Intelligent Data Analysis 11(2): 155-173. Silvestre, C., Cardoso, M. and Figueiredo, M., 2008. Clustering with finite mixture models and categorical variables. Contributed Papers to the International Conference on Computacional Statistics, Porto, Portugal, pp. 109-116. Silvestre, C., Cardoso, M., and Figueiredo, M., 2012. Categorical Data Clustering Using a Minimum Message Length Criterion. IDA 2012 - The eleventh International Symposium on Intelligent Data Analysis. Helsínquia, Finlândia, 25-27 de outubro, 2012. Silvestre, C., Cardoso, M., and Figueiredo, M., 2013. Determining the Number of Groups while Clustering Categorical Data. IFCS 2013 The International Federation os Classification Societies. Tilburg, the Netherlands, 14-17 July (Book of Abstracts p. 158)