A Borda Count for Collective Sentiment Analysis

Similar documents
From Sentiment Analysis to Preference Aggregation

Voting with CP-nets using a Probabilistic Preference Structure

Social Algorithms. Umberto Grandi University of Toulouse. IRIT Seminar - 5 June 2015

Algorithmic Game Theory Introduction to Mechanism Design

Survey of Voting Procedures and Paradoxes

Possible and necessary winners in voting trees: majority graphs vs. profiles

Efficient Algorithms for Hard Problems on Structured Electorates

Recap Social Choice Fun Game Voting Paradoxes Properties. Social Choice. Lecture 11. Social Choice Lecture 11, Slide 1

Social Dichotomy Functions (extended abstract)

6.207/14.15: Networks Lecture 24: Decisions in Groups

APPLIED MECHANISM DESIGN FOR SOCIAL GOOD

The Axiomatic Method in Social Choice Theory:

Complexity of Shift Bribery in Hare, Coombs, Baldwin, and Nanson Elections

Possible Winners When New Alternatives Join: New Results Coming Up!

3.1 Arrow s Theorem. We study now the general case in which the society has to choose among a number of alternatives

CPS 173 Mechanism design. Vincent Conitzer

Voting. José M Vidal. September 29, Abstract. The problems with voting.

APPLIED MECHANISM DESIGN FOR SOCIAL GOOD

GROUP DECISION MAKING

Approval Voting for Committees: Threshold Approaches

Social Choice and Networks

A BORDA COUNT FOR PARTIALLY ORDERED BALLOTS JOHN CULLINAN, SAMUEL K. HSIAO, AND DAVID POLETT

Introduction to Formal Epistemology Lecture 5

13 Social choice B = 2 X X. is the collection of all binary relations on X. R = { X X : is complete and transitive}

Game Theory: Spring 2017

Logic and Social Choice Theory. A Survey. ILLC, University of Amsterdam. November 16, staff.science.uva.

CMU Social choice 2: Manipulation. Teacher: Ariel Procaccia

Arrow s Impossibility Theorem and Experimental Tests of Preference Aggregation

Participation Incentives in Randomized Social Choice

Partial lecture notes THE PROBABILITY OF VIOLATING ARROW S CONDITIONS

Gibbard Satterthwaite Games

SYSU Lectures on the Theory of Aggregation Lecture 2: Binary Aggregation with Integrity Constraints

MATH 19-02: HW 5 TUFTS UNIVERSITY DEPARTMENT OF MATHEMATICS SPRING 2018

Lecture 4. 1 Examples of Mechanism Design Problems

Lifting Integrity Constraints in Binary Aggregation

Information Retrieval

Game Theory: Spring 2017

Computational Aspects of Strategic Behaviour in Elections with Top-Truncated Ballots

Redistribution Mechanisms for Assignment of Heterogeneous Objects

On the Complexity of Borda Control in Single-Peaked Elections

How Credible is the Prediction of a Party-Based Election?

CMU Social choice: Advanced manipulation. Teachers: Avrim Blum Ariel Procaccia (this time)

Social choice theory, Arrow s impossibility theorem and majority judgment

An NTU Cooperative Game Theoretic View of Manipulating Elections

Voting on Multiattribute Domains with Cyclic Preferential Dependencies

Combining Voting Rules Together

Voting and Mechanism Design

A Framework for Aggregating Influenced CP-Nets and Its Resistance to Bribery

A Maximum Likelihood Approach towards Aggregating Partial Orders

The Pairwise-Comparison Method

Dominating Manipulations in Voting with Partial Information

Propositionwise Opinion Diffusion with Constraints

arxiv: v1 [cs.gt] 1 May 2017

Wreath Products in Algebraic Voting Theory

Social Choice. Jan-Michael van Linthoudt

APPROVAL VOTING, REPRESENTATION, & LIQUID DEMOCRACY

Judgment Aggregation under Issue Dependencies

Follow links for Class Use and other Permissions. For more information send to:

Geometry of distance-rationalization

Working Together: Committee Selection and the Supermodular Degree

Approximation algorithms and mechanism design for minimax approval voting

Economics Bulletin, 2012, Vol. 32 No. 1 pp Introduction. 2. The preliminaries

Mechanism Design for Multiagent Systems

Decision Making and Social Networks

Multiple Aspect Ranking Using the Good Grief Algorithm. Benjamin Snyder and Regina Barzilay MIT

Lecture: Aggregation of General Biased Signals

Lecture 10: Mechanism Design

Complexity of Shift Bribery in Iterative Elections

Improved Heuristic for Manipulation of Second-order Copeland Elections

Aggregating Incomplete Judgments: Axiomatisations for Scoring Rules

Binary Aggregation with Integrity Constraints

Multivariate Complexity of Swap Bribery

Spring 2016 Indiana Collegiate Mathematics Competition (ICMC) Exam

Mechanism Design for Resource Bounded Agents

Pairwise Liquid Democracy

Volume 31, Issue 1. Manipulation of the Borda rule by introduction of a similar candidate. Jérôme Serais CREM UMR 6211 University of Caen

Stable and Pareto optimal group activity selection from ordinal preferences

Exchange of Indivisible Objects with Asymmetry

DM-Group Meeting. Subhodip Biswas 10/16/2014

Multi-Issue Opinion Diffusion under Constraints

Justified Representation in Approval-Based Committee Voting

Game Theory. Lecture Notes By Y. Narahari. Department of Computer Science and Automation Indian Institute of Science Bangalore, India July 2012

Voting with Partial Information: What Questions to Ask?

Game Theory: Spring 2017

arxiv: v2 [math.co] 29 Mar 2012

Categorization ANLP Lecture 10 Text Categorization with Naive Bayes

ANLP Lecture 10 Text Categorization with Naive Bayes

An equity-efficiency trade-off in a geometric approach to committee selection Daniel Eckert and Christian Klamler

Graph Aggregation. Ulle Endriss and Umberto Grandi

Virtual Robust Implementation and Strategic Revealed Preference

Approximation Algorithms and Mechanism Design for Minimax Approval Voting 1

Roberts Theorem with Neutrality. A Social Welfare Ordering Approach

Mechanism Design for Bounded Agents

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts

The Computational Complexity of Bribery in a Network-Based Rating System

Political Economy of Institutions and Development. Lectures 2 and 3: Static Voting Models

Weighting Expert Opinions in Group Decision Making for the Influential Effects between Variables in a Bayesian Network Model

Algebraic Voting Theory

Motivation. Game Theory 24. Mechanism Design. Setting. Preference relations contain no information about by how much one candidate is preferred.

The problem of judgment aggregation in the framework of boolean-valued models

Transcription:

A Borda Count for Collective Sentiment Analysis Umberto Grandi Department of Mathematics University of Padova 10 February 2014 Joint work with Andrea Loreggia (Univ. Padova), Francesca Rossi (Univ. Padova) and Vijay Saraswat (IBM Yorktown)

What is the collective sentiment about...? wegov Closing the loop between governments & citizens www.wegov-project.eu

Aggregation of individual polarities (like/dislike) Collective sentiment 40% 60%

Multiple alternatives: SA is not enough! Sentiment analysis (polarity aggregation) works well on a single issue But what if we want to compare multiple alternatives? positive negative a > c > b > d >e > f! 20 people a b 10 people b a 5 people a b Sentiment analysis Majority aggregation b a Sentiment analysis predicts a different winner than majority aggregation!

Towards human-level decision making What does sentiment analysis have to do with MAS? A wide testing ground for many MAS techniques: same methods used for agents coordination and collective decision making among humans Computational Social Choice, Algorithmic Decision Theory... Our goal: Design agents that are able to interpret correctly the collective sentiment of a society or a multiagent system Example: automated trading agents Brandt, Conitzer, and Endriss. Computational social choice. In Multiagent Systems, 2013. Wellman, Greenwald, and Stone. Autonomous Bidding Agents: Strategies and Lessons from the Trading Agent Competition, 2007.

Outline 1. Basic definitions: Sentiment analysis and preference aggregation 2. Challenge I: Data extraction and representation pure sentiments (polarity) pure preference (preorder) sentiment and preferences (SP-structures) 3. Challenge II: Aggregation of individual opinions the Borda rule 4. Challenge III: Methods for validation Predictive accuracy: Truth tracking power Individuals satisfaction: Axiomatic properties 5. Future work: Influence detection and strategic behavior

Sentiment Analysis Ingredients: An entity x (no assumption about its structure) A corpus of individual expressions T by a set of individuals I A notion of polarity: {+,, N}, 5-star scale or graded sentiment Several NLP techniques used to extract the collective sentiment: entity extraction to find expressions mentioning x in T word-count, Naive Bayes, and other machine learning techniques to extract the polarity of a single expression in T The collective sentiment about an issue x is positive if the percentage of positive expressions about x outnumbers that of negative ones.

Voting Theory Ingredients: A set of alternatives X A set of agents I expressing preferences on X Voting rules are used to identify a set of most preferred candidates. Several rules are possible! We focus on two definitions of voting rules: Borda rule - linear orders: if a voter ranks candidate c at j-th position this gives j points to c. The alternatives with highest score are the winners. Approval voting - sets of candidates: the winners of approval voting are the candidates which receive the highest number of approvals.

Challenge I: What preferences/opinions can be extracted from the individuals text?

What kind of data structure can be extracted from text? Pang and Lee. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2008. Liu. Sentiment Analysis and Opinion Mining. Morgan & Claypool Publishers, 2012.

Individual Data Two forms of opinions can be extracted with existing NLP techniques: Objective opinions: Nikon is a good camera score of a single entity Comparative opinions: I prefer Canon to Nikon binary comparisons Let T i be the set of all expressions of individual s i about objects in X : Definition The individual data extracted from T i is a tuple (σ i, P i, N i): σ i : D i R to represent objective opinions on D i X subsets P i and N i of X preordered by P i and N i, representing positive and negative comparative opinions Example: a b 3-3 > c

b The sentiment analysis approach Definition The pure sentiment data associated with raw data (σ i, P i, N i) is a function S i : {D i P i N i} {+,, 0} defined as: sgn(σ i(c)) if c D i \ (P i N i) 0 if σ i(c) = 0 S i(c) = b 3 + if c P i Example: if c N i a -3 > c Example: a b c

Definition The preference aggregation approach b 3 Example: The pure preference data associated with raw data (σ i, P i, N i) is a preordered set (D i, D i ) where D i = D i P i N i and x D i a -3 > c x P i y and x, y P i or x N i y and x, yb N i or y x N i and y P i or σ i(x) σ i(y) and x, y D i Example: a c Example: a > b > > c

Our approach: SP-structures Definition An SP-structure over X is a tuple (P, N, Z) such that: P, N and Z form a partition of X P and N are ordered respectively by preorders P and N An SP-structure (P i, N i, Z i) can be extracted from raw data (σ i, P i, N i): P i is the union of P i and the set of entities with positive score Analogously for N i. Z i is the set of entities with zero or no score Preordered relations extracted from σ i and copied from P i and N i SP-structures combine (interpersonally non-comparable) scores with (incomplete) pairwise comparisons between entities

Example Agent 1 Agent 2 Agent 3 a a b c b b a c c P Z N Table: SP-structures extracted from the previous example.

Challenge II: How to aggregate individual information into a collective opinion?

Aggregating SP-structures Definition The Borda score of entity c X in SP-structure (P, N, Z) is defined as: 2 down P (c) + inc P (c) + Z + 2 if c P i s (c) = 2 up N (c) inc N (c) Z 2 if c N i 0 if c P i N i Given a profile S of SP-structures, the most popular candidates are the ones maximising the sum of the individual Borda score: B (S) = argmax s i (c) c X i I

Example of using Borda Agent 1 Agent 2 Agent 3 a b c b a a c P b, c Z Table: SP-structures extracted from the previous example. N The score S (a) = 6 2 + 4, S (b) = 4 + 2 + 0, S (c) = 2 4 + 0 The most preferred candidate under the Borda rule is a.

Challenge III: How should similar methods be validated?

Two avenues for validation

Desirable properties of collective sentiment definitions An important property when dealing with big data: Divide and conquer approaches are possible: Consistency: For all profiles S 1 and S 2, if F (S 1) F (S 2) then F (S 1 + S 2) = F (S 1) F (S 2). If we rename the items the result should be the renaming of the initial result: Neutrality: For any profile S and permutation ρ of entities in X, we have that F (ρ(s)) = ρ(f (S)). Considering one more individual in the computation of the collective sentiment results in a candidate that is higher in her rank (non-classical interpretation): Voters Participation: For all profiles S = (S 1,..., S n) and SP-structures S n+1, we have that F (S + S n+1) is ranked above (or equal) F (S) by voter n + 1. Rank-participation, cancellation, weak Pareto, anonymity...

Desirable properties II More information on an individual s opinion will lead to results that the individual ranks higher: Rank Participation: For all profiles S = (S 1,..., S n), individual i and SP-structure S S i we have that F (S) is ranked above (or equal) F (S i + S ) by agent i. In symmetric cases, the disagreement is so extreme that it makes sense to declare all items as the most preferred ones in the collective opinion. Cancellation. If a profile S is symmetric then all entities are in the winning set, i.e., F (S) = X. Agreement among all individuals should be reflected in the collective opinion. Weak-Pareto: If S is a profile in which all individuals rank a above b then b F (S).

Theoretical results The rule we propose satisfies a number of axiomatic properties: Theorem The Borda rule satisfies consistency, anonymity, neutrality, voters and rank participation as well as cancellation. It is also consistent with sentiment analysis and preference aggregation: Theorem If a profile S is purely preferential, then B (S) = Borda(S). If a profile S is purely sentimental, then B (S) = Approval(S).

% of profiles Experimental results I Our analysis is based on the assumption that sentiment analysis and preference aggregation yields different results. How often does this happen? 34% 32% 30% 28% 26% 24% 22% 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 Number of voters

% of error Experimental results II Incompleteness is perhaps the most important problem of text-extracted information. How does Borda behave in incomplete domains? 60% 50% 40% 30% 20% 10% 0% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% % of completeness Mean error Borda* Random error

Future work: detecting influence and strategic behavior Classical studied in Computational Social Choice: Strategy-proofness (aka resistance to individual manipulation): not interesting, absence of incentives for individual deviations... Cloning and bribery are more likely to occur: an applicative setting!

From Sentiment Analysis to Preference Aggregation 1. What preferences/opinions can be extracted from the individuals text? Our proposal: sentiment score and pairwise comparison (raw data) 2. How to best represent (compactly) individual preferences and sentiments? Our proposal: SP-structures based on preorders 3. How to aggregate the individual information into a collective opinion? Our proposal: generalise Borda and Approval with the Borda rule 4. Is it possible to identify influencers and prevent strategic behaviour? Example: creation of fake accounts (cloning)... 5. How should preference aggregation methods be validated? Predictive accuracy and Social Choice Theory 6. How to deal with big data in sentiment and preference analysis? Umberto Grandi, Andrea Loreggia, Francesca Rossi, Vijay Saraswat. From Sentiment Analysis to Preference Aggregation. Proceedings of ISAIM-2014, 2014.