Modeling Controversy Within Populations

Size: px
Start display at page:

Download "Modeling Controversy Within Populations"

Transcription

1 Modeling Controversy Within Populations Myungha Jang, Shiri Dori-Hacohen and James Allan Center for Intelligent Information Retrieval (CIIR) University of Massachusetts Amherst {mhjang, shiri, ICTIR 2017, Amsterdam, Netherlands

2 Social Media and Controversy Social media are increasingly the place where controversial arguments are being held. Technological tools have become critical in shaping these discussions. However, serious gaps remain in our theoretical and practical understanding of how to define controversy, and how it manifests and evolves. 2 / 20 mhjang@cs.umass.edu

3 Definition of Controversy What makes controversy? Is the topic important? Is there enough contention? 3 / 20 mhjang@cs.umass.edu

4 Related Work Quantifying controversy in Social Media using a community graph structure [Garimella et al. WSDM 16] Detecting controversy in Web pages using the level of related Wikipedia articles edit war [Dori-Hacohen et al. ECIR 13] Detecting controversy in Web pages using language model [Jang et al. CIKM 16] Detecting Controversies in Online News Media using a debase-based approach [Beelan et al. SIGIR 17] 4 / 20

5 Controversy as Single Absolute Value? Quantifying controversy in Social Media using a community graph structure [Garimella et al. WSDM 16] While exploiting different features, previous works share the same underlying assumption that controversy is a single absolute value the Detecting for topic. controversy in Web pages using the level of related Wikipedia articles edit war [Dori-Hacohen et al. ECIR 13] Detecting controversy in Web pages using language model [Jang et al. CIKM 16] Detecting Controversies in Online News Media using a debase-based approach [Beelan et al. SIGIR 17] 4 / 20

6 Definition of Controversy Is controversy really an absolute, single value for amorphous global population? A disparity between scientific understanding and public opinion on certain controversial issues A general population Scientists 5 / 20 mhjang@cs.umass.edu

7 Definition of Controversy Is controversy really an absolute, single value for amorphous global population? A disparity between scientific understanding and public opinion on certain controversial issues Should we reduce the page limit of full paper to 8 pages at SIGIR? A general population IR Community 5 / 20 mhjang@cs.umass.edu

8 Definition of Controversy Is controversy really an absolute, single value for amorphous global population? A disparity between scientific understanding and public opinion on certain controversial issues Should we reduce the page limit of full paper to 8 pages at SIGIR? A general population IR Community 5 / 20 mhjang@cs.umass.edu

9 Our Contributions We propose a theoretical model that defines controversy in terms of population. The right question to be asked is not Is climate change controversial? but is climate change controversial to {a particular group}? We model controversy as multidimensional with at least two dimensions: Contention Importance Some topics may be less controversial because they are contentious but not important (or vice versa). Our model has explanatory power that can be used to understand a variety of observed phenomena. To ground our theoretical model, we examine a diverse dataset from polling data, Twitter and Wikipedia. 6 / 20 mhjang@cs.umass.edu

10 Modeling Population-based Controversy We model P Controversy T, Ω) where T is a topic and Ω is a population. P Controversy T, Ω) = P Contention, Importance T, Ω) Assuming that contention and importance are independent to each other: P Controversy T, Ω) = P Contention T, Ω P(Important T, Ω) 7 / 20 mhjang@cs.umass.edu

11 Modeling Contention (1/5) Contention is a measure that quantifies the proportion of people in disagreement within a population. Ω = {p V, p W p Y } is a population consisting of N people P c Ω, T is a probability of T is contentious within Ω and is modeled by: P Controversy T, Ω) = P Contention T, Ω P(Important T, Ω) 8/ 20 mhjang@cs.umass.edu

12 Modeling Contention (1/4) Contention is a measure that quantifies the proportion of people in disagreement within a population. Ω = {p V, p W p Y } is a population consisting of N people P c Ω, T is a probability of T is contentious within Ω and is modeled by: P(c Ω, T) = P(p 1, p 2 selected randomly from Ω, s i, s j S, s. t. holds p 1, s i, T holds p 2, s j, T P conflicts s i, s j Where S = {s 1, s 2 } is a set of stances with regard to T holds(p, s, T): person p holds stance s with regard to T s 0 is a lack of stance: I don t have a stance Let P(conflict s i, s j ) be the probability that two groups of s j and s k are in a complete conflict. P Controversy T, Ω) = P Contention T, Ω P(Important T, 9 / 20 Ω) 9/ 20 mhjang@cs.umass.edu

13 Modeling Contention (2/4) P(c Ω, T) = P(p 1, p 2 selected randomly from Ω, s i, s j S, s. t. holds p 1, s i, T holds p 2, s j, T P conflicts s i, s j Mutually Exclusive Stances Most controversial topics have at least two mutually exclusive stances to bisect the community. Ω = ~ G j j {.. } Where G i is a group that consists of people who hold s j For simplicity, we estimate the probability of selecting p V and p W as selection with replacement. The probability of choosing any particular pair is V }. P Controversy T, Ω) = P Contention T, Ω P(Important T, 10 / 20 Ω) 10 / 20 mhjang@cs.umass.edu

14 Modeling Contention (3/4) Mutually Exclusive Stances One person doesn t have a stance and the other has a stance Two people have the same stance Two people have conflicting stances P Controversy T, Ω) = P Contention T, Ω P(Important T, 11 / 20 Ω) 11/ 20 mhjang@cs.umass.edu

15 Modeling Contention (3/4) Mutually Exclusive Stances One person doesn t have a stance and the other has a stance Two people have the same stance Two people have conflicting stances Finally, P Controversy T, Ω) = P Contention T, Ω P(Important T, 11 / 20 Ω) 11/ 20 mhjang@cs.umass.edu

16 Modeling Contention (4/4) Mutually Exclusive Stances Contention G1 Contention is maximized when the all groups are in the same size. The bigger G is, the smaller the maximal contention becomes in the given population. P Controversy T, Ω) = P Contention T, Ω P(Important T, 12 / 20 Ω) 12/ 20 mhjang@cs.umass.edu

17 Modeling Importance Importance is the level of impact that the issue brings to the world within the population. How many people are affected by the topic? Let affected(p, T) be a binary function that denotes whether person p is affected by T. P I Ω, T = P(p selected randomly from Ω affected (p, T)) = Ω T Ω P Controversy T, Ω) = P Contention T, Ω P(Important T, Ω) 13/ / 20 mhjang@cs.umass.edu

18 Modeling Importance Importance is the level of impact that the issue brings to the world within the population. How many people are affected by the topic? Let affected(p, T) be a binary function that denotes whether person p is affected by T. P I Ω, T = P(p selected randomly from Ω affected (p, T)) = Ω T The subpopulation that is affected by T Ω The function is purposely defined vaguely such that it can be instantiated differently depending on the dataset. In Twitter, affected user posts a tweet about the event. In Wikipedia, affected editor makes changes to the article. P Controversy T, Ω) = P Contention T, Ω P(Important T, Ω) 14/ / 20 mhjang@cs.umass.edu

19 Model Validation: Contention in Polling US Adults vs Scientists Pew Research Center Gallup Data in 2015 Contention in the scientific community vs. the general population for several controversial topics. 14 / 20 mhjang@cs.umass.edu

20 Model Validation: Contention in Polling US Adults vs Scientists Issues whose contention levels align well between the scientists and the U.S. adults group Opinion on the increased use of fracking {favor / against} The space station has been {good / bad investment} for the country: Do you think it is generally {safe / unsafe} to eat foods grown with pesticides? Contention in the scientific community vs. the general population for several controversial topics. 14 / 20 mhjang@cs.umass.edu

21 Model Validation: Contention in Polling US Adults vs Scientists Issues whose contention is in disparity between the scientists and th U.S. adults group Do you support using animals in Research? {support, against} What is the reason for climate change? {Earth is warming mostly due to human activity vs Earth is warming mostly due to natural patterns} Has evolution really existed? {Yes, No} Contention in the scientific community vs. the general population for several controversial topics. 14 / 20 mhjang@cs.umass.edu

22 Model Validation: Contention in Polling US Adults vs Scientists Issues whose contention is in disparity between the scientists and th U.S. adults group Most scientists believe that using animals in research is necessary, whereas it is more controversial between U.S. adults. Most scientists believe that climate change is caused by human activity whereas it is controversial between U.S. adults with more people believing that it is a natural pattern. Most scientists believe that evolution is real whereas some people don t believe it. Contention in the scientific community vs. the general population for several controversial topics. 14 / 20 mhjang@cs.umass.edu

23 Model Validation: Contention in Polling Population by State Answers collected by isidewith.com, an on-line opinion gathering site. Population of each state provides different Ω for each. The green states of Wyoming, Alabama, Mississippi, etc are the states that are strongly opposed to gun control. Per-state contention for Do you support increased gun control? More demo: 15 / 20 mhjang@cs.umass.edu

24 Model Validation Controversy In Twitter Given a topic T, we measure controversy in Twitter population. We assume that the user u is affected by T if u posts anything with a topic hashtag. G1: Supports Hillary G2: Supports Trump G0: a lack of stance (or neutral). 16 / 20 mhjang@cs.umass.edu

25 Model Validation Controversy In Twitter Given a topic T, we measure controversy in Twitter population. We assume that the user u is affected by T if u posts anything with a topic hashtag. G1: Supports Hillary G2: Supports Trump G0: a lack of stance (or neutral) We first identify the subpopulation in which T is important to them, and classify people into stance groups (G1, G2 ) and G0. Note that here, G0 is not a group for a lack of stance to T, but a group that is affected by T but does not hold a particular stance. The details such as (1) finding relevant hashtags for T, (2) identifying the sub-population of interest, and (3) finding stance groups, are in the paper. 16 / 20 mhjang@cs.umass.edu

26 Model Validation Controversy In Twitter 17 / 20 mhjang@cs.umass.edu

27 Model Validation Controversy In Twitter Buzzfeed article 17 / 20 mhjang@cs.umass.edu

28 Model Validation Controversy In Twitter Buzzfeed article Voting day 17 / 20 mhjang@cs.umass.edu

29 Model Validation Controversy In Twitter 17 / 20 mhjang@cs.umass.edu

30 Model Validation Controversy In Twitter According to our model, U.S. Election was far more controversial than Brexit and the Dress on Twitter. 18 / 20 mhjang@cs.umass.edu

31 Model Validation Controversy In Twitter According to our model, The Dress was more contentious for one day but Brexit was more important. 18 / 20 mhjang@cs.umass.edu

32 Conclusions and Future Work In contrast to past work, we propose: a theoretical model for controversy with respect to population. a model that models controversy as multi-dimensional with two minimal dimensions, contention and importance. We validate our model using the poll and Twitter data by showing our model can explain the observed phenomena. Future work includes: Using the model to detect newly emerged controversy, not just validating it Adding more dimensions of controversy to the model 19 / 20 mhjang@cs.umass.edu

33 Thank you! And thanks to ACM SIGIR for the travel grant! Our demo and dataset for download: 20 / 20 mhjang@cs.umass.edu

Modeling Controversy within Populations

Modeling Controversy within Populations Center for Intelligent Information Retrieval College of Information and Computer Sciences University of Massachusetts ABSTRACT A growing body of research focuses on computationally detecting controversial

More information

Section 7.1 How Likely are the Possible Values of a Statistic? The Sampling Distribution of the Proportion

Section 7.1 How Likely are the Possible Values of a Statistic? The Sampling Distribution of the Proportion Section 7.1 How Likely are the Possible Values of a Statistic? The Sampling Distribution of the Proportion CNN / USA Today / Gallup Poll September 22-24, 2008 www.poll.gallup.com 12% of Americans describe

More information

Spam ain t as Diverse as It Seems: Throttling OSN Spam with Templates Underneath

Spam ain t as Diverse as It Seems: Throttling OSN Spam with Templates Underneath Spam ain t as Diverse as It Seems: Throttling OSN Spam with Templates Underneath Hongyu Gao, Yi Yang, Kai Bu, Yan Chen, Doug Downey, Kathy Lee, Alok Choudhary Northwestern University, USA Zhejiang University,

More information

What Is a Sampling Distribution? DISTINGUISH between a parameter and a statistic

What Is a Sampling Distribution? DISTINGUISH between a parameter and a statistic Section 8.1A What Is a Sampling Distribution? Learning Objectives After this section, you should be able to DISTINGUISH between a parameter and a statistic DEFINE sampling distribution DISTINGUISH between

More information

On the Foundations of Diverse Information Retrieval. Scott Sanner, Kar Wai Lim, Shengbo Guo, Thore Graepel, Sarvnaz Karimi, Sadegh Kharazmi

On the Foundations of Diverse Information Retrieval. Scott Sanner, Kar Wai Lim, Shengbo Guo, Thore Graepel, Sarvnaz Karimi, Sadegh Kharazmi On the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai Lim, Shengbo Guo, Thore Graepel, Sarvnaz Karimi, Sadegh Kharazmi 1 Outline Need for diversity The answer: MMR But what was the

More information

Yahoo! Labs Nov. 1 st, Liangjie Hong, Ph.D. Candidate Dept. of Computer Science and Engineering Lehigh University

Yahoo! Labs Nov. 1 st, Liangjie Hong, Ph.D. Candidate Dept. of Computer Science and Engineering Lehigh University Yahoo! Labs Nov. 1 st, 2012 Liangjie Hong, Ph.D. Candidate Dept. of Computer Science and Engineering Lehigh University Motivation Modeling Social Streams Future work Motivation Modeling Social Streams

More information

Geography. Programme of study for key stage 3 and attainment target (This is an extract from The National Curriculum 2007)

Geography. Programme of study for key stage 3 and attainment target (This is an extract from The National Curriculum 2007) Geography Programme of study for key stage 3 and attainment target (This is an extract from The National Curriculum 2007) Crown copyright 2007 Qualifications and Curriculum Authority 2007 Curriculum aims

More information

Personalized Social Recommendations Accurate or Private

Personalized Social Recommendations Accurate or Private Personalized Social Recommendations Accurate or Private Presented by: Lurye Jenny Paper by: Ashwin Machanavajjhala, Aleksandra Korolova, Atish Das Sarma Outline Introduction Motivation The model General

More information

Lecture 20 Random Samples 0/ 13

Lecture 20 Random Samples 0/ 13 0/ 13 One of the most important concepts in statistics is that of a random sample. The definition of a random sample is rather abstract. However it is critical to understand the idea behind the definition,

More information

Liangjie Hong, Ph.D. Candidate Dept. of Computer Science and Engineering Lehigh University Bethlehem, PA

Liangjie Hong, Ph.D. Candidate Dept. of Computer Science and Engineering Lehigh University Bethlehem, PA Rutgers, The State University of New Jersey Nov. 12, 2012 Liangjie Hong, Ph.D. Candidate Dept. of Computer Science and Engineering Lehigh University Bethlehem, PA Motivation Modeling Social Streams Future

More information

Scientific Explanation- Causation and Unification

Scientific Explanation- Causation and Unification Scientific Explanation- Causation and Unification By Wesley Salmon Analysis by Margarita Georgieva, PSTS student, number 0102458 Van Lochemstraat 9-17 7511 EG Enschede Final Paper for Philosophy of Science

More information

AP Stats MOCK Chapter 7 Test MC

AP Stats MOCK Chapter 7 Test MC Name: Class: Date: AP Stats MOCK Chapter 7 Test MC Multiple Choice-13 questions Identify the choice that best completes the statement or answers the question. 1. A survey conducted by Black Flag asked

More information

Appendix A. Linear Relationships in the Real World Unit

Appendix A. Linear Relationships in the Real World Unit Appendix A The Earth is like a giant greenhouse. The sun s energy passes through the atmosphere and heats up the land. Some of the heat escapes back into space while some of it is reflected back towards

More information

Predicting Neighbor Goodness in Collaborative Filtering

Predicting Neighbor Goodness in Collaborative Filtering Predicting Neighbor Goodness in Collaborative Filtering Alejandro Bellogín and Pablo Castells {alejandro.bellogin, pablo.castells}@uam.es Universidad Autónoma de Madrid Escuela Politécnica Superior Introduction:

More information

Chapter 7 Summary Scatterplots, Association, and Correlation

Chapter 7 Summary Scatterplots, Association, and Correlation Chapter 7 Summary Scatterplots, Association, and Correlation What have we learned? We examine scatterplots for direction, form, strength, and unusual features. Although not every relationship is linear,

More information

Topical Sequence Profiling

Topical Sequence Profiling Tim Gollub Nedim Lipka Eunyee Koh Erdan Genc Benno Stein TIR @ DEXA 5. Sept. 2016 Webis Group Bauhaus-Universität Weimar www.webis.de Big Data Experience Lab Adobe Systems www.research.adobe.com R e

More information

Introduction to Statistical Data Analysis Lecture 4: Sampling

Introduction to Statistical Data Analysis Lecture 4: Sampling Introduction to Statistical Data Analysis Lecture 4: Sampling James V. Lambers Department of Mathematics The University of Southern Mississippi James V. Lambers Statistical Data Analysis 1 / 30 Introduction

More information

Seymour Centre 2017 Education Program 2071 CURRICULUM LINKS

Seymour Centre 2017 Education Program 2071 CURRICULUM LINKS Suitable for: Stage 5 Stage 6 HSC Subject Links: Seymour Centre 2017 Education Program 2071 CURRICULUM LINKS Science Stage Content Objective Outcomes Stage 5 Earth and Space ES3: People use scientific

More information

Arrow s Impossibility Theorem and Experimental Tests of Preference Aggregation

Arrow s Impossibility Theorem and Experimental Tests of Preference Aggregation Arrow s Impossibility Theorem and Experimental Tests of Preference Aggregation Todd Davies Symbolic Systems Program Stanford University joint work with Raja Shah, Renee Trochet, and Katarina Ling Decision

More information

Mining Newsgroups Using Networks Arising From Social Behavior by Rakesh Agrawal et al. Presented by Will Lee

Mining Newsgroups Using Networks Arising From Social Behavior by Rakesh Agrawal et al. Presented by Will Lee Mining Newsgroups Using Networks Arising From Social Behavior by Rakesh Agrawal et al. Presented by Will Lee wwlee1@uiuc.edu September 28, 2004 Motivation IR on newsgroups is challenging due to lack of

More information

Exploring Urban Areas of Interest. Yingjie Hu and Sathya Prasad

Exploring Urban Areas of Interest. Yingjie Hu and Sathya Prasad Exploring Urban Areas of Interest Yingjie Hu and Sathya Prasad What is Urban Areas of Interest (AOIs)? What is Urban Areas of Interest (AOIs)? Urban AOIs exist in people s minds and defined by people s

More information

Measuring and moderating opinion polarization in social networks

Measuring and moderating opinion polarization in social networks Data Min Knowl Disc (2017) 31:1480 1505 DOI 10.1007/s10618-017-0527-9 Measuring and moderating opinion polarization in social networks Antonis Matakos 1 Evimaria Terzi 2 Panayiotis Tsaparas 1 Received:

More information

Indiana Academic Standards Science Grade: 3 - Adopted: 2016

Indiana Academic Standards Science Grade: 3 - Adopted: 2016 Main Criteria: Indiana Academic Standards Secondary Criteria: Subjects: Science, Social Studies Grade: 3 Correlation Options: Show Correlated Indiana Academic Standards Science Grade: 3 - Adopted: 2016

More information

2. Probability. Chris Piech and Mehran Sahami. Oct 2017

2. Probability. Chris Piech and Mehran Sahami. Oct 2017 2. Probability Chris Piech and Mehran Sahami Oct 2017 1 Introduction It is that time in the quarter (it is still week one) when we get to talk about probability. Again we are going to build up from first

More information

DM-Group Meeting. Subhodip Biswas 10/16/2014

DM-Group Meeting. Subhodip Biswas 10/16/2014 DM-Group Meeting Subhodip Biswas 10/16/2014 Papers to be discussed 1. Crowdsourcing Land Use Maps via Twitter Vanessa Frias-Martinez and Enrique Frias-Martinez in KDD 2014 2. Tracking Climate Change Opinions

More information

METHODS FOR IDENTIFYING PUBLIC HEALTH TRENDS. Mark Dredze Department of Computer Science Johns Hopkins University

METHODS FOR IDENTIFYING PUBLIC HEALTH TRENDS. Mark Dredze Department of Computer Science Johns Hopkins University METHODS FOR IDENTIFYING PUBLIC HEALTH TRENDS Mark Dredze Department of Computer Science Johns Hopkins University disease surveillance self medicating vaccination PUBLIC HEALTH The prevention of disease,

More information

XXX Curriculum links. Key question. Objective NC link QCA link. Hot spot. 1 Resource 1 What are rainforests? To know the features of a rainforest

XXX Curriculum links. Key question. Objective NC link QCA link. Hot spot. 1 Resource 1 What are rainforests? To know the features of a rainforest Curriculum links National Curriculum Geography: Geographical enquiry and skills, Knowledge and understanding of places, Knowledge and understanding of patterns and processes, Knowledge and understanding

More information

THE SAMPLING DISTRIBUTION OF THE MEAN

THE SAMPLING DISTRIBUTION OF THE MEAN THE SAMPLING DISTRIBUTION OF THE MEAN COGS 14B JANUARY 26, 2017 TODAY Sampling Distributions Sampling Distribution of the Mean Central Limit Theorem INFERENTIAL STATISTICS Inferential statistics: allows

More information

Unit 1 ~ Scientific Reasoning & Logic

Unit 1 ~ Scientific Reasoning & Logic Unit 1 ~ Scientific Reasoning & Logic A) An Introduction to Biology What is the study of Biology? Every thing can be classified into one of 3 groups... o _ o _ o _ Why do people study it?... Or better

More information

Chapter 18. Sampling Distribution Models /51

Chapter 18. Sampling Distribution Models /51 Chapter 18 Sampling Distribution Models 1 /51 Homework p432 2, 4, 6, 8, 10, 16, 17, 20, 30, 36, 41 2 /51 3 /51 Objective Students calculate values of central 4 /51 The Central Limit Theorem for Sample

More information

Crowdsourcing, Citizen Science & INSPIRE

Crowdsourcing, Citizen Science & INSPIRE Crowdsourcing, Citizen Science & INSPIRE Muki Haklay & Claire Ellul Extreme Citizen Science (ExCiteS) research group, UCL @mhaklay / @UCL_ExCiteS Outline Three eras of environmental information: By experts,

More information

How can we assure that local decision makers have the right statistics

How can we assure that local decision makers have the right statistics How can we assure that local decision makers have the right statistics Petra Kuncová Czech Statistical Office How can we assure.? 1. We have to be in contact with regional users incl. local decision makers

More information

Densest subgraph computation and applications in finding events on social media

Densest subgraph computation and applications in finding events on social media Densest subgraph computation and applications in finding events on social media Oana Denisa Balalau advised by Mauro Sozio Télécom ParisTech, Institut Mines Télécom December 4, 2015 1 / 28 Table of Contents

More information

Behavioral Data Mining. Lecture 2

Behavioral Data Mining. Lecture 2 Behavioral Data Mining Lecture 2 Autonomy Corp Bayes Theorem Bayes Theorem P(A B) = probability of A given that B is true. P(A B) = P(B A)P(A) P(B) In practice we are most interested in dealing with events

More information

SUBJECT: YEAR: Half Term:

SUBJECT: YEAR: Half Term: Geography 9 1 Introduction to Population and population distribution begin to explain the pattern of population distribution. describe and explain the pattern of population distribution at a range of scales.

More information

STA Module 5 Regression and Correlation. Learning Objectives. Learning Objectives (Cont.) Upon completing this module, you should be able to:

STA Module 5 Regression and Correlation. Learning Objectives. Learning Objectives (Cont.) Upon completing this module, you should be able to: STA 2023 Module 5 Regression and Correlation Learning Objectives Upon completing this module, you should be able to: 1. Define and apply the concepts related to linear equations with one independent variable.

More information

Stigmergy: a fundamental paradigm for digital ecosystems?

Stigmergy: a fundamental paradigm for digital ecosystems? Stigmergy: a fundamental paradigm for digital ecosystems? Francis Heylighen Evolution, Complexity and Cognition group Vrije Universiteit Brussel 1 Digital Ecosystem Complex, self-organizing system Agents:

More information

3. Based on how energy is stored in the molecules, explain why ΔG is independent of the path of the reaction.

3. Based on how energy is stored in the molecules, explain why ΔG is independent of the path of the reaction. B. Thermodynamics 1. What is "free energy"? 2. Where is this energy stored? We say that ΔG is a thermodynamic property, meaning that it is independent of the way that the conversion of reactants to products

More information

Information Retrieval and Web Search Engines

Information Retrieval and Web Search Engines Information Retrieval and Web Search Engines Lecture 4: Probabilistic Retrieval Models April 29, 2010 Wolf-Tilo Balke and Joachim Selke Institut für Informationssysteme Technische Universität Braunschweig

More information

Chapter 7. Scatterplots, Association, and Correlation. Copyright 2010 Pearson Education, Inc.

Chapter 7. Scatterplots, Association, and Correlation. Copyright 2010 Pearson Education, Inc. Chapter 7 Scatterplots, Association, and Correlation Copyright 2010 Pearson Education, Inc. Looking at Scatterplots Scatterplots may be the most common and most effective display for data. In a scatterplot,

More information

Big-oh stuff. You should know this definition by heart and be able to give it,

Big-oh stuff. You should know this definition by heart and be able to give it, Big-oh stuff Definition. if asked. You should know this definition by heart and be able to give it, Let f and g both be functions from R + to R +. Then f is O(g) (pronounced big-oh ) if and only if there

More information

Chapter 6. Estimates and Sample Sizes

Chapter 6. Estimates and Sample Sizes Chapter 6 Estimates and Sample Sizes Lesson 6-1/6-, Part 1 Estimating a Population Proportion This chapter begins the beginning of inferential statistics. There are two major applications of inferential

More information

Twitter s Effectiveness on Blackout Detection during Hurricane Sandy

Twitter s Effectiveness on Blackout Detection during Hurricane Sandy Twitter s Effectiveness on Blackout Detection during Hurricane Sandy KJ Lee, Ju-young Shin & Reza Zadeh December, 03. Introduction Hurricane Sandy developed from the Caribbean stroke near Atlantic City,

More information

The Axiometrics Project

The Axiometrics Project The Axiometrics Project Eddy Maddalena and Stefano Mizzaro Department of Mathematics and Computer Science University of Udine Udine, Italy eddy.maddalena@uniud.it, mizzaro@uniud.it Abstract. The evaluation

More information

A Bivariate Point Process Model with Application to Social Media User Content Generation

A Bivariate Point Process Model with Application to Social Media User Content Generation 1 / 33 A Bivariate Point Process Model with Application to Social Media User Content Generation Emma Jingfei Zhang ezhang@bus.miami.edu Yongtao Guan yguan@bus.miami.edu Department of Management Science

More information

Lecture 3: Probabilistic Retrieval Models

Lecture 3: Probabilistic Retrieval Models Probabilistic Retrieval Models Information Retrieval and Web Search Engines Lecture 3: Probabilistic Retrieval Models November 5 th, 2013 Wolf-Tilo Balke and Kinda El Maarry Institut für Informationssysteme

More information

The Five Themes of Geography

The Five Themes of Geography The Five Themes of Geography The Five Themes of Geography Main Idea: Geographers use the Five Themes of Geography to help them study the Earth. The Five Themes of Geography Geography and You: Suppose a

More information

Historical Maps Of Ireland By Michael Swift

Historical Maps Of Ireland By Michael Swift Historical Maps Of Ireland By Michael Swift If you are looking for a book by Michael Swift Historical Maps of Ireland in pdf form, in that case you come on to loyal website. We present utter variation

More information

Text mining and natural language analysis. Jefrey Lijffijt

Text mining and natural language analysis. Jefrey Lijffijt Text mining and natural language analysis Jefrey Lijffijt PART I: Introduction to Text Mining Why text mining The amount of text published on paper, on the web, and even within companies is inconceivably

More information

Using Social Media for Geodemographic Applications

Using Social Media for Geodemographic Applications Using Social Media for Geodemographic Applications Muhammad Adnan and Guy Lansley Department of Geography, University College London @gisandtech @GuyLansley Web: http://www.uncertaintyofidentity.com Outline

More information

Lab 5 for Math 17: Sampling Distributions and Applications

Lab 5 for Math 17: Sampling Distributions and Applications Lab 5 for Math 17: Sampling Distributions and Applications Recall: The distribution formed by considering the value of a statistic for every possible sample of a given size n from the population is called

More information

Biologists Study the Interactions of Life

Biologists Study the Interactions of Life What is Biology? Biologists Study the Interactions of Life Living things do not live in isolation. They interact with their environment and depend on other living/non-living things for survival. Biologists

More information

Evaluation Metrics. Jaime Arguello INLS 509: Information Retrieval March 25, Monday, March 25, 13

Evaluation Metrics. Jaime Arguello INLS 509: Information Retrieval March 25, Monday, March 25, 13 Evaluation Metrics Jaime Arguello INLS 509: Information Retrieval jarguell@email.unc.edu March 25, 2013 1 Batch Evaluation evaluation metrics At this point, we have a set of queries, with identified relevant

More information

Homework (due Wed, Oct 27) Chapter 7: #17, 27, 28 Announcements: Midterm exams keys on web. (For a few hours the answer to MC#1 was incorrect on

Homework (due Wed, Oct 27) Chapter 7: #17, 27, 28 Announcements: Midterm exams keys on web. (For a few hours the answer to MC#1 was incorrect on Homework (due Wed, Oct 27) Chapter 7: #17, 27, 28 Announcements: Midterm exams keys on web. (For a few hours the answer to MC#1 was incorrect on Version A.) No grade disputes now. Will have a chance to

More information

Chapter 1: Basic Concepts

Chapter 1: Basic Concepts Chapter 1: Basic Concepts The Cultural Landscape: An Introduction to Human Geography Defining Geography Word coined by Eratosthenes Geo = Earth Graphia = writing Geography thus means earth writing Contemporary

More information

Introduction to Information Retrieval (Manning, Raghavan, Schutze) Chapter 6 Scoring term weighting and the vector space model

Introduction to Information Retrieval (Manning, Raghavan, Schutze) Chapter 6 Scoring term weighting and the vector space model Introduction to Information Retrieval (Manning, Raghavan, Schutze) Chapter 6 Scoring term weighting and the vector space model Ranked retrieval Thus far, our queries have all been Boolean. Documents either

More information

NP-Completeness I. Lecture Overview Introduction: Reduction and Expressiveness

NP-Completeness I. Lecture Overview Introduction: Reduction and Expressiveness Lecture 19 NP-Completeness I 19.1 Overview In the past few lectures we have looked at increasingly more expressive problems that we were able to solve using efficient algorithms. In this lecture we introduce

More information

Nicole Dalzell. July 3, 2014

Nicole Dalzell. July 3, 2014 UNIT 2: PROBABILITY AND DISTRIBUTIONS LECTURE 1: PROBABILITY AND CONDITIONAL PROBABILITY STATISTICS 101 Nicole Dalzell July 3, 2014 Announcements No team activities today Labs: Individual Write-Ups Statistics

More information

1.4 Pulling a Rabbit Out of the Hat

1.4 Pulling a Rabbit Out of the Hat 1.4 Pulling a Rabbit Out of the Hat A Solidify Understanding Task I have a magic trick for you: Pick a number, any number. Add 6 Multiply the result by 2 Subtract 12 Divide by 2 The answer is the number

More information

The Clickers. PHY131 Summer 2011 Class 2 Notes 5/19/11. Your First Clicker Question!

The Clickers. PHY131 Summer 2011 Class 2 Notes 5/19/11. Your First Clicker Question! PHY131H1F Summer Introduction to Physics I Class 2 Today: Error Analysis Significant figures Constant Velocity Motion Constant Acceleration Motion Freefall Motion on an inclined plane Status Light When

More information

SPONSORSHIP GOLD SILVER BRONZE WHAT? WHEN? WHERE? HOW MUCH?

SPONSORSHIP GOLD SILVER BRONZE WHAT? WHEN? WHERE? HOW MUCH? SPONSORSHIP WHAT? P u r p l e D a y 2 0 1 8 i s a s e r i e s o f c o m m u n i t y e v e n t s c o o r d i n a t e d b y a n a r m y o f v o l u n t e e r s a n d f u n d r a i s e r s n a t i o n a l

More information

Geological Foundations of Environmental Sciences

Geological Foundations of Environmental Sciences Geological Foundations of Environmental Sciences David C. Elbert Office: Olin Hall 228 Department of Earth and Planetary Sciences Johns Hopkins University 3400 N. Charles St. Baltimore, MD 21218 Phone:

More information

The David P. Weikart Center for Youth Program Quality, Bringing together over fifty years of experience and the latest research,

The David P. Weikart Center for Youth Program Quality, Bringing together over fifty years of experience and the latest research, The David P. Weikart Center for Youth Program Quality,! " " " # $ $ " $ " % " & & " & " ' ( ) * +!!,! % " & ' )! " " "! -!. " & &! % " & &! ' Bringing together over fifty years of experience and the latest

More information

Statistical Data Analysis

Statistical Data Analysis DS-GA 0 Lecture notes 8 Fall 016 1 Descriptive statistics Statistical Data Analysis In this section we consider the problem of analyzing a set of data. We describe several techniques for visualizing the

More information

Fall CS646: Information Retrieval. Lecture 6 Boolean Search and Vector Space Model. Jiepu Jiang University of Massachusetts Amherst 2016/09/26

Fall CS646: Information Retrieval. Lecture 6 Boolean Search and Vector Space Model. Jiepu Jiang University of Massachusetts Amherst 2016/09/26 Fall 2016 CS646: Information Retrieval Lecture 6 Boolean Search and Vector Space Model Jiepu Jiang University of Massachusetts Amherst 2016/09/26 Outline Today Boolean Retrieval Vector Space Model Latent

More information

Indicative conditionals

Indicative conditionals Indicative conditionals PHIL 43916 November 14, 2012 1. Three types of conditionals... 1 2. Material conditionals... 1 3. Indicatives and possible worlds... 4 4. Conditionals and adverbs of quantification...

More information

Bay Area Scientists in Schools Presentation Plan

Bay Area Scientists in Schools Presentation Plan Bay Area Scientists in Schools Presentation Plan Lesson Name Presenter(s) The Water Cycle UC Berkeley PhD students Grade Level 1 Standards Connection(s) Earth Sciences, physics sciences CA Science Content

More information

Institute for Teaching through Technology and Innovative Practices Grade Five. 1 hour and 30 minutes

Institute for Teaching through Technology and Innovative Practices Grade Five. 1 hour and 30 minutes Tale of a Tadpole Lesson Summary Students will use the book Tale of a Tadpole by Karen Wallace to learn about fractions and decimals. Major Topic and SOL Math SOL (2009) 5.2 Science SOL (2009) 4.5.e Reading

More information

Modeling Orbital Debris Problems

Modeling Orbital Debris Problems Modeling Orbital Debris Problems NAME Space Debris: Is It Really That Bad? One problem with which NASA and space scientists from other countries must deal is the accumulation of space debris in orbit around

More information

Test 3 SOLUTIONS. x P(x) xp(x)

Test 3 SOLUTIONS. x P(x) xp(x) 16 1. A couple of weeks ago in class, each of you took three quizzes where you randomly guessed the answers to each question. There were eight questions on each quiz, and four possible answers to each

More information

AP Human Geography Syllabus

AP Human Geography Syllabus AP Human Geography Syllabus Textbook The Cultural Landscape: An Introduction to Human Geography. Rubenstein, James M. 10 th Edition. Upper Saddle River, N.J.: Prentice Hall 2010 Course Objectives This

More information

CORRELATION & REGRESSION

CORRELATION & REGRESSION CORRELATION & REGRESSION Correlation The relationship between variables E.g., achievement in college is related to? Motivation Openness to new experience Conscientiousness IQ Etc Relationships can be causal

More information

Polarization and Bipolar Probabilistic Argumentation Frameworks

Polarization and Bipolar Probabilistic Argumentation Frameworks Polarization and Bipolar Probabilistic Argumentation Frameworks Carlo Proietti Lund University Abstract. Discussion among individuals about a given issue often induces polarization and bipolarization effects,

More information

14.75: Leaders and Democratic Institutions

14.75: Leaders and Democratic Institutions 14.75: Leaders and Democratic Institutions Ben Olken Olken () Leaders 1 / 23 Do Leaders Matter? One view about leaders: The historians, from an old habit of acknowledging divine intervention in human affairs,

More information

Crowdsourcing Semantics for Big Data in Geoscience Applications

Crowdsourcing Semantics for Big Data in Geoscience Applications Wright State University CORE Scholar Computer Science and Engineering Faculty Publications Computer Science and Engineering 2013 Crowdsourcing Semantics for Big Data in Geoscience Applications Thomas Narock

More information

Sampling Distribution Models. Central Limit Theorem

Sampling Distribution Models. Central Limit Theorem Sampling Distribution Models Central Limit Theorem Thought Questions 1. 40% of large population disagree with new law. In parts a and b, think about role of sample size. a. If randomly sample 10 people,

More information

Classroom Activities/Lesson Plan

Classroom Activities/Lesson Plan Grade Band: Middle School Unit 18 Unit Target: Earth and Space Science Unit Topic: This Is the Solar System Lesson 3 Instructional Targets Reading Standards for Informational Text Range and Level of Text

More information

course overview 18.06: Linear Algebra

course overview 18.06: Linear Algebra course overview 18.06: Linear Algebra Prof. Steven G. Johnson, MIT Applied Math Fall 2017 http://web.mit.edu/18.06 Textbook: Strang, Introduction to Linear Algebra, 5 th edition + supplementary notes Help

More information

18.600: Lecture 4 Axioms of probability and inclusion-exclusion

18.600: Lecture 4 Axioms of probability and inclusion-exclusion 18.600: Lecture 4 Axioms of probability and inclusion-exclusion Scott Sheffield MIT Outline Axioms of probability Consequences of axioms Inclusion exclusion Outline Axioms of probability Consequences of

More information

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc. Notes on regression analysis 1. Basics in regression analysis key concepts (actual implementation is more complicated) A. Collect data B. Plot data on graph, draw a line through the middle of the scatter

More information

Quantifiers For and There exists D

Quantifiers For and There exists D Quantifiers For all @ and There exists D A quantifier is a phrase that tells you how many objects you re talking about. Good for pinning down conditional statements. For every real number x P R, wehavex

More information

Your web browser (Safari 7) is out of date. For more security, comfort and. the best experience on this site: Update your browser Ignore

Your web browser (Safari 7) is out of date. For more security, comfort and. the best experience on this site: Update your browser Ignore Your web browser (Safari 7) is out of date. For more security, comfort and Activitydevelop the best experience on this site: Update your browser Ignore Extracting Gas from Shale How is natural gas extracted

More information

AP Statistics. Chapter 6 Scatterplots, Association, and Correlation

AP Statistics. Chapter 6 Scatterplots, Association, and Correlation AP Statistics Chapter 6 Scatterplots, Association, and Correlation Objectives: Scatterplots Association Outliers Response Variable Explanatory Variable Correlation Correlation Coefficient Lurking Variables

More information

Gov 2000: 6. Hypothesis Testing

Gov 2000: 6. Hypothesis Testing Gov 2000: 6. Hypothesis Testing Matthew Blackwell October 11, 2016 1 / 55 1. Hypothesis Testing Examples 2. Hypothesis Test Nomenclature 3. Conducting Hypothesis Tests 4. p-values 5. Power Analyses 6.

More information

Colorado Academic Standards for High School Science Earth Systems Science

Colorado Academic Standards for High School Science Earth Systems Science A Correlation of Pearson 12 th Edition 2015 Colorado Academic Standards Introduction This document demonstrates the alignment between, 12 th Edition, 2015, and the, Earth Systems Science. Correlation page

More information

*Karle Laska s Sections: There is no class tomorrow and Friday! Have a good weekend! Scores will be posted in Compass early Friday morning

*Karle Laska s Sections: There is no class tomorrow and Friday! Have a good weekend! Scores will be posted in Compass early Friday morning STATISTICS 100 EXAM 3 Spring 2016 PRINT NAME (Last name) (First name) *NETID CIRCLE SECTION: Laska MWF L1 Laska Tues/Thurs L2 Robin Tu Write answers in appropriate blanks. When no blanks are provided CIRCLE

More information

GRADE 8 LEAP SOCIAL STUDIES ASSESSMENT STRUCTURE. Grade 8 Social Studies Assessment Structure

GRADE 8 LEAP SOCIAL STUDIES ASSESSMENT STRUCTURE. Grade 8 Social Studies Assessment Structure Grade 8 Social Studies Assessment Structure 1 In 2013-2014, the grade 8 LEAP test continues to assess Louisiana s social studies benchmarks. The design of the multiple-choice sessions of the test remains

More information

Approximate Inference

Approximate Inference Approximate Inference Simulation has a name: sampling Sampling is a hot topic in machine learning, and it s really simple Basic idea: Draw N samples from a sampling distribution S Compute an approximate

More information

Chapter IR:VIII. VIII. Evaluation. Laboratory Experiments Performance Measures Training and Testing Logging

Chapter IR:VIII. VIII. Evaluation. Laboratory Experiments Performance Measures Training and Testing Logging Chapter IR:VIII VIII. Evaluation Laboratory Experiments Performance Measures Logging IR:VIII-62 Evaluation HAGEN/POTTHAST/STEIN 2018 Statistical Hypothesis Testing Claim: System 1 is better than System

More information

CS 350 Algorithms and Complexity

CS 350 Algorithms and Complexity CS 350 Algorithms and Complexity Winter 2019 Lecture 15: Limitations of Algorithmic Power Introduction to complexity theory Andrew P. Black Department of Computer Science Portland State University Lower

More information

Lecture 3: Miscellaneous Techniques

Lecture 3: Miscellaneous Techniques Lecture 3: Miscellaneous Techniques Rajat Mittal IIT Kanpur In this document, we will take a look at few diverse techniques used in combinatorics, exemplifying the fact that combinatorics is a collection

More information

CS 350 Algorithms and Complexity

CS 350 Algorithms and Complexity 1 CS 350 Algorithms and Complexity Fall 2015 Lecture 15: Limitations of Algorithmic Power Introduction to complexity theory Andrew P. Black Department of Computer Science Portland State University Lower

More information

Bay Area Scientists in Schools Presentation Plan

Bay Area Scientists in Schools Presentation Plan Bay Area Scientists in Schools Presentation Plan Lesson Name: We Love Gravity! Presenter(s) Virginia Lehr, Laura Hidrobo Grade Level 5 Standards Connection(s) Solar System and Gravity Teaser: Gravity is

More information

3.2 Probability Rules

3.2 Probability Rules 3.2 Probability Rules The idea of probability rests on the fact that chance behavior is predictable in the long run. In the last section, we used simulation to imitate chance behavior. Do we always need

More information

Web Structure Mining Nodes, Links and Influence

Web Structure Mining Nodes, Links and Influence Web Structure Mining Nodes, Links and Influence 1 Outline 1. Importance of nodes 1. Centrality 2. Prestige 3. Page Rank 4. Hubs and Authority 5. Metrics comparison 2. Link analysis 3. Influence model 1.

More information

fakultät für informatik informatik 12 technische universität dortmund Petri nets Peter Marwedel Informatik 12 TU Dortmund Germany

fakultät für informatik informatik 12 technische universität dortmund Petri nets Peter Marwedel Informatik 12 TU Dortmund Germany 12 Petri nets Peter Marwedel Informatik 12 TU Dortmund Germany Introduction Introduced in 1962 by Carl Adam Petri in his PhD thesis. Focus on modeling causal dependencies; no global synchronization assumed

More information

MAT2345 Discrete Math

MAT2345 Discrete Math Fall 2013 General Syllabus Schedule (note exam dates) Homework, Worksheets, Quizzes, and possibly Programs & Reports Academic Integrity Do Your Own Work Course Web Site: www.eiu.edu/~mathcs Course Overview

More information

Scoring (Vector Space Model) CE-324: Modern Information Retrieval Sharif University of Technology

Scoring (Vector Space Model) CE-324: Modern Information Retrieval Sharif University of Technology Scoring (Vector Space Model) CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2017 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford)

More information

CS 188: Artificial Intelligence Fall Recap: Inference Example

CS 188: Artificial Intelligence Fall Recap: Inference Example CS 188: Artificial Intelligence Fall 2007 Lecture 19: Decision Diagrams 11/01/2007 Dan Klein UC Berkeley Recap: Inference Example Find P( F=bad) Restrict all factors P() P(F=bad ) P() 0.7 0.3 eather 0.7

More information

Modeling Social Media Memes as a Contagious Process

Modeling Social Media Memes as a Contagious Process Modeling Social Media Memes as a Contagious Process S.Towers 1,, A.Person 2, C.Castillo-Chavez 1 1 Arizona State University, Tempe, AZ, USA 2 Some University, Nowhereville, NE, USA E-mail: smtowers@asu.edu

More information