A Bivariate Point Process Model with Application to Social Media User Content Generation
|
|
- Cathleen Ross
- 5 years ago
- Views:
Transcription
1 1 / 33 A Bivariate Point Process Model with Application to Social Media User Content Generation Emma Jingfei Zhang ezhang@bus.miami.edu Yongtao Guan yguan@bus.miami.edu Department of Management Science The Miami Business School, University of Miami
2 Data Description: Sina Weibo Data 2 / 33 Source: Sina Weibo, the largest twitter-type online social media in China. The dataset contains posts from 5,913 followers of the official Beijing University Guanghua MBA Weibo account. For each user, all of his/her posts during the period of Jan 1st to Jan 30th, 2014, including the time stamp of each post, have been collected. Each post can be a post with original contents or a repost.
3 Data Description: Trump s Twitter Data 3 / 33 Source: Twitter data collected from Donald Trump (@realdonaldtrump) from Jan 2013 to Apr Twitter archive of Donald Trump can be downloaded from Twitter shows the device used for each tweet; devices may be Android, Web Client, iphone, and others. We consider the tweets posted by using a Android device before and an iphone after the election. This results in a total of 17,518 tweets; the average number of monthly tweets is 278. Each tweet is either an original tweet or a retweet.
4 Data Description: Sina Weibo Data 4 / 33 User 3 User 2 User 1 01/01 01/05 01/10 01/15 01/20 01/25 01/30 date Figure : The posting times of three users.
5 Data Description: Sina Weibo Data 5 / e e e e e e hour Figure : Average empirical pair correlation function.
6 Observations from Data 6 / 33 A user s posting activity may alternate between active and inactive states. During an active state, the user may publish one or more posts (often with short inter-post time distances). During an inactive state, no post is being produced until the start of the next active state. There may be daily patterns in posting times. It s a bivariate point process (i.e., posts and reposts).
7 Graphical Illustration: Univariate Process 7 / 33 Episodes: clusters of posting time locations. Adjacent episodes are nonoverlapping and separated by the inactive period in between.
8 Graphical Illustration: Bivariate Process 8 / 33 post segment repost segment post segment episode Inactive episode Each episode contains subepisodes of posts and reposts. Posts (reposts) tend to be followed by posts (reposts). Reposts may be more clustered than posts. Number of reposts may be related to number of followees.
9 Clustered Point Process 9 / 33 Goal: Model the clustered posting times for social media posting time data (do not distinguish between posts and reposts for now). Existing Methods: Hawkes process The Neyman-Scott process Barlett-Lewis process Interrupted poisson process We propose a new class of clustered temporal point processes that is easy to interpret and also can be easily generalized to the bivariate case.
10 Model Formulation 10 / 33 For each episode, the parent event generates a Poisson number of offspring events with mean µ. Each offspring location, relative to the location of the previous event in the same cluster, follows an exponential distribution with parameter ρ. Once all the events in an episode have been observed, the parent event in the following episode is generated following a hazard function λ(t; β).
11 Model Formulation 11 / 33 By observing the daily cyclic pattern in the average pair correlation function, we may assume that p λ(t; β) = exp β 0 + [β j1 cos(ω j t) + β j2 sin(ω j t)] j=1 where ω j = 2jπ and β = {β 0, β j1, β j2 : j = 1,, p}. Other nonparametric models can also be used.
12 Model Formulation 12 / 33 Define event time locations {T l : l = 1,..., N} and indicator variables {Y l : l = 1,..., N}, where Y l = 1 denote parent events and Y l = 0 offspring events. Let T 0 = 0. Define the gap time D l = T l T l 1, l = 1,, N. Let f l0 (x) and f l1 (x) be the probability density functions of D l given that Y l = 0 and Y l = 1. Assume f l0 (x) = ρ exp( ρx), and f l1 (x) = λ(t l 1 + x; β) exp [ tl 1 +x t l 1 ] λ(t; β)dt.
13 Model Formulation 13 / 33 Assume the first event is a parent event and all events in the last episode are contained in [0, T ]. The complete-data likelihood can then be written as L(θ; t, y) = n l=1 m=0 1 [f ] [ ] k lm (d l ; θ) I(y l =m) P(N i = n i ) P(D n+1 > T t n ), where D n+1 is the gap time between t n and the next parent event, P(N i = n i ) = exp( µ)µn i, n i! and P(D n+1 > T t n ) = exp [ i=1 T t n λ(t; β)dt ].
14 Composite Likelihood Estimation 14 / 33 The observed-data likelihood is y L(θ; t, y), where the summation is over all 2 n possibilities of y!!! Divide W = [0, T ] into J non-overlapping unit windows of length s, i.e., W = J j=1 W j where W j = [(j 1)s, js). As before, we assume The first event in W j is a parent event, All events in the last episode of W j are contained in W j. Define t j = {t i : t i W j } and y j = {y i : t i W j }. Then the observed-data likelihood on W j is y j L(θ; t j, y j ). We estimate θ by maximizing the composite likelihood J L(θ; t) = L(θ; t j, y j ). j=1 yj
15 Composite Likelihood Estimation 15 / 33 Each summation in the CLE is over 2 n j terms where n j is the number of events in W j. Note that J j=1 2n j << 2 n so significant computational gains can be achieved. There is a potential bias problem since The first event in W j may not be a parent event, Not all events in the last episode of W j are contained in W j. The bias problem can be mitigated if we choose the blocks wisely. Convergence can be a problem since multiple parameters need to be estimated simultaneously and the likelihood surface is often quite flat.
16 A Composite Likelihood EM Algorithm 16 / 33 Let T j and Y j be the random version of t j and y j. In the E-Step, we take expectation of the log likelihood l(θ; t j, Y j ) with respect to the conditional distribution of Y j T j = t j, ˆθ prev, i.e., Q j (θ ˆθ prev ) = E Yj T j =t j, ˆθ prev l(θ; t j, Y j ). Define Q(θ ˆθ prev ) = J Q j (θ ˆθ prev ). j=1 In the M-step, Q(θ ˆθ prev ) is maximized with respect to θ.
17 A Composite Likelihood EM Algorithm 17 / 33 For the expectation, we need to calculate for t l W j, P θ (Y l = m T j = t j ) which is y j y l =m P θ (Y l = m T j = t j ) = L(θ; t j, y j ). y j L(θ; t j, y j ) If there are a large number of events in W j, we employ a standard Metropolis- Hasting algorithm to sample from the conditional distribution Y j T j = t j, θ for the E-step. Closed form expressions can be obtained for ˆθ (except for ˆβ) in the M-step. Convergence is no issue.
18 A Composite Likelihood EM Algorithm 18 / 33 Theorem The log-composite likelihood l(θ; t) = log L(θ; t) satisfies l(θ p ; t) l(θ p 1 ; t), p = 1, 2,..., where θ p is the pth update from the E-M algorithm. The theorem guarantees that log-composite likelihood is nondecreasing at each EM iteration. The convergence of ˆθ p to a stationary point as p is guaranteed by Theorem 2 in Wu (1983). Standard techniques such as running the EM algorithm from multiple starting point can help locate the global maximum. Consistency and asymptotic normality can be established for the global maximum (assuming the model is right).
19 Extension to Bivariate Case 19 / 33 For each episode, there are a Poisson number of subepisodes with mean γ. Post and repost episodes alternate. The first subepisode is post with probability α. There are a Poisson number of offspring in each post (repost) subepisode with mean µ 1 (µ 0 ). For each offspring in a post (repost) subepisode, its location relative to that of the previous event in the same episode follows an exponential distribution with parameter ρ 1 (ρ 0 ). Once all the events in an episode have been observed, the parent event in the following episode is generated following a hazard function λ(t; β). The composite likelihood E-M algorithm can be modified to fit the model.
20 Application to Trump s Twitter Data 20 / 33 α γ µ 1 µ ρ ρ number of tweets per episode hour episode length Figure : Parameters estimated from Donald Trump s monthly Twitter data. The two red dashed lines mark June 2015 (candidacy announcement) and Jan 2017 (assumes office), respectively.
21 Figure : Estimated parent event hazard functions from Donald Trump s monthly Twitter data. The two red dashed lines mark June 2015 (candidacy announcement) and Jan 2017 (assumes office), respectively. 21 / 33
22 / Figure : Goodness of fit plots of the model fitted for Jan From left to right are the envelop plot (first plot) with the upper and lower envelopes marked in red dashed lines, goodness of fit plots for the original offspring post (second plot), offspring repost (third plot) and parent (last plot) inter-event distances. Red solid lines are calculated from cdf of exponential distributions. The grey bands are the 95% confidence intervals.
23 Application to Sina Weibo Data 23 / 33 User 3 User 2 User 1 01/01 01/05 01/10 01/15 01/20 01/25 01/30 date Figure : The posting times of three users.
24 24 / 33 α γ µ 1 µ 0 ρ 1 ρ 0 User (0.008) (0.004) (0.010) (0.014) (7.166) (6.124) User (0.009) (0.006) (0.010) (0.010) (13.013) (21.749) User (0.006) (0.008) (0.013) (0.012) (5.882) (7.477) Table : Estimated α, γ, µ 1, µ 0, ρ 1, ρ 0 of Users 1, 2 and 3.
25 Application to Sina Weibo Data 25 / 33 intensity User 1 User 2 User 3 12 am 12 pm 12 am time Figure : Parent hazard functions of Users 1, 2 and 3.
26 Application to Sina Weibo Data 26 / 33 mean function first eigenfunction am 12pm 12am second eigenfunction 12am 12pm 12am third eigenfunction am 12pm 12am 12am 12pm 12am Figure : Plots of the mean and first three eigenfunctions of the estimated daily parent hazard functions.
27 Characterize Sina Weibo User Behavior 27 / % 26.05% 66.6% 4.2% 20.4% 75.4% 3.2% 15.6% 81.2% Figure : Groups in the average daily parent hazard (left plot), average number of posts per episode (middle plot) and average length (in hours) of an episode (right plots). The percentages at the bottom of the boxplots show the percentage of users in each group.
28 Social Effect on Users of Sina Weibo 28 / 33 For each Sina Weibo user, we were also able to collect the number of accounts the user was following (n ) and the number of accounts that were following this user (n ). We find that there is a stronger correlation between n and µ 0 (r = 0.205). These observations indicate that users who follow more accounts are more likely to have more reposts. One explanation could be that the more accounts a user follows, the more content they can repost from. Another plausible explanation is that the followers in the social media tend to repost more.
29 Social Effect on Users of Sina Weibo 29 / 33 We find that the popular users, i.e., those whose accounts have many followers, tend to post more original content. They are also more likely to initiate their Weibo engagement by posting original content. We find that users who have strong social ties, i.e., have many followers or follow many others, are more likely to use Weibo more often. We find that users with many followers are more likely to spend more time on Weibo once they start an episode of engagement.
30 Simulation Study 30 / 33 We set the observation window length T = 100, α = 0.6. With each parameter configuration, we simulate 100 event trajectories. We set the parent event hazard function as λ(t; β) = exp [β 01 + β 11 cos(2πt) + β 12 sin(2πt)]. For estimation, we use unit window length s = 1 or 5. To model λ(t, β), we consider both the true model and the nonparametric cyclic B-spline model. For the latter, we use the knot vector (0, 0.2, 0.4, 0.6, 0.8, 1).
31 Simulation Study 31 / 33
32 Simulation Study (γ, µ 1, µ 0, ρ 1, ρ 0 ) (β 01, β 11, β 12 ; s) α γ µ 1 µ 0 ρ 1 ρ 0 (0.5,0.5,0.5,10,15) (-2,-2,2; 5) (0.010) (0.013) (0.014) (0.014) (0.261) (0.365) (0.5,0.5,0.5,10,15) (-3,-3,3; 5) (0.007) (0.011) (0.012) (0.014) (0.188) (0.284) (1.0,0.5,0.5,10,15) (-2,-2,2; 5) (0.009) (0.017) (0.011) (0.012) (0.176) (0.257) (0.5,1.0,1.0,10,15) (-2,-2,2; 5) (0.008) (0.010) (0.016) (0.017) (0.171) (0.309) (0.5,0.5,0.5,20,30) (-2,-2,2; 5) (0.008) (0.012) (0.012) (0.013) (0.460) (0.717) (0.5,0.5,0.5,10,15) (-2,-2,2; 1) (0.008) (0.010) (0.014) (0.014) (0.271) (0.309) 32 / 33
33 Summary 33 / 33 We propose a new clustered temporal point process model to model user generated posts on social media. The proposed model captures both inhomogeneity in the initial posting time and the clustering pattern in the subsequent posts following the initial post. The proposed goodness of fit procedure shows that the proposed model fits the data reasonably well. The fitted models provide valuable insights on a user s content generating behavior.
New Bayesian methods for model comparison
Back to the future New Bayesian methods for model comparison Murray Aitkin murray.aitkin@unimelb.edu.au Department of Mathematics and Statistics The University of Melbourne Australia Bayesian Model Comparison
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationEfficient Monitoring Algorithm for Fast News Alert
Efficient Monitoring Algorithm for Fast News Alert Ka Cheung Richard Sia kcsia@cs.ucla.edu UCLA Backgroud Goal Monitor and collect information from the Web Answer most of users queries Challenges Billions
More informationDeep Poisson Factorization Machines: a factor analysis model for mapping behaviors in journalist ecosystem
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationAN EM ALGORITHM FOR HAWKES PROCESS
AN EM ALGORITHM FOR HAWKES PROCESS Peter F. Halpin new york university December 17, 2012 Correspondence should be sent to Dr. Peter F. Halpin 246 Greene Street, Office 316E New York, NY 10003-6677 E-Mail:
More informationOn Model Fitting Procedures for Inhomogeneous Neyman-Scott Processes
On Model Fitting Procedures for Inhomogeneous Neyman-Scott Processes Yongtao Guan July 31, 2006 ABSTRACT In this paper we study computationally efficient procedures to estimate the second-order parameters
More informationEM for Spherical Gaussians
EM for Spherical Gaussians Karthekeyan Chandrasekaran Hassan Kingravi December 4, 2007 1 Introduction In this project, we examine two aspects of the behavior of the EM algorithm for mixtures of spherical
More informationOn Measurement Error Problems with Predictors Derived from Stationary Stochastic Processes and Application to Cocaine Dependence Treatment Data
On Measurement Error Problems with Predictors Derived from Stationary Stochastic Processes and Application to Cocaine Dependence Treatment Data Yehua Li Department of Statistics University of Georgia Yongtao
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationAn Assessment of Crime Forecasting Models
An Assessment of Crime Forecasting Models FCSM Research and Policy Conference Washington DC, March 9, 2018 HAUTAHI KINGI, CHRIS ZHANG, BRUNO GASPERINI, AARON HEUSER, MINH HUYNH, JAMES MOORE Introduction
More informationTwo step estimation for Neyman-Scott point process with inhomogeneous cluster centers. May 2012
Two step estimation for Neyman-Scott point process with inhomogeneous cluster centers Tomáš Mrkvička, Milan Muška, Jan Kubečka May 2012 Motivation Study of the influence of covariates on the occurrence
More informationp(d θ ) l(θ ) 1.2 x x x
p(d θ ).2 x 0-7 0.8 x 0-7 0.4 x 0-7 l(θ ) -20-40 -60-80 -00 2 3 4 5 6 7 θ ˆ 2 3 4 5 6 7 θ ˆ 2 3 4 5 6 7 θ θ x FIGURE 3.. The top graph shows several training points in one dimension, known or assumed to
More informationDoubly Inhomogeneous Cluster Point Processes
Doubly Inhomogeneous Cluster Point Processes Tomáš Mrkvička, Samuel Soubeyrand May 2016 Abscissa Motivation - Group dispersal model It is dispersal model, where particles are released in groups by a single
More information1 Degree distributions and data
1 Degree distributions and data A great deal of effort is often spent trying to identify what functional form best describes the degree distribution of a network, particularly the upper tail of that distribution.
More informationStance classification and Diffusion Modelling
Dr. Srijith P. K. CSE, IIT Hyderabad Outline Stance Classification 1 Stance Classification 2 Stance Classification in Twitter Rumour Stance Classification Classify tweets as supporting, denying, questioning,
More information1 A Tutorial on Hawkes Processes
1 A Tutorial on Hawkes Processes for Events in Social Media arxiv:1708.06401v2 [stat.ml] 9 Oct 2017 Marian-Andrei Rizoiu, The Australian National University; Data61, CSIRO Young Lee, Data61, CSIRO; The
More informationGeneralized additive modelling of hydrological sample extremes
Generalized additive modelling of hydrological sample extremes Valérie Chavez-Demoulin 1 Joint work with A.C. Davison (EPFL) and Marius Hofert (ETHZ) 1 Faculty of Business and Economics, University of
More informationDM-Group Meeting. Subhodip Biswas 10/16/2014
DM-Group Meeting Subhodip Biswas 10/16/2014 Papers to be discussed 1. Crowdsourcing Land Use Maps via Twitter Vanessa Frias-Martinez and Enrique Frias-Martinez in KDD 2014 2. Tracking Climate Change Opinions
More informationComputing the MLE and the EM Algorithm
ECE 830 Fall 0 Statistical Signal Processing instructor: R. Nowak Computing the MLE and the EM Algorithm If X p(x θ), θ Θ, then the MLE is the solution to the equations logp(x θ) θ 0. Sometimes these equations
More informationGauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA
JAPANESE BEETLE DATA 6 MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA Gauge Plots TuscaroraLisa Central Madsen Fairways, 996 January 9, 7 Grubs Adult Activity Grub Counts 6 8 Organic Matter
More informationAnalysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates
Communications in Statistics - Theory and Methods ISSN: 0361-0926 (Print) 1532-415X (Online) Journal homepage: http://www.tandfonline.com/loi/lsta20 Analysis of Gamma and Weibull Lifetime Data under a
More informationQuasi-likelihood Scan Statistics for Detection of
for Quasi-likelihood for Division of Biostatistics and Bioinformatics, National Health Research Institutes & Department of Mathematics, National Chung Cheng University 17 December 2011 1 / 25 Outline for
More informationMaximum Likelihood Estimation. only training data is available to design a classifier
Introduction to Pattern Recognition [ Part 5 ] Mahdi Vasighi Introduction Bayesian Decision Theory shows that we could design an optimal classifier if we knew: P( i ) : priors p(x i ) : class-conditional
More informationPoint Processes. Tonglin Zhang. Spatial Statistics for Point and Lattice Data (Part II)
Title: patial tatistics for Point Processes and Lattice Data (Part II) Point Processes Tonglin Zhang Outline Outline imulated Examples Interesting Problems Analysis under tationarity Analysis under Nonstationarity
More informationMining Triadic Closure Patterns in Social Networks
Mining Triadic Closure Patterns in Social Networks Hong Huang, University of Goettingen Jie Tang, Tsinghua University Sen Wu, Stanford University Lu Liu, Northwestern University Xiaoming Fu, University
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationSparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28
Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:
More informationProblem (INFORMAL). Given a dynamic graph, find a set of possibly overlapping temporal subgraphs to concisely describe the given dynamic graph in a
Outlines TimeCrunch: Interpretable Dynamic Graph Summarization by Neil Shah et. al. (KDD 2015) From Micro to Macro: Uncovering and Predicting Information Cascading Process with Behavioral Dynamics by Linyun
More informationAppendix F. Computational Statistics Toolbox. The Computational Statistics Toolbox can be downloaded from:
Appendix F Computational Statistics Toolbox The Computational Statistics Toolbox can be downloaded from: http://www.infinityassociates.com http://lib.stat.cmu.edu. Please review the readme file for installation
More informationInferring Latent Social Networks from Stock Holdings. Manual for the EM Algorithm
Inferring Latent Social Networks from Stock Holdings Manual for the EM Algorithm Harrison Hong Jiangmin Xu September, 2017 Columbia University, NBER, CAFR (e-mail: hh2679@columbia.edu), Guanghua School
More informationComputer Intensive Methods in Mathematical Statistics
Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of
More informationMobiHoc 2014 MINIMUM-SIZED INFLUENTIAL NODE SET SELECTION FOR SOCIAL NETWORKS UNDER THE INDEPENDENT CASCADE MODEL
MobiHoc 2014 MINIMUM-SIZED INFLUENTIAL NODE SET SELECTION FOR SOCIAL NETWORKS UNDER THE INDEPENDENT CASCADE MODEL Jing (Selena) He Department of Computer Science, Kennesaw State University Shouling Ji,
More informationLecture 25: Review. Statistics 104. April 23, Colin Rundel
Lecture 25: Review Statistics 104 Colin Rundel April 23, 2012 Joint CDF F (x, y) = P [X x, Y y] = P [(X, Y ) lies south-west of the point (x, y)] Y (x,y) X Statistics 104 (Colin Rundel) Lecture 25 April
More information12 - Nonparametric Density Estimation
ST 697 Fall 2017 1/49 12 - Nonparametric Density Estimation ST 697 Fall 2017 University of Alabama Density Review ST 697 Fall 2017 2/49 Continuous Random Variables ST 697 Fall 2017 3/49 1.0 0.8 F(x) 0.6
More informationExtreme Value Analysis and Spatial Extremes
Extreme Value Analysis and Department of Statistics Purdue University 11/07/2013 Outline Motivation 1 Motivation 2 Extreme Value Theorem and 3 Bayesian Hierarchical Models Copula Models Max-stable Models
More informationStatistical Analysis of Spatio-temporal Point Process Data. Peter J Diggle
Statistical Analysis of Spatio-temporal Point Process Data Peter J Diggle Department of Medicine, Lancaster University and Department of Biostatistics, Johns Hopkins University School of Public Health
More informationBayesian Inference for Clustered Extremes
Newcastle University, Newcastle-upon-Tyne, U.K. lee.fawcett@ncl.ac.uk 20th TIES Conference: Bologna, Italy, July 2009 Structure of this talk 1. Motivation and background 2. Review of existing methods Limitations/difficulties
More informationHypothesis testing: theory and methods
Statistical Methods Warsaw School of Economics November 3, 2017 Statistical hypothesis is the name of any conjecture about unknown parameters of a population distribution. The hypothesis should be verifiable
More informationPARAMETER CONVERGENCE FOR EM AND MM ALGORITHMS
Statistica Sinica 15(2005), 831-840 PARAMETER CONVERGENCE FOR EM AND MM ALGORITHMS Florin Vaida University of California at San Diego Abstract: It is well known that the likelihood sequence of the EM algorithm
More informationESTIMATING FUNCTIONS FOR INHOMOGENEOUS COX PROCESSES
ESTIMATING FUNCTIONS FOR INHOMOGENEOUS COX PROCESSES Rasmus Waagepetersen Department of Mathematics, Aalborg University, Fredrik Bajersvej 7G, DK-9220 Aalborg, Denmark (rw@math.aau.dk) Abstract. Estimation
More informationLecture 2 APPLICATION OF EXREME VALUE THEORY TO CLIMATE CHANGE. Rick Katz
1 Lecture 2 APPLICATION OF EXREME VALUE THEORY TO CLIMATE CHANGE Rick Katz Institute for Study of Society and Environment National Center for Atmospheric Research Boulder, CO USA email: rwk@ucar.edu Home
More informationModel Based Clustering of Count Processes Data
Model Based Clustering of Count Processes Data Tin Lok James Ng, Brendan Murphy Insight Centre for Data Analytics School of Mathematics and Statistics May 15, 2017 Tin Lok James Ng, Brendan Murphy (Insight)
More informationRational Spamming. Xinyu Cao MIT John R. Hauser MIT T. Tony Ke MIT Juanjuan Zhang MIT
Rational Spamming Xinyu Cao MIT xinyucao@mit.edu John R. Hauser MIT hauser@mit.edu T. Tony Ke MIT kete@mit.edu Juanjuan Zhang MIT jjzhang@mit.edu January 19, 017 Rational Spamming Abstract Advertising
More informationarxiv: v1 [cs.si] 15 Nov 2018
MULTIVARIATE SPATIOTEMPORAL HAWKES PROCESSES AND NETWORK RECONSTRUCTION BAICHUAN YUAN, HAO LI, ANDREA L. BERTOZZI, P. JEFFREY BRANTINGHAM, AND MASON A. PORTER arxiv:1811.06321v1 [cs.si] 15 Nov 2018 Abstract.
More informationLearning MN Parameters with Alternative Objective Functions. Sargur Srihari
Learning MN Parameters with Alternative Objective Functions Sargur srihari@cedar.buffalo.edu 1 Topics Max Likelihood & Contrastive Objectives Contrastive Objective Learning Methods Pseudo-likelihood Gradient
More informationEmpirical Bayes Unfolding of Elementary Particle Spectra at the Large Hadron Collider
Empirical Bayes Unfolding of Elementary Particle Spectra at the Large Hadron Collider Mikael Kuusela Institute of Mathematics, EPFL Statistics Seminar, University of Bristol June 13, 2014 Joint work with
More informationEVA Tutorial #2 PEAKS OVER THRESHOLD APPROACH. Rick Katz
1 EVA Tutorial #2 PEAKS OVER THRESHOLD APPROACH Rick Katz Institute for Mathematics Applied to Geosciences National Center for Atmospheric Research Boulder, CO USA email: rwk@ucar.edu Home page: www.isse.ucar.edu/staff/katz/
More informationLecture 9 Point Processes
Lecture 9 Point Processes Dennis Sun Stats 253 July 21, 2014 Outline of Lecture 1 Last Words about the Frequency Domain 2 Point Processes in Time and Space 3 Inhomogeneous Poisson Processes 4 Second-Order
More informationTime-Sensitive Dirichlet Process Mixture Models
Time-Sensitive Dirichlet Process Mixture Models Xiaojin Zhu Zoubin Ghahramani John Lafferty May 25 CMU-CALD-5-4 School of Computer Science Carnegie Mellon University Pittsburgh, PA 523 Abstract We introduce
More informationGenerative Models and Stochastic Algorithms for Population Average Estimation and Image Analysis
Generative Models and Stochastic Algorithms for Population Average Estimation and Image Analysis Stéphanie Allassonnière CIS, JHU July, 15th 28 Context : Computational Anatomy Context and motivations :
More informationFrailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.
Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Melanie M. Wall, Bradley P. Carlin November 24, 2014 Outlines of the talk
More informationMinimum Hellinger Distance Estimation in a. Semiparametric Mixture Model
Minimum Hellinger Distance Estimation in a Semiparametric Mixture Model Sijia Xiang 1, Weixin Yao 1, and Jingjing Wu 2 1 Department of Statistics, Kansas State University, Manhattan, Kansas, USA 66506-0802.
More informationA short introduction to INLA and R-INLA
A short introduction to INLA and R-INLA Integrated Nested Laplace Approximation Thomas Opitz, BioSP, INRA Avignon Workshop: Theory and practice of INLA and SPDE November 7, 2018 2/21 Plan for this talk
More informationA general mixed model approach for spatio-temporal regression data
A general mixed model approach for spatio-temporal regression data Thomas Kneib, Ludwig Fahrmeir & Stefan Lang Department of Statistics, Ludwig-Maximilians-University Munich 1. Spatio-temporal regression
More informationTEORIA BAYESIANA Ralph S. Silva
TEORIA BAYESIANA Ralph S. Silva Departamento de Métodos Estatísticos Instituto de Matemática Universidade Federal do Rio de Janeiro Sumário Numerical Integration Polynomial quadrature is intended to approximate
More informationDiscovering Geographical Topics in Twitter
Discovering Geographical Topics in Twitter Liangjie Hong, Lehigh University Amr Ahmed, Yahoo! Research Alexander J. Smola, Yahoo! Research Siva Gurumurthy, Twitter Kostas Tsioutsiouliklis, Twitter Overview
More informationStatistical Properties of Marsan-Lengliné Estimates of Triggering Functions for Space-time Marked Point Processes
Statistical Properties of Marsan-Lengliné Estimates of Triggering Functions for Space-time Marked Point Processes Eric W. Fox, Ph.D. Department of Statistics UCLA June 15, 2015 Hawkes-type Point Process
More informationSTAT 461/561- Assignments, Year 2015
STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and
More informationLecture 10 Spatio-Temporal Point Processes
Lecture 10 Spatio-Temporal Point Processes Dennis Sun Stats 253 July 23, 2014 Outline of Lecture 1 Review of Last Lecture 2 Spatio-temporal Point Processes 3 The Spatio-temporal Poisson Process 4 Modeling
More informationStat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC
Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline
More informationA Conditional Approach to Modeling Multivariate Extremes
A Approach to ing Multivariate Extremes By Heffernan & Tawn Department of Statistics Purdue University s April 30, 2014 Outline s s Multivariate Extremes s A central aim of multivariate extremes is trying
More informationAdjusted Empirical Likelihood for Long-memory Time Series Models
Adjusted Empirical Likelihood for Long-memory Time Series Models arxiv:1604.06170v1 [stat.me] 21 Apr 2016 Ramadha D. Piyadi Gamage, Wei Ning and Arjun K. Gupta Department of Mathematics and Statistics
More informationTemporal Point Processes the Conditional Intensity Function
Temporal Point Processes the Conditional Intensity Function Jakob G. Rasmussen Department of Mathematics Aalborg University Denmark February 8, 2010 1/10 Temporal point processes A temporal point process
More informationControl Variates for Markov Chain Monte Carlo
Control Variates for Markov Chain Monte Carlo Dellaportas, P., Kontoyiannis, I., and Tsourti, Z. Dept of Statistics, AUEB Dept of Informatics, AUEB 1st Greek Stochastics Meeting Monte Carlo: Probability
More informationModeling Recurrent Events in Panel Data Using Mixed Poisson Models
Modeling Recurrent Events in Panel Data Using Mixed Poisson Models V. Savani and A. Zhigljavsky Abstract This paper reviews the applicability of the mixed Poisson process as a model for recurrent events
More informationPoint process models for earthquakes with applications to Groningen and Kashmir data
Point process models for earthquakes with applications to Groningen and Kashmir data Marie-Colette van Lieshout colette@cwi.nl CWI & Twente The Netherlands Point process models for earthquakes with applications
More informationBiostat 2065 Analysis of Incomplete Data
Biostat 2065 Analysis of Incomplete Data Gong Tang Dept of Biostatistics University of Pittsburgh October 20, 2005 1. Large-sample inference based on ML Let θ is the MLE, then the large-sample theory implies
More informationBurstiness Scale: A Parsimonious Model for Characterizing Random Series of Events
Burstiness Scale: A Parsimonious Model for Characterizing Random Series of Events Rodrigo A S Alves Departament of Applied Social Sciences CEFET-MG rodrigo@dcsa.cefetmg.br Renato Assunção Department of
More informationThe Expectation-Maximization Algorithm
1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable
More informationMultivariate Capability Analysis Using Statgraphics. Presented by Dr. Neil W. Polhemus
Multivariate Capability Analysis Using Statgraphics Presented by Dr. Neil W. Polhemus Multivariate Capability Analysis Used to demonstrate conformance of a process to requirements or specifications that
More informationBUSI 460 Suggested Answers to Selected Review and Discussion Questions Lesson 7
BUSI 460 Suggested Answers to Selected Review and Discussion Questions Lesson 7 1. The definitions follow: (a) Time series: Time series data, also known as a data series, consists of observations on a
More informationJesper Møller ) and Kateřina Helisová )
Jesper Møller ) and ) ) Aalborg University (Denmark) ) Czech Technical University/Charles University in Prague 5 th May 2008 Outline 1. Describing model 2. Simulation 3. Power tessellation of a union of
More informationMCMC algorithms for fitting Bayesian models
MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationEmpirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design
1 / 32 Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design Changbao Wu Department of Statistics and Actuarial Science University of Waterloo (Joint work with Min Chen and Mary
More informationMathematical statistics
October 1 st, 2018 Lecture 11: Sufficient statistic Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation
More informationChapter 2 Inference on Mean Residual Life-Overview
Chapter 2 Inference on Mean Residual Life-Overview Statistical inference based on the remaining lifetimes would be intuitively more appealing than the popular hazard function defined as the risk of immediate
More informationBAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA
BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci
More informationan introduction to bayesian inference
with an application to network analysis http://jakehofman.com january 13, 2010 motivation would like models that: provide predictive and explanatory power are complex enough to describe observed phenomena
More informationInteractive GIS in Veterinary Epidemiology Technology & Application in a Veterinary Diagnostic Lab
Interactive GIS in Veterinary Epidemiology Technology & Application in a Veterinary Diagnostic Lab Basics GIS = Geographic Information System A GIS integrates hardware, software and data for capturing,
More informationChapter 4. Theory of Tests. 4.1 Introduction
Chapter 4 Theory of Tests 4.1 Introduction Parametric model: (X, B X, P θ ), P θ P = {P θ θ Θ} where Θ = H 0 +H 1 X = K +A : K: critical region = rejection region / A: acceptance region A decision rule
More informationWeb-based Supplementary Material for A Two-Part Joint. Model for the Analysis of Survival and Longitudinal Binary. Data with excess Zeros
Web-based Supplementary Material for A Two-Part Joint Model for the Analysis of Survival and Longitudinal Binary Data with excess Zeros Dimitris Rizopoulos, 1 Geert Verbeke, 1 Emmanuel Lesaffre 1 and Yves
More informationInformation geometry for bivariate distribution control
Information geometry for bivariate distribution control C.T.J.Dodson + Hong Wang Mathematics + Control Systems Centre, University of Manchester Institute of Science and Technology Optimal control of stochastic
More informationDiscovering Topical Interactions in Text-based Cascades using Hidden Markov Hawkes Processes
Discovering Topical Interactions in Text-based Cascades using Hidden Markov Hawkes Processes Jayesh Choudhari, Anirban Dasgupta IIT Gandhinagar, India Email: {choudhari.jayesh, anirbandg}@iitgn.ac.in Indrajit
More informationThreshold estimation in marginal modelling of spatially-dependent non-stationary extremes
Threshold estimation in marginal modelling of spatially-dependent non-stationary extremes Philip Jonathan Shell Technology Centre Thornton, Chester philip.jonathan@shell.com Paul Northrop University College
More informationExploring spatial decay effect in mass media and social media: a case study of China
Exploring spatial decay effect in mass media and social media: a case study of China 1. Introduction Yihong Yuan Department of Geography, Texas State University, San Marcos, TX, USA, 78666. Tel: +1(512)-245-3208
More informationNonparametric Bayesian Methods - Lecture I
Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics
More informationApproximation of Survival Function by Taylor Series for General Partly Interval Censored Data
Malaysian Journal of Mathematical Sciences 11(3): 33 315 (217) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal homepage: http://einspem.upm.edu.my/journal Approximation of Survival Function by Taylor
More informationSemi-parametric estimation of non-stationary Pickands functions
Semi-parametric estimation of non-stationary Pickands functions Linda Mhalla 1 Joint work with: Valérie Chavez-Demoulin 2 and Philippe Naveau 3 1 Geneva School of Economics and Management, University of
More informationA Framework of Detecting Burst Events from Micro-blogging Streams
, pp.379-384 http://dx.doi.org/10.14257/astl.2013.29.78 A Framework of Detecting Burst Events from Micro-blogging Streams Kaifang Yang, Yongbo Yu, Lizhou Zheng, Peiquan Jin School of Computer Science and
More informationIntroduction to Maximum Likelihood Estimation
Introduction to Maximum Likelihood Estimation Eric Zivot July 26, 2012 The Likelihood Function Let 1 be an iid sample with pdf ( ; ) where is a ( 1) vector of parameters that characterize ( ; ) Example:
More informationEmpirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications
Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications Fumiya Akashi Research Associate Department of Applied Mathematics Waseda University
More informationSparse Graph Learning via Markov Random Fields
Sparse Graph Learning via Markov Random Fields Xin Sui, Shao Tang Sep 23, 2016 Xin Sui, Shao Tang Sparse Graph Learning via Markov Random Fields Sep 23, 2016 1 / 36 Outline 1 Introduction to graph learning
More informationStatistical Models for Defective Count Data
Statistical Models for Defective Count Data Gerhard Neubauer a, Gordana -Duraš a, and Herwig Friedl b a Statistical Applications, Joanneum Research, Graz, Austria b Institute of Statistics, University
More informationMathematical statistics
October 18 th, 2018 Lecture 16: Midterm review Countdown to mid-term exam: 7 days Week 1 Chapter 1: Probability review Week 2 Week 4 Week 7 Chapter 6: Statistics Chapter 7: Point Estimation Chapter 8:
More informationImproving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates
Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Anastasios (Butch) Tsiatis Department of Statistics North Carolina State University http://www.stat.ncsu.edu/
More informationStatistics. Lecture 2 August 7, 2000 Frank Porter Caltech. The Fundamentals; Point Estimation. Maximum Likelihood, Least Squares and All That
Statistics Lecture 2 August 7, 2000 Frank Porter Caltech The plan for these lectures: The Fundamentals; Point Estimation Maximum Likelihood, Least Squares and All That What is a Confidence Interval? Interval
More informationExpectation Maximization
Expectation Maximization Aaron C. Courville Université de Montréal Note: Material for the slides is taken directly from a presentation prepared by Christopher M. Bishop Learning in DAGs Two things could
More informationMixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate
Mixture Models & EM icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Previously We looed at -means and hierarchical clustering as mechanisms for unsupervised learning -means
More informationModeling population growth in online social networks
Zhu et al. Complex Adaptive Systems Modeling 3, :4 RESEARCH Open Access Modeling population growth in online social networks Konglin Zhu *,WenzhongLi, and Xiaoming Fu *Correspondence: zhu@cs.uni-goettingen.de
More information