ntopic Organic Traffic Study

Similar documents
(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

Bayesian Methods for Machine Learning

Bayesian non-parametric model to longitudinally predict churn

Computational statistics

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

FAV i R This paper is produced mechanically as part of FAViR. See for more information.

Learning Objectives for Stat 225

Study Notes on the Latent Dirichlet Allocation

Binomial random variable

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

Text Mining for Economics and Finance Latent Dirichlet Allocation

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

STAT 200 Chapter 1 Looking at Data - Distributions

Bayesian linear regression

Bayesian Nonparametrics

Spatial Normalized Gamma Process

Practical Statistics for the Analytical Scientist Table of Contents

Penalized Loss functions for Bayesian Model Choice

36-463/663Multilevel and Hierarchical Models

eqr094: Hierarchical MCMC for Bayesian System Reliability

Bayesian PalaeoClimate Reconstruction from proxies:

Control Variates for Markov Chain Monte Carlo

arxiv: v1 [stat.me] 6 Nov 2013

Petr Volf. Model for Difference of Two Series of Poisson-like Count Data

Bagging During Markov Chain Monte Carlo for Smoother Predictions

STATISTICAL ANALYSIS OF LAW ENFORCEMENT SURVEILLANCE IMPACT ON SAMPLE CONSTRUCTION ZONES IN MISSISSIPPI (Part 1: DESCRIPTIVE)

STAT 425: Introduction to Bayesian Analysis

Topic Models. Brandon Malone. February 20, Latent Dirichlet Allocation Success Stories Wrap-up

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Making rating curves - the Bayesian approach

Outline. Introduction to SpaceStat and ESTDA. ESTDA & SpaceStat. Learning Objectives. Space-Time Intelligence System. Space-Time Intelligence System

Describing distributions with numbers

Bayesian nonparametric models for bipartite graphs

DRAFT: A2.1 Activity rate Loppersum

Curve Fitting Re-visited, Bishop1.2.5

CS 361: Probability & Statistics

Bayesian Estimation of log N log S

A Semi-parametric Bayesian Framework for Performance Analysis of Call Centers

3 Joint Distributions 71

CPSC 540: Machine Learning

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Bayesian Mixture Modeling of Significant P Values: A Meta-Analytic Method to Estimate the Degree of Contamination from H 0 : Supplemental Material

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Chapter 23: Inferences About Means

Bayesian Nonparametric Regression for Diabetes Deaths

Marginal Specifications and a Gaussian Copula Estimation

The Model Building Process Part I: Checking Model Assumptions Best Practice

Infinite-State Markov-switching for Dynamic. Volatility Models : Web Appendix

Network Simulation Chapter 5: Traffic Modeling. Chapter Overview

Gentle Introduction to Infinite Gaussian Mixture Modeling

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1)

Evaluation Methods for Topic Models

A Bayesian Mixture Model with Application to Typhoon Rainfall Predictions in Taipei, Taiwan 1

Downloaded from:

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture 18-16th March Arnaud Doucet

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.

Semiparametric Generalized Linear Models

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

Ages of stellar populations from color-magnitude diagrams. Paul Baines. September 30, 2008

Classical and Bayesian inference

Overview. and data transformations of gene expression data. Toy 2-d Clustering Example. K-Means. Motivation. Model-based clustering

19 : Bayesian Nonparametrics: The Indian Buffet Process. 1 Latent Variable Models and the Indian Buffet Process

Stat 516, Homework 1

Plausible Values for Latent Variables Using Mplus

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Time-Sensitive Dirichlet Process Mixture Models

Nonparametric Bayes Density Estimation and Regression with High Dimensional Data

Lecture 16: Mixtures of Generalized Linear Models

Bayesian Statistics. Debdeep Pati Florida State University. April 3, 2017

13: Variational inference II

Markov Chain Monte Carlo methods

Chapter 6. The Standard Deviation as a Ruler and the Normal Model 1 /67

Contents. Part I: Fundamentals of Bayesian Inference 1

Bayesian GLMs and Metropolis-Hastings Algorithm

More on Unsupervised Learning

Gibbs Sampling in Latent Variable Models #1

Using Model Selection and Prior Specification to Improve Regime-switching Asset Simulations

A Fully Nonparametric Modeling Approach to. BNP Binary Regression

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Grade 8 Math Curriculum Map Erin Murphy

Estimating the marginal likelihood with Integrated nested Laplace approximation (INLA)

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics

Inferential Statistics

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Subject CS1 Actuarial Statistics 1 Core Principles

Fitting Narrow Emission Lines in X-ray Spectra

The STS Surgeon Composite Technical Appendix

Lecture 13 Fundamentals of Bayesian Inference

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES

Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference

STA 4273H: Statistical Machine Learning

Master of Science in Statistics A Proposal

HITTING TIME IN AN ERLANG LOSS SYSTEM

Summarizing Measured Data

Transcription:

ntopic Organic Traffic Study 1 Abstract The objective of this study is to determine whether content optimization solely driven by ntopic recommendations impacts organic search traffic from Google. The study used 3 randomly applied treatments of ntopic modified content, random keyword modified content, and unmodified content. ntopic modified content saw an organic traffic lift of 17.%, random keyword modified content saw a 1% drop and unmodified content saw drop in traffic of 1%. This indicates that ntopic optimized content can create increases in organic search traffic that cannot be explained solely by content freshness measurements or random fluctuations. 2 Experiment Methodology We first describe the process in which we collected the data for our study. 1. Randomly select 2 pages from a single website: we use a single website to prevent domain authority from influencing the effectiveness of content modifications. 2. Identify top keyword referring traffic to each of 2 selected pages 3. Randomly apply treatments to each of the 2 pages 1: Insert recommended words from ntopic Keyword Recommendations for Keyword and Page Content via Paid API 2: Insert random words from dictionary to page 3: Make no modification to the content 4. Record 3 weeks worth of organic google traffic for each keyword prior to transformation. Record 3 weeks worth of organic google traffic beginning 1 month after the launch of transformation 6. Record shift in organic Google traffic for each page 7. Record average shift for each transformation method Figure 1 shows some summary plots of the experiment data. While the empirical distributions of the treatment traffic shifts seem to indicate ntopic performs better on average, there is much higher variability in the less visited sites which is clear in the scatter plot. Therefore, simply calculating the percent shift for each page and analyzing the empirical distributions is not sufficient. 1

Properly assessing the impact of the treatments requires a more principled analysis of the data. We give the details of this analysis in the next section followed by a discussion of the results and model validation considerations. It should be noted that even comparing these noisy percent lift histograms using the non-parametric Mann and Whitney (1947) U test shows the ntopic to be significantly more effective than the other two approaches. 8 2 log(after) 6 4 2 Random Static ntopic count 2 1 1 Random Static ntopic 2 4 6 8 log(before) 1 2 3 4 Figure 1: Left: Scatter plot of the amount of organic search traffic before and after on the log scale the treatments. Right: Histograms of observed percent increase (or decrease) in organic search traffic for each treatment. 3 Poisson Random Effects Model for Search Traffic First, we assume that visits to a particular page follow a Poisson process with some rate λ. This can be characterized by the time between visits to the website being independent and exponentially distributed with rate λ where one then counts the cumulative events. This is a standard assumption in server traffic modeling and many other applications where we count the number of events up to some time (Arlitt and Williamson, 1997). A consequence of this assumption is that the number of events (visits to the website) over any period of time will be a Poisson distribution. Furthermore, the number of events over any non-overlapping periods of the same length will have independent Poisson distributions with the same mean. Let each page be indexed by i and the number of visits to that page in the first 3 week period be denoted x i. We denote the number of visits to the same page during the second 3 week period as y i. Since these pages are all the same website, it is reasonable to assume that the λ i are each drawn from some population of visit rates. We assume based upon the discussion above that for each i x i Poisson(λ i ) where log(λ i ) N(µ, σ 2 ). This is known as a generalized random effects model for Poisson data where there distribution of means is a log normal distribution. We recommend the work by Chib and Carlin (1999) for more information on these types of models. 2

Since our inferential goal is to understand the effect of each treatment indexed by k on the page arrival rate, we assume that the arrival rate for each page is multiplied by a common factor β k. Specifically, we assume that if page i had treatment k, y i Poisson(β k λ i ). We now have a complete model specification for our data with the goal of understanding the difference between β 1, β 2, and β 3 which correspond to the multiplicative effects of the random keywords, static, and ntopic treatments respectively. 4 Results We used the JAGS software to perform model analysis in a Bayesian setting on the parameters β, µ, and σ. This approach is extremely effective for fitting custom, flexible models and for performing model validation as demonstrated in the next section (Plummer, 23). Since this is a Bayesian approach, we need to specify our prior understanding of the variables of interest with probability distributions. Our goal is objective inferences, so we chose priors that represent having no prior information about the parameters. JAGS allows us to take draws from the posterior distribution which represents our understanding of the parameters in our model after we observe the data. We ran the JAGS Gibbs sampler for 1, adaptive iterations, then for an additional 1, burn-in iterations; we then stored the final 1, draws from the posterior distribution to do inference on β. Figure 2 shows the posterior summaries. The first clear insight is that the posterior distribution for the ntopic shift is clearly and significantly larger than the other two effects. Furthermore, the random keywords and static treatments are quite similar to each other and both are below 1 indicating those treatments actually reduce the rate in site visits. Model Validation There are two major assumptions that must be validated in this analysis. First, the choice of random effects distribution could play a major role. However, the JAGS software allowed us to quickly check many different random effects distributions. We then considered a semi-parametric finite mixture of distributions which can approximate a wide class of distributions (Fraley and Raftery, 22). In all of these cases, the results were the same. Furthermore, the log normal assumption is quite reasonable because its heavy tailed features accurately depict the fact that most visiting rates will be around some common region with a few extreme outliers. The second is that within a given treatment, we assume the new visiting rate is some multiplicative shift of the old. One way to test this assumption is to perform a second analysis where we first remove random data from the second period in each treatment and treat it as missing. In the Bayesian framework, those missing observations are treated as parameters to estimate and their estimates are called posterior predictive distributions. Figure 3 shows boxplots for the posterior predictive distributions for observations removed randomly from each treatment. We see that the held out data is within the estimated range, and hence our model does not appear to have any systematic bias as a predictive tool. While this is not a conclusive test of model correctness, it is strong evidence that the results are valid. 3

1.2 2 1.1 1. density 1 1 Random Static ntopic.9.8 Random Static ntopic.8.9 1. 1.1 1.2 Figure 2: Posterior summaries of the shifts β 1, β 2, and β 3 corresponding to random, static, and ntopic treatments respectively. On the left, boxplots of the posterior parameter distributions are presented. On the right, kernel smoothed density estimates of the posterior are shown. 6 Conclusions We have shown strong evidence that insertion of ntopic recommended keywords can increase organic search traffic. This increase is significantly better than a static webpage or random fresh content. It should be noted that We cannot conclude that topic modeling or content relevancy is used in Google s algorithm. We cannot determine the exact mechanism by which ntopic recommended keywords increase Google traffic. We can, however, provide evidence that insertion of ntopic recommended keywords increase Google traffic. References Arlitt, M. F. and Williamson, C. L. (1997). Internet web servers: workload characterization and performance implications. IEEE/ACM Trans. Netw., ():631 64. 4

1 1 sqrt(y) 1 2 3 4 6 7 8 9 1 11 12 13 14 1 Index Figure 3: Posterior predictive estimates for the excluded after treatment data in the validation phase with the actual observed rates shown in red. Chib, S. and Carlin, B. P. (1999). On mcmc sampling in hierarchical longitudinal models. Statistics and Computing, 9:17 26. Fraley, C. and Raftery, A. E. (22). Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association, 97(48):611 631. Mann, H. and Whitney, D. (1947). On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat., 18: 6. Plummer, M. (23). Jags: A program for analysis of bayesian graphical models using gibbs sampling.