Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites

Similar documents
Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites

ECONOMICS 7200 MODERN TIME SERIES ANALYSIS Econometric Theory and Applications

at least 50 and preferably 100 observations should be available to build a proper model

Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach

Basics: Definitions and Notation. Stationarity. A More Formal Definition

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo

1 Random walks and data

Sample Exam Questions for Econometrics

How much information is contained in a recurrence plot?

For the full text of this licence, please go to:

Testing for Regime Switching in Singaporean Business Cycles

Revisiting linear and non-linear methodologies for time series prediction - application to ESTSP 08 competition data

Theodore Panagiotidis*^ and Gianluigi Pelloni**

Creating Order Out of Chaos

Nearest-Neighbor Forecasts Of U.S. Interest Rates

Thomas J. Fisher. Research Statement. Preliminary Results

Do we need Experts for Time Series Forecasting?

Chart types and when to use them

Exchange rates expectations and chaotic dynamics: a replication study

RL-DIODE CIRCUIT USING THE BDS TEST

Financial Econometrics

TIME SERIES DATA PREDICTION OF NATURAL GAS CONSUMPTION USING ARIMA MODEL

APPLIED TIME SERIES ECONOMETRICS

9) Time series econometrics

Forecasting Network Activities Using ARIMA Method

A nonparametric test for seasonal unit roots

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

A Test of the GARCH(1,1) Specification for Daily Stock Returns

University of Pretoria Department of Economics Working Paper Series

TESTING FOR NONLINEARITY IN

Elements of Multivariate Time Series Analysis

Nonlinear Time Series

NONLINEAR TIME SERIES ANALYSIS, WITH APPLICATIONS TO MEDICINE

A new method for short-term load forecasting based on chaotic time series and neural network

Chaos, Complexity, and Inference (36-462)

Time Series Analysis -- An Introduction -- AMS 586

Scenario 5: Internet Usage Solution. θ j

Market Efficiency and the Euro: The case of the Athens Stock Exchange

176 Index. G Gradient, 4, 17, 22, 24, 42, 44, 45, 51, 52, 55, 56

FORECASTING. Methods and Applications. Third Edition. Spyros Makridakis. European Institute of Business Administration (INSEAD) Steven C Wheelwright

A Data-Driven Model for Software Reliability Prediction

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods

Research Article Multivariate Local Polynomial Regression with Application to Shenzhen Component Index

PHONEME CLASSIFICATION OVER THE RECONSTRUCTED PHASE SPACE USING PRINCIPAL COMPONENT ANALYSIS

LOAD FORECASTING APPLICATIONS for THE ENERGY SECTOR

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

arxiv: v1 [q-fin.st] 18 Sep 2017

A Comparison of the Forecast Performance of. Double Seasonal ARIMA and Double Seasonal. ARFIMA Models of Electricity Load Demand

Forecasting Major Vegetable Crops Productions in Tunisia

Complex system approach to geospace and climate studies. Tatjana Živković

Volatility. Gerald P. Dwyer. February Clemson University

Problem set 1 - Solutions

Frequency Forecasting using Time Series ARIMA model

The Comparison of Holt Winters and Box Jenkins Methods for Software Failures Prediction

DEPARTMENT OF ECONOMICS DISCUSSION PAPER SERIES. Seasonal Mackey-Glass-GARCH Process and Short-Term Dynamics. Catherine Kyrtsou and Michel Terraza

Automatic seasonal auto regressive moving average models and unit root test detection

Long-Term Prediction, Chaos and Artificial Neural Networks. Where is the Meeting Point?

Warwick Business School Forecasting System. Summary. Ana Galvao, Anthony Garratt and James Mitchell November, 2014

Modeling and Predicting Chaotic Time Series

ARIMA Models. Richard G. Pierse

Comparing the Univariate Modeling Techniques, Box-Jenkins and Artificial Neural Network (ANN) for Measuring of Climate Index

Is the Basis of the Stock Index Futures Markets Nonlinear?

Time Series 4. Robert Almgren. Oct. 5, 2009

22/04/2014. Economic Research

Ensembles of Nearest Neighbor Forecasts

Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8]

Available online at ScienceDirect. Procedia Computer Science 72 (2015 )

Forecasting exchange rate volatility using conditional variance models selected by information criteria

ANALYZING THE IMPACT OF HISTORICAL DATA LENGTH IN NON SEASONAL ARIMA MODELS FORECASTING

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Suan Sunandha Rajabhat University

A Hybrid Deep Learning Approach For Chaotic Time Series Prediction Based On Unsupervised Feature Learning

Prashant Pant 1, Achal Garg 2 1,2 Engineer, Keppel Offshore and Marine Engineering India Pvt. Ltd, Mumbai. IJRASET 2013: All Rights are Reserved 356

Univariate ARIMA Models

Detection of Nonlinearity and Stochastic Nature in Time Series by Delay Vector Variance Method

BUSI 460 Suggested Answers to Selected Review and Discussion Questions Lesson 7

MGR-815. Notes for the MGR-815 course. 12 June School of Superior Technology. Professor Zbigniew Dziong

Phase Space Reconstruction from Economic Time Series Data: Improving Models of Complex Real-World Dynamic Systems

Vulnerability of economic systems

BDS ANFIS ANFIS ARMA

Exchange rates expectations and chaotic dynamics: a replication study

Solar irradiance forecasting for Chulalongkorn University location using time series models

Forecasting Bangladesh's Inflation through Econometric Models

Short-term electricity demand forecasting in the time domain and in the frequency domain

Firstly, the dataset is cleaned and the years and months are separated to provide better distinction (sample below).

SOME BASICS OF TIME-SERIES ANALYSIS

Statistical Methods for Forecasting

Forecasting of Dow Jones Industrial Average by Using Wavelet Fuzzy Time Series and ARIMA

STAT 436 / Lecture 16: Key

CHAOS THEORY AND EXCHANGE RATE PROBLEM

Box-Jenkins ARIMA Advanced Time Series

Technical Appendix-3-Regime asymmetric STAR modeling and exchange rate reversion

Short-term management of hydro-power systems based on uncertainty model in electricity markets

Lecture 2: Univariate Time Series

A New Look at Nonlinear Time Series Prediction with NARX Recurrent Neural Network. José Maria P. Menezes Jr. and Guilherme A.

Stochastic Processes

The Chaotic Marriage of Physics and Finance

WHEELSET BEARING VIBRATION ANALYSIS BASED ON NONLINEAR DYNAMICAL METHOD

Forecasting USD/IQD Future Values According to Minimum RMSE Rate

The Identification of ARIMA Models

Transcription:

Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites Tiago Santos 1 Simon Walk 2 Denis Helic 3 1 Know-Center, Graz, Austria 2 Stanford University 3 Graz University of Technology 3. April 2017 Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 1 / 13

Introduction Motivation Success of online collaboration websites depends critically on content contributed by users. For example, StackExchange vs. Google knol. Problem: Key deciding factors of success and failure of online collaboration websites? Goal: Uncover hidden nonlinear behavior in activity dynamics. Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 2 / 13

Related Work Nonlinear Time Series Analysis and its Applications Nonlinear time series analysis studies reconstructions of high dimensional dynamical systems from low dimensional ones. Example applications: Small and Tse [1] predicted the outcome of a roulette wheel. Hsieh [2] found nonlinear behavior in stock returns. Strozzi et al. [3] detected events in the stock market. More examples in Marwan et al. [4] and Bradley and Kantz [5]. New application: activity dynamics in online collaboration websites. Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 3 / 13

Related Work Dynamical Systems for Networks Dynamical systems provide mathematical formalizations for the evolution of numerical quantities over time [6, 7]. Applications of dynamical system theory to study activity in networks: Ribeiro [8] models daily active users in online communities with behavior of active and inactive users. Walk et al. [9] model activity in collaboration networks with activity decay rate and peer influence growth. Our approach: Characterize activity by its propensity to have originated in a dynamical system Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 4 / 13

Methodology Nonlinearity Tests We assess nonlinearity with 9 statistical tests: AR process-based tests: Broock, Dechert and Scheinkman [BDS] test [10] on ARIMA residuals Keenan s one-degree test for nonlinearity [11] McLeod-Li test [12] Tsay s test for nonlinearity [13] Likelihood ratio test for threshold nonlinearity [14] Neural Networks-based tests: Teraesvirta s neural network test [15] White neural network test [16] Other tests: Wald-Wolfowitz runs test [17] on the number of times time series grows Surrogate test - time asymmetry [18] Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 5 / 13

Methodology Reconstructing nonlinear dynamical system from univariate time series We reconstruct state space with Takens embedding theorem [19] to get: R t = (x t, x t τ, x t 2τ,..., x t (m 1)τ ) R m. (1) Example: Lorenz dynamical system Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 6 / 13

Methodology Reconstructing nonlinear dynamical system from univariate time series We reconstruct state space with Takens embedding theorem [19] to get: R t = (x t, x t τ, x t 2τ,..., x t (m 1)τ ) R m. (1) Example: Lorenz dynamical system Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 6 / 13

Methodology Reconstructing nonlinear dynamical system from univariate time series We reconstruct state space with Takens embedding theorem [19] to get: R t = (x t, x t τ, x t 2τ,..., x t (m 1)τ ) R m. (1) Example: Lorenz dynamical system Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 6 / 13

Methodology Forecasting univariate time series We employ linear and nonlinear models to forecast time series. Linear models: Linear regression: linear combination of Fourier terms and trend ARIMA models: differenced, linear combination of auto-regressors and lagged moving average error terms Exponential Smoothing (ETS) models: linear combination of lagged terms, such as level, trend, seasonality and error Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 7 / 13

Methodology Forecasting univariate time series We employ linear and nonlinear models to forecast time series. Linear models: Linear regression: linear combination of Fourier terms and trend ARIMA models: differenced, linear combination of auto-regressors and lagged moving average error terms Exponential Smoothing (ETS) models: linear combination of lagged terms, such as level, trend, seasonality and error Nonlinear models: Reconstruct dynamical system properties with Takens embedding Forecast univariate time series by following nearby trajectories in reconstructed state space Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 7 / 13

Methodology Recurrence Analysis Analyze reconstructed state spaces with Recurrence Plots: Example: Lorenz dynamical system R i,j (ɛ) = Θ(ɛ x i x j ). (2) Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 8 / 13

Methodology Recurrence Analysis Analyze reconstructed state spaces with Recurrence Plots: Example: Lorenz dynamical system R i,j (ɛ) = Θ(ɛ x i x j ). (2) Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 8 / 13

Experimental Setup Datasets, pre-processing & experiment configuration Datasets and pre-processing: Datasets: 16 randomly picked StackExchange Q&A portals Activity-based time series: number of questions, answers and comments per user per day Pre-processing: weekly aggregation, burn-in, detrend Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 9 / 13

Experimental Setup Datasets, pre-processing & experiment configuration Datasets and pre-processing: Datasets: 16 randomly picked StackExchange Q&A portals Activity-based time series: number of questions, answers and comments per user per day Pre-processing: weekly aggregation, burn-in, detrend Test and forecast setup: Significance level of the 9 tests for nonlinearity: 95% Categorize datasets on number of tests indicating nonlinearity Forecast 1 year of activity for all datasets Compare forecast root mean squared error (RMSE) of the 4 models Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 9 / 13

Results & Discussion Nonlinearity test results and forecast performance comparison Group datasets on nonlinearity test results and rank forecast RMSE per group with the Friedman test [20]: 10 out of 16 datasets with 4/9 tests indicating nonlinearity. Friedman test rank: 1. ETS, 2. ARIMA, 3. Nonlinear, 4. Linear 6 out of 16 datasets with 5/9 tests indicating nonlinearity. Friedman test rank: 1. Nonlinear, 2. ARIMA, 2. ETS, 4. Linear Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 10 / 13

Results & Discussion Nonlinearity test results and forecast performance comparison Group datasets on nonlinearity test results and rank forecast RMSE per group with the Friedman test [20]: 10 out of 16 datasets with 4/9 tests indicating nonlinearity. Friedman test rank: 1. ETS, 2. ARIMA, 3. Nonlinear, 4. Linear 6 out of 16 datasets with 5/9 tests indicating nonlinearity. Friedman test rank: 1. Nonlinear, 2. ARIMA, 2. ETS, 4. Linear Observations: Neural network-based tests are more sensitive to nonlinear dynamics Presence of nonlinear dynamics impacts forecast and modeling efforts Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 10 / 13

Results & Discussion Recurrence Plot Analysis Study reconstructed state spaces of 2 datasets deemed nonlinear: The RP empowers activity dynamics modeling efforts: Math: Drift, chaotic dynamics and slowly changing states Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 11 / 13

Results & Discussion Recurrence Plot Analysis Study reconstructed state spaces of 2 datasets deemed nonlinear: The RP empowers activity dynamics modeling efforts: Math: Drift, chaotic dynamics and slowly changing states Bitcoin: Periodic dynamics and non-stationary transitions Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 11 / 13

Conclusions & Future Work Conclusions & Future Work Conclusions: Group activity-based time series by propensity to originate from dynamical systems Increase accuracy in activity forecast experiments Customize activity models with Recurrence Plots More and longer time series more conclusive results Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 12 / 13

Conclusions & Future Work Conclusions & Future Work Conclusions: Group activity-based time series by propensity to originate from dynamical systems Increase accuracy in activity forecast experiments Customize activity models with Recurrence Plots More and longer time series more conclusive results Future work: Understand reason for differences in nonlinear behavior Study underlying collaboration networks Recurrence Quantification Analysis Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 12 / 13

Conclusions & Future Work Questions? Thank you very much for your time! Questions? Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 13 / 13

Backup and References Results table Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 1 / 6

Backup and References References I Small, M and Tse, CK. Predicting the outcome of roulette. Chaos: an interdisciplinary journal of nonlinear science 2012;22:033150. Hsieh, DA. Chaos and nonlinear dynamics: application to financial markets. The journal of finance 1991;46:1839 1877. Strozzi, F, Zaldívar, JM, and Zbilut, JP. Application of nonlinear time series analysis techniques to high-frequency currency exchange data. Physica A: Statistical Mechanics and its Applications 2002;312:520 538. Marwan, N, Romano, MC, Thiel, M, and Kurths, J. Recurrence plots for the analysis of complex systems. Physics reports 2007;438:237 329. Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 2 / 6

Backup and References References II Bradley, E and Kantz, H. Nonlinear time-series analysis revisited. Chaos: An Interdisciplinary Journal of Nonlinear Science 2015;25:097610. Luenberger, DGDG. Introduction to dynamic systems; theory, models, and applications. Tech. rep. 1979. Guckenheimer, J and Holmes, PJ. Nonlinear oscillations, dynamical systems, and bifurcations of vector fields. Vol. 42. Springer Science & Business Media, 2013. Ribeiro, B. Modeling and predicting the growth and death of membership-based websites. In: Proceedings of the 23rd international conference on World Wide Web. ACM. 2014:653 664. Walk, S, Helic, D, Geigl, F, and Strohmaier, M. Activity dynamics in collaboration networks. ACM Transactions on the Web (TWEB) 2016;10:11. Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 3 / 6

Backup and References References III Broock, W, Scheinkman, JA, Dechert, WD, and LeBaron, B. A test for independence based on the correlation dimension. Econometric reviews 1996;15:197 235. Keenan, DM. A Tukey nonadditivity-type test for time series nonlinearity. Biometrika 1985;72:39 44. McLeod, AI and Li, WK. Diagnostic checking ARMA time series models using squared-residual autocorrelations. Journal of Time Series Analysis 1983;4:269 273. Tsay, RS. Nonlinearity tests for time series. Biometrika 1986;73:461 466. Chan, KS. Percentage points of likelihood ratio tests for threshold autoregression. Journal of the Royal Statistical Society. Series B (Methodological) 1991:691 696. Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 4 / 6

Backup and References References IV Teräsvirta, T, Lin, CF, and Granger, CW. Power of the neural network linearity test. Journal of Time Series Analysis 1993;14:209 220. Lee, TH, White, H, and Granger, CW. Testing for neglected nonlinearity in time series models: A comparison of neural network methods and alternative tests. Journal of Econometrics 1993;56:269 290. Siegel, S. Nonparametric statistics for the behavioral sciences. 1956. Schreiber, T and Schmitz, A. Surrogate time series. PhysicaD:NonlinearPhenomena 2000;142:346 382. Takens, F. Detecting strange attractors in turbulence. In: Dynamical systems and turbulence, Warwick 1980. Springer, 1981:366 381. Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 5 / 6

Backup and References References V Demšar, J. Statistical comparisons of classifiers over multiple data sets. Journal of Machine learning research 2006;7:1 30. Tiago Santos Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites 3. April 2017 6 / 6