Clustering algorithms distributed over a Cloud Computing Platform.

Size: px
Start display at page:

Download "Clustering algorithms distributed over a Cloud Computing Platform."

Transcription

1 Clustering algorithms distributed over a Cloud Computing Platform. SEPTEMBER 28 TH 2012 Ph. D. thesis supervised by Pr. Fabrice Rossi. Matthieu Durut (Telecom/Lokad) 1 / 55

2 Outline. 1 Introduction to Cloud Computing 2 Context 3 Distributed Batch K-Means 4 Distributed Vector Quantization algorithms Matthieu Durut (Telecom/Lokad) 2 / 55

3 Outline Introduction to Cloud Computing 1 Introduction to Cloud Computing 2 Context 3 Distributed Batch K-Means 4 Distributed Vector Quantization algorithms Matthieu Durut (Telecom/Lokad) 3 / 55

4 Introduction to Cloud Computing What is Cloud Computing? Some Features 1 Abstraction of commodity hardware that can be rent on-demand on a hourly basis. 2 Quasi-infinite hardware scale-up. 3 Virtualization, that makes web-applications maintenance easier. Grid vs Cloud Ownership. Intensive use of Virtual Machines (VM). Elasticity. Hardware administration and maintenance. Matthieu Durut (Telecom/Lokad) 4 / 55

5 Introduction to Cloud Computing Everything a as Service 1 Software as a Service (SaaS) : Gmail, Salesforce, Lokad API, etc. 2 Platform as a Service (PaaS) : Azure, Amazon S3, etc, 3 Infrastructure as a Service (IaaS) : Amazon EC2, etc. Stack of Azure Storage Level : BlobStorage, TableStorage, QueueStorage, SQLAzure. Execution Level : Dryad. Domain Specific Language Level : DryadLinq, Scope. Matthieu Durut (Telecom/Lokad) 5 / 55

6 Introduction to Cloud Computing Figure : Illustration of the Google, Hadoop and Microsoft technology stacks for cloud applications building. Matthieu Durut (Telecom/Lokad) 6 / 55

7 MapReduce Introduction to Cloud Computing Matthieu Durut (Telecom/Lokad) 7 / 55

8 Introduction to Cloud Computing The Windows Azure Storage (WAS) BlobStorage Key-value pair (blobname/blob) storage. No more ACID. But atomicity, strong persistency and strong consistency per blob. Optimistic Read-Modify-Write primitive (RMW). QueueStorage Set of scalable queues. Asynchronous Message Delivery mechanism. Approximately FIFO. Messages returned at least once => Idempotency. Matthieu Durut (Telecom/Lokad) 8 / 55

9 Introduction to Cloud Computing Elements of Azure applications architecture No communication framework such as MPI. WAS used as a shared memory abstraction. No affinity between storage and processing units. Task agnosticity of workers (at least in the beginning). Idempotence. Matthieu Durut (Telecom/Lokad) 9 / 55

10 Outline Context 1 Introduction to Cloud Computing 2 Context 3 Distributed Batch K-Means 4 Distributed Vector Quantization algorithms Matthieu Durut (Telecom/Lokad) 10 / 55

11 Context Why clustering? One of the Lokad s abilities is to deal with large scale data. Need to group client data (clustering) to extract information from complex objects (e.g. time series seasonality). Problem Set-up Data set is composed of N points {z t } N t=1 in Rκ. Clustering POV: find a simplified representation with κ vectors of R d. These vectors will be called prototypes/centroids and gathered in a quantization scheme w = (w 1,..., w κ ) ( R d) κ. Matthieu Durut (Telecom/Lokad) 11 / 55

12 Context Objective Clustering challenge can be expressed as a minimization of the empirical distortion C N, where Initial challenge C N (w) = N min z t w l 2, w ( R d) κ. l=1,...,κ t=1 Exact minimization is computationnaly intractable. Some approximative algorithms Batch K-Means Vector Quantization (Online K-Means) Neural Gas Kohonen Maps Matthieu Durut (Telecom/Lokad) 12 / 55

13 Context Architecture Context Why distributed? A suitable way to allow more computing resources. Faster serial computers: increasingly expensive + physical limits. Cloud computing: adopted by Lokad (MS Azure). Early 2012, all apps on Cloud and scale-up 300VMs. Consequences: communication delays and the lack of efficient shared memory asynchronous schemes. Matthieu Durut (Telecom/Lokad) 13 / 55

14 Outline Distributed Batch K-Means 1 Introduction to Cloud Computing 2 Context 3 Distributed Batch K-Means 4 Distributed Vector Quantization algorithms Matthieu Durut (Telecom/Lokad) 14 / 55

15 Distributed Batch K-Means Sequencial Batch K-Means Algorithm 1 Sequential Batch K-Means Select κ initial prototypes (w k ) κ k=1 repeat for t = 1 to N do for k = 1 to κ do compute z t w k 2 2 end for find the closest centroid w k (t) from z t ; end for for k = 1 to κ do 1 w k = #{t,k (t)=k} z t {t,k (t)=k} end for until the stopping criterion is met Matthieu Durut (Telecom/Lokad) 15 / 55

16 Distributed Batch K-Means Characteristics Relatively fast : Batch Walltime seq = (3Nκd + Nκ + Nd + κd)it flop, where I refers to the number of iterations and T flop refers to the time for a floating point operation to be evaluated. Determinist. Easy to set-up. Results stationary from a certain iteration. Suited for parallelization? Obvious data-level Parallelism. Same result than sequential. Excellent speed-up efficiency already achieved. Matthieu Durut (Telecom/Lokad) 16 / 55

17 Distributed Batch K-Means Distribution Scheme Data-level parallelism suggests iterated Map-Reduce distribution. Data set {z t } N t=1 is homogeneously split into M chunks (one per processing unit): S i, i {1..M}. The processing unit i computes the distance z i t w k 2 2 for zi t Si and k {1..κ} (Map phase). Then the new prototypes version is recomputed by one or several machines (Reduce phase). Matthieu Durut (Telecom/Lokad) 17 / 55

18 Distributed Batch K-Means Batch K-Means distributed over a DMM architecture Matthieu Durut (Telecom/Lokad) 18 / 55

19 Distributed Batch K-Means Wall Time Batch WallTime DMM = T comp M + T comm M, where T comp M refers to the wall time of the assignment phase and TM comm refers to the wall time of the recalculation phase (mostly spent in communications). Assignment phase T comp M = 3INκdT flop. M Matthieu Durut (Telecom/Lokad) 19 / 55

20 Distributed Batch K-Means Recalculation phase - DMM architecture with MPI T comm M = log 2 (M) IκdS B, where S refers to the size of a double in memory (8 bytes in the following) and B refers to the communication bandwidth per machine. Wall time - DMM architecture with MPI Batch WallTime DMM 3INκdT flop = M + log 2 (M) IκdS B. Matthieu Durut (Telecom/Lokad) 20 / 55

21 Distributed Batch K-Means Speed-up - DMM architecture with MPI SpeedUp DMM (M, N) = 3NT flop 3NT flop M + S B log 2(M). Optimal number of processing units MDMM = 3NT flop B. S Matthieu Durut (Telecom/Lokad) 21 / 55

22 Distributed Batch K-Means Batch K-Means distributed over Azure Matthieu Durut (Telecom/Lokad) 22 / 55

23 Distributed Batch K-Means Mapper 1 worker push blob into pings the storage untill it finds the given blob, then downloads it blobstorage Map result (prototypes) Partial reduce result (prototypes) Final reduce result (prototypes) Mapper 2 Partial Reducer Mapper 3 Final Reducer Mapper 4 Mapper 5 Partial Reducer Mapper 6 Figure : Distribution scheme of our cloud Batch K-Means. Matthieu Durut (Telecom/Lokad) 23 / 55

24 Distributed Batch K-Means Communication Modeling T comm M = I MκdS(2T read Blob + T write Blob ), where TBlob read (resp. T write Blob ) refers to the time needed by a given processing unit to download (resp. upload) a blob from (resp. to) the storage per memory unit. Speed-up - Cloud architecture SpeedUp(M, N) = 3NT flop M 3NT flop + MS(2T read Blob + T write Blob ). Optimal number of workers M (N) = 2/3 6NT flop S(2T read Blob + T write Blob ). Matthieu Durut (Telecom/Lokad) 24 / 55

25 Time to execute the reduce phase per byte (in 10-7 sec/byte) Distributed Batch K-Means Reduce phase duration per byte in function of the number of communicating units Number of communicating units Figure : Time to execute the Reduce phase per unit of memory (2TBlob read + T Blob write) in 10 7 sec/byte in function of the number of communicating units. Matthieu Durut (Telecom/Lokad) 25 / 55

26 Distributed Batch K-Means 70 Speedup in func3on of the number of mappers Speedup Theore0cal speedup N = N = N = N = N = Number of mappers P Figure : Charts of speedup performance curves with different data set size. Matthieu Durut (Telecom/Lokad) 26 / 55

27 Distributed Batch K-Means N Meff M Wall Time Sequential Effective Theoretical theoretic time Speedup Speedup (= M 3 ) Exp Exp Exp Exp Table : Comparison between the effective optimal number of processing units M eff and the theoretical optimal number of processing units M for different data set size. Matthieu Durut (Telecom/Lokad) 27 / 55

28 Distributed Batch K-Means Speedup in func3on of the number of mappers Speedup Observed speedup Theore4cal speedup Number of mappers P Figure : Charts of speedup performance curves with different number of processing units. For each value of M, the value of N is set accordingly so that the processing units are heavy loaded with data and computations. Matthieu Durut (Telecom/Lokad) 28 / 55

29 Distributed Batch K-Means Figure : Distribution of the processing time (in second) for multiple runs of the same computation task for multiple VM. Matthieu Durut (Telecom/Lokad) 29 / 55

30 Outline Distributed Vector Quantization algorithms 1 Introduction to Cloud Computing 2 Context 3 Distributed Batch K-Means 4 Distributed Vector Quantization algorithms Matthieu Durut (Telecom/Lokad) 30 / 55

31 Distributed Vector Quantization algorithms Asynchronous clustering: motivation Joint work with Benoit Patra Every actions should be accounted once No calculation should be discarded. No calculation should be used more than once. All the writes should result into prototypes update everywhere. All the reads should be used locally. On War from Clausewitz Saturate bandwidth, memory, CPU, etc. = Asynchronism = Online or at least mini-batch (no more batch) Matthieu Durut (Telecom/Lokad) 31 / 55

32 Distributed Vector Quantization algorithms Sequential VQ algorithm Consists in incremental updates of the ( R d) κ -valued prototypes {w(t)} t=0. Initiated from a random initial w(0) ( R d) κ. Given a series of positive steps (ε t ) t>0, it produces a series of w(t) by updating w at each step with a descent term. H(z, w) = ( ) (w l z) 1 {l=argmini=1,...,κ z w i 2 }. 1 l κ w(t + 1) = w(t) ε t+1 H ( z {t+1 mod n}, w(t) ), t 0. Matthieu Durut (Telecom/Lokad) 32 / 55

33 Distributed Vector Quantization algorithms Algorithm 2 Sequential VQ algorithm Select κ initial prototypes (w k ) κ k=1 Set t=0 repeat for k = 1 to κ do compute z {t+1 mod n} w k 2 2 end for Deduce H(z {t+1 mod n}, w) Set w(t + 1) = w(t) ε t+1 H ( z {t+1 mod n}, w(t) ) increment t until the stopping criterion is met Matthieu Durut (Telecom/Lokad) 33 / 55

34 Distributed Vector Quantization algorithms Our context We assume that a satisfactory VQ implementation has been found but too slow. We will not be concerned with optimization of the several parameters (initialization, sequence of steps etc.) We have access to a finite dataset: { z i n t}, i {1,..., M} t=0 distributed over M processing units. When does a distributed VQ implementation perform better than the corresponding sequential one? Matthieu Durut (Telecom/Lokad) 34 / 55

35 Distributed Vector Quantization algorithms Definition of Speed-up for VQ algorithms A reference prototypes version is made available in the shared-memory (BlobStorage), referred to as the prototypes shared version: w srd. Performance is measured with the corresponding empirical distortion: for all w ( R d) κ, L N (w) = 1 nm M n min l=1,...,κ i=1 t=1 z i 2 t w l After any t wall time seconds, the empirical distortion of the prototypes shared version should be lower than for the prototypes version produced by sequential algorithm. Matthieu Durut (Telecom/Lokad) 35 / 55

36 Distributed Vector Quantization algorithms Previous work VQ as stochastic gradient descent method Shared-Memory : interlaying the prototypes version updates No Shared-Memory but Loss Convexity : averaging the prototypes versions In our case No efficient shared-memory No convexity of the loss function Organization of our work Simulated distributed architecture on a single machine. Then cloud implementation Matthieu Durut (Telecom/Lokad) 36 / 55

37 Distributed Vector Quantization algorithms First distributed scheme All the versions are set equal at time t = 0, w 1 (0) =... = w M (0). For all i {1,..., M} and all t 0, we have the following iterations: ( ) wtemp i = w i (t) ε t+1 H z i {t+1 mod n}, w i (t) w { i (t + 1) = wtemp i if t mod τ 0 or t = 0, w srd = 1 M M j=1 w j temp w i (t + 1) = w srd if t mod τ = 0 and t τ. Matthieu Durut (Telecom/Lokad) 37 / 55

38 Distributed Vector Quantization algorithms A first basic parallelization scheme Global time reference Averaging phase Figure : A simple (and synchronous) scheme: whenever τ points are processed an averaging phase occurs. Matthieu Durut (Telecom/Lokad) 38 / 55

39 Distributed Vector Quantization algorithms A first basic parallelization scheme 6.5E+06 Empirical distortion 6.0E E E E+06 M=1 M=2 M=10 4.0E E t (iterations) Figure : Charts of performance with different number of computing entities: M = 1, 2, 10 and τ = 10. Matthieu Durut (Telecom/Lokad) 39 / 55

40 Distributed Vector Quantization algorithms A comparison between the previous parallel scheme and the sequential VQ For t mod τ = 0 and t > 0. Then, for all i {1,..., M}, w i (t + 1) = w i (t τ + 1) t t =t τ+1 ε 1 ( )) M t +1( M j=1 H z j t +1, w j (t ) (parallel) w(t + 1) = w(t τ + 1) t t =t τ+1 ε t +1H ( z {t +1 mod n}, w(t ) ) (sequential) Terms in blue are estimators of the gradient. Matthieu Durut (Telecom/Lokad) 40 / 55

41 Distributed Vector Quantization algorithms Two SGD algorithms with the same sequence of steps then, they have similar convergence speed. Sequence of steps learning rate trade-off exploration/convergence. Introducing displacement/descent terms For all j {1,..., M} and t 2 t 1 0 set j t 1 t 2 = t 2 t =t 1 +1 ε t +1H ( ) z j {t +1 mod n}, w j (t ). corresponds to the displacement of the prototypes computed by j during (t 1, t 2 ), Matthieu Durut (Telecom/Lokad) 41 / 55

42 Distributed Vector Quantization algorithms Second distributed scheme ( ) wtemp i = w i (t) ε t+1 H z i {t+1 mod n}, w i (t) w { i (t + 1) = wtemp i if t mod τ 0 or t = 0, w srd = w srd M j=1 j t τ t w i (t + 1) = w srd if t mod τ = 0 and t τ. Matthieu Durut (Telecom/Lokad) 42 / 55

43 Distributed Vector Quantization algorithms Displacement terms Figure : Illustration of the parallelization scheme of VQ procedures described by equations (43). Matthieu Durut (Telecom/Lokad) 43 / 55

44 Distributed Vector Quantization algorithms Empirical distortion 6.0E E E E+06 M=1 M=2 M=10 4.0E E t (iterations) Figure : Charts of performance curves for a reviewed scheme M = 1, 2, 10 and τ = 10. Matthieu Durut (Telecom/Lokad) 44 / 55

45 Distributed Vector Quantization algorithms Delayed distributed scheme ( ) wtemp i = w i (t) ε t+1 H z i {t+1 mod n}, w i (t) { w i (t + 1) = wtemp i if t mod τ 0 or t = 0, w srd = w srd M j=1 j t 2τ t τ if t mod τ = 0 and t 2τ, w i (t + 1) = w srd i t τ t if t mod τ = 0 and t τ. Matthieu Durut (Telecom/Lokad) 45 / 55

46 Distributed Vector Quantization algorithms delay Figure : Illustration of the parallelization scheme described by equations (46). The reducing phase is only drawn for processor 1 where t = 2τ and processor 4 where t = 4τ. Matthieu Durut (Telecom/Lokad) 46 / 55

47 Distributed Vector Quantization algorithms Empirical distortion 6.0E E E E+06 M=1 M=2 M=10 4.0E E t (iterations) Figure : Charts of performance curves for iterations (46) with different numbers of computing entities, M = 1, 2, 10 and τ = 10. Matthieu Durut (Telecom/Lokad) 47 / 55

48 Distributed Vector Quantization algorithms Simulated parallelization schemes first conclusions Motto: summing displacement term rather than averaging versions. Experimental results Satisfactory speed-ups are recovered for the later simulated parallel schemes. Delays (determinist + random) are also studied: reasonable [random] delays do not have sever impact on the convergence. Good perspectives for a true implementation on a could computing platform. Matthieu Durut (Telecom/Lokad) 48 / 55

49 Distributed Vector Quantization algorithms The CloudDALVQ project Scientific project for testing new large scale clustering/quantization algorithms distributed on a Cloud Platform (MS Azure). Open source written in C#.NET released under new BSD Licence. Matthieu Durut (Telecom/Lokad) 49 / 55

50 Distributed Vector Quantization algorithms Mapper 1 worker push blob into pings the storage untill it finds the given blob, then downloads it blobstorage Map result (displacement term) Partial reduce result (displacement term) Final reduce result (prototypes) Mapper 2 Partial Reducer Mapper 3 Final Reducer Mapper 4 Mapper 5 Partial Reducer Mapper 6 Figure : Distribution scheme of our cloud VQ implementation. Matthieu Durut (Telecom/Lokad) 50 / 55

51 Distributed Vector Quantization algorithms BlobStorage shared version pull thread read buffer process thread process action 1 ProcessService process action 2 local version process action 3 displacement term data write buffer push thread displacement term BlobStorage Matthieu Durut (Telecom/Lokad) 51 / 55

52 Distributed Vector Quantization algorithms Empirical distortion 5.9E E E+04 M M M M M 4.4E seconds Figure : Normalized quantization curves with M = 1, 2, 4, 8, 16. Troubles appear with M = 16 because the ReduceService is overloaded. Matthieu Durut (Telecom/Lokad) 52 / 55

53 Distributed Vector Quantization algorithms 6.8E+04 Empirical distortion 6.3E E E E+04 M M M M 4.3E E seconds Figure : Normalized quantization curves with M = 8, 16, 32, 64 with an extra layer for the so called reducing task. Matthieu Durut (Telecom/Lokad) 53 / 55

54 Distributed Vector Quantization algorithms Empirical distortion 9.0E E E+04 CloudDALVQ CloudBatchKM 6.0E E seconds Figure : This chart reports on the competition between our cloud DAVQ algorithm and the cloud Batch K-Means. The graph shows the empirical distortion of the algorithms over the time. Matthieu Durut (Telecom/Lokad) 54 / 55

Convergence of a distributed asynchronous learning vector quantization algorithm.

Convergence of a distributed asynchronous learning vector quantization algorithm. Convergence of a distributed asynchronous learning vector quantization algorithm. ENS ULM, NOVEMBER 2010 Benoît Patra (UPMC-Paris VI/Lokad) 1 / 59 Outline. 1 Introduction. 2 Vector quantization, convergence

More information

Parallel and Distributed Stochastic Learning -Towards Scalable Learning for Big Data Intelligence

Parallel and Distributed Stochastic Learning -Towards Scalable Learning for Big Data Intelligence Parallel and Distributed Stochastic Learning -Towards Scalable Learning for Big Data Intelligence oé LAMDA Group H ŒÆOŽÅ Æ EâX ^ #EâI[ : liwujun@nju.edu.cn Dec 10, 2016 Wu-Jun Li (http://cs.nju.edu.cn/lwj)

More information

Scalable Asynchronous Gradient Descent Optimization for Out-of-Core Models

Scalable Asynchronous Gradient Descent Optimization for Out-of-Core Models Scalable Asynchronous Gradient Descent Optimization for Out-of-Core Models Chengjie Qin 1, Martin Torres 2, and Florin Rusu 2 1 GraphSQL, Inc. 2 University of California Merced August 31, 2017 Machine

More information

StreamSVM Linear SVMs and Logistic Regression When Data Does Not Fit In Memory

StreamSVM Linear SVMs and Logistic Regression When Data Does Not Fit In Memory StreamSVM Linear SVMs and Logistic Regression When Data Does Not Fit In Memory S.V. N. (vishy) Vishwanathan Purdue University and Microsoft vishy@purdue.edu October 9, 2012 S.V. N. Vishwanathan (Purdue,

More information

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University Che-Wei Chang chewei@mail.cgu.edu.tw Department of Computer Science and Information Engineering, Chang Gung University } 2017/11/15 Midterm } 2017/11/22 Final Project Announcement 2 1. Introduction 2.

More information

Faster Machine Learning via Low-Precision Communication & Computation. Dan Alistarh (IST Austria & ETH Zurich), Hantian Zhang (ETH Zurich)

Faster Machine Learning via Low-Precision Communication & Computation. Dan Alistarh (IST Austria & ETH Zurich), Hantian Zhang (ETH Zurich) Faster Machine Learning via Low-Precision Communication & Computation Dan Alistarh (IST Austria & ETH Zurich), Hantian Zhang (ETH Zurich) 2 How many bits do you need to represent a single number in machine

More information

Tutorial on: Optimization I. (from a deep learning perspective) Jimmy Ba

Tutorial on: Optimization I. (from a deep learning perspective) Jimmy Ba Tutorial on: Optimization I (from a deep learning perspective) Jimmy Ba Outline Random search v.s. gradient descent Finding better search directions Design white-box optimization methods to improve computation

More information

Simple Techniques for Improving SGD. CS6787 Lecture 2 Fall 2017

Simple Techniques for Improving SGD. CS6787 Lecture 2 Fall 2017 Simple Techniques for Improving SGD CS6787 Lecture 2 Fall 2017 Step Sizes and Convergence Where we left off Stochastic gradient descent x t+1 = x t rf(x t ; yĩt ) Much faster per iteration than gradient

More information

Rainfall data analysis and storm prediction system

Rainfall data analysis and storm prediction system Rainfall data analysis and storm prediction system SHABARIRAM, M. E. Available from Sheffield Hallam University Research Archive (SHURA) at: http://shura.shu.ac.uk/15778/ This document is the author deposited

More information

Distributed Machine Learning: A Brief Overview. Dan Alistarh IST Austria

Distributed Machine Learning: A Brief Overview. Dan Alistarh IST Austria Distributed Machine Learning: A Brief Overview Dan Alistarh IST Austria Background The Machine Learning Cambrian Explosion Key Factors: 1. Large s: Millions of labelled images, thousands of hours of speech

More information

Introduction to Portal for ArcGIS. Hao LEE November 12, 2015

Introduction to Portal for ArcGIS. Hao LEE November 12, 2015 Introduction to Portal for ArcGIS Hao LEE November 12, 2015 Agenda Web GIS pattern Product overview Installation and deployment Security and groups Configuration options Portal for ArcGIS + ArcGIS for

More information

Distributed Architectures

Distributed Architectures Distributed Architectures Software Architecture VO/KU (707023/707024) Roman Kern KTI, TU Graz 2015-01-21 Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 1 / 64 Outline 1 Introduction 2 Independent

More information

Topics in AI (CPSC 532L): Multimodal Learning with Vision, Language and Sound. Lecture 3: Introduction to Deep Learning (continued)

Topics in AI (CPSC 532L): Multimodal Learning with Vision, Language and Sound. Lecture 3: Introduction to Deep Learning (continued) Topics in AI (CPSC 532L): Multimodal Learning with Vision, Language and Sound Lecture 3: Introduction to Deep Learning (continued) Course Logistics - Update on course registrations - 6 seats left now -

More information

Logistic Regression. COMP 527 Danushka Bollegala

Logistic Regression. COMP 527 Danushka Bollegala Logistic Regression COMP 527 Danushka Bollegala Binary Classification Given an instance x we must classify it to either positive (1) or negative (0) class We can use {1,-1} instead of {1,0} but we will

More information

Parallel programming using MPI. Analysis and optimization. Bhupender Thakur, Jim Lupo, Le Yan, Alex Pacheco

Parallel programming using MPI. Analysis and optimization. Bhupender Thakur, Jim Lupo, Le Yan, Alex Pacheco Parallel programming using MPI Analysis and optimization Bhupender Thakur, Jim Lupo, Le Yan, Alex Pacheco Outline l Parallel programming: Basic definitions l Choosing right algorithms: Optimal serial and

More information

Quantum Artificial Intelligence and Machine Learning: The Path to Enterprise Deployments. Randall Correll. +1 (703) Palo Alto, CA

Quantum Artificial Intelligence and Machine Learning: The Path to Enterprise Deployments. Randall Correll. +1 (703) Palo Alto, CA Quantum Artificial Intelligence and Machine : The Path to Enterprise Deployments Randall Correll randall.correll@qcware.com +1 (703) 867-2395 Palo Alto, CA 1 Bundled software and services Professional

More information

A Service Architecture for Processing Big Earth Data in the Cloud with Geospatial Analytics and Machine Learning

A Service Architecture for Processing Big Earth Data in the Cloud with Geospatial Analytics and Machine Learning A Service Architecture for Processing Big Earth Data in the Cloud with Geospatial Analytics and Machine Learning WOLFGANG GLATZ & THOMAS BAHR 1 Abstract: The Geospatial Services Framework (GSF) brings

More information

CS425: Algorithms for Web Scale Data

CS425: Algorithms for Web Scale Data CS425: Algorithms for Web Scale Data Most of the slides are from the Mining of Massive Datasets book. These slides have been modified for CS425. The original slides can be accessed at: www.mmds.org Challenges

More information

Parallel Matrix Factorization for Recommender Systems

Parallel Matrix Factorization for Recommender Systems Under consideration for publication in Knowledge and Information Systems Parallel Matrix Factorization for Recommender Systems Hsiang-Fu Yu, Cho-Jui Hsieh, Si Si, and Inderjit S. Dhillon Department of

More information

One Optimized I/O Configuration per HPC Application

One Optimized I/O Configuration per HPC Application One Optimized I/O Configuration per HPC Application Leveraging I/O Configurability of Amazon EC2 Cloud Mingliang Liu, Jidong Zhai, Yan Zhai Tsinghua University Xiaosong Ma North Carolina State University

More information

Introduction to Portal for ArcGIS

Introduction to Portal for ArcGIS Introduction to Portal for ArcGIS Derek Law Product Management March 10 th, 2015 Esri Developer Summit 2015 Agenda Web GIS pattern Product overview Installation and deployment Security and groups Configuration

More information

Fast Asynchronous Parallel Stochastic Gradient Descent: A Lock-Free Approach with Convergence Guarantee

Fast Asynchronous Parallel Stochastic Gradient Descent: A Lock-Free Approach with Convergence Guarantee Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) Fast Asynchronous Parallel Stochastic Gradient Descent: A Lock-Free Approach with Convergence Guarantee Shen-Yi Zhao and

More information

Building a Multi-FPGA Virtualized Restricted Boltzmann Machine Architecture Using Embedded MPI

Building a Multi-FPGA Virtualized Restricted Boltzmann Machine Architecture Using Embedded MPI Building a Multi-FPGA Virtualized Restricted Boltzmann Machine Architecture Using Embedded MPI Charles Lo and Paul Chow {locharl1, pc}@eecg.toronto.edu Department of Electrical and Computer Engineering

More information

Portal for ArcGIS: An Introduction

Portal for ArcGIS: An Introduction Portal for ArcGIS: An Introduction Derek Law Esri Product Management Esri UC 2014 Technical Workshop Agenda Web GIS pattern Product overview Installation and deployment Security and groups Configuration

More information

THE STATE OF CONTEMPORARY COMPUTING SUBSTRATES FOR OPTIMIZATION METHODS. Benjamin Recht UC Berkeley

THE STATE OF CONTEMPORARY COMPUTING SUBSTRATES FOR OPTIMIZATION METHODS. Benjamin Recht UC Berkeley THE STATE OF CONTEMPORARY COMPUTING SUBSTRATES FOR OPTIMIZATION METHODS Benjamin Recht UC Berkeley MY QUIXOTIC QUEST FOR SUPERLINEAR ALGORITHMS Benjamin Recht UC Berkeley Collaborators Slides extracted

More information

MapReduce in Spark. Krzysztof Dembczyński. Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland

MapReduce in Spark. Krzysztof Dembczyński. Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland MapReduce in Spark Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master studies, second semester

More information

Stable Adaptive Momentum for Rapid Online Learning in Nonlinear Systems

Stable Adaptive Momentum for Rapid Online Learning in Nonlinear Systems Stable Adaptive Momentum for Rapid Online Learning in Nonlinear Systems Thore Graepel and Nicol N. Schraudolph Institute of Computational Science ETH Zürich, Switzerland {graepel,schraudo}@inf.ethz.ch

More information

CS 700: Quantitative Methods & Experimental Design in Computer Science

CS 700: Quantitative Methods & Experimental Design in Computer Science CS 700: Quantitative Methods & Experimental Design in Computer Science Sanjeev Setia Dept of Computer Science George Mason University Logistics Grade: 35% project, 25% Homework assignments 20% midterm,

More information

Welcome to MCS 572. content and organization expectations of the course. definition and classification

Welcome to MCS 572. content and organization expectations of the course. definition and classification Welcome to MCS 572 1 About the Course content and organization expectations of the course 2 Supercomputing definition and classification 3 Measuring Performance speedup and efficiency Amdahl s Law Gustafson

More information

Comparison of Modern Stochastic Optimization Algorithms

Comparison of Modern Stochastic Optimization Algorithms Comparison of Modern Stochastic Optimization Algorithms George Papamakarios December 214 Abstract Gradient-based optimization methods are popular in machine learning applications. In large-scale problems,

More information

An Evolving Gradient Resampling Method for Machine Learning. Jorge Nocedal

An Evolving Gradient Resampling Method for Machine Learning. Jorge Nocedal An Evolving Gradient Resampling Method for Machine Learning Jorge Nocedal Northwestern University NIPS, Montreal 2015 1 Collaborators Figen Oztoprak Stefan Solntsev Richard Byrd 2 Outline 1. How to improve

More information

WDCloud: An End to End System for Large- Scale Watershed Delineation on Cloud

WDCloud: An End to End System for Large- Scale Watershed Delineation on Cloud WDCloud: An End to End System for Large- Scale Watershed Delineation on Cloud * In Kee Kim, * Jacob Steele, + Anthony Castronova, * Jonathan Goodall, and * Marty Humphrey * University of Virginia + Utah

More information

Support Vector Machines: Training with Stochastic Gradient Descent. Machine Learning Fall 2017

Support Vector Machines: Training with Stochastic Gradient Descent. Machine Learning Fall 2017 Support Vector Machines: Training with Stochastic Gradient Descent Machine Learning Fall 2017 1 Support vector machines Training by maximizing margin The SVM objective Solving the SVM optimization problem

More information

Stochastic Gradient Descent. CS 584: Big Data Analytics

Stochastic Gradient Descent. CS 584: Big Data Analytics Stochastic Gradient Descent CS 584: Big Data Analytics Gradient Descent Recap Simplest and extremely popular Main Idea: take a step proportional to the negative of the gradient Easy to implement Each iteration

More information

Parallel Performance Theory - 1

Parallel Performance Theory - 1 Parallel Performance Theory - 1 Parallel Computing CIS 410/510 Department of Computer and Information Science Outline q Performance scalability q Analytical performance measures q Amdahl s law and Gustafson-Barsis

More information

ARock: an algorithmic framework for asynchronous parallel coordinate updates

ARock: an algorithmic framework for asynchronous parallel coordinate updates ARock: an algorithmic framework for asynchronous parallel coordinate updates Zhimin Peng, Yangyang Xu, Ming Yan, Wotao Yin ( UCLA Math, U.Waterloo DCO) UCLA CAM Report 15-37 ShanghaiTech SSDS 15 June 25,

More information

ELF products in the ArcGIS platform

ELF products in the ArcGIS platform ELF products in the ArcGIS platform Presentation to: Author: Date: NMO Summit 2016, Dublin, Ireland Clemens Portele 18 May 2016 The Building Blocks 18 May, 2016 More ELF users through affiliated platforms

More information

Continuous Machine Learning

Continuous Machine Learning Continuous Machine Learning Kostiantyn Bokhan, PhD Project Lead at Samsung R&D Ukraine Kharkiv, October 2016 Agenda ML dev. workflows ML dev. issues ML dev. solutions Continuous machine learning (CML)

More information

Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS. Lluís-Miquel Munguia, Geoffrey M. Oxberry, Deepak Rajan, Yuji Shinano

Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS. Lluís-Miquel Munguia, Geoffrey M. Oxberry, Deepak Rajan, Yuji Shinano Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS Lluís-Miquel Munguia, Geoffrey M. Oxberry, Deepak Rajan, Yuji Shinano ... Our contribution PIPS-PSBB*: Multi-level parallelism for Stochastic

More information

Harvard Center for Geographic Analysis Geospatial on the MOC

Harvard Center for Geographic Analysis Geospatial on the MOC 2017 Massachusetts Open Cloud Workshop Boston University Harvard Center for Geographic Analysis Geospatial on the MOC Ben Lewis Harvard Center for Geographic Analysis Aaron Williams MapD Small Team Supporting

More information

Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization

Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization Tabitha Samuel, Master s Candidate Dr. Michael W. Berry, Major Professor Abstract: Increasingly

More information

Chapter 7. Sequential Circuits Registers, Counters, RAM

Chapter 7. Sequential Circuits Registers, Counters, RAM Chapter 7. Sequential Circuits Registers, Counters, RAM Register - a group of binary storage elements suitable for holding binary info A group of FFs constitutes a register Commonly used as temporary storage

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

I N T R O D U C T I O N : G R O W I N G I T C O M P L E X I T Y

I N T R O D U C T I O N : G R O W I N G I T C O M P L E X I T Y Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200 F.508.935.4015 www.idc.com W H I T E P A P E R I n v a r i a n t A n a l y z e r : A n A u t o m a t e d A p p r o a c h t o

More information

Stochastic Gradient Descent. Ryan Tibshirani Convex Optimization

Stochastic Gradient Descent. Ryan Tibshirani Convex Optimization Stochastic Gradient Descent Ryan Tibshirani Convex Optimization 10-725 Last time: proximal gradient descent Consider the problem min x g(x) + h(x) with g, h convex, g differentiable, and h simple in so

More information

CPSC 340: Machine Learning and Data Mining. Stochastic Gradient Fall 2017

CPSC 340: Machine Learning and Data Mining. Stochastic Gradient Fall 2017 CPSC 340: Machine Learning and Data Mining Stochastic Gradient Fall 2017 Assignment 3: Admin Check update thread on Piazza for correct definition of trainndx. This could make your cross-validation code

More information

ArcGIS GeoAnalytics Server: An Introduction. Sarah Ambrose and Ravi Narayanan

ArcGIS GeoAnalytics Server: An Introduction. Sarah Ambrose and Ravi Narayanan ArcGIS GeoAnalytics Server: An Introduction Sarah Ambrose and Ravi Narayanan Overview Introduction Demos Analysis Concepts using GeoAnalytics Server GeoAnalytics Data Sources GeoAnalytics Server Administration

More information

EP2200 Course Project 2017 Project II - Mobile Computation Offloading

EP2200 Course Project 2017 Project II - Mobile Computation Offloading EP2200 Course Project 2017 Project II - Mobile Computation Offloading 1 Introduction Queuing theory provides us a very useful mathematic tool that can be used to analytically evaluate the performance of

More information

Coordinate Descent and Ascent Methods

Coordinate Descent and Ascent Methods Coordinate Descent and Ascent Methods Julie Nutini Machine Learning Reading Group November 3 rd, 2015 1 / 22 Projected-Gradient Methods Motivation Rewrite non-smooth problem as smooth constrained problem:

More information

Communication-efficient and Differentially-private Distributed SGD

Communication-efficient and Differentially-private Distributed SGD 1/36 Communication-efficient and Differentially-private Distributed SGD Ananda Theertha Suresh with Naman Agarwal, Felix X. Yu Sanjiv Kumar, H. Brendan McMahan Google Research November 16, 2018 2/36 Outline

More information

Fast Asynchronous Parallel Stochastic Gradient Descent: A Lock-Free Approach with Convergence Guarantee

Fast Asynchronous Parallel Stochastic Gradient Descent: A Lock-Free Approach with Convergence Guarantee Fast Asynchronous Parallel Stochastic Gradient Descent: A Lock-Free Approach with Convergence Guarantee Shen-Yi Zhao and Wu-Jun Li National Key Laboratory for Novel Software Technology Department of Computer

More information

Sum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017

Sum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017 Sum-Product Networks STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017 Introduction Outline What is a Sum-Product Network? Inference Applications In more depth

More information

ArcGIS Enterprise: What s New. Philip Heede Shannon Kalisky Melanie Summers Shreyas Shinde

ArcGIS Enterprise: What s New. Philip Heede Shannon Kalisky Melanie Summers Shreyas Shinde ArcGIS Enterprise: What s New Philip Heede Shannon Kalisky Melanie Summers Shreyas Shinde ArcGIS Enterprise is the new name for ArcGIS for Server ArcGIS Enterprise Software Components ArcGIS Server Portal

More information

Nonlinear Optimization Methods for Machine Learning

Nonlinear Optimization Methods for Machine Learning Nonlinear Optimization Methods for Machine Learning Jorge Nocedal Northwestern University University of California, Davis, Sept 2018 1 Introduction We don t really know, do we? a) Deep neural networks

More information

Introduction to Optimization

Introduction to Optimization Introduction to Optimization Konstantin Tretyakov (kt@ut.ee) MTAT.03.227 Machine Learning So far Machine learning is important and interesting The general concept: Fitting models to data So far Machine

More information

Progressive & Algorithms & Systems

Progressive & Algorithms & Systems University of California Merced Lawrence Berkeley National Laboratory Progressive Computation for Data Exploration Progressive Computation Online Aggregation (OLA) in DB Query Result Estimate Result ε

More information

Leveraging Web GIS: An Introduction to the ArcGIS portal

Leveraging Web GIS: An Introduction to the ArcGIS portal Leveraging Web GIS: An Introduction to the ArcGIS portal Derek Law Product Management DLaw@esri.com Agenda Web GIS pattern Product overview Installation and deployment Configuration options Security options

More information

ECS171: Machine Learning

ECS171: Machine Learning ECS171: Machine Learning Lecture 4: Optimization (LFD 3.3, SGD) Cho-Jui Hsieh UC Davis Jan 22, 2018 Gradient descent Optimization Goal: find the minimizer of a function min f (w) w For now we assume f

More information

Stochastic Optimization Algorithms Beyond SG

Stochastic Optimization Algorithms Beyond SG Stochastic Optimization Algorithms Beyond SG Frank E. Curtis 1, Lehigh University involving joint work with Léon Bottou, Facebook AI Research Jorge Nocedal, Northwestern University Optimization Methods

More information

A Reconfigurable Quantum Computer

A Reconfigurable Quantum Computer A Reconfigurable Quantum Computer David Moehring CEO, IonQ, Inc. College Park, MD Quantum Computing for Business 4-6 December 2017, Mountain View, CA IonQ Highlights Full Stack Quantum Computing Company

More information

High-Performance Scientific Computing

High-Performance Scientific Computing High-Performance Scientific Computing Instructor: Randy LeVeque TA: Grady Lemoine Applied Mathematics 483/583, Spring 2011 http://www.amath.washington.edu/~rjl/am583 World s fastest computers http://top500.org

More information

Energy-efficient Mapping of Big Data Workflows under Deadline Constraints

Energy-efficient Mapping of Big Data Workflows under Deadline Constraints Energy-efficient Mapping of Big Data Workflows under Deadline Constraints Presenter: Tong Shu Authors: Tong Shu and Prof. Chase Q. Wu Big Data Center Department of Computer Science New Jersey Institute

More information

INF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2)

INF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2) INF2270 Spring 2010 Philipp Häfliger Summary/Repetition (1/2) content From Scalar to Superscalar Lecture Summary and Brief Repetition Binary numbers Boolean Algebra Combinational Logic Circuits Encoder/Decoder

More information

Presentation in Convex Optimization

Presentation in Convex Optimization Dec 22, 2014 Introduction Sample size selection in optimization methods for machine learning Introduction Sample size selection in optimization methods for machine learning Main results: presents a methodology

More information

Optimization for neural networks

Optimization for neural networks 0 - : Optimization for neural networks Prof. J.C. Kao, UCLA Optimization for neural networks We previously introduced the principle of gradient descent. Now we will discuss specific modifications we make

More information

Visualizing Big Data on Maps: Emerging Tools and Techniques. Ilir Bejleri, Sanjay Ranka

Visualizing Big Data on Maps: Emerging Tools and Techniques. Ilir Bejleri, Sanjay Ranka Visualizing Big Data on Maps: Emerging Tools and Techniques Ilir Bejleri, Sanjay Ranka Topics Web GIS Visualization Big Data GIS Performance Maps in Data Visualization Platforms Next: Web GIS Visualization

More information

Electrical and Computer Engineering Department University of Waterloo Canada

Electrical and Computer Engineering Department University of Waterloo Canada Predicting a Biological Response of Molecules from Their Chemical Properties Using Diverse and Optimized Ensembles of Stochastic Gradient Boosting Machine By Tarek Abdunabi and Otman Basir Electrical and

More information

PI SERVER 2012 Do. More. Faster. Now! Copyr i g h t 2012 O S Is o f t, L L C. 1

PI SERVER 2012 Do. More. Faster. Now! Copyr i g h t 2012 O S Is o f t, L L C. 1 PI SERVER 2012 Do. More. Faster. Now! Copyr i g h t 2012 O S Is o f t, L L C. 1 AUGUST 7, 2007 APRIL 14, 2010 APRIL 24, 2012 Copyr i g h t 2012 O S Is o f t, L L C. 2 PI Data Archive Security PI Asset

More information

Predictive analysis on Multivariate, Time Series datasets using Shapelets

Predictive analysis on Multivariate, Time Series datasets using Shapelets 1 Predictive analysis on Multivariate, Time Series datasets using Shapelets Hemal Thakkar Department of Computer Science, Stanford University hemal@stanford.edu hemal.tt@gmail.com Abstract Multivariate,

More information

Parallel Performance Theory

Parallel Performance Theory AMS 250: An Introduction to High Performance Computing Parallel Performance Theory Shawfeng Dong shaw@ucsc.edu (831) 502-7743 Applied Mathematics & Statistics University of California, Santa Cruz Outline

More information

Knowledge Discovery and Data Mining 1 (VO) ( )

Knowledge Discovery and Data Mining 1 (VO) ( ) Knowledge Discovery and Data Mining 1 (VO) (707.003) Map-Reduce Denis Helic KTI, TU Graz Oct 24, 2013 Denis Helic (KTI, TU Graz) KDDM1 Oct 24, 2013 1 / 82 Big picture: KDDM Probability Theory Linear Algebra

More information

Telecommunication Services Engineering (TSE) Lab. Chapter IX Presence Applications and Services.

Telecommunication Services Engineering (TSE) Lab. Chapter IX Presence Applications and Services. Chapter IX Presence Applications and Services http://users.encs.concordia.ca/~glitho/ Outline 1. Basics 2. Interoperability 3. Presence service in clouds Basics 1 - IETF abstract model 2 - An example of

More information

REINFORCEMENT LEARNING

REINFORCEMENT LEARNING REINFORCEMENT LEARNING Larry Page: Where s Google going next? DeepMind's DQN playing Breakout Contents Introduction to Reinforcement Learning Deep Q-Learning INTRODUCTION TO REINFORCEMENT LEARNING Contents

More information

Logic Design II (17.342) Spring Lecture Outline

Logic Design II (17.342) Spring Lecture Outline Logic Design II (17.342) Spring 2012 Lecture Outline Class # 10 April 12, 2012 Dohn Bowden 1 Today s Lecture First half of the class Circuits for Arithmetic Operations Chapter 18 Should finish at least

More information

Importance Sampling for Minibatches

Importance Sampling for Minibatches Importance Sampling for Minibatches Dominik Csiba School of Mathematics University of Edinburgh 07.09.2016, Birmingham Dominik Csiba (University of Edinburgh) Importance Sampling for Minibatches 07.09.2016,

More information

NICTA Short Course. Network Analysis. Vijay Sivaraman. Day 1 Queueing Systems and Markov Chains. Network Analysis, 2008s2 1-1

NICTA Short Course. Network Analysis. Vijay Sivaraman. Day 1 Queueing Systems and Markov Chains. Network Analysis, 2008s2 1-1 NICTA Short Course Network Analysis Vijay Sivaraman Day 1 Queueing Systems and Markov Chains Network Analysis, 2008s2 1-1 Outline Why a short course on mathematical analysis? Limited current course offering

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

ArcGIS Enterprise: What s New. Philip Heede Shannon Kalisky Melanie Summers Sam Williamson

ArcGIS Enterprise: What s New. Philip Heede Shannon Kalisky Melanie Summers Sam Williamson ArcGIS Enterprise: What s New Philip Heede Shannon Kalisky Melanie Summers Sam Williamson ArcGIS Enterprise is the new name for ArcGIS for Server What is ArcGIS Enterprise ArcGIS Enterprise is powerful

More information

ArcGIS is Advancing. Both Contributing and Integrating many new Innovations. IoT. Smart Mapping. Smart Devices Advanced Analytics

ArcGIS is Advancing. Both Contributing and Integrating many new Innovations. IoT. Smart Mapping. Smart Devices Advanced Analytics ArcGIS is Advancing IoT Smart Devices Advanced Analytics Smart Mapping Real-Time Faster Computing Web Services Crowdsourcing Sensor Networks Both Contributing and Integrating many new Innovations ArcGIS

More information

Scikit-learn. scikit. Machine learning for the small and the many Gaël Varoquaux. machine learning in Python

Scikit-learn. scikit. Machine learning for the small and the many Gaël Varoquaux. machine learning in Python Scikit-learn Machine learning for the small and the many Gaël Varoquaux scikit machine learning in Python In this meeting, I represent low performance computing Scikit-learn Machine learning for the small

More information

Spatial Analytics Workshop

Spatial Analytics Workshop Spatial Analytics Workshop Pete Skomoroch, LinkedIn (@peteskomoroch) Kevin Weil, Twitter (@kevinweil) Sean Gorman, FortiusOne (@seangorman) #spatialanalytics Introduction The Rise of Spatial Analytics

More information

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016)

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016) Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016) Week 12: Real-Time Data Analytics (2/2) March 31, 2016 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo

More information

QR Decomposition in a Multicore Environment

QR Decomposition in a Multicore Environment QR Decomposition in a Multicore Environment Omar Ahsan University of Maryland-College Park Advised by Professor Howard Elman College Park, MD oha@cs.umd.edu ABSTRACT In this study we examine performance

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Computational and Statistical Learning Theory TTIC 31120 Prof. Nati Srebro Lecture 17: Stochastic Optimization Part II: Realizable vs Agnostic Rates Part III: Nearest Neighbor Classification Stochastic

More information

CSCI 1951-G Optimization Methods in Finance Part 12: Variants of Gradient Descent

CSCI 1951-G Optimization Methods in Finance Part 12: Variants of Gradient Descent CSCI 1951-G Optimization Methods in Finance Part 12: Variants of Gradient Descent April 27, 2018 1 / 32 Outline 1) Moment and Nesterov s accelerated gradient descent 2) AdaGrad and RMSProp 4) Adam 5) Stochastic

More information

Overview: Synchronous Computations

Overview: Synchronous Computations Overview: Synchronous Computations barriers: linear, tree-based and butterfly degrees of synchronization synchronous example 1: Jacobi Iterations serial and parallel code, performance analysis synchronous

More information

Selected Topics in Optimization. Some slides borrowed from

Selected Topics in Optimization. Some slides borrowed from Selected Topics in Optimization Some slides borrowed from http://www.stat.cmu.edu/~ryantibs/convexopt/ Overview Optimization problems are almost everywhere in statistics and machine learning. Input Model

More information

Lecture 1: Supervised Learning

Lecture 1: Supervised Learning Lecture 1: Supervised Learning Tuo Zhao Schools of ISYE and CSE, Georgia Tech ISYE6740/CSE6740/CS7641: Computational Data Analysis/Machine from Portland, Learning Oregon: pervised learning (Supervised)

More information

Master thesis. Multi-class Fork-Join queues & The stochastic knapsack problem

Master thesis. Multi-class Fork-Join queues & The stochastic knapsack problem Master thesis Multi-class Fork-Join queues & The stochastic knapsack problem Sihan Ding August 26th, 2011 Supervisor UL: Dr. Floske Spieksma Supervisors CWI: Drs. Chrétien Verhoef Prof.dr. Rob van der

More information

Introduction to Machine Learning (67577)

Introduction to Machine Learning (67577) Introduction to Machine Learning (67577) Shai Shalev-Shwartz School of CS and Engineering, The Hebrew University of Jerusalem Deep Learning Shai Shalev-Shwartz (Hebrew U) IML Deep Learning Neural Networks

More information

Stochastic Analogues to Deterministic Optimizers

Stochastic Analogues to Deterministic Optimizers Stochastic Analogues to Deterministic Optimizers ISMP 2018 Bordeaux, France Vivak Patel Presented by: Mihai Anitescu July 6, 2018 1 Apology I apologize for not being here to give this talk myself. I injured

More information

Machine Learning CS 4900/5900. Lecture 03. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Machine Learning CS 4900/5900. Lecture 03. Razvan C. Bunescu School of Electrical Engineering and Computer Science Machine Learning CS 4900/5900 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Machine Learning is Optimization Parametric ML involves minimizing an objective function

More information

Deep Learning & Neural Networks Lecture 4

Deep Learning & Neural Networks Lecture 4 Deep Learning & Neural Networks Lecture 4 Kevin Duh Graduate School of Information Science Nara Institute of Science and Technology Jan 23, 2014 2/20 3/20 Advanced Topics in Optimization Today we ll briefly

More information

Optimization for Machine Learning

Optimization for Machine Learning Optimization for Machine Learning Elman Mansimov 1 September 24, 2015 1 Modified based on Shenlong Wang s and Jake Snell s tutorials, with additional contents borrowed from Kevin Swersky and Jasper Snoek

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

How to deal with uncertainties and dynamicity?

How to deal with uncertainties and dynamicity? How to deal with uncertainties and dynamicity? http://graal.ens-lyon.fr/ lmarchal/scheduling/ 19 novembre 2012 1/ 37 Outline 1 Sensitivity and Robustness 2 Analyzing the sensitivity : the case of Backfilling

More information

Motivation Subgradient Method Stochastic Subgradient Method. Convex Optimization. Lecture 15 - Gradient Descent in Machine Learning

Motivation Subgradient Method Stochastic Subgradient Method. Convex Optimization. Lecture 15 - Gradient Descent in Machine Learning Convex Optimization Lecture 15 - Gradient Descent in Machine Learning Instructor: Yuanzhang Xiao University of Hawaii at Manoa Fall 2017 1 / 21 Today s Lecture 1 Motivation 2 Subgradient Method 3 Stochastic

More information

Our Problem. Model. Clock Synchronization. Global Predicate Detection and Event Ordering

Our Problem. Model. Clock Synchronization. Global Predicate Detection and Event Ordering Our Problem Global Predicate Detection and Event Ordering To compute predicates over the state of a distributed application Model Clock Synchronization Message passing No failures Two possible timing assumptions:

More information

Portal for ArcGIS: An Introduction. Catherine Hynes and Derek Law

Portal for ArcGIS: An Introduction. Catherine Hynes and Derek Law Portal for ArcGIS: An Introduction Catherine Hynes and Derek Law Agenda Web GIS pattern Product overview Installation and deployment Configuration options Security options and groups Portal for ArcGIS

More information

Web GIS & ArcGIS Pro. Zena Pelletier Nick Popovich

Web GIS & ArcGIS Pro. Zena Pelletier Nick Popovich Web GIS & ArcGIS Pro Zena Pelletier Nick Popovich Web GIS Transformation of the ArcGIS Platform Desktop Apps GIS Web Maps Web Scenes Layers Evolution of the modern GIS Desktop GIS (standalone GIS) GIS

More information