Continuous Machine Learning

Similar documents
The Open Sourcing of Infrastructure

ArcGIS Deployment Pattern. Azlina Mahad

Introduction to ArcGIS GeoAnalytics Server. Sarah Ambrose & Noah Slocum

A Service Architecture for Processing Big Earth Data in the Cloud with Geospatial Analytics and Machine Learning

What Would John Snow Do (Today)? Part 1

Incorporating ArcGIS Pro in your Curriculum

Quantum Artificial Intelligence and Machine Learning: The Path to Enterprise Deployments. Randall Correll. +1 (703) Palo Alto, CA

DANIEL WILSON AND BEN CONKLIN. Integrating AI with Foundation Intelligence for Actionable Intelligence

I N T R O D U C T I O N : G R O W I N G I T C O M P L E X I T Y

ArcGIS. for Server. Understanding our World

One Optimized I/O Configuration per HPC Application

Esri Overview for Mentor Protégé Program:

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Data science and engineering for local weather forecasts. Nikhil R Podduturi Data {Scientist, Engineer} November, 2016

Acceleration of WRF on the GPU

Portal for ArcGIS: An Introduction

CONTEMPORARY ANALYTICAL ECOSYSTEM PATRICK HALL, SAS INSTITUTE

ArcGIS Enterprise: Administration Workflows STUDENT EDITION

Introduction to Portal for ArcGIS. Hao LEE November 12, 2015

Portal for ArcGIS: An Introduction. Catherine Hynes and Derek Law

Collaborative WRF-based research and education enabled by software containers

Introduction to Portal for ArcGIS

Leveraging Web GIS: An Introduction to the ArcGIS portal

Spatial Analysis with Web GIS. Rachel Weeden

Stochastic Modelling of Electron Transport on different HPC architectures

Web GIS Deployment for Administrators. Vanessa Ramirez Solution Engineer, Natural Resources, Esri

Advancing Machine Learning and AI with Geography and GIS. Robert Kircher

Introduction to ArcGIS Maps for Office. Greg Ponto Scott Ball

Web GIS Administration: Tips and Tricks

WE ARE. A boutique software development house specializing in innovative solutions

High-performance processing and development with Madagascar. July 24, 2010 Madagascar development team

Visualizing Big Data on Maps: Emerging Tools and Techniques. Ilir Bejleri, Sanjay Ranka

TRAITS to put you on the map

Harvard Center for Geographic Analysis Geospatial on the MOC

Scikit-learn. scikit. Machine learning for the small and the many Gaël Varoquaux. machine learning in Python

Enabling Web GIS. Dal Hunter Jeff Shaner

DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning. Min Du, Feifei Li, Guineng Zheng, Vivek Srikumar University of Utah

ArcGIS Enterprise: What s New. Philip Heede Shannon Kalisky Melanie Summers Sam Williamson

Overview of ArcGIS Enterprise August 24, Dan Haag ESRI

ArcGIS GeoAnalytics Server: An Introduction. Sarah Ambrose and Ravi Narayanan

The Panel: What does the future look like for NPW application development? 17 th ECMWF Workshop on High Performance Computing in Meteorology

Using Artificial Intelligence to Train Your Chatbot.

Random Number Generation Is Getting Harder It s Time to Pay Attention

Multiphase Flow Simulations in Inclined Tubes with Lattice Boltzmann Method on GPU

Clustering algorithms distributed over a Cloud Computing Platform.

The Open Sourcing of Infrastructure SLC DevOpsDays 2017

OFWIM 2017 Annual Conference What Does Web GIS Really Mean for Fish and Wildlife Agencies?

ECE521 W17 Tutorial 1. Renjie Liao & Min Bai

Engineering of Automated Systems with Mechatronic Objects

Integrated Electricity Demand and Price Forecasting

The Changing Landscape of Land Administration

Geog 469 GIS Workshop. Managing Enterprise GIS Geodatabases

Supercomputer Programme

Machine Learning for Gravitational Wave signals classification in LIGO and Virgo

Cloud-based WRF Downscaling Simulations at Scale using Community Reanalysis and Climate Datasets

Integrating Globus into a Science Gateway for Cryo-EM

Changes in Esri GIS, practical ways to be ready for the future

Models to carry out inference vs. Models to mimic (spatio-temporal) systems 5/5/15

Esri Defense Mapping: Cartographic Production. Bo King

ArcGIS Enterprise: What s New. Philip Heede Shannon Kalisky Melanie Summers Shreyas Shinde

Ivana Zinno, Francesco Casu, Claudio De Luca, Riccardo Lanari, Michele Manunta. CNR IREA, Napoli, Italy

LIGO Grid Applications Development Within GriPhyN

Geologi for samfunnet

Information Retrieval and Organisation

The Emerging Role of Enterprise GIS in State Forest Agencies

Reimaging GIS: Geographic Information Society. Clint Brown Linda Beale Mark Harrower Esri

DIGITAL TWINS W A Z U G D e c e m b e r

@SoyGema GEMA PARREÑO PIQUERAS

Enabling ENVI. ArcGIS for Server

Big-Geo-Data EHR Infrastructure Development for On-Demand Analytics

ArcGIS 10.1 An Overview of the System

GPU Computing Activities in KISTI

Geo-Enabling Mountain Bike Trail Maintenance:

Scalable and Power-Efficient Data Mining Kernels

Gradient Coding: Mitigating Stragglers in Distributed Gradient Descent.

Cost optimized Timber Machine Strength Grading

PREDICTION OF HETERODIMERIC PROTEIN COMPLEXES FROM PROTEIN-PROTEIN INTERACTION NETWORKS USING DEEP LEARNING

Learning in a Distributed and Heterogeneous Environment

Esri User Conference 2018 Video Topics

Workforce Development Enables Your Adoption Strategy

Robert D. Borchert GIS Technician

A Practitioner s Guide to MXNet

Demystifying ArcGIS Online. Karen Lizcano Esri

Web GIS: Architectural Patterns and Practices. Shannon Kalisky Philip Heede

Terje Pedersen Product Manager Software / Hydrography Hydroacoustics Division

ArcGIS is Advancing. Both Contributing and Integrating many new Innovations. IoT. Smart Mapping. Smart Devices Advanced Analytics

Esri Training by Microcenter Prepare to Innovate. Microcenter Course Catalog

From statistics to data science. BAE 815 (Fall 2017) Dr. Zifei Liu

Today s Agenda: 1) Why Do We Need To Measure The Memory Component? 2) Machine Pool Memory / Best Practice Guidelines

An introduction to ArcGIS Maps for Office. Scott Ball & Mike Flanagan

Edge Computing and the Next Generation Central Office

Convolutional Neural Networks

Deutscher Wetterdienst

Mitosis Detection in Breast Cancer Histology Images with Multi Column Deep Neural Networks

Crime Analyst Extension. Christine Charles

NWGIS 2018 Bremerton Where Next?

Sensor Networks: Data Processing for Improved Spatial and Temporal Resolution

Rapid Application Development using InforSense Open Workflow and Daylight Technologies Deliver Discovery Value

Proposed Procedures for Determining the Method Detection Limit and Minimum Level

WDCloud: An End to End System for Large- Scale Watershed Delineation on Cloud

Transcription:

Continuous Machine Learning Kostiantyn Bokhan, PhD Project Lead at Samsung R&D Ukraine Kharkiv, October 2016

Agenda ML dev. workflows ML dev. issues ML dev. solutions Continuous machine learning (CML) Aspects of CML CML infrastructure CML delivery 2

ML dev. workflows Gathering Gathering Datasets Datasets Feature Feature design design Model Model Training Training Model Model Validation Validation Model Model Testing Testing Train framework (Python/R/Matlab) Requirements Requirements QA QA Planning Planning Tests Tests FE FE Model Model tools tools Classifier Classifier Application (C++/Java) 3 UI UI Market Market

ML dev. workflows Data selection Feature design Data Data transformation transformation Clean Datasets Hypothesize, Hypothesize, Model Model Visualize, Explore Define the Problem Gather Datasets Measure, Evaluate 4 Deploy

ML dev. workflows Training Test data Datasets Train data Feature Extraction Training the model features Model Eval the model labels Input Data Feature Extractio n features Predict Predicting Model 5 Labels

ML dev. issues Bigger Data Size Big Small 1996 Model complexity 2006 2016 Sim ple Complex Complex 6

ML dev. issues Performance Machine Learning on PC 7

Performance ML dev. issues Machine Learning on PC Machine Learning My other computer on AWS is Amazon EC2 8

Performance ML dev. issues Machine Learning on PC Machine Learning My other computer on AWS is Amazon EC2 Machine Learning on dedicated cluster 9

ML dev. issues Uncertainty 10

ML dev. issues Variety 11

ML dev. issues Reliabilit y 12

ML dev. issues Resource management 13

ML dev. solutions Performance + 17

ML dev. solutions Performance + Reliability + Mesos 18

ML dev. solutions Variety 19

ML dev. solutions Resource management Theano dcos CLI MPI Singularity Aurora Init.rd Marathon Kernel Mesos CUDA Memory CPU Storage 20 GPU

ML dev. solutions Mesosphere Enterprise DC/OS is an enterprise grade datacenter-scale operating system, providing a single platform for running containers, big data, and distributed apps in production. Services & Applications Easily deploy and run datacenter-wide app services such Docker, Cassandra, Spark pooled on a single platform DC/OS Powered by Apache Mesos Runtime, tools and best practices built-in to simplify operations and deliver a production self-healing infrastructure Run Anywhere Bare-metal, virtual, cloud or hybrid DC/OS runs on it all only requirement is a modern Linux distro, Windows support coming soon :) Source: https://dcos.io/docs/1.8/overview/architecture/ 22

ML dev. solutions Source: https://dcos.io/docs/1.8/overview/architecture/ 23

ML dev. solutions Apache Mesos is the open-source distributed systems kernel at the heart of the Mesosphere DC/OS. It abstracts the entire datacenter into a single pool of computing resources, simplifying running distributed systems at scale. Sources: http://nvidia.com, http://blog.arungupta.me/docker-apache-mesos-marathon/ 24

CML Resources Hybrid Virtual nodes Test devices 25 Bare-metal nodes with GPU

ML dev. solutions Uncertainty Continuous Machine Learning We can t remove uncertainty but we can automate routines especially delivery and integration 26

Aspects of CML Continuous Continuous Development Continuous Integration Continuous Deployment Continuous Delivery Continuous Everything 28

CML Infrastructure Data scientists Train Validation Test Pool of Devices Singularity Mesos Master Mesos Agent Marathon Standby Master Standby Master Mesos Agent Mesos Agent Spark job GIT Jenkins Docker registry Developers 31 Batch docker job Spark CUDA job Mesos Agent Spark job Spark Docker job CUDA job Singularity Marathon Build Test Verification

CML deploy 32

CML deploy 33

CML deploy 34

CML deploy 35

Questions? 36 Samsung R&D Institute Ukraine