Infrastructure Automation with Salt

Similar documents
Compensation Planning Application

ST-Links. SpatialKit. Version 3.0.x. For ArcMap. ArcMap Extension for Directly Connecting to Spatial Databases. ST-Links Corporation.

Using OGC standards to improve the common

ArcGIS Enterprise: What s New. Philip Heede Shannon Kalisky Melanie Summers Shreyas Shinde

Continuous Performance Testing Shopware Developer Conference. Kore Nordmann 08. June 2013

Introduction to Portal for ArcGIS. Hao LEE November 12, 2015

PI SERVER 2012 Do. More. Faster. Now! Copyr i g h t 2012 O S Is o f t, L L C. 1

Introduction to Portal for ArcGIS

TOWARDS THE DEVELOPMENT OF A MONITORING SYSTEM FOR PLANNING POLICY Residential Land Uses Case study of Brisbane, Melbourne, Chicago and London

Why GIS & Why Internet GIS?

High-Performance Scientific Computing

Web GIS & ArcGIS Pro. Zena Pelletier Nick Popovich

Administrivia. Course Objectives. Overview. Lecture Notes Week markem/cs333/ 2. Staff. 3. Prerequisites. 4. Grading. 1. Theory and application

Joint MISTRAL/CESI lunch workshop 15 th November 2017

Overview of Geospatial Open Source Software which is Robust, Feature Rich and Standards Compliant

MetConsole AWOS. (Automated Weather Observation System) Make the most of your energy SM

Administering your Enterprise Geodatabase using Python. Jill Penney

Portal for ArcGIS: An Introduction

Replication cluster on MariaDB 5.5 / ubuntu-server. Mark Schneider ms(at)it-infrastrukturen(dot)org

Fundamentals of Computational Science

Deep-dive into PyMISP MISP - Malware Information Sharing Platform & Threat Sharing

Development of a Web-Based GIS Management System for Agricultural Authorities in Iraq

django in the real world

Changes in Esri GIS, practical ways to be ready for the future

ELF products in the ArcGIS platform

ArcGIS Deployment Pattern. Azlina Mahad

A Reconfigurable Quantum Computer

Solving Polynomial Systems in the Cloud with Polynomial Homotopy Continuation

IMS4 ARWIS. Airport Runway Weather Information System. Real-time data, forecasts and early warnings

Databases through Python-Flask and MariaDB

Geog 469 GIS Workshop. Managing Enterprise GIS Geodatabases

One Optimized I/O Configuration per HPC Application

Telecommunication Services Engineering (TSE) Lab. Chapter IX Presence Applications and Services.

ArcGIS GeoAnalytics Server: An Introduction. Sarah Ambrose and Ravi Narayanan

ECEN 449: Microprocessor System Design Department of Electrical and Computer Engineering Texas A&M University

Quantum Chemical Calculations by Parallel Computer from Commodity PC Components

Develop a Spatial Information Management System: A Case Study for Faculty of Agriculture, University of Ruhuna, Sri Lanka

Grid Enabling Geographically Weighted Regression

XR Analog Clock - Manual Setting Model Troubleshooting Guide

Karsten Vennemann, Seattle. QGIS Workshop CUGOS Spring Fling 2015

The World Bank and the Open Geospatial Web. Chris Holmes

Web GIS: Architectural Patterns and Practices. Shannon Kalisky Philip Heede

Colin Bray, OSi CEO. Collaboration to develop a data platform for geospatial and statistical information in Ireland

Leveraging Web GIS: An Introduction to the ArcGIS portal

A GUI FOR EVOLVE ZAMS

PHP-Einführung - Lesson 4 - Object Oriented Programming. Alexander Lichter June 27, 2017

ArcGIS Earth for Enterprises DARRON PUSTAM ARCGIS EARTH CHRIS ANDREWS 3D

How to make R, PostGIS and QGis cooperate for statistical modelling duties: a case study on hedonic regressions

MSC HPC Infrastructure Update. Alain St-Denis Canadian Meteorological Centre Meteorological Service of Canada

CWMS Modeling for Real-Time Water Management

New Facilities for Multiphysics Modelling in Opera-3d version 16 By Chris Riley

Bloomsburg University Weather Viewer Quick Start Guide. Software Version 1.2 Date 4/7/2014

Integration of ArcFM UT with SCADA, SAP, MAXIMO and Network Calculation

ArcGIS Enterprise: What s New. Philip Heede Shannon Kalisky Melanie Summers Sam Williamson

The Vaisala AUTOSONDE AS41 OPERATIONAL EFFICIENCY AND RELIABILITY TO A TOTALLY NEW LEVEL.

Forecast solutions for the energy sector

The Open Sourcing of Infrastructure

D is the voltage difference = (V + - V - ).

MapOSMatic, free city maps for everyone!

Workstations at Met Éireann. Kieran Commins Head Applications Development

Stochastic Modelling of Electron Transport on different HPC architectures

WeatherWatcher ACP. Astronomers Control Panel (V4 or >) Ambient Virtual Weather Station (Pro or Internet editions) ASCOM platform v4.

A Spatial Data Infrastructure for Landslides and Floods in Italy

Weather Research and Forecasting (WRF) Performance Benchmark and Profiling. July 2012

A NEW SYSTEM FOR THE GENERATION OF UTC(CH)

Overlay Transport Virtualization (OTV) Unicast-Mode Transport Infrastructure Deployment

Conjoint Modeling of Temporal Dependencies in Event Streams. Ankur Parikh Asela Gunawardana Chris Meek

Demystifying ArcGIS Online. Karen Lizcano Esri

Enabling ENVI. ArcGIS for Server

DP Project Development Pvt. Ltd.

Geodatabase Programming with Python John Yaist

GIS Options RELU Upland Moorland Scoping Study Project CCG/SoG Working Paper, February 2005 Andy Turner

Geodatabase Programming with Python

arxiv:astro-ph/ v1 15 Sep 1999

Comprehensive Winter Maintenance Management System BORRMA-web MDSS inside to increase Road Safety and Traffic Flow

PAID INVOICE TAX REPORT

Time. Today. l Physical clocks l Logical clocks

Information System as a Tool for Marine Spatial Planning The SmartSea Vision and a Prototype

A small but EFFICIENT collaboration for the Spiral2 control system development

Innovation. The Push and Pull at ESRI. September Kevin Daugherty Cadastral/Land Records Industry Solutions Manager

Vaisala AviMet Automated Weather Observing System

CS425: Algorithms for Web Scale Data

Integration and Higher Level Testing

Frequently Asked Questions

AMS 132: Discussion Section 2

ECE3510 Lab #5 PID Control

OECD QSAR Toolbox v.4.1. Tutorial on how to predict Skin sensitization potential taking into account alert performance

Attack Graph Modeling and Generation

A study of entropy transfers

Design and implementation of a new meteorology geographic information system

Portal for ArcGIS: An Introduction. Catherine Hynes and Derek Law

Terje Pedersen Product Manager Software / Hydrography Hydroacoustics Division

Engineering of Automated Systems with Mechatronic Objects

Features and Benefits

Scripting Languages Fast development, extensible programs

WeatherHub2 Quick Start Guide

SAFMC Habitat and Ecosystem IMS. Habitat and Environmental Protection Advisory Panel Meeting

From BASIS DD to Barista Application in Five Easy Steps

Mastering ArcGIS Platforms to Build a National Census Web Mapping Tool. Eoghan McCarthy (AIRO)

An object-oriented design process. Weather system description. Layered architecture. Process stages. System context and models of use

Transcription:

Infrastructure Automation with Salt Sean McGrath 10th November 2016

About Research IT Where I work as a systems administrator http://www.tchpc.tcd.ie/ Ireland s premier High Performance Computing Centre with large scale Supercomputing and Visualisation facilities. Assisting Researchers with computationally complex problems. Previously The Trinity Centre for High Performance Computing, TCHPC Manage in the region of 500 physical and virtual linux machines

About Configuration management Configuration management (CM) is a systems engineering process for establishing and maintaining consistency of a product s performance, functional, and physical attributes with its requirements, design, and operational information throughout its life. 1 Dangers of Configuration Management Can t just do something on the fly, must be centrally managed There is a risk that local changes will get over written You have to learn another tool Examples: Salt Puppet Ansible Lots more 1 https://en.wikipedia.org/wiki/configuration_management

Why use a Configuration Manager? It allows you to be: Scalable: 400 HPC nodes with identical configuration Granulated: specific configurations can be applied to a specific node(s) Automated: you write your salt statements once but call them repeatedly Repeatable: it s the same salt statement that is applied each time Efficient: once configured you don t have to intervene with specific machines

About Salt Salt (also called SaltStack): https://saltstack.com/ Client Server design. Service runs on the managed nodes (minions). Minions controlled by the master. Master can push changes or minions can pull from the master. Trust relationship managed via asynchronous keys Some of what can be managed: package installation configuration files, sourced from master, templates, find and replace in a file... enabled and running services specific settings or facts, e.g. db password, for individual minions much more

Why salt over ansible or puppet Firstly, this is nothing against either of those, which we have both used. But from our experience Salt has advantages: Usable error messages - Ansible s error messages lack detail Granularity - Ansible is hard to do a single thing to 1 node with Scale - Ansible is slow in comparison to Salt YAML - Puppet requires some ruby skills I at least don t have Flexibility - Puppet always seemed very controlling and designed for tight control Lets see some use case examples:

Use Case 1 - re-install a 100 node cluster Step 1. Re-install OS on nodes with PXE Step 2. Install and configure salt on nodes s a l t s s h i g n o r e host k e y s passwd x x x x x x x x k e l v i n n s t a t e. s l s i n s t a l l s a l t Step 3 - reboot node s salt ssh ignore host keys passwd xxxxxxxx kelvin n cmd. run reboot Comment Salt scales very well in this instance, much better than Ansible It offers great efficiencies, once developed your salt state s automate all your work for you Can be easily repeated in future Lets look at what exactly the installsalt state does.

Use Case 1 - installsalt state i n s t a l l e p e l : pkg : i n s t a l l e d pkgs : e p e l r e l e a s e yum conf e p e l salt minion : pkg : i n s t a l l e d c r e a t e f o l d e r s f o r k e y s : f i l e : name : / e t c / s a l t / p k i / minion / m i n i o n m a s t e r. pub managed makedirs : True {% f o r f i l e i n [ m i n i o n m a s t e r. pub, minion. pem, minion. pub ] %} / e t c / s a l t / p k i / minion /{{ f i l e }}: f i l e : managed s o u r c e : s a l t : / / i n s t a l l s a l t /{{ f i l e }}.{{ c l u s t e r n a m e }} watch in : s e r v i c e : r e s t a r t s a l t minion {% e n d f o r %} r e s t a r t s a l t minion : s e r v i c e : name : salt minion r u n n i n g / e t c / r c. l o c a l : f i l e : append t e x t : s l e e p 1 0 ; s a l t c a l l s t a t e. h i g h s t a t e p i l l a r = {\ r e b o o t \ : \ y e s \ } #w a i t 10

Use Case 2 - Bootstrap installation Scenario: new VM and want to put your common config on it. Assumptions: OS and salt installed on VM minions key signed by master From the salt master: s a l t n e w s e r v e r. fqdn s t a t e. s l s b o o t s t r a p Which does: i n c l u d e : g e n e r a l. i n i t # i n c a s e a h i g h s t a t e i s n t c a l l e d e n s u r e t h e s t a t e s t h a t s h o u l d a p p l y t tchpc general. epel tchpc g e n e r a l. r e p o s i t o r i e s. l o c a l tchpc general. nrpe tchpc general. sh orew all tchpc g e n e r a l. p o s t f i x tchpc general. snmp tchpc g e n e r a l. s e r v i c e s tchpc g e n e r a l. r s y s l o g tchpc general. check updates tchpc general. bacula Why: your standard setup is repeatedly and automatically applied.

Use Case 3 - Install a web server Installation State: i n s t a l l common p a ckages : pkg : i n s t a l l e d pkgs : h t t p d php php d e v e l mysql php l d a p m o d s s l o p e n s s l php mysql php pear MDB2 D r i v e r mysql m o d a u t h z l d a p h t t p d s e r v i c e : s e r v i c e : name : httpd r u n n i n g enable : True From the salt master: s a l t w e b s e r v e r. fqdn s t a t e. s l s i n s t a l l w e b s e r v e r That is a very efficient way to get the usual web server settings applied to a host without having to reference standard install documentation.

Use Case 4 - software upgrade testing Scenario: kernel needs to be updated. Software, (GPFS - parallel file system) depends on kernel version. Both need to be updated simultaneously. Want to test on a subset of nodes first. Tell the minion what version to use - pillar Pillar is an interface for Salt designed to offer global values that can be distributed to minions. 2 Identify the minion to apply the pillar variable to - grains interface to derive information about the underlying system. This is called the grains interface 3 2 https://docs.saltstack.com/en/carbon/topics/pillar/index.html 3 https://docs.saltstack.com/en/latest/topics/grains/

Use Case 4 - install specific versions on specific node Set the pillar to the updated versions for your test node: # t e s t i n g upgraded v e r s i o n s on s p e c i f i c nodes : {% i f s a l t [ g r a i n s. get ] ( id ) [ 0 : 1 1 ] = = k e l v i n n038 %} g p f s v e r s i o n : 3.5.0 32 {% e l s e %} g p f s v e r s i o n : 3.5.0 29 {% e n d i f %} {% i f s a l t [ g r a i n s. get ] ( id ) [ 0 : 1 1 ] = = k e l v i n n038 %} k e r n e l v e r s i o n : 2. 6. 3 2 6 4 2. 3. 1. e l 6. x86 64 {% e l s e %} k e r n e l v e r s i o n : 2. 6. 3 2 5 7 3. 1 2. 1. e l 6. x86 64 {% e n d i f %} Install the relevant kernel version, (pillar variable) for the node, (identified by grain). {% i f g r a i n s [ k e r n e l r e l e a s e ]!= s a l t [ p i l l a r. get ] ( k e r n e l v e r s i o n ) %} i n s t a l l k e r n e l p a c k a g e s ( cmd ) : cmd : run name : yum y i n s t a l l k e r n e l h e a d e r s {{ s a l t [ p i l l a r. get ] ( k e r n e l v e r s i o n ) }} k e r This provides excellent granularity without having to provision a test environment.

Use Case 5 - GPU card installation Process: 1. Remove unsupported kernel modules and reboot to load correct kernel modules. 2. Generate a new ramdisk without the supported kernel modules. 3. Boot from that new ramdisk. 4. Install the GPU drivers and reboot. Limitation: Only want to run this state on a node with GPU hardware. Gotcha: Possible infinite loop of reboots unless the minion knows each step has completed successfully. Solution: Set a grain after each step.

Use Case 5 - Continued Ensure this state only runs on a node with the gpu installed in it: {% i f s a l t [ g r a i n s. get ] ( gpus : model ) == GK110BGL [ T e s l a K40m ] %} Unload kernel modules # the nouveu modules need to be removed from the k e r n e l / e t c / modprobe. d/ b l a c k l i s t nouveau. c o n f : f i l e. managed : s o u r c e : s a l t : / / c l u s t e r s / nodes /gpu b o o l e / b l a c k l i s t nouveau. c o n f mode : 644 user : root group : r o o t Reboot, requires salt being called with a reboot pillar set {% i f p i l l a r [ reboot ]!= yes %} always f a i l s gpu : t e s t. f a i l w i t h o u t c h a n g e s : name : MESSAGE the minion should reboot f a i l h a r d : True {% e n d i f %} # end i f p i l l a r [ reboot ]!= yes %}

Use Case 5 - Continued Generate the new ramdisk from those modules {% i f g r a i n s. g e t ( r e g e n e r a t e r a m d i s k )!= r e g e n e r a t e d %} # ramdisk needs to be re generated without the nouveu modules and node booted from i t c r e a t e ramdisk without the nouveau modules : cmd. run : name : d r a c u t f o r c e Set a grain value on the minion to say that ramdisk has been re-generated. regenerate ramdisk : module. run : name : g r a i n s. s e t v a l key : r e g e n e r a t e r a m d i s k val : regenerated Boot from the new ramdisk system. reboot ramdisk : module : name : system. reboot run r e q u i r e : module : r e g e n e r a t e r a m d i s k s t o p s a f t e r r a m d i s k r e b o o t : t e s t. f a i l w i t h o u t c h a n g e s : name : MESSAGE system rebooting f a i l h a r d : True r e q u i r e : module : system. r e b o o t ramdisk Resarch{% IT, etrinity n d i f College %} Dublin, sean.mcgrath@tcd.ie

Use Case 5 - Continued Install GPU drivers only if they haven t all ready been installed {% i f g r a i n s. g e t ( n v i d i a d r i v e r s )!= i n s t a l l e d %} I n s t a l l N v i d i a d r i v e r s : cmd. run : name : /home/ s u p p o r t / r o o t /gpu/ cuda 7. 5. 1 8 l i n u x. run s i l e n t Set the grain to say they ve been installed to prevent a reboot loop and reboot: n v i d i a d r i v e r s : module. run : name : g r a i n s. s e t v a l key : n v i d i a d r i v e r s v a l : i n s t a l l e d system. r e b o o t n v i d i a d r i v e r s : module : name : system. reboot run r e q u i r e : module : n v i d i a d r i v e r s s t o p s a f t e r n v i d i a d r i v e r s r e b o o t : t e s t. f a i l w i t h o u t c h a n g e s : # t h i s i s r e a l l y s u p p o r t e d o n l y from S a l t 2 014.7 name : MESSAGE system rebooting f a i l h a r d : True r e q u i r e : module : system. r e b o o t n v i d i a d r i v e r s {% e n d i f %}

Use Case 5 - benefits Salt provides a simple and easy way to automatically provision a complex installation. It is easily repeated if a minion has to be re-installed or new machines added. You have a centralise documentation store of what exactly needs to be done to set your installation up.

Salt take aways Salt does the work of configuring your machines for you Salt provides system documentation Salt provides a knowledge base of How to do X

Thank You! Slide source available at: https://github.com/smcgrat/presentations/blob/master/ Infrastructure_Automation_with_SaltStack.tex Questions?