Distributed Architectures - PDF Free Download

Distributed Architectures Software Architecture VO/KU (707023/707024) Roman Kern KTI, TU Graz 2015-01-21 Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 1 / 64

Outline 1 Introduction 2 Independent operations 3 Distributed operations 4 Summary Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 2 / 64

Introduction Introduction Why distributed architecture? Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 3 / 64

Introduction Distributed Architectures Goal is to achieve a scalable infrastructure scale horizontally (scale out) Different levels of complexity Depends on the systems and the required attributes Certain approaches have evolved Frameworks have been developed Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 4 / 64

Introduction Distributed Architectures Parallel computing vs distributed computing In parallel computing all component share a common memory, typically threads within a single program In distributed computing each component has it own memory Typically in distributed computing the individual components are connected over a network Dedicated programming languages (or extensions) for parallel computing Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 5 / 64

Introduction Distributed Architectures http://nighthackscom/roller/jag/resource/fallacieshtml Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 6 / 64

Introduction Distributed Architectures Different levels of complexity Lowest complexity for operations, which can easily be distributed If they are independent and short enough be to executed independent from each other Higher degree of complexity for operations, which compute a single result on multiple nodes Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 7 / 64

Independent operations Independent operations In the best case Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 8 / 64

Independent operations Independent operations In a simple scenario, the system just contains separate, independent operations No operation requires complex interactions Input data are typically small chunks Shared repository - all the data is available on all nodes Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 9 / 64

Independent operations Distributed Architectures Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 10 / 64

Independent operations Independent operations Still a number of issues to address 1 Group membership 2 Leader election 3 Queues - distribution of workload 4 Distributed locks 5 Barriers 6 Shared resources 7 Configuration Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 11 / 64

Independent operations Independent operations - Group membership Group membership When a single node comes online How does it know where to connect to? How do the other members know of an added node? Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 12 / 64

Independent operations Independent operations - Group membership Peer-to-peer architectural style Each node is client, as well as server Parts of the bootstrapping mechanism Dynamic vs static Fully dynamic via broadcast/multicast within local area networks (UDP) Centralised P2P - eg central login components/servers Static lists of group members (needs to be configurable) Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 13 / 64

Independent operations Independent operations - Leader Election Leader Election Not all nodes are equal, eg centralised components in P2P networks Single node acts as master, others are workers Some nodes have additional responsibilities (supernodes) Having centralised components makes some functionality easier to implement Eg assign work-load Disadvantage: might lead to a single point of failure Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 14 / 64

Independent operations Independent operations - Leader Election Client-server architectural style Once the leader has been elected, it takes over the role of the server All other group members then act as clients Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 15 / 64

Independent operations Independent operations - Leader Election Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 16 / 64

Independent operations Independent operations - Leader Election Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 17 / 64

Independent operations Independent operations - Leader Election Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 18 / 64

Independent operations Independent operations - Queues Queues Important component in many distributed systems Two types of nodes: manager of the queue, workers Incoming requests are collected at a single point And are stored as items in a queue Many client node consume items from the queue Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 19 / 64

Independent operations Independent operations - Queues Queues are often FIFO (first-in, first-out) Sometimes specific items are of higher priority Crucial aspect is the coordinated access to the queue Each item is only processed by a single client What if the client crashes while processing an item from the queue? Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 20 / 64

Independent operations Independent operations - Queues Publish-subscribe architectural style Basically a producer-consumer pattern Each worker client registers itself Queue manager notifies the worker of new items How to schedule the workers, which should be picked next? Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 21 / 64

Independent operations Independent operations - Locks Distributed Locks Restrict access to shared resources to only a single node at a time Eg allow only a single node to write to a file May yield many non-trivial problems, for example deadlocks or race conditions Distributed locks without central component are very complex to realise Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 22 / 64

Independent operations Independent operations - Locks Blackboard architectural style The shared repository is responsible to orchestrate the access to a locks Notifies waiting nodes once the lock has been lifted This functionality is often coupled with the elected leader Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 23 / 64

Independent operations Independent operations - Barriers Barriers Specific type of distributed lock Sychronise multiple nodes Eg multiple nodes should wait until a certain state has been reached Used when a part of the processing can be done in parallel and some parts cannot be distributed Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 24 / 64

Independent operations Independent operations - Shared Resources Shared Resources If all nodes need to be able to access a common data-structure Read-only vs read-write If read-write, the complexity rises due to synchronisation issues Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 25 / 64

Independent operations Apache Zookeeper Apache Zookeeper is a framework/library to Used by Yahoo!, LinkedIn, Facebook Initially developed by Yahoo! Now managed by Apache Alternative approaches: Google Chubby, Microsoft Centrifuge Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 26 / 64

Independent operations Apache Zookeeper Components of Zookeeper Coordination kernel File-system like API Synchronisation, Watches, Locks Configuration Shared data Example taken from: http://zookeeperapacheorg/doc/r34 2/zookeeperTutorialhtml Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 27 / 64

Independent operations Example of a Barrier with Zookeeper B a r r i e r ( S t r i n g address, S t r i n g name, i n t s i z e ) { super ( address ) ; t h i s r o o t = name ; t h i s s i z e = s i z e ; S t a t s = zk e x i s t s ( root, f a l s e ) ; i f ( s == n u l l ) zk create ( root, new byte [ 0 ], I d s OPEN_ACL_UNSAFE, 0 ) ; } / / My node name t h i s name = new S t r i n g ( InetAddress getlocalhost ( ) getcanonicalhostname ( ) t o S t r i n g ( ) ) ; Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 28 / 64

Independent operations Example of a Barrier with Zookeeper boolean enter ( ) { zk create ( r o o t + " / " + name, new byte [ 0 ], I d s OPEN_ACL_UNSAFE, CreateFlags EPHEMERAL ) ; while ( true ) { synchronized ( mutex ) { A r r a y L i s t < S t r i n g > l i s t = zk getchildren ( root, true ) ; } } } i f ( l i s t s i z e ( ) < s i z e ) mutex wait ( ) ; else return true ; Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 29 / 64

Independent operations Example of a Queue with Zookeeper i n t consume ( ) throws KeeperException, I n t e r r u p t e d E x c e p t i o n { i n t r e s u l t = 1; S t a t s t a t = n u l l ; while ( true ) { / / Get the f i r s t element a v a i l a b l e synchronized ( mutex ) { A r r a y L i s t < S t r i n g > l i s t = zk getchildren ( root, true ) ; i f (! l i s t isempty ( ) ) { I n t e g e r min = new I n t e g e r ( l i s t get ( 0 ) s u b s t r i n g ( 7 ) ) ; f o r ( S t r i n g s : l i s t ) { I n t e g e r tempvalue = new I n t e g e r ( s s u b s t r i n g ( 7 ) ) ; i f ( tempvalue < min ) min = tempvalue ; } byte [ ] b = zk getdata ( r o o t + " / element " + min, false, s t a t ) ; zk delete ( r o o t + " / element " + min, 0 ) ; B y t e B u f f e r b u f f e r = B y t e B uffer wrap ( b ) ; r e s u l t = b u f f e r g e t I n t ( ) ; return r e s u l t ; } mutex wait ( ) ; / / Going to wait } Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 30 / 64

Distributed operations Split up the work into separate tasks Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 31 / 64

Distributed Operations If the processing cannot be split into separate, independent operations If the data is too big to fit on a single machine Need for a distributed processing of a single operation Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 32 / 64

Contemporary Computing Environment Hardware basics Access to data in memory is much faster than access to data on disk (or online) Disk seeks: No data is transferred from disk while the disk head is being positioned Therefore: Transferring one large chunk of data from disk to memory is faster than transferring many small chunks Disk I/O is block-based: Reading and writing of entire blocks (as opposed to smaller chunks) Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 33 / 64

Map/Reduce Distributed indexing at Google For web-scale indexing Must use a distributed computing cluster Individual machines are fault-prone Can unpredictably slow down or fail Based on distributed file system Files are stored among different machines Redundant storage Information about storage is available to other components Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 34 / 64

Map/Reduce MapReduce MapReduce (Dean and Ghemawat 2004) is a robust and conceptually simple framework for distributed computing Motivated by indexing system at Google, which consists of a number of phases, each implemented in MapReduce Approach: Bring the code to the data distributed computing without having to write code for the distribution part Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 35 / 64

Google Infrastructure Google data centres mainly contain commodity machines Data centres are distributed around the world Estimate: a total of 1 million servers, 3 million processors/cores (Gartner 2007) Estimate: Google installs 100,000 servers each quarter Based on expenditures of 200-250 million dollars per year Ṭhis would be 10% of the computing capacity of the world Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 36 / 64

Map/Reduce Map Worker Intermediate Data Input Data Map 1 Reduce Worker Output Data Split 1 Split 2 Reduce 1 Output 1 Split 3 Map 2 Split 4 Reduce 2 Output 2 Split 5 Map 3 Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 37 / 64

Map/Reduce Task of the mapper: read a chunk of the input data and generate a intermediate key plus values Task of the reducer: process a tuple of intermediate key plus values and write the output Note: Often a number of additional functions need to be provided as well Input Output Mapper k1, v1 list(k2, v2) Reducer k2, list(v2) list(k3, v3) Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 38 / 64

Example of a Mapper void countwordsoldschool ( ) { Map< S t r i n g, Integer > wordtocountmap = new HashMap< S t r i n g, Integer > ( ) ; L i s t < F i l e > f i l e L i s t = d i r l i s t F i l e s ( ) ; f o r ( F i l e f i l e : f i l e L i s t ) { S t r i n g content = I O U t i l s r e a d F i l e T o S t r i n g ( f i l e ) ; L i s t < S t r i n g > wordlist = tokenizeintowords ( content ) ; f o r ( S t r i n g word : wordlist ) { increment ( word, 1 ) ; } } w r i t e T o F i l e ( wordtocountmap ) ; } Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 39 / 64

Example of a Mapper void map ( i n t documentid, S t r i n g content ) { L i s t < S t r i n g > wordlist = tokenizeintowords ( content ) ; f o r ( S t r i n g word : wordlist ) { y i e l d ( word, 1 ) ; } } Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 40 / 64

Example of a Reducer void reduce ( S t r i n g word, L i s t < Integer > c o u n t L i s t ) { i n t counter = 0; f o r ( I n t e g e r count : c o u n t L i s t ) { counter += count ; } w r i t e ( word, counter ) ; } Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 41 / 64

Overview Inverted Index Input: Documents to be indexed, input documents are parsed and text is extracted 3 Friends, Romans, countrymen Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 42 / 64

Overview Inverted Index Input: Documents to be indexed, input documents are parsed and text is extracted 3 Friends, Romans, countrymen Tokenizer: Produces a token stream from the text 3 Friends Romans countrymen Linguistic models: Analyses and modifies the tokens 3 friends romans countrymen Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 42 / 64

Overview Inverted Index Input: Documents to be indexed, input documents are parsed and text is extracted 3 Friends, Romans, countrymen Tokenizer: Produces a token stream from the text 3 Friends Romans countrymen Linguistic models: Analyses and modifies the tokens 3 friends romans countrymen Indexer: Collects the tokens and inverts the data-structure countrymen 2 3 friends 1 3 7 romans 3 9 Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 42 / 64

Detail Inverted Index Step 1: Build term-document table Document 1 I did enact Julius Caesar I was killed in the Capitol; Brutus killed me Document 2 So let it be with Caesar The noble Brutus hath told you Caesar was ambitious Term Doc # i 1 did 1 enact 1 julius 1 caesar 1 i 1 was 1 killed 1 in 1 the 1 capitol 1 brutus 1 killed 1 me 1 so 2 let 2 it 2 be 2 with 2 caesar 2 Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 43 / 64

Detail Inverted Index Step 2: Sort by terms Term Doc # i 1 did 1 enact 1 julius 1 caesar 1 i 1 was 1 killed 1 in 1 the 1 capitol 1 brutus 1 killed 1 me 1 so 2 let 2 it 2 be 2 with 2 caesar 2 Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 44 / 64

Detail Inverted Index Step 2: Sort by terms Term Doc # i 1 did 1 enact 1 julius 1 caesar 1 i 1 was 1 killed 1 in 1 the 1 capitol 1 brutus 1 killed 1 me 1 so 2 let 2 it 2 be 2 with 2 caesar 2 Term Doc # ambitious 2 be 2 brutus 1 brutus 2 capitol 1 caesar 1 caesar 2 caesar 2 did 1 enact 1 hath 1 i 1 i 1 in 1 it 2 julius 1 killed 1 killed 1 let 2 me 1 Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 44 / 64

Detail Inverted Index Step 3: Add term frequency, multiple entries from single document get merged Term Doc # ambitious 2 be 2 brutus 1 brutus 2 capitol 1 caesar 1 caesar 2 caesar 2 did 1 enact 1 hath 1 i 1 i 1 in 1 it 2 julius 1 killed 1 killed 1 let 2 me 1 Term Doc # TF ambitious 2 1 be 2 1 brutus 1 1 brutus 2 1 capitol 1 1 caesar 1 1 caesar 2 2 did 1 1 enact 1 1 hath 2 1 i 1 2 in 1 1 it 2 1 julius 1 1 killed 1 2 let 2 1 me 1 1 noble 2 1 so 2 1 the 1 1 Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 45 / 64

Detail Inverted Index Step 4: Result is split into dictionary file and postings file Term Doc # TF ambitious 2 1 be 2 1 brutus 1 1 brutus 2 1 capitol 1 1 caesar 1 1 caesar 2 2 did 1 1 enact 1 1 hath 2 1 i 1 2 in 1 1 it 2 1 julius 1 1 killed 1 2 let 2 1 me 1 1 noble 2 1 Dictionary # Term DF CF 0 ambitious 1 1 1 be 1 1 2 brutus 2 2 3 capitol 1 1 4 caesar 2 3 5 did 1 1 6 enact 1 1 7 hath 1 1 8 i 1 2 Term# Postings {Doc#,TF} 0 2,1 1 2,1 2 1,1 2,1 3 1,1 4 1,1 2,2 5 1,1 6 1,1 7 2,1 8 1,2 Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 46 / 64

Index Construction What is the role of the Map/Reduce framework when building such an index? Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 47 / 64

Index Construction Recall step 1 of inverted index creation Document 1 I did enact Julius Caesar I was killed in the Capitol; Brutus killed me Document 2 So let it be with Caesar The noble Brutus hath told you Caesar was ambitious Term Doc # i 1 did 1 enact 1 julius 1 caesar 1 i 1 was 1 killed 1 in 1 the 1 capitol 1 brutus 1 killed 1 me 1 so 2 let 2 it 2 be 2 with 2 caesar 2 Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 48 / 64

Index Creation After all documents have been parsed, the inverted file is sorted by terms There might be many items to sort Term Doc # i 1 did 1 enact 1 julius 1 caesar 1 i 1 was 1 killed 1 in 1 the 1 capitol 1 brutus 1 killed 1 me 1 so 2 let 2 it 2 be 2 with 2 caesar 2 Term Doc # ambitious 2 be 2 brutus 1 brutus 2 capitol 1 caesar 1 caesar 2 caesar 2 did 1 enact 1 hath 1 i 1 i 1 in 1 it 2 julius 1 killed 1 killed 1 let 2 me 1 Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 49 / 64

Index Construction Map step: parse the documents and yield terms as keys Framework: Sort the keys from the mappers Reduce: Collect all keys and write out the inverted index Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 50 / 64

Map/Reduce Framework Existing open-source framework: Apache Hadoop Implemented in Java Initially developed by Yahoo! Now used by many companies and organisations Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 51 / 64

Big Data Framework Map/Reduce is well suited for batch processing, less so for online processing eg incoming stream of Twitter messages Need for a distributed realtime computation system Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 52 / 64

Big Data Framework Storm framework: http://storm-projectnet/ Scaleable, fault-tolerant guaranteed message processing Multi-language support Thrift definitions JSON based protocol (for non-jvm languages) Uses ZeroMQ for message passing, Zookeeper for cluster setup No storage capabilities Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 53 / 64

Storm Topologies the ``job'', defines how spouts and bolts are connected Spouts sources of streams, deliver data to bolts Bolts processing units (can produce input for other bolts) Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 54 / 64

Storm Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 55 / 64

Storm Topology TopologyBuilder b u i l d e r = new TopologyBuilder ( ) ; b u i l d e r setspout ( " sentences ", new SentenceSpout ( ), 5 ) ; b u i l d e r s e t B o l t ( " s p l i t ", new SplitSentence ( ), 8) shufflegrouping ( " sentences " ) ; b u i l d e r s e t B o l t ( " count ", new WordCount ( ), 1 2 ) f i e l d s G r o u p i n g ( " s p l i t ", new F i e l d s ( " word " ) ) ; Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 56 / 64

Storm Spout public cl ass SentenceSpout extends BaseRichSpout { @Override public void nexttuple ( ) { Sentence s = queue p o l l ( ) ; i f ( r e t == n u l l ) { U t i l s sleep ( 5 0 ) ; } else { _ c o l l e c t o r emit ( new Values ( s ) ) ; } } Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 57 / 64

Storm Bolt #1 public cl ass SplitSentence extends BaseRichBolt { @Override public void execute ( Tuple t u p l e ) { S t r i n g row = t u p l e g e t S t r i n g ( 0 ) ; S t r i n g [ ] words = row s p l i t ( " " ) ; f o r ( S t r i n g word : words ) { c o l l e c t o r emit ( tuple, new Values ( word ) ) ; } c o l l e c t o r ack ( t u p l e ) ; } } @Override public void d eclareoutputfields ( O u t p u t F i e l d s D e c larer d e c l a r e r ) { d e c l a r e r declare ( new F i e l d s ( " word " ) ) ; } Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 58 / 64

Storm Bolt #2 public c l a s s WordCount implements I B a s i c B o l t { p r i v a t e Map< S t r i n g, Integer > _counts = new HashMap< S t r i n g, Integer > ( ) ; public void execute ( Tuple tuple, B a s i c O u t p u t C o l l e c t o r c o l l e c t o r ) { S t r i n g word = t u p l e g e t S t r i n g ( 0 ) ; i n t count ; i f ( _counts containskey ( word ) ) { count = _counts get ( word ) ; } else { count = 0; } count + + ; _counts put ( word, count ) ; c o l l e c t o r emit ( new Values ( word, count ) ) ; } } public void d eclareoutputfields ( O u t p u t F i e l d s D e c larer d e c l a r e r ) { d e c l a r e r declare ( new F i e l d s ( " word ", " count " ) ) ; } Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 59 / 64

Summary Summary Main things to watch out Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 60 / 64

Summary Summary If the system needs to be scalable, it needs to be appropriately designed In a simple scenario, the load is distributed via individual operations For more demanding operations, specific approaches are necessary Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 61 / 64

Summary Summary The simple scenario Scalability limited often limited by dedicated central components Eg the master node Performance bottlenecks for shared resources No guarantee on execution order Limited suitable for interactive applications Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 62 / 64

Summary Summary The scenario with a complex operation Scalability is very good High complexity when implementing Not suited for interactive applications Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 63 / 64

Summary The End Next: Examination Roman Kern (KTI, TU Graz) Distributed Architectures 2015-01-21 64 / 64