Thread Specific Storage

Similar documents
Extensibility Patterns: Extension Access

SDS developer guide. Develop distributed and parallel applications in Java. Nathanaël Cottin. version

More on Methods and Encapsulation

Design Patterns and Refactoring

Course Announcements. Bacon is due next Monday. Next lab is about drawing UIs. Today s lecture will help thinking about your DB interface.

Multicore Semantics and Programming

ITI Introduction to Computing II

Sharing Objects. Pieter van den Hombergh. Fontys Hogeschool voor Techniek en Logistiek. February 15, 2017

ITI Introduction to Computing II

Announcements. John Jannotti (cs32) Design Patterns Feb 13, / 1

INF Models of concurrency

Clojure Concurrency Constructs, Part Two. CSCI 5828: Foundations of Software Engineering Lecture 13 10/07/2014

UML. Design Principles.

CIS 4930/6930: Principles of Cyber-Physical Systems

Course Announcements. John Jannotti (cs32) Scope, Collections & Generics Feb 8, / 1

Maps performance tips On server: Maintain DB connections, prepared statements (per thread/request!)

INF 4140: Models of Concurrency Series 3

Modern Functional Programming and Actors With Scala and Akka

FACULTY OF SCIENCE ACADEMY OF COMPUTER SCIENCE AND SOFTWARE ENGINEERING OBJECT ORIENTED PROGRAMMING DATE 07/2014 SESSION 8:00-10:00

rethinking software design by analyzing state

Computer Science Introductory Course MSc - Introduction to Java

Outline. PeerSim: Informal introduction. Resources. What is PeerSim? Alberto Montresor Gianluca Ciccarelli

INF Models of concurrency

Lab Course: distributed data analytics

Outline F eria AADL behavior 1/ 78

Scheduling I. Today Introduction to scheduling Classical algorithms. Next Time Advanced topics on scheduling

1 ListElement l e = f i r s t ; / / s t a r t i n g p o i n t 2 while ( l e. next!= n u l l ) 3 { l e = l e. next ; / / next step 4 } Removal

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Fair Division on the Hubble Space Telescope (repo: bkerr) Brandon Kerr and Jordana Kerr. Project Description

CS 6110 Lecture 28 Subtype Polymorphism 3 April 2013 Lecturer: Andrew Myers

Deadlock (2) Dave Eckhardt Brian Railing Roger Dannenberg

Today s Agenda: 1) Why Do We Need To Measure The Memory Component? 2) Machine Pool Memory / Best Practice Guidelines

An object-oriented design process. Weather system description. Layered architecture. Process stages. System context and models of use

Notes from Yesterday s Discussion. Big Picture. CIS 500 Software Foundations Fall November 1. Some lessons.

Factory method - Increasing the reusability at the cost of understandability

CSE 331 Winter 2018 Reasoning About Code I

Unit: Blocking Synchronization Clocks, v0.3 Vijay Saraswat

Quadratic Equations Part I

FACTORS AFFECTING CONCURRENT TRUNCATE

Basic Java OOP 10/12/2015. Department of Computer Science & Information Engineering. National Taiwan University

Comp 11 Lectures. Mike Shah. July 12, Tufts University. Mike Shah (Tufts University) Comp 11 Lectures July 12, / 33

Trivadis Integration Blueprint V0.1

Scheduling I. Today. Next Time. ! Introduction to scheduling! Classical algorithms. ! Advanced topics on scheduling

More About Methods. Hsuan-Tien Lin. Deptartment of CSIE, NTU. OOP Class, March 8-9, 2010

arxiv: v1 [cs.se] 10 Jan 2018

Binary Decision Diagrams and Symbolic Model Checking

Traffic accidents and the road network in SAS/GIS

A Model of GIS Interoperability Based on JavaRMI

Android Services. Lecture 4. Operating Systems Practical. 26 October 2016

CS 453 Operating Systems. Lecture 7 : Deadlock

DETERMINING THE VARIABLE QUANTUM TIME (VQT) IN ROUND ROBIN AND IT S IMPORTANCE OVER AVERAGE QUANTUM TIME METHOD

1 Trees. Listing 1: Node with two child reference. public class ptwochildnode { protected Object data ; protected ptwochildnode l e f t, r i g h t ;

Description of the ED library Basic Atoms

Softwaretechnik. Lecture 13: Design by Contract. Peter Thiemann University of Freiburg, Germany

CS-140 Fall 2017 Test 1 Version Practice Practice for Nov. 20, Name:

Designing Information Devices and Systems I Spring 2018 Lecture Notes Note Introduction to Linear Algebra the EECS Way

Softwaretechnik. Lecture 13: Design by Contract. Peter Thiemann University of Freiburg, Germany

Comp 11 Lectures. Mike Shah. July 26, Tufts University. Mike Shah (Tufts University) Comp 11 Lectures July 26, / 40

NEC PerforCache. Influence on M-Series Disk Array Behavior and Performance. Version 1.0

Designing and Evaluating Generic Ontologies

Introduction to ArcGIS Server Development

AI Programming CS S-09 Knowledge Representation

Actors for Reactive Programming. Actors Origins

Deadlock. CSE 2431: Introduction to Operating Systems Reading: Chap. 7, [OSC]

Applying Architectural Patterns for Parallel Programming Solving the One-dimensional Heat Equation

Business Process Management

Preptests 59 Answers and Explanations (By Ivy Global) Section 1 Analytical Reasoning

Blocking Synchronization: Streams Vijay Saraswat (Dec 10, 2012)

4th year Project demo presentation

1 Lamport s Bakery Algorithm

A GUI FOR EVOLVE ZAMS

OBEUS. (Object-Based Environment for Urban Simulation) Shareware Version. Itzhak Benenson 1,2, Slava Birfur 1, Vlad Kharbash 1

Static Analysis of Programs: A Heap-Centric View

Troubleshooting Replication and Geodata Service Issues

Usability Extensions for the Worklet Service

Discrete-event simulations

GPU Acceleration of Cutoff Pair Potentials for Molecular Modeling Applications

Automata-Theoretic Model Checking of Reactive Systems

Operational Laws Raj Jain

PySy: A Python Package for Enhanced Concurrent Programming. TODD WILLIAMSON B.S (University of California at Davis) 2007 THESIS

ECEN 651: Microprogrammed Control of Digital Systems Department of Electrical and Computer Engineering Texas A&M University

Please bring the task to your first physics lesson and hand it to the teacher.

Go Tutorial. Ian Lance Taylor. Introduction. Why? Language. Go Tutorial. Ian Lance Taylor. GCC Summit, October 27, 2010

CS-206 Concurrency. Lecture 8 Concurrent. Data structures. a b c d. Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/ remove(c) remove(b)

MURA Office hours Thurs at 12-1pm in CIT 546 Contact for more info.

How to deal with uncertainties and dynamicity?

KB Agents and Propositional Logic

Distributed Architectures

Compiling Techniques

Lecture 4: Stacks and Queues

Prioritized Garbage Collection Using the Garbage Collector to Support Caching

Safety and Liveness. Thread Synchronization: Too Much Milk. Critical Sections. A Really Cool Theorem

CS162 Operating Systems and Systems Programming Lecture 7 Semaphores, Conditional Variables, Deadlocks"

IN4R21. Real-Time Specification for Java (RTSJ) Damien MASSON January 20, 2014

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Abstractions and Decision Procedures for Effective Software Model Checking

Geodatabase Best Practices. Dave Crawford Erik Hoel

Real Time Operating Systems

Computational Complexity. This lecture. Notes. Lecture 02 - Basic Complexity Analysis. Tom Kelsey & Susmit Sarkar. Notes

1 Definition of a Turing machine

Transcription:

Thread Specific Storage The behavioural pattern explained Huib van den Brink Utrecht University hjbrink@cs.uu.nl Abstract While multi-threading can improve the performance of an application, this is not always the case. Besides the additional complexity involved when using multiple threads, synchronization can slow applications down. Besides performance correctness is hard to test when threads are involved. In those cases where attributes can reside in just the context of a single thread, abstractions and transparency can be designed in. This paper describes the Thread-Specific Storage pattern with all it s facets involved. The Thread-Specific Storage improves performance and simplifies multi-threaded applications by defining the scope of attributes. In this paper not only the basic principles are explained, but also the complications and consequences are discussed. 1. Introduction Many programs nowadays use multi-threaded environments within their programs to support concurrent behaviour. While often used, many implementations struggle with the correct behaviour of the concurrent threads they use. In general can be said that concurrent programming is hard due to it s pitfalls. While there is no way to test the correctness of an implementation, complications as synchronization, the Java memory model, race conditions, starvation and deadlocks are involved, leading to error prone solutions. When using locking mechanisms in order to access attributes shared among threads, a bottleneck is introduced for the threads have to wait for each other, while this sometime even is unnecessary. Especially when a lot of program points get synchronized the performance penalty can get significant. In some of the occasions the shared objects are not used to share state among threads, but rather have the semantics of binding values to a certain specific thread only. One could think of a session when a one thread per session model is being used. So the desire is to have the value of an attribute only to be resided within the context of a single thread. This implies that every thread accessing this attribute only will retrieve the value once put by himself without having to bother or rake notice of the actions performed within the other application threads. 2. Problem When having a single point of entrance in order to delegate the requests of threads to values only of concern to the thread handling and requesting that value, a typical bad idea is to let the application threads itself access a common container. When each thread finding out for himself what values do apply lots of drawbacks are introduced. Not only should require this well thought synchronization, but also errors can be situated at several program points within the application. The last thing one wants is to clutter up code, having all these specific references to a certain implementation of a thread local attribute. So to abstract from this the thread local attribute should be visible to the application as if it was an ordinary attribute. This implies that the mechanisms in order to realize this behaviour should be generic enough to enable hiding to the application. The consequence is that application programmers are less likely to make mistakes, for they only use the predefined services realizing the behaviour required. A side effect is reusability, because concerns are well separated in this manner. For adaptability this means that when the semantics of the thread local storage are defined only at one single place, ie. in the implementation of it, the behaviour could be replaced over time, because the application accesses the attributes in a transparent way. This is important when regarding applications evolving from a single threaded model to a multi-threaded model. But also for migrating from a thread local to a thread shared model this is meaningful, the only thing that has to be done is to introduce some synchronization to maintain the valid state. So means should be provided and work should be done to design a program to be flexible, even when threading is regarded. 3. Context and scoping of a state Often the state is referred to as the state of an object. So a class is the definition of methods and attributes together with the types needed. When several instances are created of that class each instance has it s own state, the attributes can contain different values for each instance. This however is not the only way to define the scope for values of attributes. For maintaining context information several possibilities arise now: per system - one could think of the use of setting files per version - attribute names or setting files could include versioning convention making per application - singleton objects or static attributes are an example of this per session - most often an container is maintained by the underlying system handling the sessions per method - attributes defined at top level within a method is visible within that whole method per block - attributes that are defined within a block aren t visible outside that block per class - static attributes provide this functionality

So when the application requests the read of a value it calls the method on the proxy, as defined by it s interface. The proxy locates the corresponding object set that matches the invoking thread. If a thread for the first time invokes the proxy, a new object set will be created and registered. Once the object set has been identified, it is up to the method which of the Thread-Specific objects to access, because threads can be related with several Thread-Specific objects. If the Thread-Specific object is requested for the first time, a new Thread-Specific object instance will be created and registered. Each Thread-Specific object within it s set can be identified using a Key object. So the key determines the attribute involved. If one would use the same key on an other object set, the same attribute would be returned, but containing the value for an different thread of course. In order to ensure that a Key is unique within any given object set, a KeyFactory is defined to provide this functionality. Figure 1. Thread-Specific Storage pattern per object - a very common way to use attributes per group - singleton objects within a package could provide such means per thread - this requires the thread specific storage pattern to be applied When taking in regard the thread local storage, another kind of state is induced. Within one instance of an class, the instance attribute can contain several values de pending on the thread accessing the attribute. So by using thread local storage a new dimension is being created. 4. Thread specific storage The thread-specific storage pattern[1], also known as thread local storage, defines an abstraction[3] and generic way of implementing the functionality for attributes residing in the thread space. So every thread accessing an attribute, only the values set by that same thread will be visible. This makes the attribute one access point while several values are contained. 4.1 Architecture For the most general setup a few participants are defined, as visualized in figure 1. First of all an ObjectProxy is defined in order to divide application and thread access handling concerns. The proxy object accepts and handles requests from threads. The proxy object then delegate the requests to the correct thread-specific object s stored, and hides this implementation for the rest of the application. The application sees the proxy as an object exporting some objects as defined by it s interface. In order to realize the threads accessing their data, a mapping between thread objects and object sets is maintained by the proxy. An object set can been seen as a container, having all attributes and values related to one specific thread stored. This storage itself is situated at the Thread-Specific object. 4.2 Variants In stead of defining several Thread-Specific objects per thread one also could chose to have only one Thread-Specific objects per thread creating only a mapping between thread and object, eliminating the need for Key s. The Thread-Specific object then contains all attributes needed by the thread. Often this method exposes the Thread-Specific storage to the application, because the application first will retrieve the Thread- Specific object and then use it s getters and setters. Retrieving Thread-Specific object often and at many places within the application logic only clutters up the code. 5. Synchronization benefits explained One of the purposes of this patterns is to improve performance[5]. In order to explain how this is realized we have to focus on the aspect of synchronization. When using multiple threads accessing the same instance of an object, a locking mechanism is needed for altering some state. Because the storing functionality isn t isolated per thread, mutating the values stored in some attributes implies locking one or more objects. This locking however is inefficient since there is no direct need for letting threads wait for each other. This is because concerns are well separated from each thread. We could do most of the parts without any locking. When using the Thread-Specific storage pattern, the only point in need of synchronization is the key factory. However, dependent on the implementation of the proxy object, sometimes the mapping between thread and object set sometimes also needs synchronization in order to maintain a valid state. Mutating the values stored within Thread-Specific objects never require locking, for they are in a different scope then the other threads. 6. Garbage collection Garbage collectors[6] look for unreachable objects. The way they do this is by inspecting each instance if they have an other active object referencing them. The starting point are some root set of references and following every reference then, creating a graph. So unused instances may still reference each other while getting garbage collected. The problem with Thread-Specific objects is that they are refer-

enced from out the proxy object. The proxy may keep the references while the thread even may deceased to exist. This disallows the garbage collector to clean up the data stored. So there is no easy way for detecting this unnecessary use of recourses. One way of solving this is to poll for a total thread space cleanup. When having threads registered in the proxy, not currently present any more in the application, the Thread-Specific object s for that thread could be dereferenced. An other, non-opaque, approach is to let the application call the set method of the proxy and setting is explicitly to null when it s getting superfluous. This kind of destructor however heavily relies on the implementation of the application. This approach is major error prone and should be avoided at all times. Object lifetime management strongly depends on the programming language features. So elegant solutions should be found within the means provided by the programming language used. 7. Inter-storages communication While accessing values stored in attributes, the thread-specific storage design pattern prescribes an architecture where the mechanism to retrieve the value corresponding to the invoking thread, is made transparent. As a consequence of this, a thread only can access the values stored for him, disallowing the access to storages of other threads. However for instance in the case that a situation occurs that two sessions should set or get values stored in each other, due to some of dependency among the different sessions, means could be provided by the implementation of the pattern to enable this. Providing the retrieval of other threads expands the proxy object by servicing the requests for foreign storages out of the application infrastructure, overriding the default behaviour. Inter-storages communication is considered as a bad practise. Several reasons could be thought of why to disallow this possibility. First of all misuse and hacking within the application to locate value s of other objects make the application logic complex and tricky. By allowing the access to storages of other threads, the internal mechanism used is exposed. Overriding the transparency tends to invite programmers to use a bad design within the application structure. Even not to think about the garbage collection impact. Also this doesn t suffice the reusability. But perhaps the biggest drawback of it all is the reintroduction of concurrency control. When several threads can access the value again, the need for synchronization is introduced, leading to a performance penalty. The only mitigation here is that the lock almost always instantly could be acquired, because only a few threads interfere with each other on certain points. 8. Threadpools To increase performance when using a multi-threaded design, often threadpools are used in order to reuse the threads created and to limit the amount of context switches. So the thread, during it s lifetime, is used for several purposes. This mixture of concerns makes the thread-specific storage pattern unsuitable for this type of usage in it s default way. When threads just do one task at a time and complete that task before continuing with another one, a reasonable simple solution could be provided. In order to provide means to reuse threads, the pattern should be adapted to define a global reset method. This method should clear the object set of the thread invoking the reset method. It however than is the responsibility of the application to invoke this method at the adequate positions and at the right time. It gets even more complex if threads switch between several tasks. So one thread dividing it s processing time among several assigned tasks. That disables the possibility to just invoke the reset method after a task has been left alone for the time being. The reason for this is that the thread could return back to the task some time later, requiring the values stored to be maintained. So another dimension within the storage should be introduced then. Each thread having it s own storage, each storage containing the spaces for the different tasks currently assigned to the thread and each task containing all the attributes with their values involved. Garbage collection in this situation can t rely on thread s getting inactive, and therefore complicates the time and semantics of the garbage collection process. But even worse is when caching at several levels is taken into consideration. So implementing a Servlet container using thread pooling, caching and session context information situated in a thread-local variable is a complex task. 9. Primitive implementation Several design decisions apply. One object containing all attributes. Or an proxy object delegating calls to many thread local objects related to the calling thread. This enables the pattern to return a variety of different objects depending on the method invoked on the proxy object. In order to illustrate how valuable and powerful the pattern is I will begin by showing how the requested functionality could be achieved using undesirable solutions. As we go, we gradually will improve the solution, structure and implementations over time. 9.1 Minimal solution A minimal solution to achieve the required functionality is to extend the Thread class. By creating a wrapper around the Thread class new functionality per thread object is established. So we just create a class that extends Thread. p u b l i c c l a s s S p e c i a l i z e d T h r e a d extends Thread p r i v a t e S t r i n g message = n u l l ; p u b l i c S p e c i a l i z e d T h r e a d ( Runnable t a r g e t ) super ( t a r g e t ) ; p u b l i c void s e t M e s s a g e ( S t r i n g message ) t h i s. message = message ; p u b l i c S t r i n g getmessage ( ) return message ; By adding methods and attributes a seemingly relation between thread and additional data is established. Then the application uses it as in:

S p e c i a l i z e d T h r e a d t h r e a d = new S p e c i a l i z e d T h r e a d ( t h i s ) ; t h r e a d. s t a r t ( ) ; t h r e a d. getmessage ( ) ; So now, thread of type SpecializedThread must be delegated to wherever it is used. This thus can not prevent threads accessing each others data. The thread accessing the thread object created reads the value stored in that object. Furthermore threads need to access the wrapper class instead of the primitive Thread object. So an Thread.currentThread() just won t do here when the intention is to invoke something like getmessage(), for this is unknown to the Thread class. 9.2 Flat approach A flat approach is achieved by maintaining the mapping between threads and their Thread-Specific object. So in this example each thread has exactly one Thread-Specific object containing all attributes required by any thread. p u b l i c c l a s s T h r e a d S p e c i f i c O b j e c t P r o x y implements T h r e a d S p e c i f i c M e t h o d s p r i v a t e s t a t i c H a s h t a b l e <Thread, T h r e a d S p e c i f i c O b j e c t > o b j e c t S e t = new H a s h t a b l e <... > ( ) ;... First of all a proxy object is created to abstract the application from the implementation. This proxy maintains one hashtable relating threads to Thread-Specific objects. Every proxy object created, references the same static hashtable, making it completely legal to create new instances of the proxy. In order to let the proxy object export methods an interface is defined: p u b l i c i n t e r f a c e T h r e a d S p e c i f i c M e t h o d s p u b l i c void s e t M e s s a g e ( S t r i n g message ) ; p u b l i c S t r i n g getmessage ( ) ; This example contains a method which can be set as well as being read. While the proxy exports many methods, they all have in common that they should access the right Thread-Specific object. So the proxy should define a method for retrieving the corresponding Thread-Specific object: p r i v a t e T h r e a d S p e c i f i c O b j e c t l o c a t e T h r e a d S p e c i f i c O b j e c t ( ) Thread c u r r e n t T h r e a d = Thread. c u r r e n t T h r e a d ( ) ; T h r e a d S p e c i f i c O b j e c t t s o = o b j e c t S e t. g e t ( c u r r e n t T h r e a d ) ; i f ( t s o == n u l l ) t s o = new T h r e a d S p e c i f i c O b j e c t ( ) ; o b j e c t S e t. p u t ( c u r r e n t T h r e a d, t s o ) ; return t s o ; If a new thread causes this invocation the Thread-Specific object won t be found, and therefore should be created and registered instantly. Finally the proxy object should implement the methods as defined by it s interface, by delegating the requests to the correct Thread- Specific object. p u b l i c void s e t M e s s a g e ( S t r i n g message ) l o c a t e T h r e a d S p e c i f i c O b j e c t ( ). s e t M e s s a g e ( message ) ; p u b l i c S t r i n g getmessage ( ) return l o c a t e T h r e a d S p e c i f i c O b j e c t ( ). getmessage ( ) ; The Thread-Specific object itself non-surprisingly closely resembles the implementation of the wrapper-thread as discussed in the previous section. p u b l i c c l a s s T h r e a d S p e c i f i c O b j e c t implements T h r e a d S p e c i f i c M e t h o d s p r i v a t e S t r i n g message = n u l l ; p u b l i c void s e t M e s s a g e ( S t r i n g message ) t h i s. message = message ; p u b l i c S t r i n g getmessage ( ) return message ; Notice that both Thread-Specific object and proxy implement the interface defined. This is necessary since the proxy just forwards every request to the correct Thread-Specific object. From the application side this is viewed as follows: Thread t h r e a d = new Thread ( t h i s ) ; t h r e a d. s t a r t ( ) ; T h r e a d S p e c i f i c O b j e c t P r o x y proxy = new T h r e a d S p e c i f i c O b j e c t P r o x y ( ) ; proxy. getmessage ( ) ; A proxy now can be instantiated anywhere, totally independent of the Thread object itself. Abstractions and transparency are achieved in the sense that the proxy takes care of the value retrieval. Now however, every attribute needed is stored in just one object, creating the need for many methods which violates a separation of concerns design. 9.3 Naive use of multiple thread specific object s per thread As visualized in the section describing the architecture of the pattern, the pattern is able to store several Thread-Specific objects per

thread. In order to do so, Key s are introduced. To show all the possibilities provided by the pattern, we extend the interface exporting the methods the application can call: p u b l i c i n t e r f a c e T h r e a d S p e c i f i c M e t h o d s p u b l i c void s e t T i t l e ( S t r i n g t i t l e ) ; p u b l i c T i t l e g e t T i t l e ( ) ; p u b l i c void s e t M e s s a g e ( S t r i n g message ) ; p u b l i c Message getmessage ( ) ; A separation between message and title is specified by defining the objects Message and Title. The implementation of this is straight forward. p u b l i c c l a s s T i t l e implements T h r e a d S p e c i f i c O b j e c t p r i v a t e S t r i n g t i t l e = n u l l ; p u b l i c void s e t T i t l e ( S t r i n g t i t l e ) t h i s. t i t l e = t i t l e ; p u b l i c S t r i n g g e t T i t l e ( ) return t i t l e ; A same class is created for the Message object. In order for the proxy to treat every type of Thread-Specific object in the same way, an empty interface is provided. Of course it is well possible to demand that any object acting as an Thread- Specific object, implement some methods defined here. p u b l i c i n t e r f a c e T h r e a d S p e c i f i c O b j e c t Now we have separated the concerns by defining different classes while still being an Thread-Specific object. The proxy now elegantly can delegate the invocations. But in order to do so, some additional registration takes place: p u b l i c c l a s s T h r e a d S p e c i f i c O b j e c t P r o x y implements T h r e a d S p e c i f i c M e t h o d s p r i v a t e s t a t i c H a s h t a b l e <Thread, T h r e a d S p e c i f i c O b j e c t S e t > s t o r e = new H a s h t a b l e <... > ( ) ; p r i v a t e s t a t i c f i n a l Key t i t l e K e y = KeyFactory. c r e a t e K e y ( ) ; p r i v a t e s t a t i c f i n a l Key messagekey = KeyFactory. c r e a t e K e y ( ) ;... First of all the proxy maintains a mapping between Threads and Object Sets. Besides this, a number of Key s are specified. The Key is there to identify the Thread-Specific object requested. The key is unique within any Thread-Specific object set, enabling methods within the proxy to refer to a certain Thread-Specific object. So this means that a key is valid for every thread present. For each thread the key will return the same Thread-Specific object (type), but with the values stored for the requesting thread. The implementation of the KeyFactory is non-surprisingly straightforward: p u b l i c c l a s s KeyFactory p r i v a t e s t a t i c i n t c u r r e n t I d e n t i f i e r = 1; p u b l i c s t a t i c synchronized Key c r e a t e K e y ( ) c u r r e n t I d e n t i f i e r ++; return new Key ( c u r r e n t I d e n t i f i e r ) ; Notice however that at this point a synchronized method is introduced. At no other location the synchronization is needed 1. The class Key has a private attribute int that is set via the constructor, with no getter and setter methods provided. Equality is checked through the equals method as defined in java.lang.object. As discussed before, the methods supported by the proxy in order to delegate the requests have the commonality of the retrieval of the Thread-Specific objects. In the current model however an additional step has to be performed. First the corresponding Thread- Specific object set is located. This can easily be done because of the mapping between threads and their corresponding object set. p r i v a t e T h r e a d S p e c i f i c O b j e c t S e t l o c a t e T h r e a d S p e c i f i c O b j e c t S e t ( ) Thread c u r r e n t T h r e a d = Thread. c u r r e n t T h r e a d ( ) ; T h r e a d S p e c i f i c O b j e c t S e t o b j e c t S e t = s t o r e. g e t ( c u r r e n t T h r e a d ) ; i f ( o b j e c t S e t == n u l l ) o b j e c t S e t = new T h r e a d S p e c i f i c O b j e c t S e t ( ) ; s t o r e. p u t ( c u r r e n t T h r e a d, o b j e c t S e t ) ; return o b j e c t S e t ; Once again, when a thread accesses the set for the first time, a new set is instantiated and registered. A Thread-Specific object set is nothing more than a mapping between a given Key and a Thread-Specific object. The right Key is given by the proxy, so this is transparent to the application. The proxy has enough knowledge to select and pass the right Key when requesting or storing a Thread-Specific object. p u b l i c c l a s s T h r e a d S p e c i f i c O b j e c t S e t p r i v a t e H a s h t a b l e <Key, T h r e a d S p e c i f i c O b j e c t > o b j e c t S e t = new H a s h t a b l e <... > ( ) ; p u b l i c void s e t ( Key key, 1 This not is totally true, for the Hashtable class internally provides some concurrency control

T h r e a d S p e c i f i c O b j e c t t s o ) o b j e c t S e t. p u t ( key, t s o ) ; p u b l i c T h r e a d S p e c i f i c O b j e c t g e t ( Key key ) return o b j e c t S e t. g e t ( key ) ; In order to retrieve the requested Thread-Specific object itself, the proxy internally does the following, which actually is nothing more than glue, using delegation of requests. p r i v a t e T h r e a d S p e c i f i c O b j e c t l o c a t e T h r e a d S p e c i f i c O b j e c t ( Key key ) T h r e a d S p e c i f i c O b j e c t S e t o b j e c t S e t = l o c a t e T h r e a d S p e c i f i c O b j e c t S e t ( ) ; T h r e a d S p e c i f i c O b j e c t t s o = o b j e c t S e t. g e t ( key ) ; and a new Title object is constructed and registered. This is allowed because the method settitle has the semantics of which implementation of an ThreadSpecificObject to instantiate. A same kind of implementation is present for the Message object, so we won t show it here. Finally the usage for the application is simple. Thread t h r e a d = new Thread ( t h i s ) ; t h r e a d. s t a r t ( ) ; T h r e a d S p e c i f i c O b j e c t P r o x y proxy = new T h r e a d S p e c i f i c O b j e c t P r o x y ( ) ; proxy. s e t T i t l e ( t i t l e t e s t ) ; T i t l e t = proxy. g e t T i t l e ( ) ; Message m = proxy. getmessage ( ) ; m. s e t M e s s a g e ( message t e s t ) ; return t s o ; As shown a variety on usages and flexibility can be achieved. Only the method forwarding the request have the semantics of which Thread-Specific object to access. Therefore an Key should be given as parameter. Retrieving a title then, within the proxy object is easy. The get- Title method knows that it needs to access a Title object, thus is able to give the correct Key to the method locating the actual Thread-Specific object. The Key s involved are stored in the proxy with class visibility. p u b l i c T i t l e g e t T i t l e ( ) return ( T i t l e ) l o c a t e T h r e a d S p e c i f i c O b j e c t ( t i t l e K e y ) ; Note that several Title instances could be distinguished and identified. So an gettitle2() method could refer to another instance of the Title class. The cast is necessary because the proxy works with Thread- SpecificObject s and not on implementations of this interface. Setting a title involves some more registration actions. p u b l i c void s e t T i t l e ( S t r i n g t i t l e ) T h r e a d S p e c i f i c O b j e c t t s o = l o c a t e T h r e a d S p e c i f i c O b j e c t ( t i t l e K e y ) ; i f ( t s o == n u l l ) t s o = new T i t l e ( ) ; T h r e a d S p e c i f i c O b j e c t S e t o b j e c t S e t = l o c a t e T h r e a d S p e c i f i c O b j e c t S e t ( ) ; o b j e c t S e t. s e t ( t i t l e K e y, t s o ) ; ( ( T i t l e ) t s o ). s e t T i t l e ( t i t l e ) ; So the proxy delegates the invocation to the Title object. Again the method settitle knows which Key to use in order to retrieve the Thread-Specific object. When the thread never accessed that object before, the corresponding object set is retrieved 9.4 Taking garbage collection in regard When we expand the implementation described in the previous section, we could achieve the garbage collection of thread-specific data. The reason why normal garbage collection doesn t suffice when using Thread-Specific storages, is described in chapter 6. A basic simple, but not optimised version of our own garbage collector might look like this: c l a s s G a r b a g e C o l l e c t o r implements Runnable p r i v a t e T h r e a d S p e c i f i c O b j e c t P r o x y proxy = new T h r e a d S p e c i f i c O b j e c t P r o x y ( ) ; p u b l i c void run ( ) while ( true ) Thread c u r r e n t = Thread. c u r r e n t T h r e a d ( ) ; t r y c u r r e n t. s l e e p ( 1 0 0 ) ; catch ( E x c e p t i o n e )... ThreadGroup g roup = c u r r e n t. getthreadgroup ( ) ; while ( group. g e t P a r e n t ( )!= n u l l ) group = group. g e t P a r e n t ( ) ; i n t c o u n t = group. a c t i v e C o u n t ( ) ; Thread [ ] t h r e a d s = new Thread [ c o u n t ] ; i n t amount = group. enumerate ( t h r e a d s ) ; i f ( amount!= c o u n t ) System. o u t. p r i n t l n (... ) ; Thread [ ] r e g i s t e r e d = proxy. getknownthreads ( ) ; f o r ( i n t i =0; i<r e g i s t e r e d. l e n g t h ; i ++) boolean found = f a l s e ; f o r ( i n t j =0; j<t h r e a d s. l e n g t h ; j ++) i f ( t h r e a d s [ j ] == r e g i s t e r e d [ i ] ) found = true ;

i f (! found ) proxy. remove ( r e g i s t e r e d [ i ] ) ; 10. Java specific implementation While most programming languages could implement something as discussed in the previous section, many languages support build in support for thread local storages. Java introduced this ability[4] since Java 1.2. First we poll in a loop with a certain interval to check if some garbage can be collected. Therefore the GarbageCollector class implements the Runnable interface. The thread than is created within the constructor of the GarbageCollector. As this is an endless loop, thread.setdaemon(true); should be set to enable the application to terminate. When we do a run on checking unused data, we first search to topmost ThreadGroup. This group does contain all the threads present in the JVM. This ThreadGroup enables us to get a reference to every Thread present. Once put into an array we request the proxy to give every thread registered with it. Remember that there is a mapping between Threads and object sets. So just all hashtablekey s are returned by the proxy to the GarbageCollector. Finally, when there exist a registered Thread that isn t active any more, the data stored for that thread could be set to null, in order to let the garbage collector within the JVM doing it s job. The methods now required within the proxy object are: Thread [ ] getknownthreads ( ) Thread [ ] r e s u l t = new Thread [ s t o r e. s i z e ( ) ] ; I t e r a t o r i t e r = s t o r e. keyset ( ). i t e r a t o r ( ) ; f o r ( i n t i =0; i<r e s u l t. l e n g t h ; i ++) i f ( i t e r. hasnext ( ) ) r e s u l t [ i ] = ( Thread ) i t e r. n e x t ( ) ; e l s e break ; return r e s u l t ; So just all threads registered and known are gathered here. And of course, the simple removal out of the Hashtable should be provided. void remove ( Thread t h r e a d ) s t o r e. remove ( t h r e a d ) ; The consequences for the application however are rather simple: new G a r b a g e C o l l e c t o r ( ) ; But this also could be situated at the proxy object, making it totally transparent for the application. This time the Thread-Specific object is kept as a plain old java object. p u b l i c c l a s s Message p r i v a t e S t r i n g message = n u l l ; p u b l i c void s e t M e s s a g e ( S t r i n g message ) t h i s. message = message ; p u b l i c S t r i n g getmessage ( ) return message ; The ThreadLocal object is extended and defined as follows. p u b l i c c l a s s ThreadLocalMessage extends ThreadLocal <Message> p r o t e c t e d Message i n i t i a l V a l u e ( ) return new Message ( ) ; Only the initialvalue is overridden. The setter and getter of the object (Message) to store is left as default. The application now just can instantiate the ThreadLocalMessage and perform the get operation on it. p r i v a t e s t a t i c ThreadLocalMessage m e s s a g e S t o r e = new ThreadLocalMessage ( ) ; Thread t h r e a d = new Thread ( t h i s ) ; t h r e a d. s t a r t ( ) ; m e s s a g e S t o r e. g e t ( ). s e t M e s s a g e ( t e s t msg ) ; m e s s a g e S t o r e. g e t ( ). getmessage ( ) Message m = m e s s a g e S t o r e. g e t ( ) ; As gets clear by this example, many get() invocations could arise within your application code, even when binding it to an attribute with a sufficient scope (for instance local to the run method). Garbage collection is handled using weak references[7]. Every thread accessing the Thread-Specific storage, holds an implicit reference to its corresponding Thread-Specific object, as long as the thread is alive and the application has a strong reference to the ThreadLocal instance. When a thread dies, all of the instances of the ThreadLocal class are subject to garbage collection, together with the Thread-Specific objects they contain. Unless of course, if the contained Thread-Specific objects have a strong reference pointing to them from out some other object.

11. Specifying a transformation If one extends a language by adding a syntax construct for using thread-local storages, some work should be done to implement this feature. This could be done by specifying a transformation. The syntax extension could look like: threadlocal <type> <variable-name> Reading or writing such an attribute means setting and getting values only for the thread performing the request. A single point of entrance containing multiple values for the different threads present. 11.1 Naive and inefficient approach First one HashMap is specified to contain one object per each thread. HashMap t h r e a d S t o r a g e s = new HashMap ( ) ; This HashMap should be globally accessible then. The object stored per each thread than also is an HashMap, creating a relation from attribute name, to local value for that attribute. p u b l i c void run ( ) Thread c u r r e n t = Thread. c u r r e n t T h r e a d ( ) ; i f ( t h r e a d S t o r a g e s. g e t ( c u r r e n t ) == n u l l ) t h r e a d S t o r a g e s. p u t ( c u r r e n t, new HashMap ( ) ) ;... / / impl code t h r e a d S t o r a g e s. remove ( c u r r e n t ) ; Now every thread registers it s own HashMap containing all the thread-local attribute values used by the thread. Note that the lines at the beginning of the run method also could be situated at the start() method of java.lang.thread Garbage collection exists of removing the HashMap entry containing all the variables for that specific thread. Setting a threadlocal variable to null wouldn t mean much more than just removing that one value out of the HashTable that contains all the values only for the invoking thread. Garbage collection is now performed at the end of the run method. While this isn t always valid. Several execution path s could reside in a premature exit. So the transformation should inspect all possible execution path s in the body and should insert the remove invocation at every exit program point. Another interesting scenario is when an thread just invokes the run method out of somewhere else, thus without the construction of a new thread. Finally also exceptions should be taken into regard, making it a non-trivial transformation to establish garbage collection support. For every threadlocal attribute specified, some getter and setter methods should be generated and inserted: p u b l i c T uniquegetobj x0 ( ) Thread i n v o k i n g = Thread. c u r r e n t T h r e a d ( ) ; t h r e a d S t o r a g e s. g e t ( i n v o k i n g ). g e t ( x0 ) ; p u b l i c void u n i q u e S e t O b j x 0 ( T x0 ) Thread i n v o k i n g = Thread. c u r r e n t T h r e a d ( ) ; t h r e a d S t o r a g e s. g e t ( i n v o k i n g ). p u t ( x0, x0 ) ; Creating an instance of the threadlocal attribute has the semantics of adding an newly created object to the HashMap contained by the invoking thread. So Message m = new Message(); should be replaced by uniquesetobj_x0(new Message); This holds for every creation of new Message object instances. Now every point where the attribute is used, the variable should be obtained from the HashMap related to the invoking thread. m.getmessage() should become: uniquegetobj_x0().getmessage() So this solution implies lots of overhead and many complications. 11.2 A transformation using ThreadLocal objects From Java 1.2 and higher we could just transform the threadlocal attributes defined, to an implementation using the ThreadLocal objects provided. For every threadlocal attribute defined a ThreadLocal object should be generated: private static ThreadLocal unique_x0 = new ThreadLocal(); Creating an instance of the threadlocal attribute has the semantics of adding an newly created object the the ThreadLocal object instance. So Message m = new Message(); should be replaced by unique_x0.set(new Message); This holds for every creation of new Message object instances. Now every point where the attribute is used, the variable should be replaced by referencing the objects contained by the ThreadLocal object. m.getmessage() should become: unique_x0.get().getmessage() so by replacing the occurrence of m by unique_x0.get() the transformation redirects the usage to the correct object value. Complications with the transformation are the passing of threadlocal attributes as parameter, taking all kinds of object oriented aspects in mind. Inheritance, overriding methods and passing threadlocal objects around mean many situations to take in regard. 12. Known uses Double checked locking The thread-specific storage can act as a solution to problems[11] involved when using the double checked locking. Let s consider the creation of a singleton object. The double checked locking improves performance. The only time the lock is acquired, is when an instance wasn t already created. p r i v a t e s t a t i c S i n g l e t o n O b j e c t s i n g l e t o n O b j ;

p u b l i c s t a t i c S i n g l e t o n O b j e c t g e t I n s t a n c e ( ) i f ( s i n g l e t o n O b j == n u l l ) synchronized ( t h i s ) i f ( s i n g l e t o n O b j == n u l l ) s i n g l e t o n O b j = new S i n g l e t o n O b j e c t ( ) ; So this SingletonObject has a private constructor and the only instance made is done by the getinstance method. The getinstance method checks if an instance exist of the SingletonObject or not. If not the lock is acquired and a new instance is created and bound to the static variable. This way of implementing a singleton in Java however is unsafe. The solution to this is to use a thread local variable, telling if the current thread already once has entered the synchronized body. p r i v a t e s t a t i c ThreadLocal <Boolean> d i d P e r f o r m = new ThreadLocal <... > ( ) ; p r i v a t e s t a t i c S i n g l e t o n O b j e c t s i n g l e t o n O b j ; p u b l i c S i n g l e t o n O b j e c t g e t I n s t a n c e ( ) i f ( d i d P e r f o r m. g e t ( ) == n u l l ) synchronized ( t h i s ) i f ( s i n g l e t o n O b j == n u l l ) s i n g l e t o n O b j = new S i n g l e t o n O b j e c t ( ) ; d i d P e r f o r m. s e t ( true ) ; return r e s o u r c e ; So the idea here is that every thread enters the synchronized body the first time, while that section is only visited once. We don t look if the singletonobj still is null, but we look if the invoking thread already entered the synchronized section once before. If a thread for the first time invokes the getinstance method, the didperform variable will just return null. So the synchronized section is always entered when a thread for the first time invokes the getinstance method. Within the synchronized body it could be that the singletonobj already was created by another thread, but the next time you check the didperform, you are sure that the instance already was created by the current thread or someone else. In this way it is forced that each and every thread enters the synchronized body only once. Java Authentication and Authorization Service The threadspecific storage is of good use in the JAAS [12] [13] design. Some articles plead that is a good design to associate the Principal that is returned in the SecurityContext object with some current thread. And some EJB servers nowadays actually do associate the JNDI principal with the current thread when a client creates a remote interface and uses that principal to invoke calls on enterprise beans. Performance tools and profilers Performance tools [9] and profilers [8] use the pattern to gather information with respect to just one single thread. A thread-local storage data structure can be used to record per-thread profiling data. The JavaTM Virtual Machine Profiler Interface[8] even specifies: The JVMPI supplies to the agent a pointer-size thread-local storage that can be used to record per-thread profiling information, showing it s use clearly. JThreads/C++ JThreads/C++ [10] implemented their own version of the thread-specific storage pattern, to provide thread local functionality and behaviour within C++. JThreads/C++ is designed to achieve the same appearance and behaviour of threads as people are used of in Java, but then using C++. So it enables Java programmers to easily use threads in C++. And for this purpose they made their own implementation of the thread-specific storage pattern. The class JTCTSS implements the proxy object of the pattern, whereas JTCThreadKey represents a key used by the proxy object. Other supportive classes are JTCThreadId and JTCThreadDeath. So as one can see, the whole pattern is present here. 13. Concluding Remarks The thread-specific storage pattern adequately provides means to define the scope of attributes in terms of threads accessing it. Summarized we could say that the use of thread local storage does abstract from the mechanism realising the thread local storage, making it generic and reusable. In the situations where the binding between thread and attribute is desirable, this pattern certainly is applicable regardless of the rest of the structure of the application using it. References [1] Douglas Schmidt, Michael Stal, Hans Rohnert and Frank Buschmann. Pattern-Oriented Software Architecture Patterns for Concurrent and Networked Objects [2] Douglas C. Schmidt, Timothy H. Harrison and Nat Pryce. Thread- Specific Storage for C/C++. [3] Brian Goetz. Exploiting ThreadLocal to enhance scalability [4] Doug Lea. Concurrent Programming in Java [5] Yair Sade. Optimizing C Multithreaded Memory Management Using Thread-Local Storage [6] Hans J. Boehm. Fast Multiprocessor Memory Allocation and Garbage Collection [7] Monica Pawlan. Reference Objects and Garbage Collection [8] Sun Microsystems, Inc. JavaTM Virtual Machine Profiler Interface [9] Sameer Shende and Allen D. Malony. Performance Tools for Parallel Java Environments [10] IONA Technologies, Inc. JThreads/C++ [11] Bill Pugh. The Double-Checked Locking is Broken Declaration [12] Richard Monson-Haefel and David Chappell. Java Message Service [13] Chuck Cavaness and Brian Keeton. Special Edition Using Enterprise JavaBeans 2.0