Scheduling divisible loads with return messages on heterogeneous master-worker platforms

Similar documents
Scheduling divisible loads with return messages on heterogeneous master-worker platforms

FIFO scheduling of divisible loads with return messages under the one-port model

Divisible Load Scheduling

Divisible load theory

Scheduling Divisible Loads on Star and Tree Networks: Results and Open Problems

IN this paper, we deal with the master-slave paradigm on a

On scheduling the checkpoints of exascale applications

Divisible Load Scheduling on Heterogeneous Distributed Systems

Scheduling Lecture 1: Scheduling on One Machine

The impact of heterogeneity on master-slave on-line scheduling

Scheduling multiple bags of tasks on heterogeneous master-worker platforms: centralized versus distributed solutions

PAPER SPORT: An Algorithm for Divisible Load Scheduling with Result Collection on Heterogeneous Systems

Static Load-Balancing Techniques for Iterative Computations on Heterogeneous Clusters

Assessing the impact and limits of steady-state scheduling for mixed task and data parallelism on heterogeneous platforms

Scheduling multiple bags of tasks on heterogeneous master- worker platforms: centralized versus distributed solutions

Steady-state scheduling of task graphs on heterogeneous computing platforms

FOR decades it has been realized that many algorithms

Scheduling Lecture 1: Scheduling on One Machine

On the Complexity of Multi-Round Divisible Load Scheduling

How to deal with uncertainties and dynamicity?

Energy-aware checkpointing of divisible tasks with soft or hard deadlines

Scheduling algorithms for heterogeneous and failure-prone platforms

On the Complexity of Multi-Round Divisible Load Scheduling

Overlay networks maximizing throughput

On the Complexity of Mapping Pipelined Filtering Services on Heterogeneous Platforms

Energy-efficient scheduling

Data redistribution algorithms for homogeneous and heterogeneous processor rings

Scheduling Parallel Iterative Applications on Volatile Resources

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

A Strategyproof Mechanism for Scheduling Divisible Loads in Distributed Systems

Divisible Load Scheduling: An Approach Using Coalitional Games

Scheduling communication requests traversing a switch: complexity and algorithms

Lecture 2: Scheduling on Parallel Machines

Scheduling Parallel Iterative Applications on Volatile Resources

Revisiting Matrix Product on Master-Worker Platforms

Data redistribution algorithms for heterogeneous processor rings

Resilient and energy-aware algorithms

SCHEDULING STRATEGIES FOR MIXED DATA AND TASK PARALLELISM ON HETEROGENEOUS CLUSTERS

Embedded Systems 15. REVIEW: Aperiodic scheduling. C i J i 0 a i s i f i d i

Equal Allocation Scheduling for Data Intensive Applications

Algorithm Design. Scheduling Algorithms. Part 2. Parallel machines. Open-shop Scheduling. Job-shop Scheduling.

Real-time operating systems course. 6 Definitions Non real-time scheduling algorithms Real-time scheduling algorithm

Mapping tightly-coupled applications on volatile resources

Embedded Systems 14. Overview of embedded systems design

Divisible Job Scheduling in Systems with Limited Memory. Paweł Wolniewicz

A comparison of sequencing formulations in a constraint generation procedure for avionics scheduling

Minimizing Mean Flowtime and Makespan on Master-Slave Systems

Embedded Systems Design: Optimization Challenges. Paul Pop Embedded Systems Lab (ESLAB) Linköping University, Sweden

Flow Algorithms for Two Pipelined Filtering Problems

CPU SCHEDULING RONG ZHENG

Module 5: CPU Scheduling

Semi-online Task Partitioning and Communication between Local and Remote Processors

Component-based Analysis of Worst-case Temperatures

Chapter 6: CPU Scheduling

Overview: Synchronous Computations

Computing the throughput of probabilistic and replicated streaming applications

Parallel Performance Theory - 1

A Tighter Analysis of Work Stealing

Energy Considerations for Divisible Load Processing

Scheduling Tightly-Coupled Applications on Heterogeneous Desktop Grids

Algorithms for Temperature-Aware Task Scheduling in Microprocessor Systems

Analytical Modeling of Parallel Systems

P C max. NP-complete from partition. Example j p j What is the makespan on 2 machines? 3 machines? 4 machines?

HYBRID FLOW-SHOP WITH ADJUSTMENT

LPT rule: Whenever a machine becomes free for assignment, assign that job whose. processing time is the largest among those jobs not yet assigned.

Single machine scheduling with forbidden start times

CHAPTER 16: SCHEDULING

Analysis of Dynamic Scheduling Strategies for Matrix Multiplication on Heterogeneous Platforms

TDDB68 Concurrent programming and operating systems. Lecture: CPU Scheduling II

A Dynamic Programming algorithm for minimizing total cost of duplication in scheduling an outtree with communication delays and duplication

More on Input Distributions

This means that we can assume each list ) is

Integer Programming Part II

Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors

Single Machine Models

Real-Time and Embedded Systems (M) Lecture 5

Real-time Scheduling of Periodic Tasks (1) Advanced Operating Systems Lecture 2

Recoverable Robustness in Scheduling Problems

More Approximation Algorithms

Revisiting Matrix Product on Master-Worker Platforms

Parallel Performance Theory

Arnaud Legrand, Olivier Beaumont, Loris Marchal, Yves Robert

2/5/07 CSE 30341: Operating Systems Principles

Distributed Systems Byzantine Agreement

Multiprocessor Scheduling I: Partitioned Scheduling. LS 12, TU Dortmund

Algorithms. Outline! Approximation Algorithms. The class APX. The intelligence behind the hardware. ! Based on

Approximation Algorithms for scheduling

Real-Time Systems. Event-Driven Scheduling

MPI Implementations for Solving Dot - Product on Heterogeneous Platforms

arxiv:cs/ v1 [cs.ds] 18 Oct 2004

Feasibility Pump Heuristics for Column Generation Approaches

Revisiting Matrix Product on Master-Worker Platforms

3.8 Strong valid inequalities

Logic-based Benders Decomposition

Fundamental Algorithms for System Modeling, Analysis, and Optimization

Off-line and on-line scheduling on heterogeneous master-slave platforms

Reclaiming the energy of a schedule: models and algorithms

Computing the throughput of replicated workflows on heterogeneous platforms

Energy-Efficient Broadcast Scheduling. Speed-Controlled Transmission Channels

TDDI04, K. Arvidsson, IDA, Linköpings universitet CPU Scheduling. Overview: CPU Scheduling. [SGG7] Chapter 5. Basic Concepts.

Transcription:

Scheduling divisible loads with return messages on heterogeneous master-worker platforms Olivier Beaumont, Loris Marchal, Yves Robert Laboratoire de l Informatique du Parallélisme École Normale Supérieure de Lyon, France Graal Working Group June 2005

Outline 1 Introduction 2 Framework 3 Case study of some scenarios 4 Simulations 5 Conclusion Loris Marchal (LIP) Divisible load scheduling with return messages 2/ 32

Outline Introduction 1 Introduction 2 Framework 3 Case study of some scenarios 4 Simulations 5 Conclusion Loris Marchal (LIP) Divisible load scheduling with return messages 3/ 32

Introduction Introduction One master, holding a large number of identical tasks Some workers Loris Marchal (LIP) Divisible load scheduling with return messages 4/ 32

Introduction Introduction One master, holding a large number of identical tasks Some workers Heterogeneity in computing speed and bandwidth Loris Marchal (LIP) Divisible load scheduling with return messages 4/ 32

Introduction Introduction One master, holding a large number of identical tasks Some workers Heterogeneity in computing speed and bandwidth Distribute work to workers Loris Marchal (LIP) Divisible load scheduling with return messages 4/ 32

Introduction Introduction One master, holding a large number of identical tasks Some workers Heterogeneity in computing speed and bandwidth Distribute work to workers Gather the results Loris Marchal (LIP) Divisible load scheduling with return messages 4/ 32

Introduction Assumption: divisible load Important relaxation of the problem Loris Marchal (LIP) Divisible load scheduling with return messages 5/ 32

Introduction Assumption: divisible load Important relaxation of the problem Master holding N tasks Loris Marchal (LIP) Divisible load scheduling with return messages 5/ 32

Introduction Assumption: divisible load Important relaxation of the problem Master holding N tasks Worker P i will get a fraction α i N of these tasks Loris Marchal (LIP) Divisible load scheduling with return messages 5/ 32

Introduction Assumption: divisible load Important relaxation of the problem Master holding N tasks Worker P i will get a fraction α i N of these tasks α i is rational, tasks are divisible Loris Marchal (LIP) Divisible load scheduling with return messages 5/ 32

Introduction Assumption: divisible load Important relaxation of the problem Master holding N tasks Worker P i will get a fraction α i N of these tasks α i is rational, tasks are divisible possible to derive analytical solutions (tractability) Loris Marchal (LIP) Divisible load scheduling with return messages 5/ 32

Introduction Assumption: divisible load Important relaxation of the problem Master holding N tasks Worker P i will get a fraction α i N of these tasks α i is rational, tasks are divisible possible to derive analytical solutions (tractability) In practice, reasonable assumption with a large number of tasks Loris Marchal (LIP) Divisible load scheduling with return messages 5/ 32

Introduction Background on Divisible Load Scheduling Without return messages Linear cost model: X units of work: sent to Pi in X c i time units computed by P i in X w i time units Loris Marchal (LIP) Divisible load scheduling with return messages 6/ 32

Introduction Background on Divisible Load Scheduling Without return messages Linear cost model: X units of work: sent to Pi in X c i time units computed by P i in X w i time units Results: Bus network all processors work and finish at the same time order does not matter closed-formula for the makespan (Bataineh, Hsiung & Robertazzi, 1994) Result extended for homogeneous tree: a subtree reduces to one single worker Heterogeneous star network: all processors work and finish at the same time order matters: largest bandwidth first (whatever the computing power) Loris Marchal (LIP) Divisible load scheduling with return messages 6/ 32

Introduction Background on Divisible Load Scheduling With an affine cost model: X units of work: sent to P i in C i + X c i time units computed by P i in W i + X w i time units not all processors participate to the computation selecting the resources is hard: computing the optimal schedule on a star with affine cost model is NP-hard (Casanova, Drodowski, Legrand & Yang, 2005) Loris Marchal (LIP) Divisible load scheduling with return messages 7/ 32

Introduction Background on Divisible Load Scheduling With an affine cost model: X units of work: sent to P i in C i + X c i time units computed by P i in W i + X w i time units not all processors participate to the computation selecting the resources is hard: computing the optimal schedule on a star with affine cost model is NP-hard (Casanova, Drodowski, Legrand & Yang, 2005) Affine cost model + multi-round algorithms: UMR, quadratic progression of the chunk size (Casanova & Yang, 2003) asymptotically optimal algorithm based on steady-state (Beaumont, Legrand & Robert, 2003) Loris Marchal (LIP) Divisible load scheduling with return messages 7/ 32

Introduction Adding return messages What we need to decide: Ordering of the initial data to the workers Ordering of the return messages Quantity of work for each worker Loris Marchal (LIP) Divisible load scheduling with return messages 8/ 32

Introduction Adding return messages What we need to decide: Ordering of the initial data to the workers Ordering of the return messages Quantity of work for each worker Related work: Barlas: fixed communication time or bus network + affine computation cost optimal ordering and closed-from formulas Drodowski and Wolniewicz: consider LIFO and FIFO distributions Rosenberg et al.: complex (affine) communication model + possibility to slow down computing speed all FIFO ordering are equivalent and better than any other ordering Loris Marchal (LIP) Divisible load scheduling with return messages 8/ 32

Outline Framework 1 Introduction 2 Framework Model Linear Program for a given scenario Counter Examples 3 Case study of some scenarios 4 Simulations 5 Conclusion Loris Marchal (LIP) Divisible load scheduling with return messages 9/ 32

Model Framework Model P 0 c 1, d 1 c p, d p c2, d 2 c i, d i P 1 P 2 P i P p w 1 w 2 w i w p One master P 0 Loris Marchal (LIP) Divisible load scheduling with return messages 10/ 32

Model Framework Model P 0 c 1, d 1 c p, d p c2, d 2 c i, d i P 1 P 2 P i P p w 1 w 2 w i w p One master P 0 p workers P 1,..., P p Loris Marchal (LIP) Divisible load scheduling with return messages 10/ 32

Model Framework Model P 0 c 1, d 1 c p, d p c2, d 2 c i, d i P 1 P 2 P i P p w 1 w 2 w i w p One master P 0 p workers P 1,..., P p Linear model, X unit of work are: Loris Marchal (LIP) Divisible load scheduling with return messages 10/ 32

Model Framework Model P 0 c 1, d 1 c p, d p c2, d 2 c i, d i P 1 P 2 P i P p w 1 w 2 w i w p One master P 0 p workers P 1,..., P p Linear model, X unit of work are: sent to P i in X c i time units Loris Marchal (LIP) Divisible load scheduling with return messages 10/ 32

Model Framework Model P 0 c 1, d 1 c p, d p c2, d 2 c i, d i P 1 P 2 P i P p w 1 w 2 w i w p One master P 0 p workers P 1,..., P p Linear model, X unit of work are: sent to P i in X c i time units computed by P i in X w i time units Loris Marchal (LIP) Divisible load scheduling with return messages 10/ 32

Model Framework Model P 0 c 1, d 1 c p, d p c2, d 2 c i, d i P 1 P 2 P i P p w 1 w 2 w i w p One master P 0 p workers P 1,..., P p Linear model, X unit of work are: sent to P i in X c i time units computed by P i in X w i time units their result is sent back from Pi to the master in X d i time units Loris Marchal (LIP) Divisible load scheduling with return messages 10/ 32

Model Framework Model P 0 c 1, d 1 c p, d p c2, d 2 c i, d i P 1 P 2 P i P p w 1 w 2 w i w p One master P 0 p workers P 1,..., P p Linear model, X unit of work are: sent to P i in X c i time units computed by P i in X w i time units their result is sent back from Pi to the master in X d i time units We sometimes make the assumption: d i = z c i (e.g. z=0.8 result file = 80% of original files) Loris Marchal (LIP) Divisible load scheduling with return messages 10/ 32

Model Framework Model Standard model in DLS for communications the master can send data to at most one worker at time t the master can receive data from at most one worker at time t a worker can start computing only once the reception from the master has terminated Loris Marchal (LIP) Divisible load scheduling with return messages 11/ 32

Model Framework Model Standard model in DLS for communications the master can send data to at most one worker at time t the master can receive data from at most one worker at time t a worker can start computing only once the reception from the master has terminated No idle time in the operation of each worker? reasonable without return messages send results to the master just after computation? but maybe the communication medium is not free general scheme: allow idle time Loris Marchal (LIP) Divisible load scheduling with return messages 11/ 32

Framework Linear Program for a given scenario Linear Program for a given scenario A scenario describes: which workers participate in which order are performed the communications (sending data and receiving results) Then we can suppose that: the master sends data as soon as possible the workers start computing as soon as possible return communications are performed as late as possible Loris Marchal (LIP) Divisible load scheduling with return messages 12/ 32

Framework Linear Program for a given scenario Linear Program for a given scenario first messages sent to P 1, P 2,..., P n, permutation σ for return messages P i α i c i α i w i x i α i d i Consider worker P i : Loris Marchal (LIP) Divisible load scheduling with return messages 13/ 32

Framework Linear Program for a given scenario Linear Program for a given scenario first messages sent to P 1, P 2,..., P n, permutation σ for return messages P i α i c i α i w i x i α i d i Consider worker P i : starts receiving its data at t recv i = α j c j j, j<i Loris Marchal (LIP) Divisible load scheduling with return messages 13/ 32

Framework Linear Program for a given scenario Linear Program for a given scenario first messages sent to P 1, P 2,..., P n, permutation σ for return messages P i α i c i α i w i x i α i d i Consider worker P i : starts receiving its data at t recv i = α j c j j, j<i starts execution at t recv i + α i c i Loris Marchal (LIP) Divisible load scheduling with return messages 13/ 32

Framework Linear Program for a given scenario Linear Program for a given scenario first messages sent to P 1, P 2,..., P n, permutation σ for return messages P i α i c i α i w i x i α i d i Consider worker P i : starts receiving its data at t recv i = α j c j j, j<i starts execution at t recv i + α i c i terminates execution at t term i = t recv i + α i c i + α i w i Loris Marchal (LIP) Divisible load scheduling with return messages 13/ 32

Framework Linear Program for a given scenario Linear Program for a given scenario first messages sent to P 1, P 2,..., P n, permutation σ for return messages P i α i c i α i w i x i α i d i Consider worker P i : starts receiving its data at t recv i = α j c j j, j<i starts execution at t recv i + α i c i terminates execution at t term i starts sending results at t back i = t recv i = T + α i c i + α i w i α j d j j, σ(j) σ(i) Loris Marchal (LIP) Divisible load scheduling with return messages 13/ 32

Framework Linear Program for a given scenario Linear Program for a given scenario first messages sent to P 1, P 2,..., P n, permutation σ for return messages P i α i c i α i w i x i α i d i Consider worker P i : starts receiving its data at t recv i = α j c j j, j<i starts execution at t recv i + α i c i terminates execution at t term i starts sending results at t back i idle time: x i = t back i t term i 0 = t recv i = T + α i c i + α i w i α j d j j, σ(j) σ(i) Loris Marchal (LIP) Divisible load scheduling with return messages 13/ 32

Framework Linear Program for a given scenario Linear Program for a given scenario With a fixed T, we get the following LP: Maximize i α i, under the constraints { α i 0 t back i t term i 0 (1) It gives the optimal throughput Given a set of resources and the communications ordering not possible to test all configurations Even if we force the order of the return messages to be the same as for the initial messages (FIFO), still an exponential number of scenarios Loris Marchal (LIP) Divisible load scheduling with return messages 14/ 32

Framework Counter Examples Not all processors always participate to the computation P 0 c 1 = 1 d 1 = 1 c 2 = 1 d 2 = 1 c 3 = 5 d 3 = 5 P 1 P 2 P 3 w 1 = 1 w 2 = 1 w 3 = 5 P 1 P 1 P 2 P 2 P 3 P 3 LIFO, throughput ρ = 61/135 FIFO with 2 processors, (best among schedules with 3 processors) optimal throughput ρ = 1/2 Loris Marchal (LIP) Divisible load scheduling with return messages 15/ 32

Framework Counter Examples Optimal schedule might be non-lifo and non-fifo P 0 P 1 c 1 = 7 c 3 = 12 c 2 = 8 d 1 = 7 d 3 = 12 d 2 = 8 P 1 P 2 P 3 P 2 P 3 w 1 = 6 w 2 = 5 w 3 = 5 Optimal schedule (ρ = 38/499 0.076) P 1 P 1 P 2 P 2 P 3 P 3 best FIFO schedule best LIFO schedule (ρ = 47/632 0.074) (ρ = 43/580 0.074) Loris Marchal (LIP) Divisible load scheduling with return messages 16/ 32

Outline Case study of some scenarios 1 Introduction 2 Framework 3 Case study of some scenarios LIFO strategies FIFO strategies 4 Simulations 5 Conclusion Loris Marchal (LIP) Divisible load scheduling with return messages 17/ 32

LIFO strategies Case study of some scenarios LIFO strategies LIFO = Last In First Out Processor that receives first its data is the last one sending its results The ordering of return messages is the reverse of the ordering of initial messages P 1 P 2 P i α i c i α i w i x i α i d i P p Loris Marchal (LIP) Divisible load scheduling with return messages 18/ 32

LIFO strategies Case study of some scenarios LIFO strategies Theorem In the optimal LIFO solution, then All processors participate to the execution Initial messages must be sent by non-decreasing values of c i + d i There is no idle time, i.e. x i = 0 for all i. Proof: modify the platform: c i c i + d i and d i 0 P 1 P 1 P 2 P i αici αiwi xi αidi P 2 P i αi(di + ci) αiwi xi P p P p reduces to a classic DLS problem without return messages Loris Marchal (LIP) Divisible load scheduling with return messages 19/ 32

FIFO strategies Case study of some scenarios FIFO strategies FIFO = First In First Out The ordering of return messages is the same as the ordering of initial messages P 1 P 2 P i α i c i α i w i x i α i d i P p We restrict to the case where d i = z c i (z < 1) Loris Marchal (LIP) Divisible load scheduling with return messages 20/ 32

FIFO strategies Case study of some scenarios FIFO strategies Theorem In the optimal FIFO solution, then Initial messages must be sent by non-decreasing values of c i + d i The set of participating processors is composed of the first q processors for the previous ordering, where q can be determined in linear time There is no idle time, i.e. x i = 0 for all i. Loris Marchal (LIP) Divisible load scheduling with return messages 21/ 32

FIFO strategies Case study of some scenarios FIFO strategies Consider processor P i in the schedule: j, j<i α j c j previous initial messages So we have: j, j i α i c i α i c i + α i w i α i w i x i α i d i j, σ(j)>σ(i) α j d j j, σ(j) σ(i) following return messages α i d i + x i = T A = Aα + x = T 1, where: c 1 + w 1 + d 1 d 2 d 3... d k c 1 c 2 + w 2 + d 2 d 3... d k. c 2 c 3 + w 3 + d 3......... dk c 1 c 2 c 3... c k + w k + d k Loris Marchal (LIP) Divisible load scheduling with return messages 22/ 32

FIFO strategies Case study of some scenarios FIFO strategies We can write A = L + 1d T, with: L = c 1 + w 1 0 0... 0 c 1 d 1 c 2 + w 2 0... 0. c 2 d 2 c 3 + w 3......... 0 c 1 d 1 c 2 d 2 c 3 d 3... c k + w k and d= d 1 d 2.. d k Matrix 1d t is a rank-one matrix, so we can use the Sherman-Morrison formula to compute the inverse of A: A 1 = (L + 1d t ) 1 = L 1 L 1 1d t L 1 1 + d t L 1 1 Loris Marchal (LIP) Divisible load scheduling with return messages 23/ 32

FIFO strategies Case study of some scenarios FIFO strategies Using this formula for A 1, we are able to: prove that for all processor P i, either α i = 0 (processor does not compute at all) or x i = 0 (no idle time) derive an analytical formula for the throughput ρ(t ) = i α i prove that throughput is better when c 1 c 2 c 3... c n throughput is best when only processors with d i 1 ρ opt Loris Marchal (LIP) Divisible load scheduling with return messages 24/ 32

Case study of some scenarios FIFO strategies - special cases FIFO strategies Until now, we have suppose that d i = z c i, with z < 1 If z > 1, symmetric solution (send initial messages by decreasing value of d i + c i, select the first q processors in this order) z = 1 order has no importance in this case Loris Marchal (LIP) Divisible load scheduling with return messages 25/ 32

Outline Simulations 1 Introduction 2 Framework 3 Case study of some scenarios 4 Simulations 5 Conclusion Loris Marchal (LIP) Divisible load scheduling with return messages 26/ 32

Simulations Simulations Not possible to compute the optimal schedule in the general case (for 100 processors (100!) 2 linear programs with 100 unknowns to solve... ) Use optimal FIFO ordering as a comparison basis Also compute optimal LIFO solution Some FIFO heuristic, with all processors : ordered by increasing value of c i (fastest communicating worker first) ordered by increasing value of wi (fastest computing worker first) Loris Marchal (LIP) Divisible load scheduling with return messages 27/ 32

Simulations Simulations 100 processors, z = 80% 1.1 1 0.9 relative throughput 0.8 0.7 0.6 0.5 0.4 OPT-FIFO OPT-LIFO FIFO-INC-C FIFO-INC-W 0.3 0.1 1 10 100 ratio w/c x-axis: ratio between average communication and computation costs (w/c) y-axis: throughput versus optimal FIFO throughput Loris Marchal (LIP) Divisible load scheduling with return messages 28/ 32

Simulations Simulations Number of processors used by the optimal FIFO schedule: 100 80 nb of processors really used 60 40 20 0 0.1 1 10 100 ratio w/c x-axis: ratio between average communication and computation costs (w/c) Loris Marchal (LIP) Divisible load scheduling with return messages 29/ 32

Simulations Simulations z=1 (same size for initial messages and return messages) 1.1 1 0.9 relative throughput 0.8 0.7 0.6 0.5 0.4 OPT-FIFO OPT-LIFO FIFO-INC-C FIFO-INC-W 0.3 0.1 1 10 100 ratio w/c x-axis: ratio between average communication and computation costs (w/c) Loris Marchal (LIP) Divisible load scheduling with return messages 30/ 32

Outline Conclusion 1 Introduction 2 Framework 3 Case study of some scenarios 4 Simulations 5 Conclusion Loris Marchal (LIP) Divisible load scheduling with return messages 31/ 32

Conclusion Conclusion Divisible Load Scheduling with return messages on star platforms Natural extension to classical studies Leads to considerable difficulties (quite unexpected in the linear model) Complexity of the problem is still open Characterization of optimal FIFO and LIFO solutions Future work: Investigate the general case Extend results to the unidirectional one-port model (master cannot send AND receive at the same time) Loris Marchal (LIP) Divisible load scheduling with return messages 32/ 32