=========================================================================== CSC 373H / L0501 Lecture Summary for Week 13 Fall 2006 =========================================================================== ------------------------ Approximation algorithms ------------------------ Load Balancing: See section 11.1 in text. - Given jobs with durations t_1, ..., t_n and machines M_1, ..., M_m, assign jobs to machines so that overall completion time is minimized. - More precisely, if A(i) = jobs assigned to machine M_i, then let T_i = SUM_{j in A(i)} t_i ("load" of machine M_i). We want to minimize MAX_{i=1..m} T_i ("makespan" = maximum load). - Greedy algorithm: A(1) := A(2) := ... := A(m) := {} T_1 := T_2 := ... := T_m := 0 for j = 1,2,...,n: find i s.t. T_i is minimum A(i) := A(i) U {j} T_i := T_i + t_j - Let T = MAX_{i=1..m} T_i be the makespan of the greedy solution. Let T* be the minimum makespan. We prove lower bounds on T*. (11.1) T* >= 1/m SUM_{j=1..n} t_j because otherwise, T_i < 1/m SUM t_j for each i would mean SUM T_i < m/m SUM t_j, i.e., total work scheduled is less than total work, a contradiction. (11.2) T* >= MAX_{j=1..n} t_j because some machine must be assigned longest job. - Now, consider greedy solution and let M_i be machine with max load (i.e., T = T_i). Let j be last job scheduled on M_i. By greedy property, T_i-t_j (load of M_i just before job j) was smallest of all loads when job j was scheduled, i.e., T_k (final value) >= T_k (up to job j-1) >= T_i-t_j so T_1+...+T_m >= m(T_i-t_j) or equivalently, T_i-t_j <= 1/m SUM_{k=1..m} T_k = 1/m SUM_{k=1..n} t_k (since all jobs scheduled on exactly one machine). Also, t_j <= MAX_{k=1..n} t_k (by definition of max). Hence, T = T_i = (T_i-t_j) + t+j <= 1/m SUM_{k=1..n} t_k + MAX_{k=1..n} t_k <= T* + T* = 2 T*. - Improved algorithm: First, sort jobs by duration, so t_1 >= t_2 >= ... >= t_n. Now, consider first m+1 jobs: for any assignment, some machine must get at least two jobs, and each one takes time >= t_{m+1}, so load must be at least 2 t_{m+1}. Hence, T* >= 2 t_{m+1}. In greedy solution, let M_i have max load. If M_i contains only one job, then it must be t_1 and greedy is optimal. If M_i contains at least two jobs, then let j = index of second job on M_i. j >= m+1 (because jobs 1..m each get assigned to a different machine), so t_j <= t_{m+1}. As before, T = T_i = (T_i-t_j) + t+j <= 1/m SUM_{k=1..n} t_k + t_{m+1} <= T* + T*/2 = 3/2 T*.