Stanford CS149 I 2023 I Lecture 5 - Performance Optimization I: Work Distribution and Scheduling

16 Sep 2024 (2 days ago)
Stanford CS149 I 2023 I Lecture 5 - Performance Optimization I: Work Distribution and Scheduling

Program Optimization

Work Balancing in Parallel Programming

Static Assignment

Semi-Static Assignment

Dynamic Assignment

Cost of Dynamic Allocation

Optimizing Dynamic Allocation

Task Dependencies

Parallel Programming Approaches

Quick Sort and Parallelism

Silk Programming System

Threadpool Implementation

Scheduling Tasks in a Multi-threaded Environment

Run Child First Scheme

  • In the runch child first scheme, thread zero places the remaining iterations of a loop into its work queue. Other threads can then steal these iterations, leading to the work bouncing between threads. rel="noopener noreferrer" target="_blank">(01:01:10)
  • When applying the runch child first scheme to a recursive algorithm like quicksort, the size of tasks in the queue varies. Smaller tasks are at the bottom of the queue, while larger tasks are at the top. rel="noopener noreferrer" target="_blank">(01:03:06)
  • It is advantageous for idle threads to steal work from the top of the queue for two reasons: stealing larger tasks reduces the need for frequent synchronization, and it allows thread zero to primarily manage the bottom of the queue, promoting data locality. rel="noopener noreferrer" target="_blank">(01:05:15)

Work-Stealing Scheduler

Greedy Join Scheduling

Assignment and Next Meeting

Overwhelmed by Endless Content?