Non-sequential Computing

Part 1 - parallel programming techniques

Syllabus
  1. Review of formula manipulation: sequences, sums, logarithms
  2. Elements of the complexity theory, the O notation. 
  3. Paradigm of parallel computing. The model PRAM. Performance of parallel algorithms. 
  4. Brent's Lemma, the WT Scheduling Principle. The Parallel Computation Thesis. 
  5. Parallelization: balanced trees. Algorithm analysis - the master method.
  6. Parallelization: divide and conquer. 
  7. Parallelization: partitioning. 
  8. Parallelization: accelerated cascading.
  9. Parallelization: pipelining.



Assignment: logarithms and asymptotics
Exercises will be distributed during our lessons or by email.
Assignment: Vector-matrix product
Ex. 2 at p. 11 of the coursebook


Assignment: Master method
Exercise 4.5-1 in Cormen
Solutions to Exercise 4.5-1
Assignment: Prefix Sums in C#
Implement the parallel recursive algorithm 3.1 for prefix sum in our textbook. Submit a zipped source code + executable code for .net 7.0 or higher. The application should allow to enter input data and then print out the result.
Algorithm structure for the Array packing Problem


Assignment: Prefix sum by divide-and-conquer
Exercise 3 in Section 4.3 of our coursebook
Assignment: Sample Sort
Implement the sorting method SampleSort described in Exercise 3 in Section 5.2 of our coursebook. Elements of the input array must be separated into the buckets in parallel. In C#, use the Parallel.For method, and the recommended class to implement buckets is the ConcurrentBag, https://learn.microsoft.com/en-us/dotnet/api/system.collections.concurrent.concurrentbag-1?view=net-7.0, which allows concurrent addition of elements from multiple threads. Then you can sort all buckets in parallel by the method ToImmutableSortedSet, Finally, use the method Concat to catenate the sorted buckets.
Assignment: Accelerated Cascading
Ex.1 in Section 7.4 of our coursebook. Start with an input array of size n=1024, fill it with random integers in range 0-1000 and print it. Then run the binary tree algorithm for ⌈log log log n⌉=2 steps (implement this formula to calculate the number of steps). Obtaining an array reduced to 256 elements, construct the doubly-logarithmic tree and apply the algorithm finding maximum in it. Note that you need also to implement the algorithm 7.1 as a method which computes constant-time maximum in each node of the tree with more than two children.
Assignment: Graphical example of simple parallel insertion into 2-3 tree
Insert, in parallel, sequence (5, 15, 25, 35, 45, 55, 65, 75, 85) into the tree at Fig. 6.4 (d) in our coursebook following the algorithm in Sec. 6.2. Draw on paper parallel balancing steps of the tree similarly as Fig. 6.4 (a) - (d). Upload the scan or photo.
Assignment: Simplified array insertion into 2-3 tree
Naprogramujte paralelní vkládání sekvence klíčů B do 2-3 stromu za zjednodušeného předpokladu, že mezi dva stávající listy stromu nikdy nepadne více než jeden z nich. K dispozici máte vzorový kód už s hotovými některými metodami. nestačil jsem ho odladit, tak se nedivte, kdyby něco nefungovalo :-). Odevzdává se (v zazipovaném archivu) zdrojový kód + výpis ukázky funkce: * výpis stromu po ukončení fáze 1 (v textové podobě - pomocí ToString) * vkládané pole B * výpis stromu po ukončení fáze 2 a celkovém vyvážení
Assignment: General array insertion into 2-3 tree
Odevzdává se zdrojový kód + výpis ukázky funkce: výpis původního stromu (v textové podobě), vkládané pole B, a výpis jednotlivých fází přidávání uzlů a postupného vyvažování. Pro vytváření/rušení vláken doporučuji použít Parallel.ForEach anebo thread pool.