Layers of Parallelism in Visual Studio .NET

Layers of Parallelism
It is not the case that all programs can be highly parallel, nor is it the case that this should be a goal of most software developers At least over the next half decade, much of multicore's success will undoubtedly be in the realm of embarrassingly parallel problems, where real parallel hardware is used to attain impressive speedups These are the kinds of problems where paral lelism is inherent and easily exploitable, such as compute-intensive image manipulation, financial analysis, and AI algorithms Because parallelism is more natural in these domains, there is often less friction in getting code cor rect and performing well Race conditions and other concurrency hazards are simply easier to avoid with these kinds of programs, and, when it comes to observing a parallel speedup, the ratio of success to failure is far higher Other compute-intensive kernels of computations will use parallelism but will require more effort For example, math libraries, sort routines, report generation, XML manipulation, and stream processing algorithms may all use parallelism to speed up result generation In addition, domain specific languages (DSLs) may arise that are inherently parallel C#s Lan guage Integrated Query (LINQ) is one example of an embedded DSL within an otherwise imperative language, and MATLAB is yet another Both are amenable to parallel execution As libraries adopt parallelism, those programs that use them will receive some amount of scalability for
Layers of P a r a l l e l i s m
Parallel Applications
Domain Parallelism (Libraries, DSLs, etc )
Parallel I nfrastru ctu re
FI G U R E 1 2 : A taxonomy of concu rrent p rogra m struct u re
free, particularly if a large portion of time is spent executing that library code This is attractive because the parallelism is reusable in a variety of contexts The resulting landscape of parallelism is visualized in Figure 1 2 If you stop to think about it, this picture is not very different from what we are accustomed to seeing for sequential software Software developers creating libraries focus on ensuring that their performance meets customer expec tations, and they spend a fair bit of time on optimization and enabling future scalability Parallelism is similar; the techniques used are different, but the primary motivating factor-that of improving performance-is shared among them Aside from embarrassingly parallel algorithms and libraries, some applications will still use concurrency specifically Many of these use cases will be in representing coarse-grained independent operations as agents In fact, many programs already are structured this way; utilizing the benefits of multicore in these cases often requires minimal restructuring, although the scalability tends to be fixed to a small number of agents and, hence, cores Most developers of mostly sequential applications also can use
C h a pter s: I n t rod u c t i o n
pro filers (such a s the one i n Visual Studio) t o identify CPU-bound hotspots in programs to identify opportunities for fine-grained parallelism
Why Not Concurrency
Concurrency is not for everyone The fact that a whole book has been written about concurrency alone should tell you that it's a somewhat dense topic It is relatively easy to get started with concurrency-thanks to the fact that creating threads, queuing work to thread pools, and the like, are all very simple (and indeed automated by some commonly used program ming models such as ASPNET)-but there are many subtle consequences Concurrency is a fundamental cross-cutting property of software Once you've got many threads actively calling into a shared data structure that you've written, for example, the number of concerns you must have con sidered and proactively safeguarded yourself against when writing that data structure is often daunting Indeed it will often only be evident after you've been programming with concurrency for a while or until you've read a book about it Here is a quick list of some examples of such problems 2, Synchronization and Time, and later, 1 1 , Concurrency Hazards, will provide more detail on each
State management decisions, as noted above, often lead to synchro nization Most often this means some form of locking Locking is difficult to get right and can have a negative impact on performance Verifying that you've implemented some locking policy correctly tends to be vastly more difficult than typical unit-test-style verifica tion And getting it wrong will lead to race conditions, which are bugs that depend on intricate timing and machine architecture and are very difficult to reproduce
Deadlock can arise when synchronization is used, leading to a pro gram that suddenly stops making progress indefinitely The result of this can range anywhere from annoying (eg, a hung user interface) to disastrous (eg, a semi-real-time system fails to respond to a
