There aren’t many times where I can compare bits and pieces of code that I have written to those of other experts. One such time involves active loops (aka, parallel apply).
Shortly after posting on Concurrency last December, I began implementing my own version of active loops mentioned in the posts. A newly proposed extension for a future version of Visual C++, an active loop is a loop in which some iterations may run in parallel, typically over a collection. Visual C++ already has a limited support in the form of OpenMP.
Since that post, I noticed many entries posted on blogs and articles:
- Larry O’Brien – ParallelApply: Distribute Calculations Over Multicore / Processors
- Eric Sink — C# implementation of Map with multicore support
- Joe Duffy, Microsoft Concurrency expert — MSDN Article “Using Concurrency for Scalability” with a sample source code for a “Simplistic Parallel Implementation”
I recall Jeffrey Richter producing an implementation in one of his MSDN articles or in his PowerThreading library, but I cannot seem to locate the code snippet.
My own code is similar to Larry’s except that I used BeginInvoke instead of DynamicInvoke (reflecting my preference for early binding) and the use of the AutoResetEvent. BeginInvoke requires a call to EndInvoke. I’ll post my code snippet shortly after I sufficiently tested it.
I picked up from Joe Duffy the use of Environment.ProcessorCount to optimize the number of threads to queue from. Since he’s using the thread pool, it may be worthwhile to also look at ThreadPool.GetAvailableThreads().
Although my product is multithreaded, the parallelism is limited. My code analysis run in one thread, while the user interface runs in another. I am planning to introduce additional parallelism into my tool to take advantage of the new multicore systems available today; because I rely heavily on immutable data structures, the transition should be painless. I just purchased a Core 2 Duo E6600 Dell System with 4MB cache to replace my aging Pentium 4 desktop of three years and to experience and test with true parallelism.
My preference for multithreaded programming is to rely on immutable state, actor objects, and message passing. I try to avoid share state and low-level threading constructs; when I can’t, I rely on the CLR basics such as locks, interlocked calls, and don’t try to invent my own.
I have also been looking at BackgroundWorker, trying to uncover any differences between it and Control.Invoke. Unlike the Control.Invoke mechanism of .NET V1.1, the BackgroundWorker class is a new component available outside Windows applications; it uses a new abstraction called a SynchronizationContext. One difference appears to be that a call to ReportProgressChanged in BackgroundWorker usually waits, but not always, until the event finishes before proceed. Profiling of my app indicated a substantial amount of time spent in this call.
A reader wrote me around February mentioning that the MSDN article for BackgroundWorker contained a mistake. The author indicated that the BackgroundWorker will perform certain operations like ProgressChanged and RunWorkerCompleted on original calling thread. That’s probably only true for WinForms apps, which has a message pump, although one message board topic suggests that in some cases RunWorkerCompleted will be invoked in a third thread, if the SynchronizationContext is not properly initialized by WinForms. In console apps with no message pump, injecting a function call into a busy thread would be a bugfest and a security hole, which is why Thread.Suspend and Thread.Resume have been deprecated in Whidbey.