The .NET Framework 4.0 release included PLINQ, an engine for parallel execution of LINQ queries. This technology provides considerable speed-up of LINQ to Objects query execution on multi-core machines, as it implicitly partitions the query data source into segments each being processed by a separate processor. To prepare a data source for parallel querying, one only needs to call the AsParallel extension method. For example, in the following code we get discontinued products (see the CRM Demo sample database) from the database, and process the resulting list in a parallel fashion:
Notes on this sample:
- PLINQ is enabled for a local collection here, though the collection itself is retrieved via LinqConnect. We mention it to note that the PLINQ-related part of this sample is not LinqConnect-specific, and you could use any other data source as well.
- XMasDiscountedPrice is a lightweight method, while the PLINQ advantages are best shown on more expensive functions (or large data sources, which is also not the case). The point is that distributing tasks between threads and processors is not free, and overhead caused by it can negate the positive effects of parallelizing. However, we use a simple method here to make the sample more clear.
LinqConnect supports the PLINQ engine, meaning that you can execute the AsParallel method on LinqConnect queries, causing the resulting collections to be processed in parallel. E.g., the above sample can be rewritten as
What should be kept in mind when using PLINQ for LinqConnect queries is:
- Parallel execution is done at the client side only; it does not affect SQL commands performed at a server in any way.
- As a corollary of the first point, it only makes sense to use PLINQ if you do process the query result in your application. If you just need to materialize a query result set, PLINQ can hardly give you any performance gain. The cases when it does make sense to use PLINQ are, e.g., when the processing function is expensive (see the above note on the XMasDiscountedPrice method) or when joining results from two heterogeneous data sources (like queries to different DataContext instances).
The AsParallel method should only be invoked on the 'final' queries, as after you call it, the rest of extension methods and LINQ expressions are considered to be of LINQ to SQL, not of LinqConnect. To explain what we mean saying this, let us apply AsParallel to the Products table in the previous sample:
As you can see, the generated SQL statement has no WHERE clause, i.e., all products are fetched to the client, and only then the limitation 'Discontinued == 1' is checked. The cause of this is that the parallel execution is enabled at the Products table, making context.Products.AsParallel() a local collection subject to LINQ to Objects processing.