Batch vs Bulk (Loader) Operations in ADO.NET

Understanding Batch vs Bulk (Loader) operations in ADO.NET is key to efficient .NET scalability. The numbers explain why: today, 64% of organizations manage over a petabyte of data, and 41% exceed 500 petabytes. At this scale, performance is driven more by how data is moved than by raw processing power.

To keep pace, especially in PostgreSQL-based systems, developers need to move beyond traditional ADO.NET, which executes one command per operation. That design multiplies round-trips, parsing, and transaction overhead, turning the network into the main performance bottleneck.

The best path forward is to pair ADO.NET's familiar command structure with dotConnect’s optimized batching and bulk loading tools. This article explores how these techniques work, compares their performance, and outlines when to use each for consistent speed and scalability in PostgreSQL environments.

How data modification works in ADO.NET

Every database-driven .NET application relies on ADO.NET to send commands between the client and the server. By default, each SQL statement executed through PgSqlCommand is sent as an individual request.

While this approach ensures reliability and clear control flow, it also introduces significant latency when processing large datasets. Each ExecuteNonQuery or ExecuteReader call involves its own network trip and server-side parsing, making per-row operations expensive at scale.

To mitigate this, developers often use transactions and batching:

  • Transactions group multiple commands into a single unit of work (PgSqlTransaction), reducing commit overhead by performing a single COMMIT and ensuring atomicity. However, unless you batch commands together, each statement still incurs its own round-trip.
  • Command batching (array binding) sends multiple parameter sets in one request, reducing round-trips and latency. All dotConnect providers support this pattern. In PostgreSQL, ExecuteArray() handles the batch in a single call and delivers high efficiency for workloads with many similar operations.
  • Bulk loading goes a step further. For example, using PgSqlLoader, dotConnect streams tabular data directly into PostgreSQL through the COPY protocol. This approach minimizes client—server communication and achieves the highest possible throughput.

Now that we've covered the basics, let's dive deeper, starting with batching.

Batch operations (array binding via ExecuteArray)

Batching is the first major step toward high-performance data modification in ADO.NET. Instead of executing one command per record, you can bind arrays of parameters to a single SQL statement and send them in one request. This minimizes round-trip and reduces parsing overhead, while still running fully parameterized SQL.

The advantage of this approach is that it scales without breaking transactional or business logic boundaries. You can still rely on triggers, constraints, and RETURNING clauses, but only with far fewer client—server exchanges.

Batch operations are most effective in scenarios such as:

  • Medium to large batches (hundreds to hundreds of thousands of rows).
  • Per-row logic or triggers that require fine-grained control.
  • Workloads that require a balance between performance, consistency, and error handling.

Examples of batch operations in C#

The following code snippets show how to apply ExecuteArray() for batch inserts, updates, and deletes in PostgreSQL using dotConnect for ADO.NET.

Batch insert

public static int BatchInsertProducts(PgSqlConnection connection, List products) 
{ 
    using (var command = new PgSqlCommand( 
        "INSERT INTO products (product_id, name, price) VALUES (:product_id, :name, :price)", 
        connection)) 
    { 
        command.Parameters.Add("product_id", PgSqlType.Int, 4, "product_id"); 
        command.Parameters.Add("name", PgSqlType.VarChar, 255, "name"); 
        command.Parameters.Add("price", PgSqlType.Money, 0, "price"); 
 
        // Fill parameter values with arrays 
        command.Parameters["product_id"].Value = products.Select(p => p.ProductId).ToArray(); 
        command.Parameters["name"].Value = products.Select(p => p.Name).ToArray(); 
        command.Parameters["price"].Value = products.Select(p => p.Price).ToArray(); 
 
        return command.ExecuteArray(); 
    } 
} 

Explanation: This code inserts multiple product records in a single round trip. The key point is the array binding: instead of sending one statement per row, all rows share one command definition.

Batch update

public static int BatchUpdateProducts(PgSqlConnection connection, List products) 
{ 
    using (var command = new PgSqlCommand( 
        "UPDATE products SET name = :name, price = :price WHERE product_id = :product_id", 
        connection)) 
    { 
        command.Parameters.Add("name", PgSqlType.VarChar, 255, "name"); 
        command.Parameters.Add("price", PgSqlType.Money, 0, "price"); 
        command.Parameters.Add("product_id", PgSqlType.Int, 4, "product_id"); 
 
        // Fill parameter values with arrays (with updated values) 
        command.Parameters["name"].Value = products.Select(p => p.Name + " Updated").ToArray(); 
        command.Parameters["price"].Value = products.Select(p => p.Price * 1.1m).ToArray(); 
        command.Parameters["product_id"].Value = products.Select(p => p.ProductId).ToArray(); 
 
        return command.ExecuteArray(); 
    } 
} 

Explanation: Here, each update statement is parameterized but executed as part of the same batch, minimizing command overhead and network latency.

Batch delete

public static int BatchDeleteProducts(PgSqlConnection connection, int count) 
{ 
    using (var adapter = new PgSqlDataAdapter()) 
    { 
        adapter.DeleteCommand = new PgSqlCommand("DELETE FROM products " + 
            "WHERE :product_id = product_id", 
            connection); 
        adapter.DeleteCommand.Parameters.Add("product_id", PgSqlType.Int, 4, "product_id"); 
        adapter.DeleteCommand.UpdatedRowSource = UpdateRowSource.None; 
 
        // Set parameter array values 
        adapter.DeleteCommand.Parameters["product_id"].Value = Enumerable.Range(1, count).ToArray(); 
 
        return adapter.DeleteCommand.ExecuteArray(); 
    } 
} 

Explanation: This uses PgSqlDataAdapter.DeleteCommand with ExecuteArray() to delete many rows in one call; an array of IDs is bound to the product_id parameter to minimize client-server chatter.

Implementation notes:

  • Use transactions for atomicity and fewer commits.
  • Tune batch size (1k–10k rows) for best performance vs. memory use.
  • Ensure all parameter arrays match in length for ExecuteArray().
  • Monitor execution plans for unexpected rewrites or locks.

Once batching takes you part of the way, bulk loading takes performance even further.

Bulk (Loader) operations with PgSqlLoader

Batching improves efficiency by grouping parameterized commands. But when datasets reach millions of rows, even batching becomes limited by protocol overhead. That's where Bulk operations come in.

PgSqlLoader in dotConnect for PostgreSQL streams data directly to the server using PostgreSQL's high-speed COPY mechanism. This approach bypasses per-command execution, eliminates most round-trips, and delivers the best possible throughput for heavy ingestion, ETL, or analytics workloads.

Bulk operations are most effective when:

  • Handling very large datasets or ingestion-heavy jobs.
  • Running scheduled imports, analytics pipelines, or data migrations.
  • Applying set-based DML after staging data in temporary tables.

Examples of bulk operations in C#

The following examples demonstrate how to use PgSqlLoader for high-volume inserts, updates, and deletes while keeping operations fully set-based.

Helper: Convert to DataTable

private static DataTable ToDataTable(List products, bool forUpdate = false) 
{ 
    var dataTable = new DataTable(); 
    dataTable.Columns.Add("product_id", typeof(int)); 
    dataTable.Columns.Add("name", typeof(string)); 
    dataTable.Columns.Add("price", typeof(decimal)); 
 
    foreach (var product in products) 
    { 
        if (forUpdate) 
            dataTable.Rows.Add(product.ProductId, product.Name + " Updated", product.Price * 1.1m); 
        else 
            dataTable.Rows.Add(product.ProductId, product.Name, product.Price); 
    } 
 
    return dataTable; 
}  

Explanation: This creates a tabular in-memory buffer that aligns exactly with the target schema and allows PgSqlLoader to stream rows efficiently.

Bulk insert

public static int BulkInsertProducts(PgSqlConnection connection, List products) 
{ 
    using (var loader = new PgSqlLoader("products", connection)) 
    { 
        var dataTable = ToDataTable(products); 
        loader.LoadTable(dataTable); 
    } 
    return products.Count; 
} 

Explanation: The code streams all product rows in a single operation using PostgreSQL's COPY protocol and removes per-row command overhead.

Bulk update (using a temporary table)

public static int BulkUpdateProducts(PgSqlConnection connection, List products) 
{ 
    using (var command = new PgSqlCommand("CREATE TEMP TABLE temp_products (product_id INT, name VARCHAR(255), price DECIMAL(10,2))", connection)) 
    { 
        command.ExecuteNonQuery(); 
    } 
 
    using (var loader = new PgSqlLoader("temp_products", connection)) 
    { 
        var dataTable = ToDataTable(products, forUpdate: true); 
        loader.LoadTable(dataTable); 
    } 
 
    using (var command = new PgSqlCommand("UPDATE products p SET name = t.name, price = t.price FROM temp_products t WHERE p.product_id = t.product_id", connection)) 
    { 
        var updatedRows = command.ExecuteNonQuery(); 
        using (var dropCmd = new PgSqlCommand("DROP TABLE temp_products", connection)) 
        { 
            dropCmd.ExecuteNonQuery(); 
        } 
        return updatedRows; 
    } 
}  

Explanation: This example loads new data into a temporary table, then performs a set-based UPDATE on the main table. This pattern is efficient for modifying large volumes while preserving consistency.

Bulk delete

public static int BulkDeleteProducts(PgSqlConnection connection, int count) 
{ 
    using (var command = new PgSqlCommand("DELETE FROM products WHERE product_id <= :count", connection)) 
    { 
        command.Parameters.AddWithValue("count", count); 
        return command.ExecuteNonQuery(); 
    } 
}  

Explanation: This code snippet performs a large delete operation with a single SQL predicate, which lowers the command count and limits the round-trip to one.

Implementation notes:

  • Use temporary or staging tables for updates or merges to simplify set-based logic.
  • Keep the loader and target table schemas aligned; mismatches will fail fast.
  • For large imports, disable or defer non-essential indexes and triggers, then rebuild them after loading.

With both methods in focus, the next question is: how do they actually compare in practice?

Batch vs bulk: performance comparison

To see how batch and bulk operations perform in real scenarios, we benchmarked both approaches on 1,000 product records using dotConnect for PostgreSQL. The tests compared ExecuteArray (batch) and PgSqlLoader (bulk) across inserts, updates, and deletes.

The result of batch and bulk operations

Screenshot 1. Console output that displays benchmark results for batch and bulk operations (1,000 products).

Operation type Mode Total time (ms) Avg/record (ms) Performance gain (vs batch)
INSERT Batch 420 0.42
Bulk 52 0.05 ≈ 8× faster
UPDATE Batch 254 0.25
Bulk 43 0.05 ≈ 6× faster
DELETE Batch 218 0.22
Bulk 24 0.02 ≈ 9× faster

Summary insights:

  • Bulk operations outperform batch in every CRUD category, achieving 6×–9× faster execution.
  • The performance gain comes from fewer round-trips and PostgreSQL's COPY-based streaming in PgSqlLoader.
  • Batching still provides strong improvements over one-row-at-a-time execution and offers a practical middle ground when control and transactional granularity matter more than absolute throughput.

After benchmarking both approaches, the next step is to decide which one fits your workload.

Choose between batch and bulk

The choice depends less on raw speed and more on how your data operations behave at scale.

Batch operations work best for:

  • Moderate data volumes, typically tens of thousands of rows.
  • Row-level logic that relies on triggers, predicates, or RETURNING clauses.
  • Controlled transactions where error isolation and consistency matter most.

Bulk operations become a better option for:

  • Large-scale loads that reach hundreds of thousands or millions of rows.
  • ETL pipelines, imports, or analytics tasks that can run set-based.
  • High-throughput scenarios where minimizing client—server interaction is critical.

Key trade-offs to keep in mind:

  • Error handling: Batch isolates issues per parameter set, while bulk may fail an entire stream, staging tables help recover safely.
  • Transaction scope: Both support transactions, though bulk loads often hold locks longer due to data volume.
  • Memory usage: Batch stores parameter arrays; bulk buffers tables in memory, adjust batch sizes and buffers to prevent pressure on resources.

Ultimately, batch favors precision and transactional control, while bulk favors scale and throughput. The most efficient solution is the one that aligns with your workload pattern, not just the fastest benchmark result.

Best practices for high-performance data operations

High-throughput database work depends on more than batching or bulk loading. Connection handling, transaction strategy, and tuning choices can all make or break performance. The following practices help ensure consistent, scalable results in production environments:

  • 1. Pool and reuse connections: Keep PgSqlConnection lifetimes scoped but reused through connection pooling to avoid the overhead of frequent opens and closes.
  • 2. Wrap operations in transactions: Use one transaction per batch or bulk unit to reduce commit costs and guarantee atomicity.
  • 3. Eliminate unnecessary round trips: Prefer ExecuteArray() or PgSqlLoader over row-by-row execution to minimize latency.
  • 4. Right-size batches and buffers: Start with 1,000–5,000 rows per batch and adjust based on schema complexity, triggers, and hardware performance.
  • 5. Utilize PostgreSQL COPY optimizations: Align column order and data types, and avoid conversions inside tight loops for maximum throughput.
  • 6. Apply a smart index strategy: For large loads, disable non-clustered indexes before loading and rebuild them afterward, or stage data into a heap table and switch/merge it later.
  • 7. Set appropriate timeouts and retries: Use reasonable CommandTimeout values, monitor for lock waits and deadlocks, and design bulk jobs to be idempotent for safe retries.
  • 8. Measure end-to-end: Track CPU, I/O, WAL generation, and client throughput, and always validate with production-like data to ensure real-world performance.

Conclusion

Batch and bulk operations serve different points in the performance journey. Batch execution with ExecuteArray() offers clarity and steady throughput for transactional workloads, while PgSqlLoader shines when large imports or analytical pipelines need raw speed.

In many environments, teams naturally lean on batch operations first because they keep transactional logic clean and predictable. As data volume increases, bulk loading tends to take the lead where scale and fast ingestion matter most.

Both methods benefit from real-world testing. Observing how they behave under your workload, not assumptions, usually reveals the setup that fits best.

Connect to data effortlessly in .NET

Streamline your .NET projects with feature-rich ADO.NET providers