githubEdit

Data insertion strategies

Introduction

When synchronizing data from your source platforms to your data warehouse, a fundamental question arises: how should new data interact with existing data?

The answer lies in choosing the appropriate insertion strategy. This choice determines whether data is added, replaced, or updated, and has direct consequences on data quality, storage costs, and analytical capabilities.

The three insertion strategies Quanti uses:

  • Append new data without deleting anything.

Use case: when you want to keep a full audit trail of every record as ingested, and duplicates are acceptable or deduplicated later.

Advantages: simple, append-only; minimal risk of accidental data loss.

Limitations: can lead to duplicate rows, larger storage needs, and more complex downstream deduplication logic.

circle-info

Quanti automatically selects the most appropriate insertion method for each table in its connectors. The preferred method is UPSERT Mode, which offers the best balance between data preservation and updates. However, depending on the nature of the data and business requirements, INSERT or REPLACE methods may be more suitable.

triangle-exclamation

⚠️ Critical consideration: Performance impact on dimension tables

Why this matters

Understanding insertion strategies is crucial to:

  • Comprehend the behavior of your data over time

  • Anticipate how updates and corrections are handled

  • Identify potential risks (duplicates, data loss, performance issues)

  • Make informed decisions when configuring custom connectors

  • Troubleshoot data inconsistencies effectively

What you'll learn

This section provides a detailed explanation of each insertion method, including:

  • How each method works technically

  • When to use each method

  • Advantages and limitations

  • Concrete examples with fact and dimension tables

  • Impact on data warehouse performance and costs

Last updated