Skip to main content

Replicate Assets in Microsoft Fabric

In a Microsoft Fabric context, the concept of "replicating assets" differs from classic IaaS-based lift-and-shift migrations. Instead of virtual machines, replication in Fabric focuses on data, metadata, and pipelines across the Fabric platform and into OneLake.

Fabric-specific replication steps

Microsoft Fabric workloads depend on replicated data sources, semantic models, pipelines, and optionally lakehouse tables or data warehouses. The replication process consists of:

  • SQL Server Replication: In hybrid environments, transactional replication or snapshot replication can be used from SQL Server to Fabric via Azure SQL as intermediate.
  1. Source System Capture:

    • Use Change Data Capture (CDC) from SQL Server or Azure SQL via Azure Data Factory or Fabric Dataflows Gen2.
    • Use Eventstream to ingest streaming data.
    • Use Azure Data Explorer (ADX) if time-series ingestion is required.
  2. Seeding:

    • Historical data is seeded using Copy activities in Dataflows Gen2, Azure Data Factory pipelines, or direct copy to Lakehouse via Spark Notebooks.
    • For large-volume ingestion, consider PolyBase, Bulk Insert, or Azure Data Box for offline loads.
  3. Synchronization:

    • Keep datasets updated via:
      • Dataflows with scheduled refresh
      • Pipelines with triggers
      • SQL CDC for low-latency sync
    • Ensure schema drift handling is defined in pipelines or staging Lakehouses.

Typical replication targets in Fabric

SourceMethodTarget
Azure SQL DBCDC + Dataflow Gen2Lakehouse or Warehouse
SQL ServerSelf-hosted IR + ADFLakehouse (Bronze)
Blob StorageDataflow Gen2 / EventstreamLakehouse (Bronze)
On-prem SQLAzure Data Factory / Data BoxLakehouse or Warehouse
SAP / OracleAzure Data Factory connectorsLakehouse
REST API / SaaSDataflow Gen2 (API support)Lakehouse or Warehouse
SQL Server (Transactional Replication)Native SQL Replication to Azure SQL + Dataflow Gen2Lakehouse

Risks and constraints in Fabric replication

  • Disk Drift ≈ Schema Drift: Ongoing schema evolution in source systems can desynchronize pipelines and must be monitored continuously.
  • Replication latency and snapshot intervals: When using SQL Replication, understand the impact of snapshot agent scheduling or transactional log shipping lag, especially in systems with tight SLA constraints.
  • WAN/Networking: Consider the bandwidth to OneLake, especially with hybrid or federated environments.
  • Concurrency limits: Lakehouse write performance is influenced by concurrent ingestion and Spark pool limits.
  • Semantic replication: Reports and Datasets (Power BI) can reference replicated models, but need to be validated after promotion.

Tools and services involved

  • Microsoft Fabric Dataflows Gen2
  • OneLake Shortcuts
  • Eventstream + KQL DB
  • Azure Data Factory / Synapse Pipelines
  • Azure Data Box (for large data volume initial load)
  • Power BI REST APIs (for semantic asset rehydration)
  • SQL Server Replication (Transactional Replication or Snapshot Replication to Azure SQL or Fabric Lakehouse)

Example Promotion Model


✅ Replication in Microsoft Fabric focuses on data pipelines, CDC, lakehouses, and semantic assets — not on VMs. Promote early, minimize schema drift, and prepare pipelines for robustness.

Contributors