Last updated: May 2026

Azure DatabasesIntermediateAZ-104⏱ 11 min read

Azure Synapse Analytics

Azure Synapse Analytics is a limitless analytics service that brings together data integration, enterprise data warehousing, and big data analytics. It's the evolution of Azure SQL Data Warehouse — now a unified workspace for data engineers, data scientists, and business analysts to work with petabyte-scale data using both SQL and Spark engines.

What you'll learn Synapse architecture overview · Dedicated SQL Pool (data warehouse) · Serverless SQL Pool (query-on-demand) · Apache Spark Pool · Synapse Link (HTAP) · Synapse Pipelines (data integration) · When to use Synapse vs SQL Database · Pricing model

Table of Contents

Synapse Architecture
Dedicated SQL Pool
Serverless SQL Pool
Apache Spark Pool
Synapse Link
Synapse Pipelines
Synapse vs SQL Database
Practice Questions

Synapse Architecture

Azure Synapse is a unified workspace with multiple compute engines, all connected to a common data lake (Azure Data Lake Storage Gen2):

Component	Purpose
Dedicated SQL Pool	Traditional data warehouse — provisioned MPP compute
Serverless SQL Pool	Query files in data lake on-demand with SQL
Apache Spark Pool	Big data processing, ML, data transformation with Spark
Synapse Pipelines	Data integration and orchestration (like Azure Data Factory)
Synapse Link	HTAP — query operational data in Cosmos DB/SQL without ETL
Synapse Studio	Unified web IDE for all Synapse capabilities

Dedicated SQL Pool

The traditional data warehouse component — formerly called Azure SQL Data Warehouse. A massively parallel processing (MPP) SQL engine optimised for petabyte-scale analytics.

Key Concepts

Data Warehouse Units (DWU) — The billing unit. More DWUs = more compute and memory. Scale up/down as needed.
Distributions — Data is distributed across 60 nodes. Distribution key choice is critical for performance.
Columnstore indexes — Data stored column-by-column — excellent for analytical queries scanning many rows
Pause/Resume — Pause when not needed, stop compute billing. Storage billing continues.

💡

Pause to Save Money Dedicated SQL Pools are expensive (charged per DWU per hour). Always pause them outside business hours or when not running queries. Storage continues to bill but compute (the expensive part) stops.

Serverless SQL Pool

Serverless SQL Pool allows you to query files directly in Azure Data Lake Storage (Parquet, CSV, JSON, Delta) using standard SQL — without provisioning any infrastructure. You pay per TB of data scanned.

No infrastructure to manage
Query data in place — no need to load into a database first
Create external tables as views over data lake files
Pay per TB scanned — ₹5/TB (approximate)

SQLQuery Parquet file in data lake with Serverless SQL

SELECT TOP 100 *
FROM OPENROWSET(
  BULK 'https://mystorageaccount.dfs.core.windows.net/data/sales/*.parquet',
  FORMAT = 'PARQUET'
) AS sales
WHERE sales.year = 2026;

Apache Spark Pool

Serverless Spark clusters for big data processing, machine learning, and data transformation using Python, Scala, R, or SQL. Auto-scales — only active (and billed) when running jobs.

Integrated with Azure ML for model training
Share data with Dedicated SQL Pool (via shared data lake)
Delta Lake support for ACID transactions on data lake files

Synapse Link

Synapse Link enables Hybrid Transactional Analytical Processing (HTAP) — run analytics directly on operational data without copying it or impacting the transactional database:

Synapse Link for Cosmos DB — Analytical store mirrors your Cosmos DB data; query it with SQL or Spark
Synapse Link for SQL — Replicate SQL Database/MI data to Synapse for analytics

Synapse Pipelines

Built-in data integration capability (same visual experience as Azure Data Factory). Orchestrate data movement and transformation — copy data from 90+ sources, transform with data flows or Spark, load into the data warehouse.

Synapse vs SQL Database

Factor	Synapse Dedicated SQL Pool	Azure SQL Database
Workload type	Analytics — complex queries on billions of rows	Transactional (OLTP) — many small reads/writes
Optimised for	Aggregations, scans, reporting	Point lookups, inserts, updates
Storage format	Columnar	Row-based
Scaling	DWU-based MPP	vCores/DTUs
Max data size	Petabytes	4 TB

💡

AZ-104 Exam Tip Know Synapse = data warehousing and analytics (not OLTP). Know the four main components — Dedicated SQL Pool, Serverless SQL Pool, Spark Pool, Synapse Pipelines. Know that Dedicated SQL Pool can be paused to stop compute billing. Know Synapse Link enables analytics on operational data without ETL.

📝 Practice Questions

Click an option to check your answer.

Q1. Which Synapse component allows you to query files directly in Azure Data Lake using SQL without provisioning any infrastructure?

A — Dedicated SQL Pool

B — Serverless SQL Pool

C — Apache Spark Pool

D — Synapse Pipelines

Q2. What is the billing unit for Dedicated SQL Pool (data warehouse)?

A — vCores

B — Data Warehouse Units (DWU)

C — Request Units (RU/s)

D — DTUs

Q3. A company runs analytics reports on their Cosmos DB operational data. They want to avoid impacting the production database performance. What should they use?

A — Run analytics directly on the primary Cosmos DB

B — Enable Synapse Link for Cosmos DB — analytical queries run against the analytical store

C — Export data to CSV daily and import into Synapse

D — Create a Cosmos DB read replica for analytics

Q4. How can you reduce costs for Dedicated SQL Pool when no queries are running overnight?

A — Delete and recreate the pool each morning

B — Pause the pool — compute billing stops while paused

C — Reduce DWUs to minimum during off-hours

D — Move data to cold storage tier

Q5. When should you use Azure Synapse instead of Azure SQL Database?

A — For a transactional e-commerce application processing thousands of orders per minute

B — For a data warehouse running complex aggregation queries across billions of rows

C — For a standard CRUD web application

D — For storing user session data

← PreviousAzure Cache for Redis Next →Azure Data Factory

Comments

Disclaimer: RedKite Cloud is an independent educational resource and is not affiliated with Microsoft Corporation.