Skip to content

John Savill vs Scott Duffy -- Azure Storage Value-Add Analysis

Purpose

Identify what unique value John Savill's Azure Master Class v3 Part 5 (Storage) adds on top of the Scott Duffy notes already written in 02-storage/scott-duffy/.


Summary Verdict

John Savill goes deeper on architecture and internals while Scott Duffy is more exam-focused and procedural. There is significant overlap on fundamentals (redundancy, tiers, access control, encryption), but Savill covers several topics Scott Duffy never touches and explains the "why" behind Azure Storage design decisions at an engineering level.


Unique Topics John Savill Covers (Not in Scott Duffy)

1. Azure Storage Internal Architecture (Three-Tier Model)

  • Stream Layer -- actual bits on disk, distribution, replication across servers in a cluster
  • Partition Layer -- understands blobs, tables, queues; provides scalable namespace
  • Front End Layer -- stateless, receives API requests, authenticates, routes to partition layer
  • Scott Duffy never explains how Azure Storage works internally. This is valuable context for understanding why redundancy/availability work the way they do.

2. Storage Stamps (Clusters)

  • Azure uses software-defined storage clusters (not traditional SANs/NAS)
  • Premium offerings use different storage stamps with different disk types
  • Multiple racks, fault domains, redundant networking/power per stamp
  • The only exception is Azure NetApp Files (actual NetApp filers in Azure DCs)

3. Ephemeral vs Durable Storage

  • Explicit distinction between temporary (cache, page files, temp disk) and persistent storage
  • Scott Duffy jumps straight into storage accounts without this foundational framing

4. Data Types (Unstructured / Structured / Semi-Structured)

  • Detailed breakdown with examples: when to use each, how they map to Azure services
  • Schema concepts, normalization, foreign keys for structured
  • Self-describing nature of JSON/XML for semi-structured
  • Scott Duffy assumes you already know this

5. Hierarchical Namespace (HNS) / Data Lake Gen2 -- Deep Dive

  • Flat vs hierarchical namespace explained at a technical level -- virtual directories vs real directory objects
  • Why rename/move is copy+delete in flat namespace but instant metadata change in HNS
  • POSIX ACLs, DFS API, ABFS driver for Hadoop/Spark/Databricks
  • Data Lake pattern: "storage is cheap, store raw data first, transform later"
  • ETL vs ELT paradigm shift
  • Feature compatibility matrix -- what you lose when enabling HNS (versioning, blob index tags, point-in-time restore, object replication)
  • Scott Duffy mentions Data Lake briefly but never explains the namespace difference or its implications

6. Azure Elastic SAN

  • Block storage solution using iSCSI protocol
  • Base units (capacity + IOPS + throughput) vs capacity-only units
  • Volume groups as network/security boundaries
  • Use cases: Azure VMware Solution, Azure Container Storage
  • Counts against VM network performance, not storage performance
  • Not covered at all by Scott Duffy

7. Azure NetApp Files

  • Actual NetApp filers in Azure data centers (only SAN-like thing in Azure)
  • Account > Capacity Pool > Volume hierarchy
  • Service levels: Standard (16 MB/s/TiB), Premium (64), Ultra (128)
  • Cool access tiering for cold data (2-183 days threshold)
  • SMB, NFS, or dual protocol per volume
  • Not covered at all by Scott Duffy

8. Azure File Sync

  • Detailed architecture: cloud endpoint (Azure file share) + up to 100 server endpoints
  • Agent installation, sync groups, registration flow
  • Cloud tiering policies (percentage-based, date-based)
  • USN journaling on Windows vs change detection jobs for Azure-originated changes (introduces delay)
  • Scott Duffy doesn't cover Azure File Sync at all -- this is exam-relevant for AZ-104

9. Azure Storage Actions (Beyond Lifecycle Management)

  • Centralized task management across multiple storage accounts (same tenant)
  • Supports block, page, append blobs + both flat and HNS
  • More complex conditions: wildcards, group clauses
  • Actions beyond tiering: set immutability, set blob tags, undelete, legal holds
  • Assigned via managed identity with data plane RBAC
  • Not covered by Scott Duffy

10. Static Website Hosting

  • Enable at account level, creates $web container
  • Comparison with Azure Static Web Apps (CDN, managed Functions integration)
  • Scott Duffy doesn't cover static website hosting

11. Provisioned V2 Billing for Files

  • Three independent dials: capacity, IOPS, throughput
  • Can dynamically change IOPS/throughput (24hr cooldown on decreases)
  • Contrasted with Provisioned V1 where performance scales with capacity
  • Scott Duffy covers file shares but not this billing model detail

12. Blob Index Tags

  • Key-value pairs on the data plane (distinct from ARM resource tags)
  • Up to 10 per blob
  • Searchable/filterable in portal and API
  • Usable with ABAC (attribute-based access control)
  • Scott Duffy mentions tags briefly but Savill demonstrates portal usage

13. Blob Inventory

  • Daily/weekly reports on blob state
  • Configurable: block/page/append, include versions, output format
  • Creates inventory in a dedicated container
  • Use lifecycle management to clean up old inventory data

14. Encryption Scopes

  • Different encryption keys per container or per blob
  • Can mix Microsoft-managed and customer-managed keys within one account
  • Cross-tenant customer-managed keys (SaaS scenario)
  • Scott Duffy mentions encryption scopes briefly; Savill demonstrates creating them

15. Service Endpoint Policies

  • Prevent data exfiltration by restricting which storage accounts a subnet can talk to
  • Complements service endpoints (which only allow inbound access)
  • Scott Duffy covers service endpoints but not endpoint policies

16. Resource Instance Rules (Networking)

  • Allow specific Azure resource instances (e.g., a specific SQL Database) through the storage firewall
  • More granular than subnet-based rules

17. Durability vs Availability Distinction

  • Savill explicitly separates durability (data safety, 11-16 nines) from availability (ability to interact)
  • Front-end layer problems affect availability but not durability
  • Scott Duffy mentions durability numbers but doesn't draw this distinction clearly

18. TLS/HTTPS Security Explanation

  • Detailed walkthrough of how SAS tokens are transmitted securely: DNS resolution > TCP session > TLS session > then URL with signature sent over encrypted channel
  • Addresses common misconception that SAS tokens are sent in plaintext

Topics Where Savill Adds Depth to Scott Duffy's Coverage

TopicScott Duffy CoverageJohn Savill Adds
Redundancy (LRS/ZRS/GRS/GZRS)Good coverage with pricingExplains sync vs async replication at stream layer level, durability vs availability distinction
Customer-managed failoverNot coveredUnplanned vs planned (preview), last sync time, LRS demotion on unplanned
Access tiersExcellent with exact pricingAdds file share tier details (transaction optimized/hot/cool), provisioned billing models
Lifecycle managementGood coverageAdds Azure Storage Actions as next-gen alternative, access-time-based rules
EncryptionGood coverageAdds encryption scopes (per-container/per-blob), cross-tenant CMK for SaaS
NetworkingGood coverageAdds resource instance rules, service endpoint policies
SAS tokensExcellent coverageAdds valet key pattern architecture, TLS transport security explanation
Data protectionExcellent (soft delete, versioning, PITR)Explains how versioning + change feed + soft delete combine to enable PITR (the dependency chain)
Object replicationGood coverageAdds that it works for premium block blob accounts
Page blobsBrief mentionExplains legacy status clearly, 512-byte page structure
Append blobsBrief mentionLog scenario use case, block-only-at-end constraint

Topics Scott Duffy Covers Better

TopicWhy Scott Duffy is Better
Exact pricing with screenshotsReal Azure portal pricing page screenshots, exact dollar amounts for LRS East US 2
SAS token URL anatomyDetailed breakdown of every URL parameter
Stored access policiesDedicated page with portal walkthrough, 5-policy limit, revocation workflow
AzCopy commandsDetailed command reference with examples
Azure File Share snapshots + Azure BackupDedicated coverage with backup policies and Recovery Services Vault
Step-by-step portal labs (Scott Duffy style)"Do this, then this" procedural labs tied to each lecture
Smart Tier (Preview)Covered in pricing section
Reserved Capacity pricing100TB/1PB, 1yr/3yr commitment tables

Recommendations for Integration

High-Priority Additions (exam-relevant, missing from Scott Duffy notes)

  1. Azure File Sync -- add a new page or section in 11-azure-file-shares.md
  2. HNS / Data Lake Gen2 deep dive -- the namespace difference, feature compatibility matrix, and move/rename implications should be added somewhere (possibly 06-storage-services.md)
  3. Static website hosting -- brief section in 06-storage-services.md
  4. Durability vs availability -- add explicit callout in 01-storage-account-fundamentals.md
  5. Customer-managed failover -- add to 01-storage-account-fundamentals.md redundancy section

Medium-Priority Additions (good context, less exam-critical)

  1. Encryption scopes with per-container/per-blob keys -- expand 05-encryption.md
  2. Service endpoint policies (data exfiltration prevention) -- expand 03-networking.md
  3. Resource instance rules -- expand 03-networking.md
  4. Azure Storage Actions -- mention in lifecycle/access tiers page
  5. Blob inventory -- mention in storage services page

Low-Priority (deep knowledge, unlikely on AZ-104)

  1. Three-tier architecture (stream/partition/front-end)
  2. Azure Elastic SAN
  3. Azure NetApp Files
  4. Provisioned V2 billing model for files
  5. Valet key pattern for SAS

John Savill's Lab Value-Add

Savill's labs (labs.md) are discovery-based rather than procedural:

  • Lab 1: Flat vs hierarchical namespace hands-on comparison
  • Lab 2: Access tier behavior including archive rehydration attempt
  • Lab 3: Container access levels + anonymous access testing
  • Lab 4: Move/copy blob limitations (flat vs HNS)
  • Lab 5: Lifecycle management rule creation
  • Lab 6: Versioning vs soft delete vs point-in-time restore comparison
  • Lab 7: Static website hosting
  • Lab 8: Azure Files mount
  • Lab 9: SAS token generation and testing
  • Lab 10: HNS feature conflict verification

These labs are more exploratory ("what happens when you try X?") compared to Scott Duffy's labs which are more prescriptive ("do X, then Y"). Both styles are valuable -- Savill's build understanding, Duffy's build confidence for the exam.

Released under the MIT License.