John Savill vs Scott Duffy -- Azure Storage Value-Add Analysis
Purpose
Identify what unique value John Savill's Azure Master Class v3 Part 5 (Storage) adds on top of the Scott Duffy notes already written in 02-storage/scott-duffy/.
Summary Verdict
John Savill goes deeper on architecture and internals while Scott Duffy is more exam-focused and procedural. There is significant overlap on fundamentals (redundancy, tiers, access control, encryption), but Savill covers several topics Scott Duffy never touches and explains the "why" behind Azure Storage design decisions at an engineering level.
Unique Topics John Savill Covers (Not in Scott Duffy)
1. Azure Storage Internal Architecture (Three-Tier Model)
- Stream Layer -- actual bits on disk, distribution, replication across servers in a cluster
- Partition Layer -- understands blobs, tables, queues; provides scalable namespace
- Front End Layer -- stateless, receives API requests, authenticates, routes to partition layer
- Scott Duffy never explains how Azure Storage works internally. This is valuable context for understanding why redundancy/availability work the way they do.
2. Storage Stamps (Clusters)
- Azure uses software-defined storage clusters (not traditional SANs/NAS)
- Premium offerings use different storage stamps with different disk types
- Multiple racks, fault domains, redundant networking/power per stamp
- The only exception is Azure NetApp Files (actual NetApp filers in Azure DCs)
3. Ephemeral vs Durable Storage
- Explicit distinction between temporary (cache, page files, temp disk) and persistent storage
- Scott Duffy jumps straight into storage accounts without this foundational framing
4. Data Types (Unstructured / Structured / Semi-Structured)
- Detailed breakdown with examples: when to use each, how they map to Azure services
- Schema concepts, normalization, foreign keys for structured
- Self-describing nature of JSON/XML for semi-structured
- Scott Duffy assumes you already know this
5. Hierarchical Namespace (HNS) / Data Lake Gen2 -- Deep Dive
- Flat vs hierarchical namespace explained at a technical level -- virtual directories vs real directory objects
- Why rename/move is copy+delete in flat namespace but instant metadata change in HNS
- POSIX ACLs, DFS API, ABFS driver for Hadoop/Spark/Databricks
- Data Lake pattern: "storage is cheap, store raw data first, transform later"
- ETL vs ELT paradigm shift
- Feature compatibility matrix -- what you lose when enabling HNS (versioning, blob index tags, point-in-time restore, object replication)
- Scott Duffy mentions Data Lake briefly but never explains the namespace difference or its implications
6. Azure Elastic SAN
- Block storage solution using iSCSI protocol
- Base units (capacity + IOPS + throughput) vs capacity-only units
- Volume groups as network/security boundaries
- Use cases: Azure VMware Solution, Azure Container Storage
- Counts against VM network performance, not storage performance
- Not covered at all by Scott Duffy
7. Azure NetApp Files
- Actual NetApp filers in Azure data centers (only SAN-like thing in Azure)
- Account > Capacity Pool > Volume hierarchy
- Service levels: Standard (16 MB/s/TiB), Premium (64), Ultra (128)
- Cool access tiering for cold data (2-183 days threshold)
- SMB, NFS, or dual protocol per volume
- Not covered at all by Scott Duffy
8. Azure File Sync
- Detailed architecture: cloud endpoint (Azure file share) + up to 100 server endpoints
- Agent installation, sync groups, registration flow
- Cloud tiering policies (percentage-based, date-based)
- USN journaling on Windows vs change detection jobs for Azure-originated changes (introduces delay)
- Scott Duffy doesn't cover Azure File Sync at all -- this is exam-relevant for AZ-104
9. Azure Storage Actions (Beyond Lifecycle Management)
- Centralized task management across multiple storage accounts (same tenant)
- Supports block, page, append blobs + both flat and HNS
- More complex conditions: wildcards, group clauses
- Actions beyond tiering: set immutability, set blob tags, undelete, legal holds
- Assigned via managed identity with data plane RBAC
- Not covered by Scott Duffy
10. Static Website Hosting
- Enable at account level, creates
$webcontainer - Comparison with Azure Static Web Apps (CDN, managed Functions integration)
- Scott Duffy doesn't cover static website hosting
11. Provisioned V2 Billing for Files
- Three independent dials: capacity, IOPS, throughput
- Can dynamically change IOPS/throughput (24hr cooldown on decreases)
- Contrasted with Provisioned V1 where performance scales with capacity
- Scott Duffy covers file shares but not this billing model detail
12. Blob Index Tags
- Key-value pairs on the data plane (distinct from ARM resource tags)
- Up to 10 per blob
- Searchable/filterable in portal and API
- Usable with ABAC (attribute-based access control)
- Scott Duffy mentions tags briefly but Savill demonstrates portal usage
13. Blob Inventory
- Daily/weekly reports on blob state
- Configurable: block/page/append, include versions, output format
- Creates inventory in a dedicated container
- Use lifecycle management to clean up old inventory data
14. Encryption Scopes
- Different encryption keys per container or per blob
- Can mix Microsoft-managed and customer-managed keys within one account
- Cross-tenant customer-managed keys (SaaS scenario)
- Scott Duffy mentions encryption scopes briefly; Savill demonstrates creating them
15. Service Endpoint Policies
- Prevent data exfiltration by restricting which storage accounts a subnet can talk to
- Complements service endpoints (which only allow inbound access)
- Scott Duffy covers service endpoints but not endpoint policies
16. Resource Instance Rules (Networking)
- Allow specific Azure resource instances (e.g., a specific SQL Database) through the storage firewall
- More granular than subnet-based rules
17. Durability vs Availability Distinction
- Savill explicitly separates durability (data safety, 11-16 nines) from availability (ability to interact)
- Front-end layer problems affect availability but not durability
- Scott Duffy mentions durability numbers but doesn't draw this distinction clearly
18. TLS/HTTPS Security Explanation
- Detailed walkthrough of how SAS tokens are transmitted securely: DNS resolution > TCP session > TLS session > then URL with signature sent over encrypted channel
- Addresses common misconception that SAS tokens are sent in plaintext
Topics Where Savill Adds Depth to Scott Duffy's Coverage
| Topic | Scott Duffy Coverage | John Savill Adds |
|---|---|---|
| Redundancy (LRS/ZRS/GRS/GZRS) | Good coverage with pricing | Explains sync vs async replication at stream layer level, durability vs availability distinction |
| Customer-managed failover | Not covered | Unplanned vs planned (preview), last sync time, LRS demotion on unplanned |
| Access tiers | Excellent with exact pricing | Adds file share tier details (transaction optimized/hot/cool), provisioned billing models |
| Lifecycle management | Good coverage | Adds Azure Storage Actions as next-gen alternative, access-time-based rules |
| Encryption | Good coverage | Adds encryption scopes (per-container/per-blob), cross-tenant CMK for SaaS |
| Networking | Good coverage | Adds resource instance rules, service endpoint policies |
| SAS tokens | Excellent coverage | Adds valet key pattern architecture, TLS transport security explanation |
| Data protection | Excellent (soft delete, versioning, PITR) | Explains how versioning + change feed + soft delete combine to enable PITR (the dependency chain) |
| Object replication | Good coverage | Adds that it works for premium block blob accounts |
| Page blobs | Brief mention | Explains legacy status clearly, 512-byte page structure |
| Append blobs | Brief mention | Log scenario use case, block-only-at-end constraint |
Topics Scott Duffy Covers Better
| Topic | Why Scott Duffy is Better |
|---|---|
| Exact pricing with screenshots | Real Azure portal pricing page screenshots, exact dollar amounts for LRS East US 2 |
| SAS token URL anatomy | Detailed breakdown of every URL parameter |
| Stored access policies | Dedicated page with portal walkthrough, 5-policy limit, revocation workflow |
| AzCopy commands | Detailed command reference with examples |
| Azure File Share snapshots + Azure Backup | Dedicated coverage with backup policies and Recovery Services Vault |
| Step-by-step portal labs (Scott Duffy style) | "Do this, then this" procedural labs tied to each lecture |
| Smart Tier (Preview) | Covered in pricing section |
| Reserved Capacity pricing | 100TB/1PB, 1yr/3yr commitment tables |
Recommendations for Integration
High-Priority Additions (exam-relevant, missing from Scott Duffy notes)
- Azure File Sync -- add a new page or section in
11-azure-file-shares.md - HNS / Data Lake Gen2 deep dive -- the namespace difference, feature compatibility matrix, and move/rename implications should be added somewhere (possibly
06-storage-services.md) - Static website hosting -- brief section in
06-storage-services.md - Durability vs availability -- add explicit callout in
01-storage-account-fundamentals.md - Customer-managed failover -- add to
01-storage-account-fundamentals.mdredundancy section
Medium-Priority Additions (good context, less exam-critical)
- Encryption scopes with per-container/per-blob keys -- expand
05-encryption.md - Service endpoint policies (data exfiltration prevention) -- expand
03-networking.md - Resource instance rules -- expand
03-networking.md - Azure Storage Actions -- mention in lifecycle/access tiers page
- Blob inventory -- mention in storage services page
Low-Priority (deep knowledge, unlikely on AZ-104)
- Three-tier architecture (stream/partition/front-end)
- Azure Elastic SAN
- Azure NetApp Files
- Provisioned V2 billing model for files
- Valet key pattern for SAS
John Savill's Lab Value-Add
Savill's labs (labs.md) are discovery-based rather than procedural:
- Lab 1: Flat vs hierarchical namespace hands-on comparison
- Lab 2: Access tier behavior including archive rehydration attempt
- Lab 3: Container access levels + anonymous access testing
- Lab 4: Move/copy blob limitations (flat vs HNS)
- Lab 5: Lifecycle management rule creation
- Lab 6: Versioning vs soft delete vs point-in-time restore comparison
- Lab 7: Static website hosting
- Lab 8: Azure Files mount
- Lab 9: SAS token generation and testing
- Lab 10: HNS feature conflict verification
These labs are more exploratory ("what happens when you try X?") compared to Scott Duffy's labs which are more prescriptive ("do X, then Y"). Both styles are valuable -- Savill's build understanding, Duffy's build confidence for the exam.