Apple Sparse Image Format (ASIF) Technical Analysis

Apple Sparse Image Format (ASIF) Technical Analysis

Overview of Apple Sparse Image Format (ASIF)

Apple's Sparse Image Format (ASIF) is a virtual disk image format introduced with macOS 26 Tahoe, specifically designed for use within the Virtualization framework. ASIF functions as a sparse virtual disk, allowing large disks to be represented by smaller files on physical storage by only allocating space for data that has been written, similar to VMDK, VHDX, or QCOW2 formats.

ASIF Header Structure

The ASIF file begins with a structured header that defines the disk's geometry, versioning, and offsets to critical allocation data. The header uses big-endian byte ordering.

Header Field Definitions

Field Type Description
header_signature uint32 File magic bytes ('shdw')
header_version uint32 Format version
header_size uint32 Size of the header section
header_flags uint32 Header-specific flags
directory_offsets uint64[2] Offsets to the allocation directories
guid char[16] Unique identifier for the image
sector_count uint64 Current number of sectors in the virtual disk
max_sector_count uint64 Maximum allowable sectors the disk can grow to
chunk_size uint32 Size of data chunks (typically 1 MiB)
block_size uint16 Sector/block size (typically 512 bytes)
total_segments uint16 Total number of segments
metadata_chunk uint64 Offset to the metadata block
read_only_flags uint32 Flags indicating read-only status
metadata_flags uint32 General metadata flags
metadata_read_only_flags uint32 Read-only flags specifically for metadata

Allocation and Data Mapping

ASIF maps virtual offsets to physical chunks using a hierarchical directory and table system. This architecture supports atomic updates by maintaining multiple directories; the directory with the highest version number is considered the active one.

The Table and Entry System

  • Directories and Tables: A directory contains a list of tables, and each table contains a list of uint64 data entries.
  • Data Entries: Each entry points to a physical chunk in the ASIF file. The entries use 55 bits for the chunk number and 9 bits for flags.
  • Entry Flags:
    • 0b00: Uninitialized
    • 0b01: Fully initialized
    • 0b10: Unmapped
    • 0b11: Has bitmap

Chunk Groups and Bitmaps

Unlike some sparse formats that store bitmaps separately, ASIF embeds allocation bitmaps directly within the data tables.

  • Chunk Groups: Data entries are organized into "chunk groups." Each group consists of a set of chunk entries followed by a single bitmap entry.
  • Bitmap Specifications: The bitmap uses 2 bits per block. A single byte in the bitmap covers 4 blocks. Because the bitmap itself is one chunk in size, it covers 4 * chunk_size blocks per group.
  • Mapping Logic: To resolve a virtual offset to a physical chunk, the system calculates the table index, the relative chunk index within that table, and then accounts for the "skipped" entries created by the embedded bitmaps to find the final entry_index.

Metadata and Constraints

ASIF reserves a small portion of the disk's end for metadata. The metadata_chunk offset in the header typically points to an area near the end of the maximum theoretical disk size (approximately 4 PiB minus one chunk).

This metadata block contains a small header followed by an XML plist. This plist stores two primary dictionaries:

  1. Internal Metadata: System-level configuration and state.
  2. User Metadata: User-defined attributes for the disk image.

Technical Summary of Disk Resolution

To read a specific offset from an ASIF image, the following logic is applied:

  1. Table Selection: table = directory.table(offset // size_per_table)
  2. Relative Indexing: Calculate the relative_chunk_index based on the block_size and the table's virtual offset.
  3. Group Adjustment: Determine the chunk_group by dividing the relative index by the number of chunks per group.
  4. Final Entry Lookup: entry_index = relative_chunk_index + chunk_group. This adjustment accounts for the bitmap entries interspersed within the table.

Sources