Architecture

Architecture

Spegel can run as a stateless application by exploiting the fact that an image pulled by a node is not immediately garbage collected. Spegel is deployed as a Daemonset on each node which acts as both the registry and mirror. Each instance is reachable both locally through a host port and a Service. This enables Containerd to be configured to use the localhost interface as a registry mirror and for Spegel instances to forward requests to each other.

landscape

Images are composed of multiple layers which are stored as individual files on the node disk. Each layer has a digest which is its identifier. Every node advertises the digests which are stored locally on disk. Kademlia is used to enable a distributed advertisement and lookup of digests. An image pull consists of multiple HTTP requests with one request per digest. The request is first sent to Spegel when an image is pulled if it is configured to act as the mirror for the registry. Spegel will lookup the digest within the cluster to see if any node has advertised that they have it. If a node is found the request will be forwarded to that Spegel instance which will serve the file with the specified digest. If a node is not found a 404 response will be returned and Containerd will fallback to using the actual remote registry.

In its core Spegel is a pull only OCI registry which runs locally on every Node in the Kubernetes cluster. Containerd is configured to use the local registry as a mirror, which would serve the image from within the cluster or from the source registry.

Flow Diagrams

The flow diagrams provide a comprehensive set of diagrams explaining Spegel’s architecture, flows, and operations.

High-Level Cluster Architecture

Shows how Spegel pods form a P2P network within the cluster. Containerd interacts with Spegel for image pulls and handles fallback to external registry when needed.

graph TB
    subgraph "External"
        ER["External Registry"]
    end

    subgraph "Kubernetes Cluster"
        subgraph "Node 1"
            SP1["Spegel Pod"]
            CD1["Containerd"]
            SP1 <-->|interacts| CD1
            CD1 -->|fallback| ER
        end
        
        subgraph "Node 2"
            SP2["Spegel Pod"]
            CD2["Containerd"]
            SP2 <-->|interacts| CD2
            CD2 -->|fallback| ER
        end
        
        subgraph "Node 3"
            SP3["Spegel Pod"]
            CD3["Containerd"]
            SP3 <-->|interacts| CD3
            CD3 -->|fallback| ER
        end

        SP1 <-->|P2P Network| SP2
        SP2 <-->|P2P Network| SP3
        SP3 <-->|P2P Network| SP1
    end

Pod Component Architecture

Details the internal components of a Spegel pod and their relationships, showing how the registry service, P2P components, and state management interact with each other and with containerd.

graph TB
    subgraph "Spegel Pod"
        subgraph "Registry Service"
            RS[HTTP Server /v2/]
            RH[Request Handler]
            RS --> RH
        end

        subgraph "P2P Components"
            P2P[P2P Router]
            DHT[DHT Provider]
            BS[Bootstrapper]
            P2P --> DHT
            BS --> P2P
        end

        subgraph "State Management"
            ST[State Tracker]
            MT[Metrics]
            ST --> MT
        end

        CD[Containerd Client]
        
        RH --> P2P
        ST --> P2P
        CD --> ST
    end

    subgraph "Node Components"
        CDD[Containerd Daemon]
        CS[Content Store]
        CDD --> CS
    end

    CD --> CDD

Image Pull Flow

Shows the sequence of operations during an image pull request, demonstrating both successful peer pulls and fallback to external registry.

sequenceDiagram
    participant CD as Containerd
    participant SR as Spegel Registry
    participant P2P as P2P Router
    participant PR as Peer Registry
    participant ER as External Registry

    Note over SR,P2P: 20ms default resolve timeout
    Note over SR,P2P: 3 default resolve retries

    CD->>SR: GET /v2/{name}/manifests/{ref}
    SR->>P2P: Resolve(key, allowSelf, retries)
    
    alt Peer Found
        P2P-->>SR: Return Peer Address
        SR->>PR: Request Content
        PR-->>SR: Stream Content
        SR-->>CD: Return Content
        CD->>CS: Store Content
    else No Peers Available (within 20ms)
        SR-->>CD: 404 Not Found
        CD->>ER: Request from External
        ER-->>CD: Return Content
        CD->>CS: Store Content
    end

P2P Network Formation

Shows how nodes discover each other and form the P2P network through leader election and peer sharing.

sequenceDiagram
    participant N1 as Node 1
    participant N2 as Node 2
    participant N3 as Node 3
    participant LE as Leader Election
    
    Note over N1,LE: 10s lease duration
    Note over N1,LE: 5s renew deadline
    Note over N1,LE: 2s retry period

    N1->>LE: Participate in Election
    N2->>LE: Participate in Election
    N3->>LE: Participate in Election
    LE->>N1: Elected Leader
    N2->>N1: Discover Leader
    N3->>N1: Discover Leader
    N1->>N2: Share Peer List
    N1->>N3: Share Peer List
    N2->>N3: Establish P2P Connection
    
    Note over N1,N3: P2P Network Formed

State Management and Content Advertisement

Shows how content availability is maintained and advertised in the P2P network, including periodic refresh cycles and event-driven updates.

sequenceDiagram
    participant ST as State Tracker
    participant CD as Containerd
    participant P2P as P2P Router
    participant DHT as DHT Network
    participant MT as Metrics

    Note over ST,DHT: Content TTL: 10 minutes
    Note over ST,DHT: Refresh: Every 9 minutes

    loop Every 9 minutes
        ST->>CD: List Images
        CD-->>ST: Image List
        
        loop For each image
            ST->>P2P: Advertise(image_keys)
            P2P->>DHT: Provide(keys)
        end
        
        ST->>MT: Update Metrics
    end

    CD-->>ST: Image Event (Create/Update/Delete)
    ST->>P2P: Update Advertisement
    ST->>MT: Update Metrics

Content Resolution Process

Shows how content is located and retrieved from peers in the network, including peer selection and retry mechanisms.

sequenceDiagram
    participant SR as Spegel Registry
    participant P2P as P2P Router
    participant DHT as DHT Network
    participant PR1 as Peer 1
    participant PR2 as Peer 2

    SR->>P2P: Resolve(content_key)
    P2P->>DHT: FindProviders(key)
    
    par Parallel Resolution
        DHT-->>P2P: Found Peer 1
        DHT-->>P2P: Found Peer 2
    end

    P2P->>SR: Return First Available Peer
    
    Note over SR,PR2: Default 20ms timeout
    Note over SR,PR2: 3 retry attempts
    
    alt Try Peer 1
        SR->>PR1: Request Content
        PR1-->>SR: Stream Content
    else Peer 1 Fails
        SR->>PR2: Request Content
        PR2-->>SR: Stream Content
    end

Data Flow Paths

Shows the content paths and system control flows, including peer transfers and fallback mechanisms.

graph LR
    subgraph "Content Paths"
        CD[Containerd]
        SP[Spegel]
        P[Peers]
        ER[External Registry]
        CS[Content Store]
        
        CD -->|Request| SP
        SP -->|Check| P
        P -->|Content| SP
        SP -->|Return| CD
        CD -->|Store| CS

        SP -->|404| CD
        CD -->|Fallback| ER
    end

    subgraph "P2P Operations"
        P2P[P2P Network]
        DHT[DHT]
        ST[State Tracker]
        
        P2P -->|Advertise| DHT
        DHT -->|Discover| P2P
        ST -->|Update| P2P
    end

Failure Handling

Shows how different types of failures are handled in the system.

sequenceDiagram
    participant CD as Containerd
    participant SR as Spegel Registry
    participant P2P as P2P Router
    participant PR as Peer
    participant ER as External Registry

    Note over SR,ER: Failure Scenarios

    alt Peer Not Found
        CD->>SR: Request Content
        SR->>P2P: Resolve(key)
        P2P--xSR: No Peers Available
        SR-->>CD: 404 Not Found
        CD->>ER: Fallback Request
    end

    alt Peer Connection Failed
        SR->>PR: Request Content
        PR--xSR: Connection Failed
        SR->>P2P: Resolve(key) Retry
        P2P-->>SR: Alternative Peer
    end

    alt Content Corrupted
        SR->>PR: Request Content
        PR-->>SR: Stream Content
        SR--xCD: Verification Failed
        CD->>ER: Fallback Request
    end

Metrics Collection

Shows how metrics are collected and organized across the system components.

graph TB
    subgraph "Metrics Sources"
        RQ[Registry Requests]
        P2P[P2P Operations]
        ST[State Changes]
    end

    subgraph "Metric Types"
        CT[Counters]
        HT[Histograms]
        GT[Gauges]
    end

    subgraph "Prometheus Metrics"
        MR[mirror_requests_total]
        RD[resolve_duration_seconds]
        AI[advertised_images]
        AK[advertised_keys]
        RL[request_latency]
        IF[requests_inflight]
    end

    RQ --> CT
    RQ --> HT
    P2P --> HT
    P2P --> GT
    ST --> GT

    CT --> MR
    HT --> RD
    HT --> RL
    GT --> AI
    GT --> AK
    GT --> IF