g3

Architecture

Data flow from S3 clients through the g3 gateway into Google Drive and Gmail

flowchart TD
    CLIENT["S3 Clients"]
    AUTH["SigV4 Auth"]
    ROUTER["HTTP Router"]
    PUT["PutObject"]
    GET["GetObject"]
    HEAD["HeadObject"]
    LIST["ListObjects"]
    MULTI["Multipart Store"]
    SQLITE["Metadata Index"]
    DRIVE["Google Drive API"]
    GMAIL["Gmail API"]
    DRIVESTORE["Drive Files"]
    GMAILSTORE["Gmail Emails"]
    METRICS["Prometheus"]
    TRACING["Tempo"]

    CLIENT -->|"S3 API requests"| AUTH
    AUTH -->|"bucket resolved"| ROUTER
    ROUTER --> PUT
    ROUTER --> GET
    ROUTER --> HEAD
    ROUTER --> LIST
    ROUTER --> MULTI
    MULTI -->|"stream parts"| PUT
    PUT -->|"upload data"| DRIVE
    PUT -->|"insert metadata email"| GMAIL
    PUT -->|"record"| SQLITE
    GET -->|"lookup file ID"| SQLITE
    GET -->|"download data"| DRIVE
    HEAD -->|"local query"| SQLITE
    LIST -->|"prefix query"| SQLITE
    DRIVE --> DRIVESTORE
    GMAIL --> GMAILSTORE
    ROUTER -->|"/metrics"| METRICS
    ROUTER -->|"OTLP gRPC"| TRACING

    classDef client fill:#172554,stroke:#60a5fa,color:#dbeafe
    classDef server fill:#1e293b,stroke:#334155,color:#e2e8f0
    classDef google fill:#132a1f,stroke:#22c55e,color:#dcfce7
    classDef local fill:#1e293b,stroke:#60a5fa,color:#dbeafe
    classDef obs fill:#2d2513,stroke:#f97316,color:#fef3c7

    class CLIENT client
    class AUTH,ROUTER,PUT,GET,HEAD,LIST,MULTI server
    class DRIVE,GMAIL,DRIVESTORE,GMAILSTORE google
    class SQLITE local
    class METRICS,TRACING obs

Storage Model

Object Data (Google Drive)

Object data is stored as individual files in a root Drive folder (s3/ by default). There is no file size limit – Drive supports up to 5TB per file, eliminating the need for chunking. Each file is named bucket/key for identification.

Object Metadata (Gmail + Metadata Index)

Each object has a corresponding Gmail email:

  • Subject: s3://bucket-name/path/to/key (used for search and identification)
  • Body: JSON metadata (content_type, etag, size, drive_file_id, user metadata)
  • No attachment – object data lives in Drive

A metadata index (SQLite or PostgreSQL) caches this data along with the Gmail message ID and Drive file ID. This index is the primary lookup path for all read operations. SQLite is the default for single-node deployments; PostgreSQL allows the service to run on any node in a cluster without persistent local storage.

API Call Budget

OperationDrive API CallsGmail API CallsIndex Queries
PutObject1 (upload)1 (insert email)1 (insert record)
GetObject1 (download)01 (lookup file ID)
HeadObject001 (lookup metadata)
ListObjects001 (prefix query)
DeleteObject1 (delete file)1 (delete email)1 (delete record)

Multipart Uploads

S3 multipart upload parts are buffered individually in memory:

  1. CreateMultipartUpload allocates an upload ID and in-memory part map
  2. UploadPart buffers each part keyed by part number
  3. CompleteMultipartUpload sorts parts and streams them via io.MultiReader into PutObject
  4. PutObject streams data through an MD5 hasher directly into the Drive upload

Abandoned uploads are cleaned up by a background goroutine (1-hour TTL, 10-minute sweep).

Observability

  • Prometheus metrics on the configurable /metrics endpoint cover HTTP requests, Gmail/Drive API calls, and operational state
  • OpenTelemetry traces export via OTLP gRPC with server spans for S3 requests and client spans for Gmail/Drive API calls
  • Structured JSON logs include trace_id and span_id for correlation in Grafana Loki + Tempo
  • Audit logging records security-relevant operations with request ID correlation

Request Flow

  1. S3 client sends a signed request (SigV4)
  2. Auth layer validates the signature and resolves the target bucket
  3. Router creates a server span, generates/adopts a request ID
  4. Request is dispatched to the appropriate handler
  5. Writes: data uploaded to Drive, metadata email inserted in Gmail, record stored in SQLite
  6. Reads: metadata resolved from SQLite index, data downloaded from Drive
  7. Metadata-only operations: resolved entirely from SQLite with zero API calls
  8. Response is written with appropriate S3 headers and XML
  9. Metrics are recorded and an audit log entry is emitted