g3

README

An S3-compatible HTTP gateway that uses Gmail and Google Drive as the storage backend.

Object data is stored in Google Drive files. Gmail emails serve as metadata pointers with JSON in the body containing the Drive file ID, ETag, size, and user metadata. A local SQLite index eliminates API calls for metadata-only operations. Buckets map to Gmail labels. Designed for write-once/read-rarely workloads like offsite backups, where Google’s 15 GB of free storage becomes a durable, API-accessible backup target.

How It Works

S3 ConceptGoogle Mapping
BucketGmail label (s3/bucket-name)
Object dataGoogle Drive file
Object metadataGmail email body (JSON with Drive file ID)
Object keyEmail subject (s3://bucket/path/to/key)
ETagMD5 hex digest of content
Metadata indexLocal SQLite database

S3 API Coverage

OperationSupportedNotes
PutObjectYesStreams to Drive via resumable upload, inserts metadata email in Gmail
GetObjectYesDownloads from Drive using cached file ID
HeadObjectYesLocal SQLite lookup, zero API calls
DeleteObjectYesRemoves Drive file, Gmail email, and index record
ListObjectsV2YesLocal SQLite query, zero API calls
ListBucketsYesLists all labels under the configured prefix
CreateBucketYesCreates a Gmail label
HeadBucketYesChecks bucket existence
GetBucketLocationYesReturns empty constraint (us-east-1)
CreateMultipartUploadYesIn-memory part buffering
UploadPartYesParts 1-10000, max 100 concurrent uploads
CompleteMultipartUploadYesStreams assembled parts into PutObject
AbortMultipartUploadYesDiscards buffered parts

Features

  • Drive + Gmail hybrid storage – object data in Drive via resumable upload (no size limit), metadata in Gmail emails
  • Local SQLite index – HeadObject and ListObjects resolve locally with zero API calls
  • S3-compatible API – works with the AWS CLI, s3cmd, any S3 SDK
  • Multipart upload – large files via standard S3 multipart protocol
  • Dual API quota pools – Drive (12,000 req/min) and Gmail (250 units/sec) operate independently
  • SigV4 authentication – standard AWS Signature Version 4 request signing
  • Prometheus metrics – request counts, latency, Gmail/Drive API metrics (g3_ prefix)
  • OpenTelemetry tracing – distributed traces with OTLP gRPC export
  • Log/span correlation – trace_id and span_id injected into structured JSON logs
  • Audit logging – security-relevant operations logged with request ID correlation
  • YAML configuration – environment variable expansion (${VAR} syntax)
  • Graceful shutdown – clean drain on SIGINT/SIGTERM
  • Health checks/health (liveness) and /health/ready (readiness)

Prerequisites

Each user needs a Google Cloud project with the Gmail and Drive APIs enabled (free, no billing required):

  1. Go to Google Cloud Console
  2. Create a project (or use an existing one)
  3. Navigate to APIs & Services > Library and enable both Gmail API and Google Drive API
  4. Navigate to APIs & Services > Credentials
  5. Click Create Credentials > OAuth client ID
  6. Select application type Desktop app, name it (e.g., “g3”)
  7. Copy the client ID and client secret
  8. Navigate to OAuth consent screen, set to External, and add your email as a test user

Getting Started

git clone https://github.com/afreidah/g3.git
cd g3
make build

./g3 auth --client-id YOUR_CLIENT_ID --client-secret YOUR_CLIENT_SECRET

./g3 -config config.yaml

Configuration

server:
  listen_addr: "0.0.0.0:9000"       # Listen address (default: 0.0.0.0:9000)
  log_level: "info"                  # debug, info, warn, error (default: info)
  read_timeout: "5m"                 # HTTP read timeout (default: 5m)
  write_timeout: "5m"                # HTTP write timeout (default: 5m)
  shutdown_timeout: "30s"            # Graceful shutdown deadline (default: 30s)

gmail:
  client_id: "${GMAIL_CLIENT_ID}"          # Google OAuth2 client ID (required)
  client_secret: "${GMAIL_CLIENT_SECRET}"  # Google OAuth2 client secret (required)
  refresh_token: "${GMAIL_REFRESH_TOKEN}"  # OAuth2 refresh token from g3 auth (required)
  user: "me"                               # Gmail user (default: me)
  label_prefix: "s3"                       # Gmail label prefix for buckets (default: s3)

database:
  driver: "sqlite"                         # "sqlite" or "postgres" (default: sqlite)
  path: "/data/g3/metadata.db"             # SQLite: database file path (default: g3-metadata.db)
  # PostgreSQL options (used when driver is "postgres"):
  # host: "haproxy-postgres.service.consul"
  # port: 5433
  # database: "g3"
  # user: "${G3_DB_USER}"
  # password: "${G3_DB_PASSWORD}"
  # ssl_mode: "require"
  # max_conns: 5

buckets:
  - name: "backups"                        # Bucket name (maps to Gmail label s3/backups)
    credentials:
      - access_key_id: "mykey"             # S3 access key for this bucket
        secret_access_key: "mysecret"      # S3 secret key for this bucket

telemetry:
  metrics:
    enabled: true                          # Enable Prometheus endpoint (default: false)
    path: "/metrics"                       # Metrics path (default: /metrics)
  tracing:
    enabled: false                         # Enable OpenTelemetry tracing (default: false)
    endpoint: "tempo:4317"                 # OTLP gRPC endpoint
    insecure: true                         # Use insecure gRPC connection
    sample_rate: 1.0                       # Trace sampling rate 0.0-1.0

All string values support ${ENV_VAR} expansion, making it easy to inject secrets from Vault, Nomad templates, or environment variables.

Usage

Basic operations with the AWS CLI

export AWS_ACCESS_KEY_ID=mykey
export AWS_SECRET_ACCESS_KEY=mysecret
export AWS_ENDPOINT_URL=http://localhost:9000

aws s3 mb s3://backups

aws s3 cp backup.tar.gz s3://backups/daily/backup.tar.gz

aws s3 ls s3://backups/daily/

aws s3 cp s3://backups/daily/backup.tar.gz ./restored.tar.gz

aws s3 rm s3://backups/daily/backup.tar.gz

As an s3-orchestrator backend

g3 can be added as a backend in s3-orchestrator alongside other S3-compatible providers:

backends:
  - name: "gmail"
    endpoint: "http://g3.service.consul:9000"
    region: "us-east-1"
    bucket: "backups"
    access_key_id: "mykey"
    secret_access_key: "mysecret"
    force_path_style: true
    quota_bytes: 15000000000    # 15 GB Google storage limit

CLI Subcommands

CommandDescription
g3 or g3 serveStart the S3 gateway server
g3 authObtain a refresh token via OAuth2 browser flow
g3 syncRebuild SQLite metadata index from Gmail
g3 validateValidate a config file without starting the server
g3 versionPrint version and Go runtime information
g3 helpShow available commands

g3 sync

g3 sync -config /path/to/config.yaml

Scans all Gmail emails under the configured label prefix and populates the local SQLite metadata index. Use this to recover the index after data loss, after migrating to a new host, or to index objects written before the SQLite layer was added.

g3 auth

g3 auth --client-id <id> --client-secret <secret> [--port <port>]

Opens a browser for Google OAuth2 authorization requesting gmail.modify and drive.file scopes. After approval, prints the refresh token to stdout. The --port flag sets the localhost callback port (default: auto-assigned).

g3 validate

g3 validate -config /path/to/config.yaml

Parses and validates the configuration file, checking all required fields and defaults. Exits 0 on success, 1 on failure with error details.

Architecture

              S3 Clients (aws cli, s3-orchestrator, SDKs)
                             |
                        [SigV4 Auth]
                             |
                   g3 S3 HTTP Server
                   /         |        \
            PutObject   GetObject   ListObjects ...
                   |         |           |
              [SQLite Metadata Index]    |
              /         |         \      |
     Drive Upload   Drive Download   Local Query
         |              |
    Gmail Insert    (data from Drive)
    (metadata email)
         |
   Google Drive (object data)  +  Gmail (metadata emails)

Storage model

  • Object data is stored as Google Drive files in a root folder (s3/ by default). No size limit – Drive supports up to 5TB per file.
  • Object metadata is stored as Gmail emails with JSON in the body containing the Drive file ID, content type, ETag, size, and user metadata. No attachment.
  • Metadata index (SQLite or PostgreSQL) maps bucket/key to Gmail message ID, Drive file ID, and metadata. HeadObject and ListObjects resolve entirely from the index with zero API calls. GetObject and DeleteObject use the cached IDs to skip Gmail search. SQLite is the default for single-node deployments; PostgreSQL allows the service to run on any node in a cluster.
  • Buckets map to Gmail labels under the configured prefix (e.g., s3/backups).

Data flow

Write path (PutObject):

  1. Upload object data to Google Drive
  2. Insert metadata-only email in Gmail with Drive file ID
  3. Record metadata in local SQLite index

Read path (GetObject):

  1. Look up Drive file ID from SQLite index (or Gmail email on cache miss)
  2. Download object data from Google Drive

Metadata path (HeadObject, ListObjects):

  1. Query local SQLite index – zero API calls

Multipart uploads

S3 multipart upload parts are buffered individually in memory. On CompleteMultipartUpload, parts are streamed in order via io.MultiReader into the PutObject path (Drive upload + Gmail metadata) without assembling into a single buffer. Abandoned uploads are cleaned up after 1 hour.

Limits: 100 concurrent uploads, part numbers 1-10000.

Observability

Prometheus Metrics

Available at /metrics when telemetry.metrics.enabled is true.

MetricTypeLabels
g3_requests_totalCountermethod, status_code
g3_request_duration_secondsHistogrammethod
g3_request_size_bytesHistogrammethod
g3_response_size_bytesHistogrammethod
g3_inflight_requestsGaugemethod
g3_gmail_api_requests_totalCounteroperation, status
g3_gmail_api_duration_secondsHistogramoperation
g3_gmail_storage_bytesGauge
g3_objects_totalGaugebucket
g3_audit_events_totalCounterevent
g3_build_infoGaugeversion, go_version

Tracing

When telemetry.tracing.enabled is true, g3 exports traces via OTLP gRPC. Each S3 request produces a server span, and each Gmail/Drive API call produces a child client span. Custom attributes are prefixed with g3. (e.g., g3.bucket, g3.key, g3.gmail.message_id).

Trace IDs and span IDs are automatically injected into JSON log output for correlation in tools like Grafana Loki + Tempo.

Health Checks

EndpointBehavior
GET /healthAlways returns 200 {"status":"ok"}
GET /health/readyReturns 200 after startup, 503 during initialization or shutdown

Limitations

  • Google storage quota: 15 GB shared across Gmail, Drive, and Photos. Objects count against this limit.
  • API rate limits: Drive allows 12,000 requests/user/minute. Gmail allows 250 quota units/second. Sufficient for backup workloads.
  • Eventual consistency: Gmail search indexing has a small delay. Objects not yet in the metadata index may take a few seconds to appear via Gmail search fallback.
  • Memory usage: Multipart upload parts are buffered in memory until completion. PutObject streams data directly to Drive without buffering the full object.
  • Metadata persistence: SQLite requires a persistent volume; if lost, run g3 sync to rebuild from Gmail. PostgreSQL avoids this by using a shared database.

Project Structure

cmd/g3/              Entry point and subcommands (serve, auth, sync, validate, version)
internal/
  audit/              Request ID generation, context propagation, audit logging
  auth/               SigV4 signature verification, bucket registry
  backend/
    types.go          ObjectBackend interface, MetadataStore interface, result types
    gmail.go          Gmail + Drive hybrid backend (PutObject, GetObject, HeadObject, DeleteObject)
    gmail_list.go     ListObjects, ListBuckets, CreateBucket
    gmail_chunked.go  Legacy chunked object support (read-only for old data)
    email.go          MIME email construction and parsing
    search.go         Gmail search query builder
  config/             YAML config loading, validation, defaults
  store/
    sqlite.go         SQLite metadata index (local, requires persistent volume)
    postgres.go       PostgreSQL metadata index (shared, via pgx/v5 + sqlc)
    sqlc/             sqlc-generated query code
    migrations/       Goose SQL migrations
  server/
    server.go         HTTP routing, auth, spans, audit logging
    objects.go        PUT, GET, HEAD, DELETE handlers
    list.go           ListObjectsV2 handler
    buckets.go        ListBuckets, CreateBucket, HeadBucket, GetBucketLocation
    multipart.go      Multipart upload store and handlers
    helpers.go        S3 XML responses, path parsing, metadata extraction
  telemetry/
    metrics.go        Prometheus metric definitions
    tracing.go        OpenTelemetry initialization, span helpers
    tracehandler.go   slog handler for trace/span ID injection
    logbuffer.go      Circular log buffer for operational visibility

License

MIT