
README
An S3-compatible HTTP gateway that uses Gmail and Google Drive as the storage backend.
Object data is stored in Google Drive files. Gmail emails serve as metadata pointers with JSON in the body containing the Drive file ID, ETag, size, and user metadata. A local SQLite index eliminates API calls for metadata-only operations. Buckets map to Gmail labels. Designed for write-once/read-rarely workloads like offsite backups, where Google’s 15 GB of free storage becomes a durable, API-accessible backup target.
How It Works
| S3 Concept | Google Mapping |
|---|---|
| Bucket | Gmail label (s3/bucket-name) |
| Object data | Google Drive file |
| Object metadata | Gmail email body (JSON with Drive file ID) |
| Object key | Email subject (s3://bucket/path/to/key) |
| ETag | MD5 hex digest of content |
| Metadata index | Local SQLite database |
S3 API Coverage
| Operation | Supported | Notes |
|---|---|---|
| PutObject | Yes | Streams to Drive via resumable upload, inserts metadata email in Gmail |
| GetObject | Yes | Downloads from Drive using cached file ID |
| HeadObject | Yes | Local SQLite lookup, zero API calls |
| DeleteObject | Yes | Removes Drive file, Gmail email, and index record |
| ListObjectsV2 | Yes | Local SQLite query, zero API calls |
| ListBuckets | Yes | Lists all labels under the configured prefix |
| CreateBucket | Yes | Creates a Gmail label |
| HeadBucket | Yes | Checks bucket existence |
| GetBucketLocation | Yes | Returns empty constraint (us-east-1) |
| CreateMultipartUpload | Yes | In-memory part buffering |
| UploadPart | Yes | Parts 1-10000, max 100 concurrent uploads |
| CompleteMultipartUpload | Yes | Streams assembled parts into PutObject |
| AbortMultipartUpload | Yes | Discards buffered parts |
Features
- Drive + Gmail hybrid storage – object data in Drive via resumable upload (no size limit), metadata in Gmail emails
- Local SQLite index – HeadObject and ListObjects resolve locally with zero API calls
- S3-compatible API – works with the AWS CLI, s3cmd, any S3 SDK
- Multipart upload – large files via standard S3 multipart protocol
- Dual API quota pools – Drive (12,000 req/min) and Gmail (250 units/sec) operate independently
- SigV4 authentication – standard AWS Signature Version 4 request signing
- Prometheus metrics – request counts, latency, Gmail/Drive API metrics (
g3_prefix) - OpenTelemetry tracing – distributed traces with OTLP gRPC export
- Log/span correlation – trace_id and span_id injected into structured JSON logs
- Audit logging – security-relevant operations logged with request ID correlation
- YAML configuration – environment variable expansion (
${VAR}syntax) - Graceful shutdown – clean drain on SIGINT/SIGTERM
- Health checks –
/health(liveness) and/health/ready(readiness)
Prerequisites
Each user needs a Google Cloud project with the Gmail and Drive APIs enabled (free, no billing required):
- Go to Google Cloud Console
- Create a project (or use an existing one)
- Navigate to APIs & Services > Library and enable both Gmail API and Google Drive API
- Navigate to APIs & Services > Credentials
- Click Create Credentials > OAuth client ID
- Select application type Desktop app, name it (e.g., “g3”)
- Copy the client ID and client secret
- Navigate to OAuth consent screen, set to External, and add your email as a test user
Getting Started
Configuration
All string values support ${ENV_VAR} expansion, making it easy to inject secrets from Vault, Nomad templates, or environment variables.
Usage
Basic operations with the AWS CLI
As an s3-orchestrator backend
g3 can be added as a backend in s3-orchestrator alongside other S3-compatible providers:
CLI Subcommands
| Command | Description |
|---|---|
g3 or g3 serve | Start the S3 gateway server |
g3 auth | Obtain a refresh token via OAuth2 browser flow |
g3 sync | Rebuild SQLite metadata index from Gmail |
g3 validate | Validate a config file without starting the server |
g3 version | Print version and Go runtime information |
g3 help | Show available commands |
g3 sync
Scans all Gmail emails under the configured label prefix and populates the local SQLite metadata index. Use this to recover the index after data loss, after migrating to a new host, or to index objects written before the SQLite layer was added.
g3 auth
Opens a browser for Google OAuth2 authorization requesting gmail.modify and drive.file scopes. After approval, prints the refresh token to stdout. The --port flag sets the localhost callback port (default: auto-assigned).
g3 validate
Parses and validates the configuration file, checking all required fields and defaults. Exits 0 on success, 1 on failure with error details.
Architecture
Storage model
- Object data is stored as Google Drive files in a root folder (
s3/by default). No size limit – Drive supports up to 5TB per file. - Object metadata is stored as Gmail emails with JSON in the body containing the Drive file ID, content type, ETag, size, and user metadata. No attachment.
- Metadata index (SQLite or PostgreSQL) maps bucket/key to Gmail message ID, Drive file ID, and metadata. HeadObject and ListObjects resolve entirely from the index with zero API calls. GetObject and DeleteObject use the cached IDs to skip Gmail search. SQLite is the default for single-node deployments; PostgreSQL allows the service to run on any node in a cluster.
- Buckets map to Gmail labels under the configured prefix (e.g.,
s3/backups).
Data flow
Write path (PutObject):
- Upload object data to Google Drive
- Insert metadata-only email in Gmail with Drive file ID
- Record metadata in local SQLite index
Read path (GetObject):
- Look up Drive file ID from SQLite index (or Gmail email on cache miss)
- Download object data from Google Drive
Metadata path (HeadObject, ListObjects):
- Query local SQLite index – zero API calls
Multipart uploads
S3 multipart upload parts are buffered individually in memory. On CompleteMultipartUpload, parts are streamed in order via io.MultiReader into the PutObject path (Drive upload + Gmail metadata) without assembling into a single buffer. Abandoned uploads are cleaned up after 1 hour.
Limits: 100 concurrent uploads, part numbers 1-10000.
Observability
Prometheus Metrics
Available at /metrics when telemetry.metrics.enabled is true.
| Metric | Type | Labels |
|---|---|---|
g3_requests_total | Counter | method, status_code |
g3_request_duration_seconds | Histogram | method |
g3_request_size_bytes | Histogram | method |
g3_response_size_bytes | Histogram | method |
g3_inflight_requests | Gauge | method |
g3_gmail_api_requests_total | Counter | operation, status |
g3_gmail_api_duration_seconds | Histogram | operation |
g3_gmail_storage_bytes | Gauge | – |
g3_objects_total | Gauge | bucket |
g3_audit_events_total | Counter | event |
g3_build_info | Gauge | version, go_version |
Tracing
When telemetry.tracing.enabled is true, g3 exports traces via OTLP gRPC. Each S3 request produces a server span, and each Gmail/Drive API call produces a child client span. Custom attributes are prefixed with g3. (e.g., g3.bucket, g3.key, g3.gmail.message_id).
Trace IDs and span IDs are automatically injected into JSON log output for correlation in tools like Grafana Loki + Tempo.
Health Checks
| Endpoint | Behavior |
|---|---|
GET /health | Always returns 200 {"status":"ok"} |
GET /health/ready | Returns 200 after startup, 503 during initialization or shutdown |
Limitations
- Google storage quota: 15 GB shared across Gmail, Drive, and Photos. Objects count against this limit.
- API rate limits: Drive allows 12,000 requests/user/minute. Gmail allows 250 quota units/second. Sufficient for backup workloads.
- Eventual consistency: Gmail search indexing has a small delay. Objects not yet in the metadata index may take a few seconds to appear via Gmail search fallback.
- Memory usage: Multipart upload parts are buffered in memory until completion. PutObject streams data directly to Drive without buffering the full object.
- Metadata persistence: SQLite requires a persistent volume; if lost, run
g3 syncto rebuild from Gmail. PostgreSQL avoids this by using a shared database.
Project Structure
License
MIT