GCE VM Workstation Provisioning for Autonomous Coding Agents
📋 Related Issue: Issue #411 - Provision GCE VM Workstations for Autonomous Coding Agents
Overview
This document provides step-by-step gcloud commands for provisioning GCE VM workstations to host autonomous AI coding agents (Cursor, Antigravity). These VMs will run the agent IDEs with full access to agent-specific Git SSH keys and GitHub PATs stored in GCP Secret Manager.
Prerequisites
- User must be authenticated with
gcloud auth login - User must have
Project CreatorandBilling Account Userroles - Billing account ID available (can list with
gcloud beta billing accounts list)
Step 1: Create the GCP Project
# Configuration
PROJECT_ID="permitproof-dev-workstations"
PROJECT_NAME="permitproof-dev-workstations"
BILLING_ACCOUNT_ID="018A1F-2219A5-D47906" # CodeProof.app billing account
REGION="us-central1"
ZONE="us-central1-a"
# Create the project
gcloud projects create "${PROJECT_ID}" \
--name="${PROJECT_NAME}"
# Link billing account
gcloud beta billing projects link "${PROJECT_ID}" \
--billing-account="${BILLING_ACCOUNT_ID}"
Note: All subsequent commands explicitly use
--project="${PROJECT_ID}"rather than relying on a default project.
Step 2: Enable Required APIs
# Compute Engine API - for VM instances
gcloud services enable compute.googleapis.com --project="${PROJECT_ID}"
# Secret Manager API - for accessing agent credentials
gcloud services enable secretmanager.googleapis.com --project="${PROJECT_ID}"
# IAM API - for service account management
gcloud services enable iam.googleapis.com --project="${PROJECT_ID}"
# Cloud Resource Manager API - for project-level operations
gcloud services enable cloudresourcemanager.googleapis.com --project="${PROJECT_ID}"
# OS Login API - for SSH access management (optional but recommended)
gcloud services enable oslogin.googleapis.com --project="${PROJECT_ID}"
Step 3: Provision Network Infrastructure
3.1 Create VPC Network
NETWORK_NAME="agent-workstation-network"
# Create custom VPC network
gcloud compute networks create "${NETWORK_NAME}" \
--project="${PROJECT_ID}" \
--subnet-mode=custom \
--description="Network for AI agent workstations"
3.2 Create Subnet
SUBNET_NAME="agent-workstation-subnet"
SUBNET_RANGE="10.0.0.0/24"
gcloud compute networks subnets create "${SUBNET_NAME}" \
--project="${PROJECT_ID}" \
--network="${NETWORK_NAME}" \
--region="${REGION}" \
--range="${SUBNET_RANGE}"
Network Resource Costs
| Resource | Pricing | Estimated Monthly Cost |
|---|---|---|
| VPC Network | Free | $0 |
| Subnet | Free | $0 |
| Cloud Router | Free | $0 |
| Cloud NAT Gateway | $0.045/hour | ~$32/month |
| Cloud NAT Data Processing | $0.045/GB egress | ~$5-10/month (varies by usage) |
| Firewall Rules | Free | $0 |
| Network Total | ~$37-42/month |
Cloud NAT Cost Optimization Options
| Option | Pros | Cons | Monthly Savings |
|---|---|---|---|
| Use Private Google Access only | Free, no NAT needed for GCP APIs | Cannot reach GitHub, npm, apt repos | ~$37 |
| Ephemeral external IP on VM | No NAT cost, direct internet | Less secure (public IP), IP changes on restart | ~$37 |
| Delete NAT when idle | Pay only when needed | Manual/scripted process, VMs can't reach internet when deleted | Variable |
| Single NAT for all VMs | Already shared by default | N/A - this is the default behavior | N/A |
Recommendation: If agents only need occasional internet access (git clone, apt update), consider:
- Enable Private Google Access for GCP services (Secret Manager, GCS) - always free
- Create NAT only when needed, delete when done
- Or use ephemeral external IP if security is acceptable
# Enable Private Google Access on subnet (free access to Google APIs)
gcloud compute networks subnets update "${SUBNET_NAME}" \
--project="${PROJECT_ID}" \
--region="${REGION}" \
--enable-private-ip-google-access
# Delete NAT when not needed (saves ~$32/month)
gcloud compute routers nats delete "${NAT_NAME}" \
--project="${PROJECT_ID}" \
--router="${ROUTER_NAME}" \
--region="${REGION}" \
--quiet
# Recreate NAT when needed
gcloud compute routers nats create "${NAT_NAME}" \
--project="${PROJECT_ID}" \
--router="${ROUTER_NAME}" \
--region="${REGION}" \
--nat-all-subnet-ip-ranges \
--auto-allocate-nat-external-ips
Note: There is no multi-tenant/shared NAT across projects or idle/sleep mode for Cloud NAT. It's either running (~$32/month) or deleted.
3.3 Create Cloud NAT (for outbound internet access)
ROUTER_NAME="agent-workstation-router"
NAT_NAME="agent-workstation-nat"
# Create Cloud Router
gcloud compute routers create "${ROUTER_NAME}" \
--project="${PROJECT_ID}" \
--network="${NETWORK_NAME}" \
--region="${REGION}"
# Create Cloud NAT
gcloud compute routers nats create "${NAT_NAME}" \
--project="${PROJECT_ID}" \
--router="${ROUTER_NAME}" \
--region="${REGION}" \
--nat-all-subnet-ip-ranges \
--auto-allocate-nat-external-ips
3.4 Create Firewall Rules
# Allow IAP tunneling for SSH only
# IAP uses Google's IP range 35.235.240.0/20
gcloud compute firewall-rules create "${NETWORK_NAME}-allow-iap-ssh" \
--project="${PROJECT_ID}" \
--network="${NETWORK_NAME}" \
--direction=INGRESS \
--priority=1000 \
--action=ALLOW \
--rules=tcp:22 \
--source-ranges="35.235.240.0/20" \
--description="Allow SSH via IAP tunneling"
# Allow internal communication between VMs
gcloud compute firewall-rules create "${NETWORK_NAME}-allow-internal" \
--project="${PROJECT_ID}" \
--network="${NETWORK_NAME}" \
--direction=INGRESS \
--priority=1000 \
--action=ALLOW \
--rules=tcp:0-65535,udp:0-65535,icmp \
--source-ranges="${SUBNET_RANGE}"
Security: VMs have no public IP. SSH via IAP tunneling, remote desktop via Chrome Remote Desktop (no firewall rules needed - uses Google's relay servers).
Step 4: Agent Identity Configuration
Each autonomous agent VM uses a service account from its dedicated agent project, not a shared workstation service account. This maintains strict isolation between agents.
| Service Account | Agent Type | VM Host Project | VM Instance(s) |
|---|---|---|---|
ai-swe-agent@construction-code-expert-crsr.iam.gserviceaccount.com | Cursor | permitproof-dev-workstations | alex-cursor-autonomous-agent |
ai-swe-agent@construction-code-expert-agy.iam.gserviceaccount.com | Antigravity | permitproof-dev-workstations | alex-antigravity-autonomous-agent |
ai-swe-agent@construction-code-expert-agy2.iam.gserviceaccount.com | Antigravity | permitproof-dev-workstations | philip-antigravity-autonomous-agent |
Note: Multiple VMs can share the same agent identity - they all commit as the same GitHub user.
4.1 Prerequisites
Agent GCP projects and service accounts are provisioned via Terraform (similar to Issue #384 for the -crsr project).
This playbook assumes the following already exist:
- Agent GCP project (e.g.,
construction-code-expert-crsr,construction-code-expert-agy) - Service account
ai-swe-agent@{project}.iam.gserviceaccount.com - Secret Manager access configured in
construction-code-expert-admin
4.2 Verify Agent Service Account
# Agent type to project suffix mapping:
# cursor => crsr, antigravity => agy
PROJECT_SUFFIX="agy" # or "crsr", "agy2", etc.
AGENT_PROJECT_ID="construction-code-expert-${PROJECT_SUFFIX}"
AGENT_SA_EMAIL="ai-swe-agent@${AGENT_PROJECT_ID}.iam.gserviceaccount.com"
# Verify service account exists
gcloud iam service-accounts describe "${AGENT_SA_EMAIL}" --project="${AGENT_PROJECT_ID}"
# Verify Secret Manager access (from admin project)
ADMIN_PROJECT_ID="construction-code-expert-admin"
gcloud secrets get-iam-policy cursor-autonomous-agent-ssh-key --project="${ADMIN_PROJECT_ID}" \
| grep "${AGENT_SA_EMAIL}" && echo "✅ SSH key access configured"
Step 5: Provision GCE VM Instance
5.1 Startup Script
The base software installation script is version-controlled at:
cli/sdlc/gce-workstation/install-base-software.sh
This script installs:
- XFCE desktop environment
- Chrome Remote Desktop (no port/firewall config needed)
- Google Chrome browser
- Development tools (git, curl, docker, jq, etc.)
- Google Cloud CLI
- GitHub CLI
- Java (Temurin 21)
Assumption: All commands below assume you are running from the root of the repo on your local machine.
5.2 Create the VM Instance
VM Naming Convention
VMs are named with the pattern: {owner}-{agent-type}-autonomous-agent
Examples:
alex-antigravity-autonomous-agent- Alex's Antigravity agentphilip-antigravity-autonomous-agent- Philip's Antigravity agentalex-cursor-autonomous-agent- Alex's Cursor agent
Create VM Command
⚠️ Important: Set ALL variables in a single shell session before running the create command. Variables from earlier sections (Step 1, Step 3) must still be set.
# ============================================
# STEP 1: Set variables from earlier sections
# (if not already set in current shell)
# ============================================
PROJECT_ID="permitproof-dev-workstations"
ZONE="us-central1-a"
NETWORK_NAME="agent-workstation-network"
SUBNET_NAME="agent-workstation-subnet"
# ============================================
# STEP 2: VM-specific configuration
# ============================================
OWNER="alex" # Owner: "alex", "philip", etc.
AGENT_TYPE="antigravity" # Agent Type: "cursor" or "antigravity"
PROJECT_SUFFIX="agy" # Project suffix: cursor => crsr, antigravity => agy
VM_NAME="${OWNER}-${AGENT_TYPE}-autonomous-agent"
MACHINE_TYPE="e2-standard-4" # 4 vCPUs, 16 GB RAM
BOOT_DISK_SIZE="100GB"
# GCE VM base image (not Docker) - Ubuntu 22.04 LTS from Google's public images
IMAGE_FAMILY="ubuntu-2204-lts"
IMAGE_PROJECT="ubuntu-os-cloud"
# Agent type to project suffix: cursor => crsr, antigravity => agy
AGENT_PROJECT_ID="construction-code-expert-${PROJECT_SUFFIX}"
AGENT_SA_EMAIL="ai-swe-agent@${AGENT_PROJECT_ID}.iam.gserviceaccount.com"
# ============================================
# STEP 3: Create VM
# ============================================
# --scopes="cloud-platform": Grants VM access to ALL GCP APIs.
# Actual permissions are controlled by IAM roles on the service account.
# This is the recommended approach - manage access in IAM, not scopes.
gcloud compute instances create "${VM_NAME}" \
--project="${PROJECT_ID}" \
--zone="${ZONE}" \
--machine-type="${MACHINE_TYPE}" \
--network="${NETWORK_NAME}" \
--subnet="${SUBNET_NAME}" \
--no-address \
--service-account="${AGENT_SA_EMAIL}" \
--scopes="cloud-platform" \
--image-family="${IMAGE_FAMILY}" \
--image-project="${IMAGE_PROJECT}" \
--boot-disk-size="${BOOT_DISK_SIZE}" \
--boot-disk-type="pd-ssd" \
--metadata-from-file="startup-script=cli/sdlc/gce-workstation/install-base-software.sh"
echo "Created VM: ${VM_NAME} with identity: ${AGENT_SA_EMAIL}"
Validate Startup Script Completion (Required)
⚠️ IMPORTANT: The startup script takes ~5-10 minutes. Wait for it to complete successfully before proceeding!
# ============================================
# STEP 4: Wait for and validate startup script
# ============================================
# The VM will boot quickly, but the startup script runs in the background.
# You MUST verify it completed successfully before using the VM.
# Check startup script status (run every 1-2 minutes until complete)
gcloud compute instances get-serial-port-output "${VM_NAME}" \
--project="${PROJECT_ID}" --zone="${ZONE}" 2>&1 | tail -30
# ✅ SUCCESS: Look for these lines:
# "✅ Base software installation complete!"
# "startup-script exit status 0"
#
# ❌ FAILURE: If you see:
# "Script "startup-script" failed with error: exit status 1"
# Then check the logs above for the error message and fix the script.
# Alternative: SSH in and check the journalctl logs
gcloud compute ssh "${VM_NAME}" --project="${PROJECT_ID}" --zone="${ZONE}" --tunnel-through-iap \
--command="sudo journalctl -u google-startup-scripts.service | tail -50"
Create Custom Image (Optional - Faster Scaling)
After the startup script completes (~10-15 min), you can create a custom image for faster provisioning of additional VMs.
What's IN the image (generic, agent-agnostic):
- XFCE desktop environment
- Chrome Remote Desktop
- Dev tools: git, docker, java, gcloud CLI, gh CLI
What's NOT in the image (added per-VM):
- Service account identity (
--service-accountflag at VM creation) - GitHub SSH key (fetched by
setup-agent-credentials.shfrom Secret Manager) - GitHub PAT (fetched by
setup-agent-credentials.shfrom Secret Manager) - Repo clone (done in post-provisioning)
Note: The same base image works for Cursor, Antigravity, or any agent. The agent identity is determined by the service account attached at VM creation time.
# 1. Wait for startup script to complete (check serial port output)
gcloud compute instances get-serial-port-output "${VM_NAME}" \
--project="${PROJECT_ID}" --zone="${ZONE}" 2>&1 | tail -20
# Look for: "startup-script exit status 0" or similar completion message
# 2. Stop the VM (required before creating image)
gcloud compute instances stop "${VM_NAME}" \
--project="${PROJECT_ID}" --zone="${ZONE}"
# 3. Create custom image from the VM's disk
# Name reflects it's a generic base image (no credentials baked in)
IMAGE_NAME="dev-workstation-base-$(date +%Y%m%d)"
gcloud compute images create "${IMAGE_NAME}" \
--project="${PROJECT_ID}" \
--source-disk="${VM_NAME}" \
--source-disk-zone="${ZONE}" \
--family="dev-workstation-base" \
--description="Generic dev workstation: XFCE, Chrome Remote Desktop, dev tools. No credentials. $(date +%Y-%m-%d)"
# 4. Restart the original VM
gcloud compute instances start "${VM_NAME}" \
--project="${PROJECT_ID}" --zone="${ZONE}"
echo "✅ Custom image created: ${IMAGE_NAME}"
echo " Use --image-family=dev-workstation-base for new VMs (no startup script needed)"
Creating new VMs from custom image:
# New VMs boot in ~1 min instead of ~15 min
# The --service-account determines the agent identity (not the image)
gcloud compute instances create "philip-antigravity-autonomous-agent" \
--project="${PROJECT_ID}" \
--zone="${ZONE}" \
--machine-type="${MACHINE_TYPE}" \
--network="${NETWORK_NAME}" \
--subnet="${SUBNET_NAME}" \
--no-address \
--service-account="ai-swe-agent@construction-code-expert-agy.iam.gserviceaccount.com" \
--scopes="cloud-platform" \
--image-family="dev-workstation-base" \
--image-project="${PROJECT_ID}" \
--boot-disk-size="${BOOT_DISK_SIZE}" \
--boot-disk-type="pd-ssd"
# Note: No --metadata-from-file needed - base software already installed
# Agent credentials are fetched post-provisioning via setup-agent-credentials.sh
Auto-Shutdown When Idle (Cost Optimization)
GCE has scheduled start/stop (time-based) but no native idle detection. Use this custom script to auto-shutdown after inactivity:
Script: cli/sdlc/gce-workstation/install-idle-shutdown.sh
Monitors for:
- Active SSH sessions
- Chrome Remote Desktop sessions
- CPU usage (< 5% = idle)
# SSH into VM
gcloud compute ssh "${VM_NAME}" --project="${PROJECT_ID}" --zone="${ZONE}" --tunnel-through-iap
# NOTE: Repo is private - clone first (see Step 6.1), then run:
cd ~/construction-code-expert
# Install with default 2-hour timeout
sudo ./cli/sdlc/gce-workstation/install-idle-shutdown.sh
# Or with custom timeout (e.g., 60 minutes)
sudo ./cli/sdlc/gce-workstation/install-idle-shutdown.sh 60
Manage the service:
sudo tail -f /var/log/idle-shutdown.log # View logs
sudo systemctl stop idle-shutdown # Disable temporarily
sudo systemctl disable idle-shutdown # Disable permanently
Cost Impact: With 2-hour idle shutdown, a VM used 8 hours/day costs ~$40/month instead of ~$120/month (24/7).
Example Configurations
| Owner | Agent Type | Project Suffix | VM Name |
|---|---|---|---|
| alex | antigravity | agy | alex-antigravity-autonomous-agent |
| philip | antigravity | agy2 | philip-antigravity-autonomous-agent |
| alex | cursor | crsr | alex-cursor-autonomous-agent |
Note: The VM is hosted in the workstation project (
permitproof-dev-workstations) but runs with the identity of the agent-specific service account from its dedicated project (crsroragy). Multiple VMs can share the same agent identity - they all commit as the same GitHub user.
5.3 Access Methods
SSH via IAP Tunnel
gcloud compute ssh "${VM_NAME}" \
--project="${PROJECT_ID}" \
--zone="${ZONE}" \
--tunnel-through-iap
Note: Requires
roles/iap.tunnelResourceAccessoron the project or VM.
Remote Desktop via Chrome Remote Desktop
Chrome Remote Desktop uses Google's relay servers - no firewall rules or external IP needed.
One-time setup (run on VM via SSH):
# Generate setup command at: https://remotedesktop.google.com/headless
# Then run the provided command on the VM, e.g.:
DISPLAY= /opt/google/chrome-remote-desktop/start-host \
--code="4/xxxx" \
--redirect-url="https://remotedesktop.google.com/_/oauthredirect" \
--name=$(hostname)
Connect from any device:
- Visit https://remotedesktop.google.com/access
- Sign in with your Google account
- Click on the VM name to connect
Step 6: Post-Provisioning Setup (SSH into VM)
After the VM is provisioned, SSH in and complete the setup:
# SSH into the VM
gcloud compute ssh "${VM_NAME}" \
--project="${PROJECT_ID}" \
--zone="${ZONE}"
6.1 Setup Agent Credentials (Inside VM)
The credential setup script is version-controlled at:
cli/sdlc/gce-workstation/setup-agent-credentials.sh
This script:
- Auto-detects the agent type from the VM's service account identity
- Fetches SSH key and GitHub PAT from Secret Manager
- Configures Git identity and GitHub CLI
- No manual configuration needed
# NOTE: Repo is private - must clone first (not curl from raw.githubusercontent.com)
# Recommended: Use gh CLI (tokens stored securely, not in shell history)
# First authenticate gh CLI with a PAT or browser login:
gh auth login
# Then clone the repo:
gh repo clone sanchos101/construction-code-expert
cd construction-code-expert
./cli/sdlc/gce-workstation/setup-agent-credentials.sh
# Alternative: Clone via SSH (if SSH key is already configured)
git clone git@github.com:sanchos101/construction-code-expert.git
cd construction-code-expert
./cli/sdlc/gce-workstation/setup-agent-credentials.sh
The script will automatically detect whether this is a Cursor or Antigravity VM based on the service account and configure accordingly.
6.2 Install IDE (Cursor or Antigravity)
# For Cursor IDE
# Download and install Cursor AppImage
curl -fsSL https://download.cursor.sh/linux/appImage/x64 -o ~/cursor.AppImage
chmod +x ~/cursor.AppImage
# For Antigravity IDE
# TODO: Add Antigravity installation steps when available
Step 7: Verification
# Verify VM is running
gcloud compute instances describe "${VM_NAME}" \
--project="${PROJECT_ID}" \
--zone="${ZONE}" \
--format="table(name,status,networkInterfaces[0].accessConfigs[0].natIP)"
# List all VMs in the project
gcloud compute instances list --project="${PROJECT_ID}"
# Check startup script logs
gcloud compute instances get-serial-port-output "${VM_NAME}" \
--project="${PROJECT_ID}" \
--zone="${ZONE}"
Cost Estimation
Per-VM Costs
| Resource | Specification | Estimated Monthly Cost |
|---|---|---|
| GCE VM (e2-standard-4) | 4 vCPU, 16 GB RAM, 24/7 | ~$100-120 |
| Boot Disk (100GB SSD) | pd-ssd | ~$17 |
| Per-VM Total | ~$117-137/month |
Note: No external IP costs - using IAP tunneling for access.
Shared Network Infrastructure (One-Time per Project)
| Resource | Specification | Estimated Monthly Cost |
|---|---|---|
| VPC Network | Custom mode | Free |
| Subnet | 10.0.0.0/24 | Free |
| Cloud Router | For NAT | Free |
| Cloud NAT Gateway | Per gateway | ~$32 |
| Cloud NAT Data Processing | ~100GB egress | ~$5 |
| Firewall Rules | SSH, RDP, Internal | Free |
| Network Total | ~$37/month |
Total Cost Scenarios
| Scenario | VMs | Monthly Cost |
|---|---|---|
| Single agent (Cursor OR Antigravity) | 1 VM | ~$160-185 |
| Both agents (Cursor AND Antigravity) | 2 VMs | ~$280-330 |
| Both agents + dev/test VMs | 4 VMs | ~$520-620 |
Cost Optimization Tips
- Preemptible/Spot VMs: ~60-80% savings for non-critical/interruptible workloads
- Schedule VM start/stop: Run only during working hours (reduces to ~$50-60/VM)
- Smaller machine type: Use
e2-medium(2 vCPU, 4GB RAM, ~$50/month) for lighter workloads - Committed Use Discounts: 1-year commitment saves ~37%, 3-year saves ~55%
- No external IPs: Using IAP tunneling eliminates IP costs and improves security
Troubleshooting
Cannot access Secret Manager secrets
# Verify service account has access
gcloud secrets get-iam-policy cursor-autonomous-agent-ssh-key \
--project="${ADMIN_PROJECT_ID}"
# Check VM service account
gcloud compute instances describe "${VM_NAME}" \
--project="${PROJECT_ID}" \
--zone="${ZONE}" \
--format="value(serviceAccounts[0].email)"
RDP connection fails
# Check XRDP service status
gcloud compute ssh "${VM_NAME}" --zone="${ZONE}" --command="sudo systemctl status xrdp"
# Check firewall rules
gcloud compute firewall-rules list --project="${PROJECT_ID}" --filter="name:rdp"
VM startup script errors
# View startup script output
gcloud compute instances get-serial-port-output "${VM_NAME}" \
--project="${PROJECT_ID}" \
--zone="${ZONE}" | tail -100
References
- Issue #411 - Provision GCE VM Workstations
- Issue #319 - Agent Secret Management
- GCE Documentation
- Secret Manager Documentation