GitApplied
Technical reference · Round 0

Architecture

How everything fits together, cloud, network, services, data. A guide to the boring decisions and the seams between them. Written so a new engineer can read once and know where to look the next time something breaks.

1. The shape of it

GitApplied is a job-search atelier: a kanban board for job postings, an LLM-assisted scrape-and-tailor pipeline, a document library, and a Chrome extension that captures jobs while the user browses. The system is a single AWS account in us-east-1 with two EC2 instances, a Postgres database, an S3 bucket, and a small set of managed services for email, billing, and DNS. Everything else is application code.

                          ┌─────────────────────────────────────────┐
                          │            User · browser · ext          │
                          └────────────────┬────────────────────────┘
                                           │ HTTPS
                                           ▼
                                  Route 53  ──►  app.githired.com
                                           │
                                           ▼
                        ┌──────────────────────────────────────────┐
                        │   ALB · TLS 1.3 · ACM cert · 2 AZs       │
                        │   /api/* → API target group              │
                        │   default → Web target group             │
                        └────────┬─────────────────────┬───────────┘
                                 │                     │
                          private subnet         private subnet
                                 ▼                     ▼
                         ┌──────────────┐      ┌──────────────┐
                         │   Web EC2    │      │   API EC2    │
                         │   nginx +    │      │   Go + Gin   │
                         │   React SPA  │      │   chromedp   │
                         └──────────────┘      └──────┬───────┘
                                                      │
                  ┌───────────────┬───────────────────┼──────────────┐
                  ▼               ▼                   ▼              ▼
           Secrets Manager   RDS Postgres        S3 bucket       SES v2
           (DB + app JSON)   16.4 · gp3          documents/      transactional
                             single-AZ           prefix only     email
                                                      │
                                                      ▼
                                            ┌─────────────────┐
                                            │   OpenAI API    │
                                            │   Stripe API    │
                                            └─────────────────┘

Region us-east-1, VPC 10.0.0.0/16 over us-east-1a and us-east-1b, three-tier subnet layout (public / app / db). All compute lives in private subnets with no public IPs; outbound internet goes through a single NAT gateway.


2. AWS & cloud topology

Everything in production is provisioned with Terraform under infra/terraform/. The list below is everything AWS knows about us, grouped by purpose.

Compute

ServiceRoleNotes
EC2 · webnginx + React SPAt4g.small, arm64 AL2023, gp3 20GB encrypted, IMDSv2 required. Runs the built React bundle via docker compose.
EC2 · apiGo API servert4g.small, arm64 AL2023. Runs the API container plus an embedded headless Chromium for scraping JS-heavy job postings.
ALBPublic edge2-AZ ALB. TLS 1.3 termination via ACM cert. Path-based routing: /api/* → API TG, default → Web TG. drop_invalid_header_fields on.
NAT GatewayEgress for private subnetsSingle NAT (one AZ). Outbound is just container pulls and outbound HTTPS, so a single SPOF is acceptable today.

Data & storage

ServiceRoleNotes
RDS Postgres 16Primary datastoredb.t4g.micro, gp3 20GB autoscaling to 100GB, single-AZ, encrypted, 7-day backups, deletion protection on. Schema migrations live in internal/database/migrations and run on API boot.
S3 · documents bucketUser documentsAll resumes and cover letters are stored under the documents/ prefix. IAM policy on the EC2 role restricts ListBucket to that prefix only. Reads are served via short-lived pre-signed URLs.
S3 · docs bucketPublic docs siteCreated out-of-band in the console (see Out-of-band AWS below).
ECR · api / webImage registryTwo repos, scan-on-push, lifecycle policy keeps the last 20 images.

Identity & secrets

ServiceRoleNotes
Secrets Manager · dbPostgres credentialsJSON blob with username, password, host, port, dbname, and a pre-rendered url. Password generated by Terraform random_password.
Secrets Manager · appNon-DB secretsJWT signing key, OpenAI key, S3 bucket name, all six Stripe variables. The API renders .env from this blob on every boot, a rotation is a reboot, not a redeploy.
IAM · ec2 roleInstance roleReads the two specific secret ARNs, pulls from ECR (AmazonEC2ContainerRegistryReadOnly), and gets PutObject/GetObject/DeleteObject scoped to documents/*. SSM Managed Instance Core attached for shell-in-without-keys.
IAM · GitAppliedDeployCI deploy roleOIDC trust to GitHub Actions. Used to push images to ECR and restart EC2. Created in the console; not in Terraform.

DNS, TLS, edge

ServiceRoleNotes
Route 53Authoritative DNSApex, www., and app. records are A-aliases to the ALB. Cert validation records also live here.
ACMTLS certificateSAN covering apex + www + app. DNS-validated. Attached to the HTTPS listener with TLS 1.3 policy.

Messaging

ServiceRoleNotes
SES v2Transactional emailPassword reset, welcome, email verification. SDK call from internal/email/mailer.go. Falls back to a stdout logger in dev when EMAIL_FROM_ADDRESS is empty.

Out-of-band AWS (not in Terraform)

  • S3 docs bucket for the marketing/docs site.
  • GitAppliedDeploy OIDC role used by GitHub Actions.
  • gh-api IAM user for any non-OIDC automation.
  • SSH keypairs used for break-glass instance access.

These were created in the console because they predate the Terraform layout or because rotating them via Terraform would be more risk than benefit. If you create new infrastructure, default to Terraform.


3. Network & security

VPC layout

VPC CIDR 10.0.0.0/16, three subnet tiers spread across us-east-1a and us-east-1b:

  • Public, ALB and the NAT gateway live here. Inbound from the internet on 80/443 only.
  • App (private), the web and API EC2 instances. No public IPs. Outbound goes through the NAT.
  • Database (private), RDS subnet group only. No route to the internet.

Security-group chain

Four security groups, chained by reference rather than by IP so the rules survive any IP change.

internet ──► alb-sg :443,:80
                  │
                  ▼  (referenced by SG, not CIDR)
              web-sg :80          api-sg :8080
                                       │
                                       ▼  (referenced by SG)
                                  rds-sg :5432

The web and API SGs accept ingress only from the ALB SG. The RDS SG accepts ingress only from the API SG. The ALB SG is the only thing open to the internet. Adding a worker tomorrow is one SG ingress rule.

Hardening defaults

  • IMDSv2 required on both EC2 instances.
  • EBS volumes encrypted by default.
  • RDS storage encrypted, deletion protection on, 7-day backups, final snapshot.
  • ALB drop_invalid_header_fields enabled.
  • TLS policy ELBSecurityPolicy-TLS13-1-2-2021-06.
  • S3 IAM scoping to the documents/* prefix, the role cannot enumerate anything else.

Application-level guards

  • CORS reflects either the configured APP_URL or any chrome-extension:// origin; credentials are allowed and the allow-origin is set per-request (no wildcard).
  • Three rate-limiter buckets (token-bucket, in-process): an auth limiter keyed by IP, a per-user API limiter, and a tighter AI limiter shared across /extract, /tailor, application-question generation, and skill-match endpoints.
  • JWT cookies for the SPA, Bearer API tokens for the Chrome extension. The bearer token is hashed at rest and last_used_at is best-effort-touched.
  • Per-job ownership middleware wraps every /jobs/:id/... route so child handlers can trust the path param.
  • Feature-tier middleware gates Base-only endpoints (job posting breakdown, skill match, document editor, interview prep) and Premium-only endpoints (resume tailor, cover letter, application answers).
  • Stripe webhook is mounted outside the v1 group; the Stripe signature is the authentication.

4. Data plane & storage

Postgres

One Postgres 16 database under the githired schema. Migrations are versioned SQL files embedded into the binary at build time and run on API boot via golang-migrate. The current series ends at migration 35; the latest migrations cover job submissions, user preferences, resume profile snapshots, requirement matches, cover-letter drafts, and per-column timestamps.

Repositories live in internal/database/, one file per aggregate. Notable entities:

  • Users, auth, profile, tier, trial window, preferences, Stripe customer ID.
  • Jobs, the kanban card. Has scraping status, autofill-ready flag, outcome, normalized URL, and a per-job example cover letter.
  • Job-scoped children, contacts, interview rounds, prep notes, job notes (with pinning and threaded comments), application questions, generated documents (multi-version resumes + cover letters), job posting highlights, requirement matches, cover-letter drafts, submissions.
  • Documents, user’s uploaded resumes and cover letters; the actual bytes live in S3.
  • Resume profile, structured snapshot of a parsed resume (experiences, accomplishments, education). There is a base profile per user and a per-job snapshot under job_resume_profiles.
  • Auth tables, password reset tokens, email verification tokens, API tokens (hashed), Stripe billing fields.

S3 documents

Every uploaded resume / cover letter and every generated export is keyed under documents/<user>/<doc>. The API serves bytes by issuing a short-lived pre-signed GET URL; the browser fetches the file directly. The same wrapper supports PutObject, CopyObject, and DeleteObject, and falls back to the EC2 instance-role credential chain when static keys are not configured.

Secrets & configuration

The API reads its full configuration from environment variables (pkg/config). In dev, a .env file is loaded from one of several relative paths; in prod, the EC2 user-data script reads the two Secrets Manager blobs and writes /opt/gitapplied/.env before docker compose up.


5. Application services

The Go API is a single binary (cmd/server) using Gin as the HTTP layer. There is no microservice split today; instead, the binary is composed of a handful of cohesive packages under internal/.

PackageResponsibility
internal/authTier & feature catalog. EffectiveTier resolves a stored tier + trial window into the tier the user should be treated as at request time. Mirrored in web/src/auth/features.ts.
internal/middlewareJWT/Bearer auth, per-IP and per-user rate limiters, feature-gate, job-ownership.
internal/handlersHTTP handlers, one file per resource. The router in cmd/server/server.go wires them all up.
internal/databaseSQL repositories, the embedded migration source, and the CardDataLoader that fans out per-job sub-reads.
internal/servicesS3 wrapper and the LLM-backed services (skill match, tailor, resume text extraction).
internal/extractorThe scrape-and-parse pipeline. A site-specific scraper handles Greenhouse / Lever / Ashby / LinkedIn / Indeed / Workday; chromedp renders JS-heavy pages; the LLM enrichment step uses an OpenAI model to fill in skills, responsibilities, benefits, company, salary, etc.
internal/billingStripe SDK wrapper. Treats Stripe as the source of truth for subscription state; the app stores Stripe IDs and reacts to webhook events.
internal/emailSES v2 mailer for transactional email, with a stdout fallback in dev. Disposable-domain blocklist lives next to it.

The hot paths

  1. POST /api/v1/extract, scrape a URL or parse pasted text, run the LLM enrichment, return a structured job. Rate-limited per user against the shared AI bucket.
  2. POST /api/v1/jobs/from-extension, the Chrome extension posts the active tab’s URL and HTML; the API reuses the LLM enrichment without re-scraping and creates a pending job card.
  3. POST /api/v1/tailor, Premium-gated. Tailors a resume against a job. Uses OpenAI.
  4. POST /api/v1/jobs/:id/auto-skill-match, Base-gated. Matches the user’s resume bullets to the job posting’s requirements.
  5. POST /api/v1/jobs/:id/questions/generate + /polish, Premium-gated. Drafts and polishes free-form application answers.
  6. POST /api/v1/billing/checkout / /portal, redirect to Stripe-hosted Checkout and Customer Portal. We never see the card.
  7. POST /api/v1/billing/webhook, Stripe webhook receiver. Mounted outside the auth-required group; signature is the auth.

6. Frontend & extension

Web app

React 19 + TypeScript SPA under web/src/, built with react-scripts and a Tailwind CSS layer compiled separately. State management uses Zustand stores for auth, board sort, board data, bullet selection, and theme. The rich-text experience uses Tiptap; drag-and-drop on the kanban uses dnd-kit; resume rendering uses docx-preview, mammoth for parsing, and html2pdf.js for export.

The SPA ships as an nginx container with a single try_files $uri $uri/ /index.html rewrite for client-side routing. Hashed static assets are cached one year, immutable.

Chrome extension

Manifest V3, built with Vite + CRXJS under extension/. Three execution contexts:

  • Popup, user-facing UI for “save this job posting” and account status.
  • Background service worker, calls the API with the Bearer token and proxies messages from content scripts.
  • Content scripts, injected on the supported job boards (Greenhouse, Lever, Ashby, LinkedIn, Indeed, Workday). They grab document.outerHTML on demand so the API can parse pages it cannot reach from outside.

The extension authenticates via a one-time token issued from Settings → Connections. The public half of the signing key is pinned in manifest.config.ts so the extension ID is deterministic across dev, CI, and the Chrome Web Store.

Mocks

This document, and everything else under web/mocks/, is a static HTML mock that shares a self-contained design-token CSS file (mocks.css) with no React, no build step, and no shared component library. The mocks ship the design system; the production SPA is migrating onto the same tokens.


7. Third-party services

ServiceUseHow it’s wired
OpenAILLM enrichment, resume tailor, cover-letter drafting, skill match, application-answer drafting and polishOPENAI_API_KEY in Secrets Manager. Direct HTTPS calls from the API container. Costs are bounded by the AI rate-limiter bucket (per-user) on the server.
StripeSubscription billing for Base and Premium tiers, monthly and yearlySix secrets in Secrets Manager: the secret key, the webhook secret, and four price IDs. Checkout and Customer Portal sessions are server-created; the app never touches a card. Tier state is driven by webhooks.
SES v2Password reset, welcome, email verificationSDK call from the API. AWS region inherited from AWS_REGION. Falls back to a stdout logger in dev.
Google FontsInter, Source Serif 4, JetBrains MonoLinked from every HTML entry-point with preconnect.
Lucide iconsIconography in mocks and the SPAUMD bundle in mocks; lucide-react in the SPA.
GitHub ActionsCI & deployAuthenticates to AWS via the GitAppliedDeploy OIDC role. Builds two images, pushes to ECR, restarts EC2.
Chrome Web StoreExtension distributionManifest V3 zip, deterministic extension ID via pinned public key.

8. Build, deploy & release

Images

Two Dockerfiles, one per service:

  • Dockerfile (API), multi-stage Go build (Go 1.26-alpine) producing a static binary, packaged on a debian:13-slim runtime that ships chromium, ca-certificates, fonts, and tini. tini is non-optional: chromedp spawns chromium subprocesses that would otherwise pile up as zombies under the Go server.
  • web/Dockerfile, multi-stage Node 25 build producing a static React bundle, served by nginx:1.29-alpine with the SPA-fallback rewrite.

Deploy flow

  1. A push to develop triggers a CI build via the GitAppliedDeploy OIDC role.
  2. Both images are built for linux/arm64 (matching t4g.*) and pushed to their ECR repos. ECR scan-on-push runs.
  3. The /deploy workflow opens a release branch (release-YYYYMMDD-<sha>) and an associated PR from develop into main; merging that PR is the production cutover.
  4. EC2 instances are restarted, or their user-data is re-executed via user_data_replace_on_change, which pulls the new image and writes a fresh .env from Secrets Manager.

Schema migrations

SQL files in internal/database/migrations are embedded into the API binary at build time. On startup, the API runs migrate up against the githired schema. There is no separate migration job; deploying the new API is the migration.


9. Identity, auth & tiers

Two credential paths

  • JWT cookie, signed with the JWT secret from Secrets Manager, issued at login. Secure flag in production. Used by the SPA.
  • Bearer API token, user-issued from Settings → Connections, stored hashed (HashAPIToken) in api_tokens. Used by the Chrome extension. last_used_at is touched asynchronously on every request.

Tiers

Three tiers: Free Base Premium. Tier is stored on the user row; trial state is a trial_ends_at timestamp. The effective tier is computed per request:

  • Free + active trial → treated as Premium until expiry.
  • Base → Base. Premium → Premium. Free + no trial → Free.

Features map to a minimum tier in internal/auth/features.go. The RequireFeature middleware looks up the effective tier and returns 402 if the gate fails. The list is mirrored in web/src/auth/features.ts so the UI can disable controls before the request goes out.

Email verification & password reset

Both flows use a single-use token persisted in Postgres with an expiry; the token is sent by SES with a deep link into the SPA, where the user posts back to /api/v1/auth/verify-email or /api/v1/auth/reset-password.


10. Observability & operations

Honest answer: minimal. The app relies on the free CloudWatch metrics AWS already publishes for ALB, EC2, and RDS, plus gin’s default request logging. There are no CloudWatch alarms in Terraform yet. That’s appropriate for “no users yet” and inappropriate the moment that changes.

The planned monitoring increments, in order:

  1. ALB access logs to S3, turn on before opening signups. Unlocks every retroactive “was something weird happening at 02:14?” question.
  2. ~10 CloudWatch alarms wired to SNS → email, ALB 5xx rate, ALB unhealthy host count, RDS free storage, RDS CPU, EC2 status check fails.
  3. Application metrics, per-route request duration, DB query time, error rates. CloudWatch Embedded Metric Format from the Go binary, no agent.
  4. External synthetic check on /health every minute from outside the region, paging on two consecutive failures.

Break-glass access

SSM Managed Instance Core is attached to the EC2 role, so aws ssm start-session works without SSH keys for routine debugging. SSH keypairs created in the console exist as a fallback.


11. Scaling path

Up to the first thousand active users, we scale vertically: bump instance_type and db_instance_class. These are one-line changes in terraform.tfvars plus a brief restart.

After that, the order is:

  1. ASG at web/API tier, replace the single aws_instance.{web,api} with launch templates + autoscaling groups. Target groups already exist; the wiring change is small. Trigger: sustained target response time p95 > 500ms for 15 min, or CPU > 70%.
  2. Multi-AZ RDS, flip multi_az = true. Cost roughly doubles. Trigger: first paying customers, or the first scheduled maintenance we can’t take an outage for.
  3. Second NAT gateway, one per AZ. Trigger: we lose a deploy because one AZ had a NAT outage.
  4. CloudFront in front of the ALB, reuse the ACM cert, cache static asset paths, leave /api/* pass-through. Trigger: regular non-US latency > 200ms p95, or ALB egress starts to dominate.
  5. AWS WAF managed rules, Common, KnownBadInputs, SQLi. Trigger: first credential-stuffing pattern in ALB access logs.
  6. In-API LRU → read replica → ElastiCache, in that order, only if the DB is the bottleneck. Trigger: RDS CPU > 60% sustained, or specific endpoints have DB time dominating their p95.
  7. ECS on Fargate, the moment we’re running more than ~4 services, or rolling-back is costing real human time. We skip EKS unless there’s a concrete platform feature we need from it.
  8. Distributed rate limiter, today the token-bucket limiters are in-process; switch to Redis the moment we run more than one API replica.

12. Principles

The choices above are downstream of a small set of principles. Naming them keeps future decisions consistent.

The cost of an architecture isn’t its AWS bill, it’s the surface area you have to keep in your head.
  • Boring beats clever. EC2 + Docker over ECS until we feel the pain. SES over a third-party email vendor. Postgres over a managed search index. Each “we should also…” gets a counter-question: what signal am I waiting for that says this is now worth the cost?
  • Stripe is the source of truth for billing. The app stores Stripe IDs and reacts to webhook events; it does not mutate subscription tier directly. The card never touches our servers.
  • One Go binary, many handlers. A microservice split is justified by independent scaling or independent ownership. We have neither, so we don’t pay for either.
  • Security groups by reference, not by IP. If we add a worker, it joins the right SG and the rules just work.
  • Rotation is a reboot. User-data renders .env from Secrets Manager on every boot, so a credential rotation never requires a redeploy.
  • Mock UI first. Every user-facing change starts as a static mock in web/mocks/ on fixture data, then earns its schema and API.
  • Write the trigger down. Every “we deliberately didn’t build X” is paired with the signal that will change our minds. Otherwise “not yet” quietly becomes “never.”