Architecture Overview

A high-level look at the system design, service boundaries, and the key architectural decisions that guide how the platform is built and evolved. The goals are predictable behavior, independent deployability, and clear ownership.

Guiding Principles

  • Separation of concerns — each service owns a single bounded context and exposes a well-defined contract (Domain-Driven Design).
  • Composition over inheritance — small, focused modules are composed at the application layer.
  • Twelve-Factor App — config in environment, stateless processes, explicit dependencies, port binding, and parity between dev/prod.
  • Fail fast, recover gracefully — validate at the boundary; use timeouts, retries with jitter, and circuit breakers for downstream calls.
  • Observability by default — structured logs, metrics, and distributed traces (OpenTelemetry) across every hop.
  • Secure by design — least privilege, defense in depth, and threat modeling on new surfaces (STRIDE).

System Context (C4 Level 1)

At the highest level, the platform sits between end users and a small set of trusted upstream and downstream systems.

C4 system context: actors and external systems.

Layered Architecture (C4 Level 2)

The platform is divided into three primary layers — presentation, application, and data. Each layer communicates only with the layer directly below it. Cross-cutting concerns (auth, logging, caching) are handled in middleware.

Container view: presentation, application, and data layers.

Request Lifecycle

A typical request flows through middleware (auth, rate limiting, tracing), into a route handler, then into a domain service that orchestrates persistence and outbound calls. Idempotency keys are used for write operations.

Sequence: authenticated request with cache-aside read.

Key Decisions (ADR Summary)

  • Next.js App Router — co-located routing, layouts, streaming, and React Server Components reduce client JS.
  • Tailwind CSS — utility-first; no runtime CSS-in-JS overhead; tokens defined in globals.css.
  • Shadcn UI on Radix — accessible primitives composed with Tailwind; ownership of component source.
  • TypeScript strict mode — catches errors at compile time; any requires justification.
  • OpenTelemetry — vendor-neutral instrumentation across services.

Quality Attributes

  • Availability — target 99.9%; multi-AZ; automated failover; runbooks for top incident classes.
  • Performance — p95 server response < 300 ms; LCP < 2.5 s on 4G mobile.
  • Security — OWASP ASVS Level 2; quarterly dependency and SAST review.
  • Scalability — stateless services behind a load balancer; horizontal scaling on CPU and queue depth.