— 00 Services

Six lanes of senior engineering work.

Built for teams facing real scale, real deadlines, and real production constraints. Each engagement is scoped tightly, priced clearly, and ends with a working system and a written handoff.

SVC-01

Backend & API engineering.

Design and build robust backend services, APIs, and microservices using modern JVM stacks — Java, Kotlin, Scala, Spring Boot, Quarkus, Akka, and reactive frameworks. For teams that need scalable service design, clean architecture, production-ready APIs, or help modernising existing backend systems.

What it covers

REST & WebSocket API design, contract-first or evolve-in-place
Microservice decomposition & domain-driven design
Reactive backends — Spring WebFlux, Project Reactor, Akka
Modernising Java monoliths into scalable services
Polyglot persistence — Cassandra, Couchbase, PostgreSQL, Mongo

A typical engagement

Two weeks of investigation in your repo and traces, a written architecture decision record, then 4–10 weeks of hands-on implementation alongside your team. We pair, we PR, we leave the system better than we found it.

Deliverables

Written architecture review (PDF + Markdown)
Working code, in your repo, with tests
ADRs for every non-obvious decision
Runbooks for what we changed

Duration / shape

4–12 weeks · fixed price for scoped work, T&M for open-ended · weekly written status · one demo per fortnight.

SVC-02

Distributed systems & event-driven architecture.

Build resilient, asynchronous systems using Kafka, RabbitMQ, CQRS, event sourcing, streaming pipelines, and reactive architecture patterns. For teams designing systems that have to handle high traffic, complex data flows, and real-time business requirements.

What it covers

Kafka — topology design, partitioning, ACLs, schema evolution
CQRS & event sourcing on the JVM
Streaming pipelines (Kafka Streams, Akka Streams, Spark)
Idempotency, exactly-once, ordering, replay
Service choreography & back-pressure under load

A typical engagement

Discovery against your live traffic patterns, a target architecture written up as an ADR, then embedded delivery building the new pipeline alongside your engineers. Includes load and resilience tests as part of the work.

Deliverables

Event-flow & topic design documentation
Production-grade streaming services in your repo
Load & chaos test results, with the harness left behind
On-call runbooks for the new components

Duration / shape

6–16 weeks · fixed-price discovery, T&M for the build · pair-programmed with your team.

SVC-03

Cloud-native engineering.

Design, deploy, and operate services on AWS and Azure with Kubernetes, Docker, Terraform, Jenkins, ArgoCD, and modern CI/CD pipelines. We support cloud migrations, platform setup, infrastructure automation, service deployment, and production environment hardening.

What it covers

AWS & Azure architecture review
Kubernetes & container platform design (EKS / AKS)
Infrastructure as code — Terraform, Helm
CI/CD pipelines — Jenkins, ArgoCD, GitOps workflows
Secrets, deployment safety nets, production hardening

A typical engagement

Discovery week, written infrastructure assessment, then a focused rebuild of the parts that matter — usually one of: deploy pipeline, environment topology, or cost-and-reliability hotspot.

Deliverables

Infrastructure assessment & target architecture
Working IaC modules, in your repo
Migration plan with reversible steps
Documentation your on-call can actually read

Duration / shape

3–10 weeks · fixed-price discovery, T&M for the build · embedded with your platform team.

SVC-04

Performance, scalability & reliability.

Identify bottlenecks, run non-functional tests, improve throughput, reduce latency, and prepare systems for production load. Includes Gatling-based peak, soak, rolling-deployment and linear-scalability testing — the same approach used to harden APIs for the 16M-concurrent-user NFL playoff broadcast on Peacock.

What it covers

Non-functional test design — peak, soak, rolling, scalability
Gatling test harnesses you can run in CI
Latency & throughput profiling on hot paths
Database & cache tuning under load
Observability — Prometheus, Grafana, ELK, AppDynamics

A typical engagement

Two weeks to baseline current behaviour and build a Gatling harness, then 2–6 weeks of targeted optimisation. We measure before and after, in numbers, and leave the test harness behind so the team can keep doing this.

Deliverables

Performance baseline & bottleneck report
Reusable Gatling test suite checked into your repo
Before/after metrics dashboard
Concrete code & config changes that moved the numbers

Duration / shape

3–10 weeks · best done before a known scaling event (launch, sale, broadcast).

SVC-05

Technical leadership & delivery support.

Support engineering teams with architecture decisions, technical discovery, stakeholder alignment, mentoring, code reviews, technical planning, and delivery ownership. Useful for startups and scaleups that need senior engineering leadership without hiring a full-time principal engineer or tech lead.

What it covers

Fractional tech-lead / staff-engineer support
Architecture & technology selection reviews
Engineering hiring & interview design
Stakeholder alignment & sprint / scrum facilitation
Mentoring, code review culture, ADR practice

A typical engagement

Monthly retainer with a standing weekly call, async availability on Slack, and ad-hoc deep-dives on whatever's burning that week. Quarterly written strategy memos.

Deliverables

Standing weekly call & async support
Written architecture & hiring memos
Interview rubrics & sample loops
Code-review & ADR templates for the team

Duration / shape

3–12 month retainer · 1 or 2 days / week equivalent · capped scope, written boundaries, no Slack-at-midnight.

SVC-06

AI product engineering.

Help teams integrate AI into products and workflows using RAG, LangChain, AI agents, vector databases, and practical production-focused patterns. The focus is not on AI hype — it's on building useful, maintainable, business-relevant AI capabilities that survive contact with paying customers.

What it covers

Retrieval-augmented generation (RAG) & vector search (Pinecone, pgvector)
LangChain & agent loops, tool use, structured output
Evals, prompt engineering, model & vendor selection
Cost, latency, caching, model routing
Safety, guardrails, observability for LLM systems

A typical engagement

Two weeks of evals and instrumentation to find out what's actually broken, then 4–8 weeks turning a fragile demo into a system you can ship to paying customers without holding your breath.

Deliverables

Eval harness & baseline measurements
Production-grade retrieval & agent code
Cost / latency / quality dashboards
Model & vendor selection memo

Duration / shape

4–12 weeks · suits teams with an existing prototype that needs to harden, not greenfield "let's add AI" projects.

— 07 Next step

Sound like a fit? Start with a call.

Let's build scalable systems → hello@steerscale.com