About Sqwish. Sqwish auto-tunes every layer of an AI stack in real time so companies optimise for business outcomes, not just latency or cost. We close the loop between production data, user behaviour and model choices, letting product teams ship faster and win in crowded GenAI markets.
We’re a fast-moving, ambitious team that cares deeply about what we build and how we build it. Speed is part of our DNA - we ship early, iterate quickly, and treat momentum as a core advantage. But we never trade quality for haste: curiosity, thoughtful decisions, and a high bar for craft guide every part of our work. We’re humble learners in a rapidly shifting field and genuinely excited to build alongside others who bring care, pace, and clarity to their craft.
Find out more at https://sqwish.ai
This is a flexible internship tailored to your strengths and interests. Successful candidates will be matched to one (or a blend) of the following tracks:
You’ll ship real features with real impact, supported by engineers and researchers who care about craft and learning. **
The problems you’ll tackle • Designing, training and evaluating models that learn from continuous, real-world data. • Building APIs and services that handle high request volumes with tight latency budgets. • Streaming and structuring outcome signals to close the loop between product usage and model behaviour. • Creating robust CI/CD for data → training → deployment → rollback, across AWS/GCP. • Instrumenting systems with metrics and traces; defining SLOs that reflect user experience.
**Example responsibilities (we’ll shape these with you) **ML Research/Engineering • Prototype and test learning algorithms aligned to product goals and constraints. • Fine-tune models; define reward functions and evaluation metrics beyond static benchmarks. • Keep great research hygiene: clear hypotheses, structured logging, reproducible setups. • Analyse data and model behaviour to prioritise improvements and guide next experiments.
Backend Engineering • Build production-grade Python services; contribute to Rust/C++ for latency-critical paths where helpful. • Model clean domain boundaries; expose REST/gRPC interfaces and Kafka topics. • Own build → release with Terraform, Helm and GitHub Actions; run load tests and iterate fast. • Instrument services with Prometheus/OpenTelemetry; set and track SLOs.
MLOps/Platform • Build CI/CD pipelines for ML workflows; automate infra (Terraform, Helm, GitHub Actions). • Design container-based inference services (Docker, Kubernetes) with clear rollback strategies. • Set up experiment tracking, dataset/version management, and model governance (e.g. MLflow/W&B). • Profile and optimise inference (latency, memory, batch serving) on CPU/GPU; monitor drift and performance.
We don’t expect mastery of every bullet - strength in some areas plus the drive to learn the rest beats a perfect checklist.
Nice to have for an ideal candidate (but all teachable on the job) • Familiarity with ML training/fine-tuning, RL/reward modelling, or evaluation design. • Exposure to distributed training, inference optimisation (ONNX/TensorRT/quantisation), or GPU workloads. • Comfort with cloud infra (AWS/GCP), Docker/Kubernetes, Terraform, or observability stacks. • Experience with high-throughput APIs, streaming architectures, or message queues (Kafka). • Contributions to research papers, internal tooling, or open-source.
What to expect: • Salary: $1.2k-2.5k/month based on location and experience • Duration: Minimum 10–12 weeks (longer by mutual agreement). • Timing: Available year-round (Summer/Winter). • Location: Cambridge (preferred), but remote internships are also available.
APPLY: