Senior DevOps Engineer
Willow runs AI agents on behalf of enterprises. Every agent action spins up isolated, ephemeral compute, talks to sensitive systems, and gets logged for audit. The infrastructure underneath has to be fast enough that users don't notice it, secure enough that CISOs sleep at night, and scalable enough that growth never becomes an infra problem.
That's your job - and you own it end to end. From git push to production traffic, from the first alert to the post-mortem, it's yours.
What you'll actually do
- Own the path from commit to production. Pipelines, rollbacks, preview environments, deploys. When something breaks at 2am, you're the one who fixes it - and the one who makes sure it can't break that way again.
- Run Kubernetes across SaaS, on-prem, and hybrid deployments. Some of our enterprise customers can't send their data to the cloud. You'll design infra that runs in all three without forking the stack.
- Design for isolation and scale. Pod-per-session on EKS, with cold starts you'd be proud of, noisy-neighbor problems that never reach customers, and capacity that grows with the customer base - not ahead of it.
- Build observability that devs can extend on their own. Loki, Prometheus, Grafana wired up so the next engineer can add a dashboard or alert without filing a ticket. You build the foundation; everyone else builds on top.
- Make the secure path the easy path. Network policies, IAM, secrets, audit trails - designed so doing the right thing is also the fastest thing. Enterprise compliance is table stakes; we sell to companies that read every line of our SOC 2.
- Templatize everything reusable. Helm charts simple enough that a dev with Claude can modify them. Infra patterns documented enough that the second service onboards in an hour, not a week. Your job isn't to be the bottleneck - it's to remove it.
You'll be a fit if
- You've run K8s in production where downtime had consequences - both managed (EKS) and bare-metal/on-prem. The on-prem part is non-negotiable.
- You're deep on Cloud (preferably AWS) - networking, IAM, EKS, the parts that bite you at scale.
- You think about security as part of the design, not a review step at the end.
- You build for the team, not for yourself. Your Terraform is readable, your Helm charts are forkable, your dashboards make sense to someone seeing them for the first time.
- You move fast. Decisions in hours, not weeks. You'd rather ship the 80% solution today and iterate than perfect it for a month.
- You've owned infrastructure end-to-end at a startup or small team — not one slice at a big company.
Bonus points
- You've built or operated infrastructure for AI workloads, browser automation, sandboxed execution, or anything where "untrusted code in a pod" is the threat model.
- You've been the first or second infra hire somewhere before.
Why Willow
You'll build the infrastructure layer for the AI era.
AI adoption is accelerating faster than any technology shift before it. Soon, every employee will delegate thousands of tasks to AI agents – running in parallel, accessing sensitive data, performing actions across internal systems and SaaS tools. Today, this layer has zero governance.
Willow is building the control layer for AI agents: zero-trust authentication, app-aware permissions, centralized audit trails. We enable adoption without compromising security. Every agent, every tool, one control plane.
We're already in production with forward-thinking enterprises. The market is moving fast – and we're positioned to lead it.
What else:
- Founding impact: You're not joining a team – you're building it. Early equity, real ownership, direct influence on product and company direction.
- Ship to real customers: No theoretical exercises. Enterprises are using what we build today.
- Exceptional team: Small, senior, low-ego. Everyone here can build.
- The right moment: Strong traction, active fundraising, and a market that's exploding. This is the window.
Don't see your role?
The best crew makes their own path. If you believe in what we're building, reach out.
