About the Role
- Scale’s rapidly growing Global Public Sector team is focused on using AI to address critical challenges facing the public sector around the world. Our core work consists of:
- Creating custom AI applications that will impact millions of citizens
- Generating high-quality training data for national LLMs
- Upskilling and advisory services to spread the impact of AI
- As a Principal AI Ops Architect, you will design and develop the production lifecycle of full-stack AI applications, while supporting end-to-end system reliability, real-time inference observability, sovereign data orchestration, high-security software integration, and the resilient cloud infrastructure required for our international government partners.
- At Scale, we’re not just building AI solutions—we’re enabling the public sector to transform their operations and better serve citizens through cutting-edge technology. If you’re ready to shape the future of AI in the public sector and be a founding member of our team, we’d love to hear from you.
Responsibilities
- Own the production outcome: Take full accountability for the long-term performance and reliability of AI use cases deployed across international government agencies.
- Ensure Full-Stack integrity: Oversee the end-to-end health of the platform, ensuring seamless integration between the AI core and all full-stack components, from APIs to UI, to maintain a responsive and production-ready environment.
- Scale the feedback loop: Build automated systems to monitor model performance and data drift across geographically dispersed environments, ensuring the right levels of reliability.
- Navigate global compliance: Manage the technical lifecycle within diverse regulatory frameworks.
- Incident command: Lead the response for production issues in mission-critical environments, ensuring rapid resolution and building the guardrails to prevent them from happening again.
- Bridge the gap: Translate deep technical performance metrics into clear insights for senior international government officials.
- Drive product evolution: Partner with our Engineering and ML teams to ensure the lessons learned in the field directly influence the technical architecture and decisions of future use cases.
Requirements
- Experience: 6+ years in a high-impact technical role (SRE, FDE or MLOps) with experience in the public sector.
- Global perspective: Familiarity with international government security standards and the complexities of deploying sovereign AI.
- System architecture proficiency: Proven experience maintaining production-grade applications with a deep understanding of the full request lifecycle-connecting frontend/API layers to the backend and AI core.
- Modern AI Stack expertise: Proficiency in coding and the modern AI infrastructure, including Kubernetes, vector databases, agentic development, and LLM observability tools.
- Ownership: You treat every production deployment as your own. You race toward solving hard problems before the customer even sees them.
- Reliability: You understand that in the public sector, a model failure may be a risk to public safety or privacy.
- Customer communication: The ability to explain to a high-ranking official why the performance of the system has degraded and how we are fixing it.
Qualifications
Not specified.
Benefits
- Our holistic approach to supporting Scaliens includes comprehensive health coverage, dental and vision insurance, mental healthcare services, and more. PTO policies and accommodating schedules ensure you'll get time off when you need it to relax and recharge. Note that our offerings may vary by region as we strive to respond to the unique needs of Scaliens around the globe.
- Continuously learn and grow through annual learning & development stipend, attending leadership breakfasts, manager training, speaker series, and joining an ERG.
- We welcome guests to our offices, and you can expect to see Scalien families and friends around. Join local happy hours, and accept invites to game nights, book clubs, and many other employee-led community events.
- Balancing work and family is essential, and Scale understands the importance of having adequate leave policies in place to promote a healthy home and work life.