Celesta Portfolio
Careers

Celesta and our portfolio of startups are always hiring exceptional talent!
Browse open jobs below to find your next career move.

Lead Architect, RunTime

SambaNova Systems

SambaNova Systems

IT
Palo Alto, CA, USA
Posted on Jan 14, 2025

The era of pervasive AI has arrived. In this era, organizations will use generative AI to unlock hidden value in their data, accelerate processes, reduce costs, drive efficiency and innovation to fundamentally transform their businesses and operations at scale.

SambaNova Suite™ is the first full-stack, generative AI platform, from chip to model, optimized for enterprise and government organizations. Powered by the intelligent SN40L chip, the SambaNova Suite is a fully integrated platform, delivered on-premises or in the cloud, combined with state-of-the-art open-source models that can be easily and securely fine-tuned using customer data for greater accuracy. Once adapted with customer data, customers retain model ownership in perpetuity, so they can turn generative AI into one of their most valuable assets.

We’re seeking a Lead Architect, Runtime to join our talented Runtime team—a group of engineers who have a proven track record of building software that directly powers advanced AI workloads and scientific computing. As a key technical leader, you will be responsible for designing and architecting a high-performance, distributed, and scalable software runtime that supports our broad array of data-flow applications, including machine learning training and inference, data processing pipelines (ETL), and HPC applications.

In this role, you will have the opportunity to define and deliver the architecture of our entire runtime stack, driving everything from OS-level integration to performance profiling, networking, and optimization, while working closely with hardware teams to design the most efficient systems.

Key Responsibilities:

  • Architectural Leadership: Lead the design, development, and performance optimization of the software runtime stack, ensuring it meets the high-performance and scalability requirements of ML, AI, and HPC applications.
  • Runtime System Design: Architect embedded software infrastructure to enable smooth integration of high-level applications with the underlying hardware, including OS interface/integration, partitioned workload orchestration, fault management, and inter-node communication.
  • Hardware Interaction: Oversee and guide the low-level integration between software and hardware components, ensuring efficient chipset initialization, monitoring, and fault management.
  • Technical Strategy: Drive the technical direction for the Runtime Engineering team, ensuring the design and implementation of software that delivers performance and scales efficiently with our next-generation AI hardware and platforms based on our Reconfigurable Dataflow Architecture.
  • Tooling and Profiling: Lead the design and development of tools and performance profilers, empowering customers to configure, deploy, and optimize their workloads on SambaNova’s Datascale systems.
  • Mentorship and Team Development: Inspire and guide the team to continuously improve development processes, coding standards, and collaboration practices. Foster a culture of excellence, accountability, and technical growth.
  • Cross-functional Leadership: Collaborate with hardware, software, and product teams to define requirements and ensure seamless integration between hardware and system software components.

Skills and Qualifications:

  • Strong Software Engineering Background: Proven experience building, testing, and tuning software for distributed, high-performance systems. In-depth knowledge of operating systems and runtime stacks.
  • Real-Time Operating Systems (RTOS): Hands-on experience with RTOS and system-level software that directly interfaces with hardware.
  • High-Performance Computing (HPC): Expertise in designing and optimizing systems that handle massive parallel workloads, including machine learning training and inference tasks that involve billions of operations per second.
  • Low-Level System Understanding: Deep understanding of hardware-software interaction, including registers, device memory management, and the intricacies of accelerator design. Experience working with ASIC accelerators is highly desirable.
  • Distributed Systems Expertise: Familiarity with distributed systems architecture, including networking, communication protocols, and the challenges of scaling compute resources efficiently.
  • Toolchain Expertise: Hands-on experience with software development tools such as Git, Jenkins, and Jira, with an ability to drive automation and continuous integration efforts.
  • Cross-Disciplinary Knowledge: Ability to work at the intersection of hardware and software, designing systems that optimize both performance and reliability.

Preferred Experience:

  • ASIC/FPGA Expertise: Experience designing or working closely with custom hardware accelerators (ASICs, FPGAs, etc.) and understanding low-level interactions.
  • Cloud and Data Center Experience: Familiarity with deploying high-performance systems in distributed, cloud, or data center environments.

What We Offer:

  • Opportunity to work on cutting-edge technologies that power the next generation of AI and ML applications.
  • A collaborative, dynamic environment where your ideas and leadership will have a direct impact on the success of the company.
  • A chance to work with some of the brightest minds in the industry and contribute to groundbreaking innovations in AI, HPC, and distributed computing.

If you're passionate about designing high-performance, distributed systems and want to lead the architectural evolution of AI infrastructure, we want to hear from you.

Annual Salary Range and Level

The base salary for this position ranges from $200,000/year up to $250,000/year. This range is based on role, level, and location and reflects the salary target for new hires in the US. Individual pay within the range will depend on a number of factors, including a candidate’s job-related qualifications, skills, competencies and experience, and location.

#LI-CK1

Benefits Summary for US-Based Full-Time Direct Employment Positions

(The Recruiter will provide benefit details for non-US-based roles)

SambaNova offers a competitive total rewards package, including the base salary, plus equity and benefits. We cover 95% premium coverage for employee medical insurance, and 77% premium coverage for dependents and offer a Health Savings Account (HSA) with employer contribution. We also offer Dental, Vision, Short/Long term Disability, Basic Life, Voluntary Life, and AD&D insurance plans in addition to Flexible Spending Account (FSA) options like Health Care, Limited Purpose, and Dependent Care. Our library of well-being benefits available to you and your dependents includes a full subscription to Headspace, Gympass+ membership with access to physical gyms, One Medical membership, counseling services with an Employee Assistance Program, and much more.

Submission Guidelines

Please note that in order to be considered an applicant for any position at SambaNova Systems, you must submit an application form for each position for which you believe you are qualified.

If you are a new, recent (within the last two years), or upcoming college graduate and are interested in opportunities with SambaNova Systems, please apply through our university job listings.

EEO Policy

SambaNova Systems is an Equal Opportunity/Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard basis of age (40 and over), color, disability, gender identity, genetic information, marital status, military or veteran status, national origin/ancestry, race, religion, creed, sex (including pregnancy, childbirth, breastfeeding), sexual orientation, and any other applicable status protected by federal, state, or local laws.