We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results

Sr/Principal Data Reliability Engineer

Lam Research
$111,000.00 -$244,000.00.
United States, California, Fremont
4650 Cushing Parkway (Show on map)
Jul 01, 2025
The Group You'll Be a Part Of

The Global Information Systems Group is dedicated to the success of Lam through providing best-in-class and innovative information system solutions and services. Together, we support users globally with data, information, and systems to achieve their business objectives.

The Impact You'll Make

The Senior / Principal Data Platform Reliability Engineer will own end-to-end platform stability, reliability, and operational excellence. This role requires deep technical expertise combined with a diplomatic yet assertive leadership style to provide objective operational insights, identify critical weaknesses, and drive improvement across AI and Analytics products in production.

What You'll Do
  • Own and enforce best practices in data pipeline and data platform reliability, availability, scalability, and maintainability.
  • Collaborate closely with Directors of AI and Analytics to understand goals, architectures, and production deployments.
  • Proactively monitor production deployments, identify bottlenecks, issues, and improvement opportunities, and communicate them clearly and objectively.
  • Develop and continuously refine SLIs, SLOs, and SLAs for the platform, holding accountable parties responsible.
  • Serve as the authoritative voice for operational excellence and customer experience, ensuring objective feedback is regularly integrated into product enhancements.
  • Foster transparency in technical evaluations and objectively report operational insights to all stakeholders, including senior management.
  • Assess current platform capabilities, use cases, and future-state transformation requirements to benchmark data operations reliability.
  • Assess current capabilities, roadmap and implement future state observability capability.
  • Support disaster recovery planning and testing in partnership with infrastructure teams, application teams, and domain architects.
  • Assist in troubleshooting operational issues and lead root cause and corrective action analysis.
  • Mentor and train the existing SRE/Ops team to enhance their technical competencies and assertiveness in engaging with engineering leadership.
Who We're Looking For
  • Bachelor's degree or equivalent in Engineering or Computer Science.
  • 8+ years of direct experience managing production-grade data platforms (analytics pipelines, machine learning infrastructure, and big data frameworks).
  • Strong hands-on technical skills in cloud infrastructure (Azure, AWS, GCP), containerization (Kubernetes), data platforms (Snowflake, Databricks, Microsoft Fabric, BigQuery), and observability tools (Prometheus, Grafana, Datadog).
  • Exceptional analytical skills, especially in troubleshooting and proactive risk identification.
  • Demonstrated ability to communicate clearly, objectively, and assertively, particularly when challenging existing viewpoints or practices.
  • Proven experience leading cross-functional discussions among diverse and senior technical stakeholders.
  • Strong interpersonal skills to diplomatically manage organizational politics without compromising technical integrity.
  • Experience in transforming reactive operations into proactive, scalable SRE functions.
Preferred Qualifications
  • Experience in organizations with competing analytical and AI teams.
  • Certifications in cloud (e.g., AWS Solutions Architect Professional, Google Professional Cloud Architect, Azure Solution Architect).
Our Commitment

We believe it is important for every person to feel valued, included, and empowered to achieve their full potential. By bringing unique individuals and viewpoints together, we achieve extraordinary results.

Lam Research ("Lam" or the "Company") is an equal opportunity employer. Lam is committed to and reaffirms support of equal opportunity in employment and non-discrimination in employment policies, practices and procedures on the basis of race, religious creed, color, national origin, ancestry, physical disability, mental disability, medical condition, genetic information, marital status, sex (including pregnancy, childbirth and related medical conditions), gender, gender identity, gender expression, age, sexual orientation, or military and veteran status or any other category protected by applicable federal, state, or local laws. It is the Company's intention to comply with all applicable laws and regulations. Company policy prohibits unlawful discrimination against applicants or employees.

Lam offers a variety of work location models based on the needs of each role. Our hybrid roles combine the of on-site collaboration with colleagues and the flexibility to work remotely and fall into two categories - On-site Flex and Virtual Flex. 'On-site Flex' you'll work 3+ days per week on-site at a Lam or customer/supplier location, with the opportunity to work remotely for the balance of the week. 'Virtual Flex' you'll work 1-2 days per week on-site at a Lam or customer/supplier location, and remotely the rest of the time.

#LI-DM1

Salary

CA San Francisco Bay Area Salary Range for this position: $111,000.00 -$244,000.00.

The above salary range for this position is relevant to applicants that reside or work onsite in the California, San Francisco Bay Area only. Salary offers will depend on factors that include the location you work from, your level, education, training, specific skills, years of experience and comparison to other employees already in this role. Actual salary may vary from salary offered due to numerous factors including but not limited to unpaid time off, unpaid leave, company mandated shutdown, and other relevant factors.

Our Perks and Benefits

At Lam, our people make amazing things possible. That's why we invest in you throughout the phases of your life with a comprehensive set of outstanding benefits.

>
Applied = 0

(web-8588dfb-dbztl)