The Operating Systems Engineering team is trusted to establish the tooling and architectural standards for all aspects of operating system platforms focusing on Linux but also including Solaris, AIX and Windows. We focus on deep troubleshooting of complex performance issues across all application and system stacks. We benchmark and certify production readiness of new OS releases and patches, providing engineering leadership for systems administration and application SRE teams.
What's in it for you:
You'll work with modern, open-source tooling while maintaining mission-critical systems hosting a wide array of applications for the Terminal product. Systems Stability Engineers will trust you as an escalation point and you'll regularly collaborate to maintain the stability and performance of operating systems and servers. We'll depend on you to advise on design, architecture, and utilization of enterprise-class operating systems, with particular focus on Linux.
You’ll need to have:
• Deep understanding of the Linux kernel including the virtual memory, VFS, IPC, network, and process scheduling subsystems
• Understanding of interrupt handling, IRQ and IRQ affinity, processor sets/cgroups
• Demonstrated experience with performance tuning including tradeoffs between low latency and throughput, hardware and BIOS-related tuning, and proper employment of NUMA
• Ability to create robust testing and certification processes to comprehensively evaluate impact of hardware changes, tunables, and system software updates to the application stack
• Proficiency in reading and debugging C source code to troubleshoot kernel-space issues
We’d love to see:
• Experience with applying formalized performance analysis methodologies such as the USE Method to address complex problems
• Experience programming in Python or Ruby
• Familiarity with configuration management tools such as Chef, Puppet, Ansible or SaltStack