Senior Systems Engineer HPC - R-21841

Posted:
10/5/2025, 5:05:04 PM

Location(s):
Haryana, India ⋅ Gurugram, Haryana, India

Experience Level(s):
Expert or higher ⋅ Senior

Field(s):
DevOps & Infrastructure ⋅ IT & Security ⋅ Software Engineering

Workplace Type:
Hybrid

Responsibilities:

System Administration & Maintenance: Install, configure, and maintain HPC clusters (hardware, software, operating systems), perform regular updates/patching, manage user accounts and permissions, and troubleshoot/resolve hardware or software issues.

Performance & Optimization: Monitor and analyse system and application performance, identify bottlenecks, implement tuning solutions, and profile workloads to improve efficiency.

Cluster & Resource Management: Manage and optimize job scheduling, resource allocation, and cluster operations using tools such as Slurm, LSF, Bright Cluster Manager / Base Command Manager, OpenHPC, and Warewulf.

Networking & Interconnects: Configure, manage, and tune Linux networking (TCP/IP, DNS, routing) and high-speed HPC interconnects (InfiniBand, Ethernet) to ensure low-latency, high-bandwidth communication.

Storage & Data Management: Implement and maintain large-scale storage and parallel file systems (Lustre, Ceph, GPFS), ensure data integrity, manage backups, and support disaster recovery.

Security & Authentication: Implement security controls, ensure compliance with policies, and manage authentication and directory services such as LDAP and Active Directory.

DevOps & Automation: Use configuration management and DevOps practices (Ansible, Terraform, Jenkins, Git) to automate deployments, application packaging (RPM/DEB), and system configurations.

User Support & Collaboration: Provide technical support, documentation, and training to researchers; collaborate with scientists, HPC architects, and engineers to align infrastructure with research needs.

Planning & Innovation: Contribute to the design and planning of HPC infrastructure upgrades, evaluate and recommend hardware/software solutions, and explore cloud-based HPC solutions where applicable.

Rackspace

Website: https://www.rackspace.com/

Headquarter Location: San Antonio, Texas, United States

Employee Count: 1001-5000

Year Founded: 1998

IPO Status: Public

Last Funding Type: Private Equity

Industries: Big Data ⋅ Cloud Computing ⋅ Cloud Infrastructure ⋅ IaaS