Posted:
11/14/2024, 4:00:00 PM
Location(s):
Columbia, Maryland, United States ⋅ Maryland, United States
Experience Level(s):
Senior
Field(s):
IT & Security
The Northrop Grumman Microelectronics Design and Applications (MDA) Business Area's (BA) Research Computing (RC) organization seeks high-performance computing (HPC) professionals to support its growing teams of scientists and engineers. The MDA RC organization is responsible for the design, development, deployment, security, operation, monitoring, management, and support of all computational hardware and software used for basic research, modeling and simulation, circuit design, and circuit testing. RC oversees multiple highly complex large compute clusters that are required in the daily work of users of diverse backgrounds, roles, and responsibilities. RC is searching for HPC operations, administration, and applications experts excited to learn new technologies and contribute to a highly dynamic organization.
Oversee operation of a high-performance compute cluster
Lead team of HPC Systems Administrators
Investigate, diagnose, and resolve acute system faults
Ensure system performance aligns with customer requirements
Maintain software deployments
Maintain security compliance
Monitor and maintain hardware
Contribute to design of new high-performance compute clusters
Assess and respond to customer requests for cluster modifications
Interface with user support staff
Assess new technology for benefits and risks
Assess and report on cluster operational risks and propose mitigation strategies
Position can be worked out of Annapolis Junction or Linthicum sites
Bachelors degree with 8 years' relevant experience; 6 years' experience with a Master’s Degree; 4 years with PhD. Will consider 4 years additional experience in lieu of degree.
Strong Linux (Red Hat) systems administration proficiency
Strong knowledge and experience with concepts of high-performance computing system operations, including cluster management, multi-user login environments, job scheduling, and networked file systems
Strong knowledge and experience maintaining compliance with Security Technical Implementation Guides (STIGs)
Strong knowledge and experience with compiling software
Strong knowledge and experience monitoring and maintaining high-performance compute cluster hardware
Experience directing technical work of a small team of Linux Systems Administrators
Strong written and verbal communication skills
U.S. citizen with ability to obtain a TS/SCI clearance with full scope poly
Active TS/SCI with full-scope polygraph security clearance
IAT Level II certification
Experience with MPI implementations
Experience with high-speed, low-latency network fabrics
Experience with parallel file systems
Experience with GPUs
Website: https://northropgrumman.com/
Headquarter Location: Falls Church, Virginia, United States
Employee Count: 10001+
Year Founded: 1994
IPO Status: Public
Last Funding Type: Grant
Industries: Data Integration ⋅ Manufacturing ⋅ Remote Sensing ⋅ Security ⋅ Software