Data Manager (Bioinformatics & HPC)

Posted:
11/14/2024, 4:00:00 PM

Location(s):
New York, United States ⋅ Town of Amherst, New York, United States

Experience Level(s):
Mid Level ⋅ Senior

Field(s):
Data & Analytics

Workplace Type:
On-site

POSITION SUMMARY

Since its beginning, SFARI (Simons Foundation Autism Research Initiative) has partnered with families and clinical centers across the country to build large and diverse cohorts of well-characterized individuals with autism or with specific genetic alterations associated with neurodevelopmental risk. These include the Simons Simplex Collection (SSC), Simons Searchlight, Autism Inpatient Collection, and SPARK. The principles of community-based participatory research have been key to all cohort activities. The Simons Foundation Informatics group manages the collection and distribution of large-scale aggregate and deidentified clinical and genomic data from these cohorts, which are made available to autism researchers through SFARI Base, a clearinghouse for autism and autism-related research data and biospecimens supported by SFARI.

The Bioinformatics team at the Simons Foundation is seeking a full-time data manager. The position will be at the forefront of the Foundation’s open data initiatives and will improve data organization, documentation, and access for downstream researchers. The ideal candidate will have strong experience managing large-scale data in a Linux or HPC environment, experience following and improving data management SOPs, and outstanding attention to detail and organization.

The candidate will be responsible for overseeing an extensive and growing collection of genomics data (150,000+ whole-exomes and 15,000+ whole-genomes) and other biomedical data across the Simons Foundation autism cohorts and neuroscience collaborations. The data manager will report to the Deputy Director of Bioinformatics and will work alongside bioinformatics engineers, software engineers, and data analysts on the Bioinformatics team, as well as with Informatics' engineering team, the SDBR (SFARI Data and Biospecimen Repository) team, and the Simons Foundation’s Cloud Systems team.

This full-time position is based on-site in the Simons Foundation offices in New York City.

Responsibilities

  • Data organization and stewardship
    • Coordinate and track data receipt from vendors, research collaborators, and external investigators
    • Develop data processing pipelines for common data cleaning and data organization needs
    • Perform incoming data cleaning as needed, including de-identifying sample identifiers, metadata standardization, and data organization, following data management SOPs
    • Perform quality control checks on incoming data and released datasets
    • Maintain and improve data management SOPs, including quality, compliance, and privacy considerations
    • Maintain a data catalog to document dataset descriptions, file locations, SFARI Base availability, and backup/archive statuses
  • Data sharing and support
    • Support data sharing for SFARI investigators and various SFARI cohorts and collaborations
    • Manage sharing on SFARI Base of datasets received from external investigators
    • Help with periodic minor updates to internally-generated datasets released on SFARI Base
    • Manage data-related questions from external investigators about released datasets
    • Support data access for cloud platforms

MINIMUM QUALIFICATIONS

  • Education
    • Bachelor’s or Master’s degree in computer science, data science, engineering, or a related field. Minimum of 3 years of experience in data management.
  • Required skills
    • Experience managing large-scale data in an HPC environment
    • Strong Linux and Bash experience, including data permissions management
    • Strong Python experience, including Pandas and similar packages
    • Experience using git/github
    • Strong organizational skills and attention to detail
    • Effective oral and written communicator
    • Ability to thrive in collaborative environments
  • Helpful skills
    • Project management experience
    • Familiarity with genomics data and file types
    • Familiarity with popular bioinformatics command-line tools
    • Experience with data processing and data storage on cloud platforms
    • Experience creating and/or hosting dashboards or platforms to visualize and/or interact with data

REQUIRED APPLICATION MATERIALS

  • Resume

  • Cover letter stating your interest in the position.

  • Code repository or example code

COMPENSATION AND BENEFITS

  • The full-time annual compensation range for this position is $130,000 – $145,000, depending on experience.

  • In addition to competitive salaries, the Simons Foundation provides employees with an outstanding benefits package.

THE SIMONS FOUNDATION'S DIVERSITY COMMITMENT

Many of the greatest ideas and discoveries come from a diverse mix of minds, backgrounds and experiences, and we are committed to cultivating an inclusive work environment. The Simons Foundation actively seeks a diverse applicant pool and encourages candidates of all backgrounds to apply. We provide equal opportunities to all employees and applicants for employment without regard to race, religion, color, age, sex, national origin, sexual orientation, gender identity, genetic disposition, neurodiversity, disability, veteran status, or any other protected category under federal, state and local law.