Job Opportunities in Qatar


October 17, 2024

Origen Middle East

Doha

OTHER & FULL TIME


HPC SYSTEMS ADMINISTRATOR

HPC SYSTEMS ADMINISTRATOR
Responsibilities:
a. Cluster Management:
i. Install, configure, and maintain HPC and AI cluster hardware and software.
ii. Monitor cluster performance and resource utilization.
iii. Ensure high availability and reliability of cluster resources.
iv. Manage job scheduling and workload management systems (e.g., SLURM, PBS, Torque).
b. System Administration:
i. Perform regular system updates, patches, and upgrades.
ii. Manage user accounts, permissions, and security protocols.
iii. Implement and maintain system backup and recovery procedures.
iv. Troubleshoot and resolve hardware and software issues.
c. Performance Optimization:
i. Analyze and optimize system performance and resource usage.
ii. Implement tuning parameters and configurations for improved performance.
iii. Collaborate with users to optimize their applications and workflows for HPC and AI environments.
d. User Support:
i. Provide technical support and training to users.
ii. Assist users with job submission, monitoring, and troubleshooting.
iii. Develop and maintain documentation and user guides.
e. Security and Compliance:
i. Implement and enforce security policies and procedures.
ii. Monitor system security and respond to incidents.
iii. Ensure compliance with relevant regulations and organizational policies.
f. Research and Development:
i. Stay updated with the latest technologies and trends in HPC and AI.
ii. Evaluate and recommend new hardware, software, and methodologies.
iii. Collaborate with research teams to support new projects and initiatives.
g. Collaboration:
i. Work closely with IT, research, and development teams to understand their requirements and provide solutions.
ii. Participate in cross-functional projects and initiatives.
iii. Communicate effectively with stakeholders and management.
h. Documentation and Reporting:
i. Maintain detailed documentation of system configurations, procedures, and changes.
ii. Generate reports on system performance, usage, and incidents.
iii. Document and share best practices and lessons learned
Qualifications:
a. Education:
i. Bachelor's degree in Computer Science, Information Technology, Engineering, or a related field. Master's degree preferred.
b. Experience:
i. Proven experience in HPC/AI and or mainframe cluster administration.
ii. Strong background in Linux/Unix system administration.
iii. Experience with job scheduling and workload management systems.
iv. Familiarity with AI and machine learning frameworks and tools.
c. Skills:
i. Proficiency in scripting languages (e.g., Python, Bash).
ii. Strong analytical and problem-solving skills.
iii. Excellent communication and teamwork skills.
iv. Knowledge of networking, storage, and security concepts.
d. Certifications (Optional):
i. Certified HPC Administrator (CHPCA)
ii. IBM or other Mainframe certifications
iii. Red Hat Certified Engineer (RHCE)
iv. AWS Certified Solutions Architect
v. Nutanix Certified Professional (NCP)
vi. Nutanix Certified Advanced Professional (NCAP)
vii. Nutanix Platform Expert (NPX)
viii. Other relevant certifications
Job Types: Full-time, Permanent
Experience:
  • Linux/Unix: 2 years (Preferred)
  • HPC clusters administration: 2 years (Preferred)
  • Install & configure HPC/AI cluster: 2 years (Preferred)
Language:
  • English (Required)
  • Arabic (Required)

Latest Job Opportunities


October 19, 2024

Artisans Digital
Graphic Designer

Doha

FULL TIME

View Details

October 19, 2024

Artisans
Business Development Executive

Doha

FULL TIME

View Details

October 19, 2024

PHI Medcare
Cleaner - Female

Doha

FULL TIME

View Details

October 19, 2024

ASMACS QATAR
FM-ELECTRONICS TECHNICAL OFFICER

Doha

FULL TIME

View Details

October 19, 2024

ASMACS QATAR
FM OPERATIONS TEAM LEAD-SOFT SERVICE (cleaning/housekeeping)

Doha

FULL TIME

View Details

October 19, 2024

ASMACS QATAR
FM INSTRUMENTATION SUPERVISOR

Doha

FULL TIME

View Details

Similar Jobs


September 18, 2024

Raytheon
Systems Administrator (Qatar)

Doha

FULL TIME

View Details

September 12, 2024

Qatar Airways
Senior RM Systems Analyst

Doha

View Details

September 17, 2024

Qatar Airways
Manager IT Physical Security Systems

Doha

View Details

September 18, 2024

C3EL
Systems Engineer III

Doha

View Details

September 17, 2024

Talent Pal
SYSTEMS ENGINEER (Telecommunication Services, Ras Laffan)...

Doha

View Details

September 18, 2024

C3EL
Systems Engineer IV

Doha

View Details

New Jobs from This Company


October 16, 2024

Origen Middle East
Executive Assistant to the CEO

Doha

OTHER & FULL TIME

View Details