Country
Romania
Job Family
Technology
Our Mission
At Consumer Panel Services Gf K, our mission is clear: to provide key information on who buys what, where, how much, how often, and most importantly, why (or why not). We take pride in delivering high-quality, uninterrupted data, expert consultancy, and unparalleled expertise to empower our clients to deliver superior customer experiences at every stage of the shopper journey.
Job Description
We are seeking a highly skilled and motivated Site Reliability Engineer (SRE) to join our team. In this role, you will be responsible for ensuring the reliability, availability, and performance of our cloud-based infrastructure, primarily on AWS, through automation, monitoring, and proactive incident resolution. You will leverage your expertise in Infrastructure as Code (Ia C) with Terraform, container orchestration with Kubernetes, and Windows server administration to maintain and improve our systems.
Key Responsibilities:
-
Cloud Infrastructure Management: Design, build, and maintain scalable and reliable infrastructure on AWS, ensuring high availability and disaster recovery strategies are in place.
-
Infrastructure as Code (Ia C): Implement and manage infrastructure using Terraform, ensuring consistency, version control, and automation across environments.
-
Kubernetes Orchestration: Deploy, manage, and troubleshoot containerized applications using Kubernetes, focusing on scalability, performance, and security.
-
Windows Server Administration: Monitor, manage, and optimize Windows server environments, ensuring timely patching, security hardening, and performance tuning.
-
Monitoring & Incident Response: Implement monitoring solutions to ensure observability of system performance, and manage incident responses to address outages and degradations.
-
Automation & Optimization: Identify and automate manual processes to enhance system efficiency, reduce downtime, and improve the overall developer and user experience.
-
Collaboration: Work closely with development teams, operations, and security to ensure smooth deployments, continuous integration, and compliance with best practices.
-
Security: Ensure infrastructure is secure, compliant with industry standards, and aligned with the company’s security policies.
Requirements:
-
Solid understanding of Windows Server administration, including Active Directory, patch management, and performance optimization.
-
Proven experience as a Site Reliability Engineer or in a similar role.
-
Knowledge of AWS services (EC2, S3, RDS, VPC, IAM, etc.) and architecture best practices.
-
Knowledge in Infrastructure as Code (Ia C) with Terraform for managing and automating cloud environments.
-
Experience with Kubernetes for container orchestration in production environments.
-
Proficiency in monitoring tools (e.g., Prometheus, Grafana, Cloud Watch) and experience in setting up alerting and logging systems.
-
Knowledge of automation and configuration management tools such as Ansible or Puppet.
-
Experience with CI/CD pipelines and Dev Ops practices.
-
Strong scripting skills (e.g., Power Shell, Python, Bash) for automating tasks.
-
Understanding of security best practices for cloud and containerized environments.
We are an ethical and honest company that is wholly committed to its clients and employees. We are proud to be an inclusive workplace for all and are committed to equal employment opportunity, focusing on all of our employees reaching their full potential.
We respect and value every employee regardless of race, ethnicity, gender, sex, sexual orientation, age, personality, experience, culture, faith, socio-economic status, or physical or mental disabilities.
We endorse the core principles and rights set forth in the United Nations Declaration of Human Rights and the Social Charter of Fundamental Rights of the European Union, promoting the universal values of human dignity, freedom, equality, and solidarity.
Learn more about who we are and everything we do on:
https://www.gfk-cps.com/about-gfk-cps