SRE Engineer – Assistant Manager – ()
Description
The Role: We are seeking a highly skilled and experienced ELK SRE Engineer to join our dynamic team. In this role, you will be responsible for the design, implementation, maintenance, and optimization of our Elasticsearch, Logstash, and Kibana (ELK) stack, ensuring its reliability, scalability, and performance. You will play a crucial part in providing robust logging, monitoring, and analytics solutions that are critical to our operational insights and incident response.
Responsibilities:
- Finacle Engineer:
- Maintain and troubleshoot issues in the Finacle core banking system.
- Ensure smooth functioning of modules like customer management, deposits, loans, and payments.
- Customize Finacle modules using scripting languages (e.g., Finacle Scripting Language, PL/SQL).
- Develop new features or modify existing ones based on business requirements.
- Maintain detailed documentation of customizations, configurations, and troubleshooting procedures.
- Troubleshoot complex issues related to data ingestion, search performance, and Kibana visualizations.
- Site Reliability Engineering (SRE) Principles:
- Apply SRE principles to the ELK stack, focusing on automation, observability, and continuous improvement.
- Develop and implement monitoring and alerting solutions for the ELK infrastructure and data pipelines.
- Define and track Service Level Objectives (SLOs) and Service Level Indicators (SLIs) for ELK services.
- Conduct post-incident reviews to identify root causes and implement preventative measures.
- Data Ingestion and Pipelines:
- Design, implement, and optimize data ingestion pipelines using Logstash, Beats (File beat, Metric beat, Heartbeat, etc.), Kafka, or other relevant technologies.
- Develop custom Logstash filters and configurations to parse, enrich, and transform log data.
- Ensure data quality, integrity, and security throughout the ingestion process.
- Collaboration & Mentorship:
- Work closely with development, operations, and security teams to understand their logging and monitoring requirements.
- Provide expertise and guidance on best practices for using the ELK stack.
- Create documentation, runbooks, and training materials for ELK users and administrators.
- Mentor junior engineers and contribute to a culture of knowledge sharing.
- Automation:
- Automate ELK deployment, configuration, and operational tasks using tools like Ansible, Terraform, Puppet, or Chef.
- Develop scripts (Python, Go, Bash) to streamline common ELK administration tasks.
Qualifications
Required Qualifications:
- Bachelor’s degree in computer science, Information Technology, or a related field, or equivalent practical experience.
- 5+ years of experience working with and managing large-scale ELK (Elasticsearch, Logstash, Kibana) deployments.
- Strong understanding of Elasticsearch architecture, performance tuning, and scaling strategies.
- Proficiency in configuring Logstash pipelines and Beats for various data sources.
- Experience with Kibana for dashboard creation, visualization, and alerting.
- Solid experience with Linux/Unix operating systems.
- Experience with Azure cloud platforms
- Familiarity with SRE principles and practices, including SLOs, SLIs, and error budgets.
- Strong problem-solving skills and the ability to troubleshoot complex distributed systems.
- Excellent communication and collaboration skills.
- Preferred Qualifications:
- Experience with Kafka or other message queuing systems.
- Knowledge of other monitoring tools (Dynatrace, Datadog).
- Familiarity with security best practices for the ELK stack.
- Certifications related to Elasticsearch or cloud platforms.