M

Senior Software Engineer - Openstack

MulticoreWare
Full-time
On-site
Chennai, Tamil Nadu, India
Engineering Jobs

Key Responsibilities

·       Debugging and Troubleshooting:

·       Investigate and resolve complex software issues within OpenStack environments (particularly those running on Ubuntu), including networking, compute, and storage.

·       Diagnose and troubleshoot problems related to Kubernetes container orchestration, including pod failures, service outages, and networking issues.

·       Debug and analyze issues with Docker containers and their interaction with the underlying system.

·       Analyze and resolve issues related to Ceph distributed storage, including data replication, performance tuning, and storage availability.

·       Work on Octavia load balancers to troubleshoot L2/L3 networking issues and ensure reliable load balancing for cloud-native applications.

·       Incident and Problem Management:

·       Lead incident resolution efforts for platform outages or performance degradation, coordinating across different teams to ensure swift recovery.

·       Perform root cause analysis (RCA) and provide long-term fixes for recurring or critical issues.

·       Document incident postmortems to prevent future occurrences and improve processes.

·       Performance Optimization:

·       Analyze performance bottlenecks across the cloud stack, including OpenStack components, Kubernetes, and Ceph, and implement optimizations to improve reliability and efficiency.

·       Optimize networking setups, including Octavia load balancers, to enhance cloud service delivery.

·       Monitor and improve containerized application performance and scaling across Docker and Kubernetes clusters.

·       Cloud Platform Maintenance:

·       Assist in upgrading and maintaining cloud infrastructure, ensuring that all components (Ubuntu, OpenStack, Kubernetes, Ceph, etc.) are kept secure and up to date.

·       Participate in the deployment of software updates, security patches, and configuration changes in a controlled manner with minimal downtime.

·       Automation and Tooling:

·       Build and maintain automation scripts for monitoring, troubleshooting, and resolving cloud platform issues, focusing on OpenStack, Kubernetes, Ceph, and Docker environments.

·       Implement and optimize Infrastructure as Code (IaC) solutions to improve the deployment and configuration of cloud resources.

Required Skills and Qualifications

·       3+ years of experience with cloud platforms, specifically focusing on Ubuntu, OpenStack, Kubernetes, Ceph, Octavia load balancers, and Docker.

·       Strong debugging skills and familiarity with cloud and software debugging tools.

·       Experience with networking, compute, and storage components in OpenStack.

·       Hands-on experience with containerization (Docker) and orchestration (Kubernetes).

·       Familiarity with Ceph distributed storage solutions and troubleshooting storage issues.

·       Experience with monitoring and logging tools, such as Prometheus, Grafana, and Elasticsearch.

·       Solid understanding of networking principles, including L2/L3 networking, load balancing (Octavia), and SDN (Software Defined Networking).

·       Proficient in scripting languages like Python, Bash, or equivalent for automation.


·       Strong communication skills and the ability to work in a collaborative environment.