Principal DevOps Engineer/SRELocation: Redwood City, CA
Posted On: 04/05/2022
Requirement Code: 57212
Principal DevOps Engineer/SRE
As a Principal DevOps Software Engineer, you will actively interface with software developers, product managers, test engineers, and administrators on projects to design and develop the build, release, and deploy toolchain for DevOps while providing on-call support. You should be able to identify, troubleshoot and resolve issues quickly and effectively, sometimes under pressure. Responsibilities include capacity planning, high availability engineering, performance tuning, and automation/tools development.
You should have strong leadership skills, experience managing infrastructure through multiple product releases, and have a passion for reliability and security. Work with management to set priorities, track operational metrics. Excellent communication skills and teamwork are a must!
???Design and develop the build, release, and deploy toolchain for DevOps
???Setup, manage and maintain parity across development, staging, and production application environments in cloud infrastructure
???Provide release cadence across multiple environments
???Prototype and develop cloud-native architecture solutions for application needs
???Design and implement monitoring infrastructure development
???Provide support for production operations
???A Bachelor's degree in Computer Science or a related field with 10+ years of experience in Software Reliability Engineering/Systems Engineering/DevOps role is required.
???Strong ability to architect development toolchains and cloud infrastructure
???Strong knowledge of Linux systems and internals.
???Experience in developing software to automate production systems with one of the following languages: python, ruby, java, golang. Python or golang preferred.
???Strong working knowledge of AWS Cloud infrastructure (EC2, RDS, VPC peering, Route53, S3, Autoscaling).
???Strong experience with container technology including Kubernetes and Docker
???Strong experience with provisioning infrastructure through IAC (preferably Terraform) and cloud automation principles
???Good understanding of networking and related protocols; must have a strong understanding of fundamentals (HTTP, DNS, TLS)
???Proficiency with source control, CI/CD pipeline (eg: git, jenkins, Harness)
???Demonstrate experience troubleshooting problems and working with a team to resolve web-scale production issues
???Strong experience with configuration management, monitoring, and systems tools (ie: Salt, Ansible, Chef, Nagios, Graphite, Fluentd,, vector, etc.). Ansible is preferred.
???Good understanding of Mysql, Postgres databases
???Experience working with cloud-based technologies (CDN) is highly desirable
???Drive to build robust automated logging, monitoring, and alerting systems with tools such as Splunk, NewRelic, CloudWatch etc.
???Exposure to messaging pub/sub systems (eg: RabbitMQ, Active-MQ, Kinesis, Kafka etc.)
???Troubleshooting critical development systems (Build failures, critical web services)
???Experience with Release Management processes and controls
???Experience in secrets management solution (KMS/HSMs/Hashicorp Vault)
???Experience with Linux package management tools eg: rpm, deb & fpm etc.
???Exposure to Security technologies related to perimeter security, web application scanning, and firewall systems
???Additionally, working knowledge on one of distributed systems technologies (eg: Zookeeper, Consul, etc.)
???Familiarity with NoSQL technologies eg: redis, dynamodb