DevOps EngineerApply to this job
RAW Labs is a rapidly expanding Swiss enterprise data technology company that was spun out of École Polytechnique Fédérale de Lausanne (EPFL), by Prof. Anastasia Ailamaki and a team of highly successful engineers and scientists from amongst others CERN, Cisco and Salesforce.
At RAW Labs we have developed novel and highly innovative technologies to interrogate massive quantities of data in different formats, that are held in a variety of data stores across the enterprise’s infrastructure and the Cloud. By leveraging these innovative technologies RAW Labs has built a Cloud based Data Sharing platform for creating and maintaining APIs. The RAW Labs platform enables our customers to cost effectively exploit all forms of data and make available as Data Products. Enterprises use RAW Labs’ platform to drive ML/AI, business intelligence and data analytics applications without having first to undertake costly ETL/ELT operations, and securely share data in hours, not days.
RAW Labs is funded by a group of highly sophisticated and experienced technology investors and are advised amongst others by Prof. Martin Odersky (creator of Scala), Prof. Mike Franklin (co-creator of Spark), Dr. Alon Halevy (from Facebook’s AI team) and CIO’s of large enterprises.
Our R&D team is based in two development centers: one in Lausanne, Switzerland, and the other in Athens, Greece. The successful applicant would be working in either office, or a Remote worker with access to either office as needs dictate, and working in European time zones.
As DevOps Engineer, you will play a key role in the operation of the RAW platform. This includes, among other tasks, managing and scaling up the customer and internal-facing infrastructure as well as coordinating the quality assurance processes required for a successful release.
You are a passionate engineer with demonstrable experience, detailed-oriented, with great oral and written communication skills, multi-tasker, and demonstrated team-player. You know how to manage projects on time and interact with both technical and non-technical colleagues. You want to be a major factor in the success of our customers.
Your role is part of the engineering team and strategic for the success of our company. We are planning on rapid growth which paves the way for great career opportunities.
- Manage and improve the production infrastructure, which includes Kubernetes clusters, multiple relational databases, HDFS, Hive, Kafka, both on-premises and on the cloud (esp. AWS).
- Analyze behavior of production systems, run benchmarks, collect results, and provide tools to help analyze results.
- Together with the engineering team, develop new benchmarks and test suites that reflect customer scenarios.
- Deploy clusters on the cloud for customers as well as specialized testing.
- Design, implement, monitor, and maintain automated deployment to production, ensuring a stable process.
- Ensure system reliability by verifying deployments through monitoring and automated testing.
- Help to troubleshoot production issues as needed.
- Collaborate, educate and work across teams to simplify and scale the tasks involved in building and shipping software through improved tooling, automation, and communication.
- Write playbooks and rehearse scenarios to ensure we have an efficient incident response to support our uptime commitments to our customers.
- Look for automation opportunities and implement them.
- Work on emergency planning and resolution processes for customer support cases.
- Participate in customer support as needed, ready to jump into a verified emergency and organize the restoration of service.
- Participate in the maintenance of our continuous integration system (based on Jenkins and GitHub Actions).
- Participate in the maintenance of our build and test system (based on SBT with custom-made components).
- Participate in the maintenance of our software packaging (including Docker images, Python packages and others)
- University degree in computer science or engineering or equivalent experience.
- At least 2 years of experience in a DevOps/SRE/Ops role.
- Experience with Kubernetes.
- Experience with major Cloud Providers, esp. AWS.
- Extensive experience with one or more monitoring tools/frameworks (e.g. CloudWatch, Grafana, Prometheus, Elastic Stack, etc.).
- Expert knowledge of networking and VPCs.
- Debugging and troubleshooting skills, with an enthusiastic attitude to support and resolve customer problems.
- Basic knowledge of SQL is required to use the RAW platform.
- Experience with CI/CD tools e.g. Jenkins, GitHub actions, Artifactory.
- Excellent written and verbal English.
- Great oral and written communication skills.
Nice to have:
- Experience in operating big data technologies such as Hadoop, Spark, HDFS. Experience in Kafka is also a nice-to-have.
- Experience in Infrastructure-as-Code (Terraform, Packer, Ansible).
- Knowledge of Python, Scala, Java, Go, Bash.
- Experience in enterprise technologies such as Auth0, OAuth, Kerberos, Active Directory, LDAP, etc.
- Experience with Java benchmarking tools: JMH, Java Mission Control.
- Being at the front-line building one of the greatest enterprise technology success stories.
- Working shoulder to shoulder with the greatest academics and practitioners in the field. of Big Data and Data Meshes to solve the most challenging problems that the world’s largest enterprises face when trying to explore their data troves.
- Using in your day-to-day work the most modern technologies and techniques to solve challenging real-life problems.
- Learning directly from some of the industry’s best minds.
- The opportunity to be a key member of the team building as we grow.
- An attractive compensation package including equity upside.
We also have other benefits that will keep you happy:
- Dedicated budget for training and professional development, participation in conferences
- State-of-the-art equipment
- Great facilities when working from the office and support for remote working.
- Regular inspiring team building events.
- Flexibility in working hours and location.
Apply to this job