Technical Internships
About Us
RAW Labs is a rapidly expanding Swiss enterprise data technology company that was spun out of École Polytechnique Fédérale de Lausanne (EPFL), by Prof. Anastasia Ailamaki and a team of highly successful engineers and scientists from amongst others CERN, Cisco and Salesforce.
At RAW Labs we have developed novel and innovative technologies to interrogate massive quantities of data in different formats, that are held in a variety of data stores across an enterprise infrastructure and in the Cloud. By leveraging this core technology RAW Labs has built a Cloud based Data Sharing platform for creating and maintaining APIs. The RAW platform enables our customers to exploit all forms of data to create curated Data Products via our DataOps infrastructure, and securely share data in hours, not days. Enterprises use RAW Labs’ platform to drive ML/AI, business intelligence and data analytics applications without having to build and maintain complex data engineering infrastructures.
RAW Labs is funded by a group of highly sophisticated and experienced technology investors and are advised by technology luminaries including: Prof. Martin Odersky (creator of Scala), Prof. Mike Franklin (co-creator of Spark), Dr. Alon Halevy (from Facebook’s AI team).
Where you’ll be
Our R&D team is based in two development centers: one in Lausanne, Switzerland, and the other in Athens, Greece. The successful applicants will be working in/near either office, with Remote working available if desired by agreement – note that access to either office will be required for face-to-face meetings with your project supervisor.
About the Role
We are seeking a number of highly talented, innovation-driven Interns to help our research and further differentiate the RAW platform.
Our Interns will be assigned one of several projects we have, based on preference, availability, suitability and business priorities (See below).
As an Intern you will be assigned a technical supervisor, who will typically be a senior engineer from our staff. Whilst you are working with us, you will be fully integrated into the core engineering team and hence will get to see how a commercial technology company develops its product. And, of course, if you excel during the Internship period, there will be opportunities to join us permanently as we grow.
About You
Firstly and foremostly, we are looking for candidates with a passion for new technology, an inquisitive mind, a self-starting approach, and a can-do attitude. You may be working on problems that may not have been solved yet, or only partially solved. We are looking for:
- University degree in computer science or engineering and any post-graduate experience a bonus.
- Commercial experience a benefit, not a pre-requisite
- Evidence of an innovative project you have undertaken in the software and/or data space.
- Great oral and written communication skills, preferably in English, but we have French, Greek and Portuguese speakers too.
For technical skills, we can tailor the project to the successful candidate’s experience. Here are some of the technologies we use currently:
- Scala and/or Java and/or Kotlin, and SQL.
- Development of distributed / big data, especially Spark
- Development of Visual Studio Code Extensions
- Cloud Service Provider’s stack, e.g: AWS
- Benchmarking and profiling tools, e.g., JMH, Apache JMeter
- Container technologies such as Kubernetes and/or Docker
- CI/CD tools, e.g., Jenkins, Artifactory and DevOps tooling, e.g., Terraform, Docker, Compose, Ansible
- Security frameworks/libraries/providers, e.g. Auth0
Our Projects
Please indicate in your application which of the projects are of interest to you. There can be multiple:
1. Data Lineage
For Whom: Students interested in data management, automated data governance, traceability, metadata interoperability, standards and data catalogs. Duration: Flexible, ranging from 2 to 4 months
RAW’s query engine allows users to build complex analytics that integrate multiple data sources. These data analytics libraries can be built and shared with other users.
The goal of this project is to develop tools to determine and expose the lineage of data transformations in RAW, in the form of a catalog and REST APIs that can be easily consumed by UIs as well as users. The APIs to expose data lineage will be an extension of the core metadata APIs being developed at RAW.
This project requires knowledge of Scala/Java, as well as SQL.
2. GraphQL API interface
For Whom: Students with an interest in API developments, GraphQL vs. REST differences/comparisons, experience/knowledge of ApolloGraphQL, NodeJs, etc. is useful. Duration: Flexible, ranging from 2 to 4 months
Currently we support generation of REST APIs, however we are interested in generating GraphQL interfaces as these often work well for analytical workloads and user-defined questions.
The goal of the project will be to prototype and evaluate a GraphQL interface for the RAW platform. In addition to the development of the back-end, some consideration will be required for the UI and UX part, and metadata/API catalog.
3. Short-term Development Projects
For Whom: Students looking for short and well-defined implementation projects, using Scala/Java, and who do not have more than 2 months. Duration: 2 months
We have a number of short-term development projects; these are well-defined and scoped projects that give you the opportunity to contribute to an advanced codebase and gain practical experience. These projects include, but not limited to the following,
- Support OpenData: The Open Data Protocol is used for consuming queryable REST APIs. The goal of this project is to implement the Open Data Protocol to expose RAW data.
- Support OpenAPI: The OpenAPI specification is used to describe, produce and consume web services. RAW also allows users to easily create “data-based Web Services”. The goal of this project is to implement the latest OpenAPI standards in RAW, so that our API’s can be conformant with the standards
- JSON Schema: JSON schema provides a way to describe the structure of complex data. The goal of this project is to implement the support for JSON Schema in RAW
- Add API invocation commands to enhance our RAW user catalog application. RAW creates APIs but to use them it’s easier for the user to just copy/paste the invocation in: Python, Java, Excel/PowerBI, Postman, or some other technology, directly from our App.
- Development of RAW SQL demos: At RAW Labs, we are constantly developing new “data products” using the RAW platform. The goal of this project is to help the development of a variety demos. Unlike other projects, this involves primarily the use of RAW SQL language