Please note as of November 2022 we have upgraded our language – see our documentation and quick tour for details. These examples below are using previous version of RQL, and will be refactored shortly.
In this tutorial we are going to learn the basics of RAW. If you are familiar with SQL it should be very straightforward.
Types in RAW
There are two families of data types in RAW: nested types and primitive types.
The nested types are collection
and record
. They are called nested types because they take inner types and can nest arbitrarily.
The primitive types are strings, numbers and temporal types.
The full list of types is available in the documentation.
For instance: collection(record(a: int, b: collection(string)))
is a type that represents a collection of records. Each record has two fields: field a
is an int (aka an integer number) and field b
is a collection of strings.
First Query
One of the key features of RAW SQL is the ability to query structuers other than relational database tables. It can, for instance read several text formats like CSV, JSON, XML, AVRO or Parquet among others.
For instance, let’s write a first query on a CSV file.
The READ function in RAW will infer the format and structure of the data and we can assign this to the airports
collection for later use. airports
is a collection of records, just like a relational table is. To project all columns in the file, and return the first 5 rows we can code the following function:
airports := READ( "https://raw-tutorial.s3.amazonaws.com/airports.csv");
airports5() := {
SELECT * FROM airports LIMIT 5
}
This could also be written:
airports5() := {
SELECT *
FROM
READ( "https://raw-tutorial.s3.amazonaws.com/airports.csv") LIMIT 5
}
It is often convenient to use an object to hold a collection for instance. This can make the code more readable and easier to maintain, however the underlying query is optimised identically.
Code is encapsulated in functions, this is handy if we want to reuse them locally or make REST API calls using them. By convention, our code files’ extensions are .rql and there are Visual Studio Code extensions available to manage and run them. For this tutorial, all functions are implemented in the tutorial.rql
file.
First API
To create and subsequently execute an API endpoint we use basic YAML declarations like the following:
raw: 0.9
endpoint: GET
code: rql
codeFile: tutorial.rql
declaration: airports5
metadata:
title: RAW SQL Tutorial
description: Tutorial
tags:
- tutorial
format: json
computeClass: normal
enabled: true
security:
public: true
Executing the above endpoint:
https://api.raw-labs.com/tutorial/five-airports
will produce a collection of records like the following:
collection(
record(
AirportID: int,
Name: string,
City: string,
Country: string,
IATA_FAA: string,
ICAO: string,
Latitude: double,
Longitude: double,
Altitude: int,
Timezone: double,
DST: string,
TZ: string
)
)
[
{
"AirportID": 1,
"Name": "Goroka",
"City": "Goroka",
"Country": "Papua New Guinea",
"IATA_FAA": "GKA",
"ICAO": "AYGA",
"Latitude": -6.081689,
"Longitude": 145.391881,
"Altitude": 5282,
"Timezone": 10,
"DST": "U",
"TZ": "Pacific/Port_Moresby"
},
{
"AirportID": 2,
"Name": "Madang",
"City": "Madang",
"Country": "Papua New Guinea",
"IATA_FAA": "MAG",
...
We can execute more complex queries using regular SQL language features. For example:
airports := READ("https://raw-tutorial.s3.amazonaws.com/airports.csv");
airports_per_country() := {
SELECT Country, COUNT(*) AS Number_Airports
FROM airports
GROUP BY Country
ORDER BY Number_Airports DESC
}
Which will produce something like this:
collection(
record(
Country: string,
Number_Airports: long
)
)
[
{
"Country": "United States",
"Number_Airports": 1697
},
{
"Country": "Canada",
"Number_Airports": 435
},
{
"Country": "Germany",
"Number_Airports": 321
},
{
"Country": "Australia",
"Number_Airports": 263
},
{
"Country": "Russia",
"Number_Airports": 249
},
...
Here we see countries with the number of airports per country.
With the exception of the READ
keyword, this query looks like normal SQL. The READ
keyword, however, means users do not have to create tables and load data into them, or discover the schema. This is done automatically by RAW.
Passing Parameters
Now let’s add a parameter to the query and compute the number of airports for a given country.
airports := READ("https://raw-tutorial.s3.amazonaws.com/airports.csv");
airports_for_country(country: string) := {
SELECT Country, COUNT(*) AS Number_Airports
FROM airports AS a
WHERE a.Country = country
GROUP by a.Country
}
You can note that the function receives a parameter which is of type string
and named country
, this is super handy to design REST APIs. This parameter will automatically become a parameter in the URL endpoint, for example:
https://api.raw-labs.com/tutorial/airports-for-country?country=Germany
The other notable element is the “airports AS a” clause which is needed here, otherwise, trying to write the following, would not work.
// THIS DOESN'T WORK!
airports_for_country(country: string) := {
SELECT Country, COUNT(*) AS Number_Airports
FROM airports
WHERE airports.Country = country
GROUP by airports.Country
}
Here airports
in the WHERE clause would refer to the airports
variable (which is a collection) and airports.Country
would not mean a thing comparable to the country
string parameter.
The source code for this tutorial can be found here on GitHub, and more Tutorial links can be found below.

Georges Lagardère, VP Customer Experience, RAW Labs.
Get Started for Free
More Useful Links:
- Have Questions / Comments? Ask away – join our Community discussion!
- Read our Tutorials and Getting Started Docs
- Why not follow us on LinkedIn, Twitter, YouTube
- Like code? head on over to GitHub
- Developer? Join us! we are looking for bright minds – at all levels of seniority, in databases, distributed systems, UI/UX.