Machine-generated data analytics

Perform analytics directly on machine-generated data without needing to reformat and load into a database.



Within any industrial machinery or manufacturing environment, data is being created with ever-increasing volume, detail and frequency. Machines supplied by different vendors generate highly varied data, which may change over time due to machinery upgrades, part replacements or sensor recalibrations. To understand your manufacturing process, trace products or predict failures requires integrated analysis of this data. Dealing with the varied and changing data structures across hundreds or thousands of machines in your factories is a huge challenge as data pipelines are complex, fragile and fail and analytic capability is halted whilst problems are fixed.


Cost and complexity increases – due to the reliance on data operations to find issues and data engineering to fix those issues due to machine data structure changes, and then re-process the failed records again. Time-to-market extends to weeks for new analytical capabilities and new data products. Meanwhile the business is without the crucial answers it needs for product traceability or predictive maintenance purposes.


RAW allows heterogenous data to be queried in any format, at speed and at huge scale, but without the need to load into a database and normalize the data first. RAW can cope with changes to data structures via creation of rules to process structure or content changes when they occur, or after they have occurred. RAW also performs complex transformations directly in place on machine data and maintains fast caches based on user behavioural patterns. – even over millions of files of machine-generated data. This frees up your resources from fixing problems to creating and enhancing data products and analytical features.

USE CASE Product traceability

An industrial manufacturer has a large estate of machines generating millions of files of data in proprietary data formats, structures and descriptions. These are collected and stored daily in a Amazon S3 data lake. Users, both inside and outside of the company, want to trace product provenance, including traceability queries on a single part number, the history of a part across machines and processes, and larger-scale analytics across all products or processes.

RAW NoDB performs queries on machine-generated data directly in the format produced by each machine, including mapping and transformation rules to provide consistent output. RAW NoDB creates web services as API’s to allow this data to be accessible by applications or served up as data products through Business Intelligence applications.

USE CASE Predictive maintenance

A manufacturer collects data from machine sensors, maintenance history and product performance/failure events. This data exists in several file formats and structures and is collected in a Amazon S3 data lake. Financial costs from a relational data warehouse and product reference data from SAP ERP are both needed to produce input data sets for training a predictive maintenance machine learning model.

RAW NoDB enables a data scientist to deliver data needed for the machine learning model directly from many data sources and formats using a single SQL-like query language. RAW NoDB scales in the cloud via our Query-as-a-Service platform, leaving you to just ask the questions and not worry about scaling the infrastructure. RAW integrates directly with Python code so that the data processing and the ML model can use the same familiar development environment, e.g. a Jupyter notebook.


With RAW NoDB and our Query-as-a-Service you can leverage your machine-generated data to produce better data products, here’s why:

  • Data polyglot, read/write any kind of file format or data structure
  • Single query language, providing one interface to many sources
  • Output data views and API’s for both humans and application interfaces to consume
  • “NoDB” approach removes unnecessary data copies, query directly at source, save ETL and wrangling
  • Python integration for data scientists to augment and integrate with existing code
  • Smart caching, provided by our Query-as-a-service, leaving you to concentrate on the business problem

Take a tour of our documentation and tutorials today !