Product Traceability: a Manufacturing Use Case

Background
A major European manufacturing corporation challenged us to help them with providing a product traceability API on top of their machine data. They needed an API for connectivity from user interfaces, in order to perform a search by serial number to understand the series of events, machines, processes that a particular part encountered.
The company has many factories and hundreds of machines from different vendors. Each performs a process on components, and records details, including: date/time, location, machine, serial numbers, and other metrics collected from the machine.
Why was this a problem?
There are millions of machine-generated files, sitting in a data lake on AWS in S3 buckets, with each machine emitting files and pushing them to cloud storage.
The complexity arises due to the different file formats (multiple JSON and XML) that each machine uses. For instance, some machines are many years old, others brand new, and there can be firmware upgrades that can change the format of the data.
Their current solution was ETL-based, however due to the fluidity of the file formats and data quality issues, the data pipelines were frequently broken and had to be investigated, remediated and processing restarted including the failed backlog.
A need to perform integrated analytics alongside other enterprise data, e.g. an ERP system, or using reference values as lookup data, and an inability to perform historical data reporting.

Machine-generated data accessibility using the RAW Data Product Platform
We implemented a solution to the four major needs above, and a set of extended functionalities covering the following:
- Accessing the data as APIs, via Business Intelligence tools, Excel and other UIs
- Searching by serial number
- Returning latest state of any part
- Return historical information and changes to that part as it progressed through the manufacturing process
- Handling changes to data structures gracefully over time
- Finding and dealing with data quality exceptions without involving IT
- Integration to other data sources (ERP, database) using SQL syntax for wider analytics
Summary
Our RAW Data Product Platform is a great choice for APIs to access complex, heterogenous data at scale, on machine-generated data. There are many unique features that enable these types of problems to be implemented faster, and simpler. For more details, you can download our solution brief, or white paper on our platform architecture, plus see the links below.

Jeremy Posner, VP Product & Solutions, RAW Labs.
Want to learn more?
- Give RAW a try: Get Started for free!
- Why not follow us on LinkedIn, or Twitter, or join the conversation over at Reddit
- Read our Tutorials and Getting Started docs
- Like code? head on over to GitHub and look at our demo APIs
- Developer? Join us! we are looking for bright minds – at all levels of seniority, in databases, distributed systems, UI/UX.
Learn More

Machine-Generated Data Analytics
Gain insight from your machine processes without ETL Machine data comes in all formats, shapes and sizes, and can be difficult to perform analysis on. It may be collected and...

RAW Data Product Platform
Read more about our RAW Data Product Platform, including challenges it is designed to solve, and components of the solution

Product Traceability
Our RAW solution enables a European manufacturer to provide product traceability as a data service for both their internal and customers’ demands for more detailed and transparent information.