About client
Our client is one of the globally leading logistics providers. The company provides sea, air, road, and contract logistics, as well as integrated logistics solutions.
Challenge
The client had an on-prem legacy set of systems for logistics management and tracking of goods movement. Due to the rapidly changing environment, vehicle sensor types, and business processes, a significant percentage of analytics required manual or batch processing. At the same time, world standards for modern parcel tracking and vehicles dictate the need for real-time data processing. There was a need to design a cloud-agnostic solution, which could satisfy business needs and non-functional requirements.
Solutions
Divectors team proposed the design of a brand-new cloud data processing solution, alongside with implementation of the MVP for the client.
Key decisions Divectors team made during system design:
- Use Amazon Web Services, as the client had already numerous AWS resources.
- Use Airflow for the process orchestration. In AWS it is represented as MWAA and can be deployed on other Clouds using Kubernetes services.
- Use AWS CloundWatch + DataDog for transparent system audit and logging.
- A New Data warehouse (DWH) on Snowflake should be introduced as a company-wide reference data repository. DWH should be used as centralized storage for structured and processed information, as well as the storage for “raw” data in the staging layer.
- Data processing should be implemented as a set of data pipelines build on top of dBt models and embedded Jinja templates.
- A centralized data repository like DWH is intended to support data exploration and analytics workload, as well as to be a data source for other repositories. At the same time, they rarely can be effective for product requests with their fast response time requirements.
The diagram below shows that ingesting processing can be performed using a combination of airflow (orchestration and invocation of Snowflake copy) and dBt.
Results
- The implemented solution is fully based on cloud services and meets defined SLA’s, RTO, and RPO which strongly ensures business continuity and availability of the services.
- Solution Provides the ability for transparent, detailed operational audit and reporting for MVP platform activity
- The solution allows scalability and data integrity from different data sources to the Snowflake analytical engine.
- Considering IaC deployment and solution standardization, It can be used by different divisions for satisfying various business needs from reporting and data analytical perspective
- Increased the efficiency of analytics data processing by reducing any human intervention in the data processing process