Delhivery -- a high-efficient, low-cost global logistics platform -- processes about 1 TB of data per day from shipment tracking, GPS, biometrics, handheld devices clients etc. and makes smart business decisions through real-time data-driven insights.
As data volumes surged exponentially, it raised critical questions such as "Where does my data reside?", "How do I define my rapidly changing datasets?", "What other applications are associated with the datasets?", "How is my data classified?" and, finally, "Who has access to my data?" , which turned to data democratisation at scale for real-time business insights and improved decision-making.
A data catalog-based approach is useful here and it helps in maintaining the necessary business metadata. It is aided by data discovery, data lineage tracking, which then provides quality data insights for businesses through visualisation tools.
To support data democratization and data catalog approaches, organisations have been adopting ‘data mesh’ as an architectural pattern with several tooling and other considerations to deliver such an architecture.
The objective of this discussion is to describe the implementation of data mesh, using a combination of AWS analytics services and open source tools (Apache Atlas and Amundsen) in Delhivery.