The context, the need 📍
With over 10 million parcels processed daily, Geopost is faced with a massive volume of data, requiring an advanced data engineering approach to fully exploit this information in real time. Faced with this colossal workload, the company needs to implement artificial intelligence solutions that facilitate tasks such as shipper and recipient identity verification, parcel compliance, carbon footprint calculation, and logistics network optimization.
However, managing this flow of complex data and transforming it into usable information requires a solid, high-performance infrastructure. This is where the digital transformation of Geopost's data architecture comes in, with the aim of developing agile AI solutions capable of responding in real time to operational requirements.
The approach, the solution 🛠
To meet these needs, we implemented a comprehensive Data Engineering approach, providing Geopost with solutions tailored to their data infrastructure and operational challenges. At the heart of this approach is the implementation of a scalable Datalakehouse based on S3 and Dremio, designed to handle continuous data flows and boost AI applications.
Our support focused on two main areas:
-
- Consulting and auditing of data infrastructure: we analyzed and benchmarked existing solutions before recommending and deploying a Datalakehouse architecture capable of handling large volumes of data efficiently.
-
- Developing data pipelines: We have developed Kafka pipelines to process these massive data streams in real time, making critical AI services such as carbon footprint calculation or delivery time forecasting possible.
Main activities performed ✅
The realization of this solution relied on advanced technologies, as well as AI models tailored to Geopost's needs.
Key activities included
-
- Optimization of internal processes: Automation of manual tasks (e.g. document and photo parsing) and implementation of a data search/synthesis engine.
-
- Forecasting and analysis of logistics flows: machine learning algorithms for forecasting delivery times and managing optimal routes.
-
- Real-time calculation of compliance and carbon footprint: Process package information to assess regulatory compliance and environmental impact in real time.
The technical stack, the models used 🤖
The results, the benefits obtained ✨
Thanks to this new infrastructure and the AI solutions implemented, Geopost has been able to industrialize its artificial intelligence and data engineering needs. The main benefits achieved include:
🥰 Improving customer satisfaction
Thanks to faster, more accurate services, including delivery time prediction and parcel compliance management.
📉 Optimizing logistics processes
Reduced operating costs and lead times thanks to better route planning and automated flow management.
🌱 Reducing environmental impact
Real-time carbon footprint calculations enable Geopost to better manage its sustainable development initiatives.
🎯 Scalability and flexibility
Datalakehouse architecture can handle growing volumes of data, while remaining flexible to integrate new AI services in the future.
Finally, with the support of a team of dedicated experts (8 FTEs - 2 Tech Leads, 4 Data Scientists, 4 Data Engineers), Geopost has succeeded in modernizing its data infrastructure and considerably improving the efficiency of its services, positioning the company as a leader in the use of AI for logistics management.
Conclusion
The implementation of a modern data infrastructure and the creation of Kafka pipelines for real-time data processing have enabled Geopost to take a decisive step towards optimizing its operations. The results are as high as the stakes: reduced costs, improved customer satisfaction, and an infrastructure ready to support future growth.
Useful links 🔗
🦹 Our custom Data Engineering projects
🍾 Success stories Geopost - How GEOPOST manages 30M conversations a year in 18 languages with ILLUIN Dialogue











