Streaming Analytics on 50M Events Per Day with Confluent Cloud at Picnic

What are useful practices for migrating a system to Apache Kafka® and Confluent Cloud, and why use Confluent to modernize your architecture?Dima Kalashnikov (Technical Lead, Picnic Technologies) is part of a small analytics platform team at Picnic, an online-only, European grocery store that processes around 45 million customer events and five million internal events daily. An underlying goal at Picnic is to try and make decisions as data-driven as possible, so Dima's team collects events on all aspects of the company—from new stock arriving at the warehouse, to customer behavior on their websites, to statistics related to delivery trucks. Data is sent to internal systems and to a data warehouse.Picnic recently migrated from their existing solution to Confluent Cloud for several reasons:Ecosystem and community: Picnic liked the tooling present in the Kafka ecosystem. Since being a small team means they aren't able to devote extra time to building boilerplate-type code such as connectors for their data sources or functionality for extensive monitoring capabilities. Picnic also has analysts that use SQL so appreciated the processing capabilities of ksqlDB. Finally, they found that help isn't hard to locate if one gets stuck.Monitoring: They wanted better monitoring; specifically they found it challenging to measure for SLAs with their former system as they couldn't easily detect the positions of consumers in their streams.Scaling and data retention times: Picnic is growing so they needed to scale horizontally without having to worry about manual reassignment. They also hit a wall with their previous streaming solution with respect to the length of time they could save data, which is a serious issue for a company that makes data-first decisions. Cloud: Another factor of being a small team is that they don't have resources for extensive maintenance of their tooling.Dima's team was extremely careful and took their time with the migration. They ran a pilot system simultaneously with the old system, in order to make sure it could achieve their fundamental performance goals: complete stability, zero data loss, and no performance degradation. They also wanted to check it for costs.The pilot was successful and they actually have a second, IoT pilot in the works that uses Confluent Cloud and Debezium to track the robotics data emanating from their automatic fulfillment center. And it's a lot of data, Dima mentions that the robots in the center generate data sets as large as their customer events streams. EPISODE LINKSPicnic Analytics Platform: Migration from AWS Kinesis to Confluent CloudPicnic Modernizes Data Architecture with ConfluentData Engineer: Event Streaming PlatformWatch this podcast in videoKris Jenkins’ TwitterStreaming Audio Playlist Join the Confluent CommunityLearn more with Kafka  resources on Confluent DeveloperLive demo: Event-Driven Microservices with ConfluentUse PODCAST100 to get $100 of free Confluent Cloud usageBuilding Data Streaming App | Coding In Motion

Om Podcasten

Streaming Audio features all things Apache Kafka®, Confluent, real-time data, and the cloud. We cover frequently asked questions, best practices, and use cases from the Kafka community—from Kafka connectors and distributed systems, to data mesh, data integration, modern data architectures, and data mesh built with Confluent and cloud Kafka as a service. Join our hosts as they stream through a series of interviews, stories, and use cases with guests from the data streaming industry. Apache®️, Apache Kafka, Kafka, and the Kafka logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks.