About
Program
Workshops
Speakers
Venue
Networking
FAQ
Tickets
Program
Times are shown in CDT timezone (GMT-5)
Filter by track
All
Case study
Deep dive
Trends
Community
Workshop
Filter by day
Monday
Tuesday
Wednesday
Monday, July 18, 2022
10:00
10:20 - 10:40
The adoption, current state and future of Apache Beam at Twitter
by Lohit Vijayarenu
10:40 - 11:00
Google's investment on Beam, and internal use of Beam at Google
by Kerry Donny-Clark
11:00 - 11:20
Tayloring pipelines at Spotify
by Rickard Zwahlen
12:00
12:00 - 12:50
Vega: Scaling MLOps Pipelines at Credit Karma using Apache Beam and Dataflow
by Debasish Das & Vishnu Venkataraman
12:00 - 12:50
Houston, we've got a problem: 6 principles for pipelines design taken from the Apollo missions
by Israel Herraiz & Paul Balm
12:00 - 12:25
RunInference: Machine Learning Inferences in Beam
by Andy Ye
12:30 - 12:55
Speeding up development with Apache Beam (Adobe Experience Platform)
by Constantin Scacun
13:00
Lunch
14:00
14:00 - 14:50
Powering Real-time Data at Intuit: A Look at Golden Signals powered by Beam
by Omkar Deshpande, Dunja Panic, Nick Hwang & Nagaraja Tantry
14:00 - 14:50
How the sausage gets made: Dataflow under the covers
by Pablo Estrada
14:00 - 14:25
State of the Go SDK 2022
by Robert Burke
14:30 - 14:55
Spanner Change Streams to BigQuery replication using Dataflow
by Haikuo Liu
15:00
15:00 - 15:50
Protecting the Internet at Scale
by Alfredo Gimenez
15:00 - 15:50
Improving Beam-Dataflow Pipelines for Text Data Processing
by Sayak Paul & Nilabhra Roy Chowdhury
15:00 - 15:25
Introduction to the benchmarks in Apache Beam
by Alexey Romanenko
15:30 - 15:55
From script slums to beam skyscrapers
by Shailesh Mangal
16:00
Coffee break
16:15
16:15 - 16:40
Data Integration on cloud made easy using Apache Beam
by Parag Ghosh
16:45 - 17:10
How to break Wordle with Beam and BigQuery
by inigo-san-jose
16:15 - 16:40
Detecting Change-Points in Real-Time with Apache Beam
by Devon Peticolas
16:45 - 17:10
Strategies for caching data in Dataflow using Beam SDK
by Zeeshan
16:15 - 17:05
Collibra’s Telemetry Backbone - OpenTelemetry and Apache Beam
by Alex Van Boxel
17:15
17:15 - 18:05
New Avro serialization and deserialization in Beam SQL
by Talat Uyarer
17:15 - 18:05
Implementing Cloud Agnostic Machine Learning Workflows with Apache Beam on Kubernetes
by charles adetiloye & Alexander Lerma
17:15 - 18:05
Challenges of capturing change streams with Beam
by Nancy Xu
18:05
Reception 18:05 - 20:00 hrs
Tuesday, July 19, 2022
10:00
10:20 - 10:40
Beam's last year, and what to expect for next!
by Kenn Knowles
10:40 - 11:00
Palo Alto Networks' massive-scale deployment of Beam
by Talat Uyarer
12:00
12:00 - 12:50
Optimizing a Dataflow pipeline for cost efficiency: lessons learned at Orange
by Jérémie Gomez & Thomas Sauvagnat
12:00 - 12:25
Oops, I wrote a Portable Beam Runner in Go
by Robert Burke
12:30 - 12:55
Combine by Example - OpenTelemetry Exponential Histogram
by Alex Van Boxel
12:00 - 12:50
Visually build Beam pipelines using Apache Hop
by Matt Casters
13:00
Lunch
14:00
14:00 - 14:25
Unified stream and batch pipelines at LinkedIn using Beam
by Shangjin Zhang & Yuhong Cheng
14:30 - 14:55
How to benchmark your Beam pipelines for cost optimization and capacity planning
by Roy Arsan
14:00 - 14:50
Beam in production
by Ragy Abraham
14:00 - 14:50
Log ingestion and data replication at Twitter
by Praveen Killamsetti & Zhenzhao Wang
15:00
15:00 - 15:50
Streaming NLP infrastructure on Dataflow
by Alex Chan & Angus Neilson
15:00 - 15:50
Relational Beam: Process columns, not rows!
by Andrew Pilloud & Brian Hulette
15:00 - 15:50
Migration Spark to Apache Beam/Dataflow and hexagonal architecture + DDD
by Mazlum Tosun
16:00
Coffee break
16:15
16:15 - 17:05
Developing PulsarIO Connector
by Marco Robles
16:15 - 16:40
Writing a native Go streaming pipeline
by Danny McCormick & Jack McCluskey
16:45 - 17:10
Beam as a High-Performance Compute Grid
by Peter Coyle & Raj Subramani
16:15 - 16:40
Playing the Long Game - Transforming Ricardo's Data Infrastructure with Apache Beam
by Tobias Kaymak
16:45 - 17:10
Use of shared handles for Cache reuse across DoFn’s in Python
by Amruta Deshmukh
17:15
17:15 - 18:05
Online clustering and semantic enrichment of textual data with Apache Beam
by Alexandru Balan
17:15 - 18:05
Scaling up pandas with the Beam DataFrame API
by Brian Hulette
17:15 - 17:40
Apache Beam backend for open source Scalding
by Navin Viswanath
Wednesday, July 20, 2022
09:00
09:00 - 12:00
Scio track
09:00 - 12:00
Apache Beam on Amazon Kinesis Data Analytics (KDA)
by Amar Surjit & Subham Rakshit
09:00 - 09:25
Real time liveness status of industrial sensors with Apache Beam on Dataflow runner and Yugabyte
by Kamaljeet Singh
09:30 - 09:55
Supporting ACID transactions in a NoSQL database with Apache Beam
by Jan Lukavský
10:00 - 10:50
Error handling with Apache Beam and Asgarde library
by Mazlum Tosun
11:00 - 11:25
Effective detecting and preventing abuse on LinkedIn with Beam streaming processing
by Rui Han
11:30 - 11:40
Beam Playground: discover, learn and prototype with Apache Beam
by Daria Malkova
11:40 - 11:50
GCP Beam Common Customer Issues
by Svetak Sundhar
11:50 - 12:00
The Ray Beam Runner Project: A Vision for Unified Batch, Streaming, and ML
by Patrick Ames, Jiajun Yao & Chandan Prasad
12:30
12:30 - 15:00
Splittable DoFns in Python: a hands-on workshop
by Israel Herraiz & Miren Esnaola
12:30 - 15:00
Beam Cross Language Transforms in Python, with Google Cloud Dataflow
by Wei Hsia & Sergei Lilichenko