Program

Times are shown in CDT timezone (GMT-5)

Filter by track

All

Case study

Deep dive

Trends

Community

Workshop

Filter by day

Monday, July 18, 2022

09:50

09:50 - 10:00.

Welcome

by Beam Summit Team

Room: 204

10:00 - 10:25.

Google's investment on Beam, and internal use of Beam at Google

by Kerry Donny-Clark

Room: 204

10:25 - 10:50.

Machine learning design patterns: between Beam and a hard place

by Lak Lakshmanan

Room: 204

10:50 - 11:15.

Tailoring pipelines at Spotify

by Rickard Zwahlen

Room: 204

11:15 - 11:40.

The adoption, current state and future of Apache Beam at Twitter

by Lohit Vijayarenu

Room: 204

11:40

Break

12:00

12:00 - 12:50.

Vega: Scaling MLOps Pipelines at Credit Karma using Apache Beam and Dataflow

by Debasish Das, Vishnu Venkataraman & Raj Katakam

Room: 204

12:00 - 12:50.

Houston, we've got a problem: 6 principles for pipelines design taken from the Apollo missions

by Israel Herraiz & Paul Balm

Room: 203

12:00 - 12:25.

RunInference: Machine Learning Inferences in Beam

by Andy Ye

Room: 202

12:30 - 12:55.

Speeding up development with Apache Beam (Adobe Experience Platform)

by Constantin Scacun & Alexander Falca

Room: 202

13:00

Lunch

14:00

14:00 - 14:50.

Powering Real-time Data at Intuit: A Look at Golden Signals powered by Beam

by Omkar Deshpande, Dunja Panic, Nick Hwang & Nagaraja Tantry

Room: 204

14:00 - 14:50.

How the sausage gets made: Dataflow under the covers

by Pablo Estrada

Room: 203

14:00 - 14:25.

State of the Go SDK 2022

by Robert Burke

Room: 202

14:30 - 14:55.

How to break Wordle with Beam and BigQuery

by Iñigo San Jose Visiers

Room: 202

15:00

15:00 - 15:50.

BlueVoyant: Detecting Security Dumpster Fires on the Internet

by Alfredo Gimenez, Adam Najman, Tucker Leavitt & Tyler Flach

Room: 204

15:00 - 15:50.

Migration Spark to Apache Beam/Dataflow and hexagonal architecture + DDD

by Mazlum Tosun

Room: 203

15:00 - 15:25.

Introduction to performance testing in Apache Beam

by Alexey Romanenko

Room: 202

15:30 - 15:55.

From script slums to beam skyscrapers

by Shailesh Mangal

Room: 202

16:00

Break

16:15

16:15 - 16:40.

Data Integration on cloud made easy using Apache Beam

by Parag Ghosh

Room: 204

16:45 - 17:10.

Collibra’s Telemetry Backbone - OpenTelemetry and Apache Beam

by Alex Van Boxel

Room: 204

16:15 - 16:40.

How to benchmark your Beam pipelines for cost optimization and capacity planning

by Roy Arsan

Room: 203

16:45 - 17:10.

Strategies for caching data in Dataflow using Beam SDK

by Zeeshan

Room: 203

16:15 - 17:15.

Cloud Spanner change streams and Apache Beam

by Haikuo Liu, Nancy Xu & Le Chang

Room: 202

17:15

17:15 - 18:05.

New Avro serialization and deserialization in Beam SQL

by Talat Uyarer

Room: 204

17:15 - 18:05.

Implementing Cloud Agnostic Machine Learning Workflows with Apache Beam on Kubernetes

by Charles Adetiloye & Alexander Lerma

Room: 203

18:05

Reception 18:05 - 20:00 hrs

Tuesday, July 19, 2022

10:00

10:00 - 10:25.

Where is Beam leading Data Processing now?

by Kenn Knowles

Room: 204

10:25 - 10:50.

Palo Alto Networks' massive-scale deployment of Beam

by Talat Uyarer

Room: 204

10:50 - 11:15.

Beam as a High-Performance Compute Grid

by Peter Coyle & Raj Subramani

Room: 204

11:20

Break

12:00

12:00 - 12:50.

Optimizing a Dataflow pipeline for cost efficiency: lessons learned at Orange

by Jérémie Gomez & Thomas Sauvagnat

Room: 204

12:00 - 12:25.

Oops, I wrote a Portable Beam Runner in Go

by Robert Burke

Room: 203

12:30 - 12:55.

Combine by Example - OpenTelemetry Exponential Histogram

by Alex Van Boxel

Room: 203

12:00 - 12:50.

Visually build Beam pipelines using Apache Hop

by Matt Casters

Room: 202

13:00

Lunch

14:00

14:00 - 14:25.

Unified Streaming and Batch Pipelines at LinkedIn using Beam

by Shangjin Zhang & Yuhong Cheng

Room: 204

14:30 - 14:55.

Detecting Change-Points in Real-Time with Apache Beam

by Devon Peticolas

Room: 204

14:00 - 14:50.

Beam in Production: Working with DataFlow Flex temples and Cloud Build

by Ragy Abraham

Room: 203

14:00 - 14:50.

Log ingestion and data replication at Twitter

by Praveen Killamsetti & Zhenzhao Wang

Room: 202

15:00

15:00 - 15:50.

Streaming NLP infrastructure on Dataflow

by Alex Chan & Angus Neilson

Room: 204

15:00 - 15:50.

Relational Beam: Process columns, not rows!

by Andrew Pilloud & Brian Hulette

Room: 203

15:00 - 15:50.

Error handling with Apache Beam and Asgarde library

by Mazlum Tosun

Room: 202

16:00

Break

16:15

16:15 - 17:05.

Developing PulsarIO Connector

by Marco Robles

Room: 204

16:15 - 16:40.

Writing a native Go streaming pipeline

by Danny McCormick & Jack McCluskey

Room: 203

16:45 - 17:10.

Beam data pipelines on microservice architectures

by Pragalbh Srivastava

Room: 203

16:15 - 16:40.

Use of shared handles for Cache reuse across DoFn’s in Python

by Amruta Deshmukh

Room: 202

16:45 - 17:10.

Playing the Long Game - Transforming Ricardo's Data Infrastructure with Apache Beam

by Tobias Kaymak

Room: 202

17:15

17:15 - 18:05.

Online clustering and semantic enrichment of textual data with Apache Beam

by Konstantin Buschmeier

Room: 204

17:15 - 18:05.

Scaling up pandas with the Beam DataFrame API

by Brian Hulette

Room: 203

17:15 - 17:40.

Apache Beam backend for open source Scalding

by Navin Viswanath

Room: 202

Wednesday, July 20, 2022

09:00

09:00 - 12:00.

Scio in-depth workshop

by Michel Davit, Israel Herraiz, Claire McGinty, Kellen Dye & Annica Ivert

Room: 202

09:00 - 09:25.

Real time liveness status of industrial sensors with Apache Beam on Dataflow runner and Yugabyte

by Kamaljeet Singh

Room: 201 (remote)

09:30 - 09:55.

Supporting ACID transactions in a NoSQL database with Apache Beam

by Jan Lukavský

Room: 201 (remote)

10:00 - 10:50.

Improving Beam-Dataflow Pipelines for Text Data Processing

by Sayak Paul & Nilabhra Roy Chowdhury

Room: 201 (remote)

11:00 - 11:10.

Beam Playground: discover, learn and prototype with Apache Beam

by Daria Malkova

Room: 201 (remote)

11:10 - 11:20.

GCP Beam Common Customer Issues

by Svetak Sundhar

Room: 201 (remote)

11:20 - 11:30.

The Ray Beam Runner Project: A Vision for Unified Batch, Streaming, and ML

by Patrick Ames, Jiajun Yao & Chandan Prasad

Room: 201 (remote)

12:00

Lunch

12:30

12:30 - 15:00.

Splittable DoFns in Python: a hands-on workshop

by Israel Herraiz & Miren Esnaola

Room: 202