AWS re:Invent 2019: Using Amazon EMR to build a Spark ecosystem at Opendoor (STP302)

Published on Dec 07, 2019

In this session, you learn how Opendoor, an online home-buying and selling service, manages medium-sized real estate data using Spark. The session covers the journey from in-house data processing solutions using Kubernetes to benchmark against providers, such as Amazon EMR and Databricks, to migrating to Amazon EMR to achieve a balance of cost and performance. As a cost-conscious startup with heavy data processing needs, Opendoor had to balance cost, performance, availability, and developer experiences when designing its ETL and machine learning platform. This session focuses on how Opendoor optimized its data workflow by migrating Spark workloads from Kubernetes to Amazon EMR.