AWS Video Catalog

AWS re:Invent 2018: [REPEAT 1] A Deep Dive into What's New with Amazon EMR (ANT340-R1)

Published on Dec 01, 2018

Amazon EMR is one of the largest Spark and Hadoop service providers in the world, enabling customers to run ETL, machine learning, real-time processing, data science, and low-latency SQL at petabyte scale. In this session, we introduce design patterns such as using Amazon S3 instead of HDFS, taking advantage of both long- and short-lived clusters, using notebooks, and other architectural best practices. We discuss lowering cost with Auto Scaling and Spot Instances, and security best practices for encryption and fine-grained access control. We showcase key improvements made to the service in 2017. We cover improvements in using the Amazon EMR API, best practices utilizing Spot instances and Spot Instances with Auto Scaling, improvements toward Amazon S3 performance on Amazon EMR, and security/authorization and authentication. We couple each of these with a demo or customer use case to illustrate the benefits. If you are an existing Amazon EMR user, you walk away with a thorough understanding of improvements made in 2018, and how they benefit you. If you are a new Amazon EMR user, get an understanding of common use cases and how other customers are using Amazon EMR.

3:01

An introduction to Amazon EMR - Amazon Web Services

3:01

1:01:29

AWS re:Invent 2018: [REPEAT 1] A Deep Dive into What's New with Amazon EMR (ANT340-R1)

1:01:29

55:47

AWS re:Invent 2018: Hadoop/Spark to Amazon EMR, Architect It for Security & Governance (ANT312)

55:47

2021

AWS Summit Online ASEAN 2021 | Run Amazon EMR at low cost - Amazon EC2 Spot Instances & Amazon MWAA

How can I access applications on an Amazon EMR cluster if the cluster is in a private subnet?

Amazon EMR on EKS - Build Custom Images for Apache Spark on Kubernetes

AWS re:Invent 2018: [REPEAT 1] A Deep Dive into What's New with Amazon EMR (ANT340-R1)

2021

AWS Summit Online ASEAN 2021 | Run Amazon EMR at low cost - Amazon EC2 Spot Instances & Amazon MWAA

How can I access applications on an Amazon EMR cluster if the cluster is in a private subnet?

Amazon EMR on EKS - Build Custom Images for Apache Spark on Kubernetes

AWS What's Next ft. Amazon EMR Studio | AWS Events

How can I modify the Spark configuration in an Amazon EMR notebook?

Amazon EMR Support for Targeted ODCR

EMR on EKS - Optimizing Apache Spark jobs on EMR on EKS

EMR on EKS - Orchestrating workflows with Apache Airflow

EMR on EKS - Accessing a Hive metastore or Glue Data Catalog

EMR on EKS - Running Apache Spark jobs on EMR on EKS

Amazon EMR on Amazon EKS - What is EMR on EKS?

Using Amazon EMR Studio to Launch EMR clusters in AWS Service Catalog

Intro to Amazon EMR Studio

Incremental Data Processing using Delta Lake with EMR

AWS re:Invent 2020: What’s new with Amazon EMR

AWS re:Invent 2020: Run big data analytics faster at lower cost with Amazon EMR

AWS re:Invent 2020: How Nielsen built a multi-petabyte data platform using Amazon EMR

AWS re:Invent 2020: Run Spark on Kubernetes with Amazon EMR on Amazon EKS

AWS re:Invent 2020: Introducing EMR Studio: a new notebook-first IDE experience

AWS re:Invent 2020: Implement data access controls for multi-tenant Amazon EMR clusters

AWS re:Invent 2020: Turbocharging query execution on Amazon EMR

AWS re:Invent 2018: [REPEAT 1] A Deep Dive into What's New with Amazon EMR (ANT340-R1)

2021

AWS Summit Online ASEAN 2021 | Run Amazon EMR at low cost - Amazon EC2 Spot Instances &amp; Amazon MWAA

How can I access applications on an Amazon EMR cluster if the cluster is in a private subnet?

Amazon EMR on EKS - Build Custom Images for Apache Spark on Kubernetes

AWS What&#39;s Next ft. Amazon EMR Studio | AWS Events

How can I modify the Spark configuration in an Amazon EMR notebook?

Amazon EMR Support for Targeted ODCR

EMR on EKS - Optimizing Apache Spark jobs on EMR on EKS

EMR on EKS - Orchestrating workflows with Apache Airflow

EMR on EKS - Accessing a Hive metastore or Glue Data Catalog

EMR on EKS - Running Apache Spark jobs on EMR on EKS

Amazon EMR on Amazon EKS - What is EMR on EKS?

Using Amazon EMR Studio to Launch EMR clusters in AWS Service Catalog

Intro to Amazon EMR Studio

Incremental Data Processing using Delta Lake with EMR

AWS re:Invent 2020: What’s new with Amazon EMR

AWS re:Invent 2020: Run big data analytics faster at lower cost with Amazon EMR

AWS re:Invent 2020: How Nielsen built a multi-petabyte data platform using Amazon EMR

AWS re:Invent 2020: Run Spark on Kubernetes with Amazon EMR on Amazon EKS

AWS re:Invent 2020: Introducing EMR Studio: a new notebook-first IDE experience

AWS re:Invent 2020: Implement data access controls for multi-tenant Amazon EMR clusters

AWS re:Invent 2020: Turbocharging query execution on Amazon EMR

AWS Summit Online ASEAN 2021 | Run Amazon EMR at low cost - Amazon EC2 Spot Instances & Amazon MWAA

AWS What's Next ft. Amazon EMR Studio | AWS Events