AWS Video Catalog

AWS re:Invent 2015 | (BDT208) A Technical Introduction to Amazon Elastic MapReduce

Published on Oct 12, 2015

Amazon EMR provides a managed framework which makes it easy, cost effective, and secure to run data processing frameworks such as Apache Hadoop, Apache Spark, and Presto on AWS. In this session, you learn the key design principles behind running these frameworks on the cloud and the feature set that Amazon EMR offers. We discuss the benefits of decoupling compute and storage and strategies to take advantage of the scale and the parallelism that the cloud offers, while lowering costs. Additionally, you hear from AOL’s Senior Software Engineer on how they used these strategies to migrate their Hadoop workloads to the AWS cloud and lessons learned along the way. In this session, you learn the benefits of decoupling storage and compute and allowing them to scale independently; how to run Hadoop, Spark, Presto and other supported Hadoop Applications on Amazon EMR; how to use Amazon S3 as a persistent data-store and process data directly from Amazon S3; dDeployment strategies and how to avoid common mistakes when deploying at scale; and how to use Spot instances to scale your transient infrastructure effectively.

49:13

AWS re:Invent 2015 | (BDT305) Amazon EMR Deep Dive and Best Practices

49:13

50:45

AWS re:Invent 2015 | (BDT208) A Technical Introduction to Amazon Elastic MapReduce

50:45

48:04

Deep Dive: Amazon Elastic MapReduce

48:04

2021

AWS Summit Online ASEAN 2021 | Run Amazon EMR at low cost - Amazon EC2 Spot Instances & Amazon MWAA

How can I access applications on an Amazon EMR cluster if the cluster is in a private subnet?

Amazon EMR on EKS - Build Custom Images for Apache Spark on Kubernetes

AWS re:Invent 2015 | (BDT208) A Technical Introduction to Amazon Elastic MapReduce

2021

AWS Summit Online ASEAN 2021 | Run Amazon EMR at low cost - Amazon EC2 Spot Instances & Amazon MWAA

How can I access applications on an Amazon EMR cluster if the cluster is in a private subnet?

Amazon EMR on EKS - Build Custom Images for Apache Spark on Kubernetes

AWS What's Next ft. Amazon EMR Studio | AWS Events

How can I modify the Spark configuration in an Amazon EMR notebook?

Amazon EMR Support for Targeted ODCR

EMR on EKS - Optimizing Apache Spark jobs on EMR on EKS

EMR on EKS - Orchestrating workflows with Apache Airflow

EMR on EKS - Accessing a Hive metastore or Glue Data Catalog

EMR on EKS - Running Apache Spark jobs on EMR on EKS

Amazon EMR on Amazon EKS - What is EMR on EKS?

Using Amazon EMR Studio to Launch EMR clusters in AWS Service Catalog

Intro to Amazon EMR Studio

Incremental Data Processing using Delta Lake with EMR

AWS re:Invent 2020: What’s new with Amazon EMR

AWS re:Invent 2020: Run big data analytics faster at lower cost with Amazon EMR

AWS re:Invent 2020: How Nielsen built a multi-petabyte data platform using Amazon EMR

AWS re:Invent 2020: Run Spark on Kubernetes with Amazon EMR on Amazon EKS

AWS re:Invent 2020: Introducing EMR Studio: a new notebook-first IDE experience

AWS re:Invent 2020: Implement data access controls for multi-tenant Amazon EMR clusters

AWS re:Invent 2020: Turbocharging query execution on Amazon EMR

AWS re:Invent 2015 | (BDT208) A Technical Introduction to Amazon Elastic MapReduce

2021

AWS Summit Online ASEAN 2021 | Run Amazon EMR at low cost - Amazon EC2 Spot Instances &amp; Amazon MWAA

How can I access applications on an Amazon EMR cluster if the cluster is in a private subnet?

Amazon EMR on EKS - Build Custom Images for Apache Spark on Kubernetes

AWS What&#39;s Next ft. Amazon EMR Studio | AWS Events

How can I modify the Spark configuration in an Amazon EMR notebook?

Amazon EMR Support for Targeted ODCR

EMR on EKS - Optimizing Apache Spark jobs on EMR on EKS

EMR on EKS - Orchestrating workflows with Apache Airflow

EMR on EKS - Accessing a Hive metastore or Glue Data Catalog

EMR on EKS - Running Apache Spark jobs on EMR on EKS

Amazon EMR on Amazon EKS - What is EMR on EKS?

Using Amazon EMR Studio to Launch EMR clusters in AWS Service Catalog

Intro to Amazon EMR Studio

Incremental Data Processing using Delta Lake with EMR

AWS re:Invent 2020: What’s new with Amazon EMR

AWS re:Invent 2020: Run big data analytics faster at lower cost with Amazon EMR

AWS re:Invent 2020: How Nielsen built a multi-petabyte data platform using Amazon EMR

AWS re:Invent 2020: Run Spark on Kubernetes with Amazon EMR on Amazon EKS

AWS re:Invent 2020: Introducing EMR Studio: a new notebook-first IDE experience

AWS re:Invent 2020: Implement data access controls for multi-tenant Amazon EMR clusters

AWS re:Invent 2020: Turbocharging query execution on Amazon EMR

AWS Summit Online ASEAN 2021 | Run Amazon EMR at low cost - Amazon EC2 Spot Instances & Amazon MWAA

AWS What's Next ft. Amazon EMR Studio | AWS Events