AWS Video Catalog

AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ecosystem (BDM306)

Published on Dec 01, 2016

Amazon S3 is the central data hub for Netflix's big data ecosystem. We currently have over 1.5 billion objects and 60+ PB of data stored in S3. As we ingest, transform, transport, and visualize data, we find this data naturally weaving in and out of S3. Amazon S3 provides us the flexibility to use an interoperable set of big data processing tools like Spark, Presto, Hive, and Pig. It serves as the hub for transporting data to additional data stores / engines like Teradata, Redshift, and Druid, as well as exporting data to reporting tools like Microstrategy and Tableau. Over time, we have built an ecosystem of services and tools to manage our data on S3. We have a federated metadata catalog service that keeps track of all our data. We have a set of data lifecycle management tools that expire data based on business rules and compliance. We also have a portal that allows users to see the cost and size of their data footprint. In this talk, we’ll dive into these major uses of S3, as well as many smaller cases, where S3 smoothly addresses an important data infrastructure need. We will also provide solutions and methodologies on how you can build your own S3 big data hub.

57:06

AWS re:Invent 2016: Deep Dive: Amazon EMR Best Practices & Design Patterns (BDM401)

57:06

46:53

AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ecosystem (BDM306)

46:53

1:00

Using Amazon S3 for Apache HBase on Amazon EMR

1:00

2021

Why am I getting an Access Denied error from the Amazon S3 console while I modify a bucket policy?

AWS AMER Summit May 2021 | What’s new with Amazon S3

Configure PII redaction using Amazon S3 Object Lambda Access Points | Amazon Web Services

AWS AMER Summit 2020 | Best practices for building a data lake on Amazon S3

Mobilewalla: S3 Access Monitoring Using ML

AWS Pi Week 2021: Demo - S3 Object Lambda | AWS Events

AWS Pi Week 2021: S3 Object Lambda - add code to Amazon S3 GET requests to process data | AWS Events

AWS Pi Week 2021: S3 & Lambda - flexible pattern at the core of serverless applications | AWS Events

AWS Pi Week 2021: Building serverless applications with Amazon S3 | AWS Events

AWS Pi Week 2021: Live coding - Uploading media to Amazon S3 from web & mobile apps | AWS Events

AWS Pi Week 2021: The best compute for your storage - Amazon S3 & AWS Lambda | AWS Events

AWS Pi Week 2021: Serverless on Amazon S3 - Introducing S3 Object Lambda | AWS Events

AWS Pi Week 2021: Demo - Monitor your Amazon S3 inventory & identify sensitive data | AWS Events

AWS Pi Week 2021: Demo - Amazon S3 security posture management and threat detection | AWS Events

AWS Pi Week 2021: Demo - Amazon S3 and VPC Endpoints | AWS Events

AWS Pi Week 2021: Advanced networking with Amazon S3 and AWS PrivateLink | AWS Events

AWS Pi Week 2021: Protecting your data in Amazon S3 | AWS Events

AWS Pi Week 2021: Managing access to your Amazon S3 buckets and objects | AWS Events

AWS Pi Week 2021: S3 Block Public Access overview and demo | AWS Events

AWS Pi Week 2021: Securing Amazon S3 with guardrails and fine-grained access controls | AWS Events

AWS Pi Week 2021: Accelerating your migration to Amazon S3 | AWS Events

AWS Pi Week 2021: Modernizing your data archive with Amazon S3 Glacier | AWS Events

AWS Pi Week 2021: Amazon S3 Strong Consistency | AWS Events

AWS Pi Week 2021: Building modern data lakes on Amazon S3 | AWS Events

AWS Pi Week 2021: Managed file transfers to Amazon S3 and EFS over SFTP, FTPS, and FTP | AWS Events

AWS Pi Week 2021: Amazon S3 foundations: Best practices for Amazon S3 | AWS Events

AWS Pi Week 2021: Architecting for high availability on Amazon S3 | AWS Events

AWS Pi Week 2021: Beyond eleven nines - How Amazon S3 is built for durability | AWS Events

AWS Pi Week 2021: Backup to Amazon S3 and Amazon S3 Glacier | AWS Events

AWS Pi Week 2021: Amazon S3 Replication: For data protection & application acceleration | AWS Events

AWS Pi Week 2021: Amazon S3 storage classes primer | AWS Events

AWS Pi Week 2021: Optimize and manage data on Amazon S3 | AWS Events

How do I raise RFCs for cross-account Amazon S3 bucket access in different AMS accounts?

Amazon S3 File Gateway Overview - On-Premises Backup to the AWS Cloud

Create Lookout For Vision Dataset Using Images In Amazon S3

AWS Supports You: Diving Deep into Amazon S3 Security Features

Quickly Create Rekognition Custom Labels Dataset Using Images In Amazon S3

IoT All the Things | S3 E1 | All in with James Gosling: Behind the Scenes with AWS IoT Greengrass V2

How to Automate Data Transfer from Amazon S3 to Zendesk using Amazon AppFlow

15 years of Amazon S3 - Building an Evolvable System

S3 Object Lambda Overview

15 years of Amazon S3 - Security is Job Zero

15 years of Amazon S3 - Accelerating Data Movement

15 years of Amazon S3 - Foundations of Cloud Infrastructure

AWS Storage Partners Celebrate 15 Years of Amazon S3

Local File to S3 to QuickSight Dashboard

Happy 15th Birthday Amazon S3

Back to Basics: Bulk Data Storage

AWS re:Invent 2020: Dropbox cuts costs with cold metadata store using Amazon DynamoDB and S3

AWS re:Invent 2020: Amazon S3 foundations: Best practices for Amazon S3

AWS re:Invent 2020: Architecting for high availability on Amazon S3

AWS re:Invent 2020: Accelerate your migration to Amazon S3

AWS re:Invent 2020: Advanced techniques for building with Amazon S3 in .NET applications

AWS re:Invent 2020: Closing the lid on public S3 buckets: Preventing S3 bucket exposure – Part 1

AWS re:Invent 2020: A defense-in-depth approach to Amazon S3 security and access

AWS re:Invent 2020: Secure your file transfers to Amazon S3 over SFTP, FTPS, and FTP

AWS re:Invent 2020: Achieving unparalleled scale on Amazon S3: Manage and analyze your data

AWS re:Invent 2020: Best practices for archiving large datasets with AWS

AWS re:Invent 2020: Amazon S3 to AWS Lambda: A flexible pattern at the core of serverless apps

AWS re:Invent 2020: Closing the lid on public S3 buckets: Preventing S3 bucket exposure – Part 2

AWS re:Invent 2020: What’s new with Amazon S3

AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ecosystem (BDM306)

2021

Why am I getting an Access Denied error from the Amazon S3 console while I modify a bucket policy?

AWS AMER Summit May 2021 | What’s new with Amazon S3

Configure PII redaction using Amazon S3 Object Lambda Access Points | Amazon Web Services

AWS AMER Summit 2020 | Best practices for building a data lake on Amazon S3

Mobilewalla: S3 Access Monitoring Using ML

AWS Pi Week 2021: Demo - S3 Object Lambda | AWS Events

AWS Pi Week 2021: S3 Object Lambda - add code to Amazon S3 GET requests to process data | AWS Events

AWS Pi Week 2021: S3 &amp; Lambda - flexible pattern at the core of serverless applications | AWS Events

AWS Pi Week 2021: Building serverless applications with Amazon S3 | AWS Events

AWS Pi Week 2021: Live coding - Uploading media to Amazon S3 from web &amp; mobile apps | AWS Events

AWS Pi Week 2021: The best compute for your storage - Amazon S3 &amp; AWS Lambda | AWS Events

AWS Pi Week 2021: Serverless on Amazon S3 - Introducing S3 Object Lambda | AWS Events

AWS Pi Week 2021: Demo - Monitor your Amazon S3 inventory &amp; identify sensitive data | AWS Events

AWS Pi Week 2021: Demo - Amazon S3 security posture management and threat detection | AWS Events

AWS Pi Week 2021: Demo - Amazon S3 and VPC Endpoints | AWS Events

AWS Pi Week 2021: Advanced networking with Amazon S3 and AWS PrivateLink | AWS Events

AWS Pi Week 2021: Protecting your data in Amazon S3 | AWS Events

AWS Pi Week 2021: Managing access to your Amazon S3 buckets and objects | AWS Events

AWS Pi Week 2021: S3 Block Public Access overview and demo | AWS Events

AWS Pi Week 2021: Securing Amazon S3 with guardrails and fine-grained access controls | AWS Events

AWS Pi Week 2021: Accelerating your migration to Amazon S3 | AWS Events

AWS Pi Week 2021: Modernizing your data archive with Amazon S3 Glacier | AWS Events

AWS Pi Week 2021: Amazon S3 Strong Consistency | AWS Events

AWS Pi Week 2021: Building modern data lakes on Amazon S3 | AWS Events

AWS Pi Week 2021: Managed file transfers to Amazon S3 and EFS over SFTP, FTPS, and FTP | AWS Events

AWS Pi Week 2021: Amazon S3 foundations: Best practices for Amazon S3 | AWS Events

AWS Pi Week 2021: Architecting for high availability on Amazon S3 | AWS Events

AWS Pi Week 2021: Beyond eleven nines - How Amazon S3 is built for durability | AWS Events

AWS Pi Week 2021: Backup to Amazon S3 and Amazon S3 Glacier | AWS Events

AWS Pi Week 2021: Amazon S3 Replication: For data protection &amp; application acceleration | AWS Events

AWS Pi Week 2021: Amazon S3 storage classes primer | AWS Events

AWS Pi Week 2021: Optimize and manage data on Amazon S3 | AWS Events

How do I raise RFCs for cross-account Amazon S3 bucket access in different AMS accounts?

Amazon S3 File Gateway Overview - On-Premises Backup to the AWS Cloud

Create Lookout For Vision Dataset Using Images In Amazon S3

AWS Supports You: Diving Deep into Amazon S3 Security Features

Quickly Create Rekognition Custom Labels Dataset Using Images In Amazon S3

IoT All the Things | S3 E1 | All in with James Gosling: Behind the Scenes with AWS IoT Greengrass V2

How to Automate Data Transfer from Amazon S3 to Zendesk using Amazon AppFlow

15 years of Amazon S3 - Building an Evolvable System

S3 Object Lambda Overview

15 years of Amazon S3 - Security is Job Zero

15 years of Amazon S3 - Accelerating Data Movement

15 years of Amazon S3 - Foundations of Cloud Infrastructure

AWS Storage Partners Celebrate 15 Years of Amazon S3

Local File to S3 to QuickSight Dashboard

Happy 15th Birthday Amazon S3

Back to Basics: Bulk Data Storage

AWS re:Invent 2020: Dropbox cuts costs with cold metadata store using Amazon DynamoDB and S3

AWS re:Invent 2020: Amazon S3 foundations: Best practices for Amazon S3

AWS re:Invent 2020: Architecting for high availability on Amazon S3

AWS re:Invent 2020: Accelerate your migration to Amazon S3

AWS re:Invent 2020: Advanced techniques for building with Amazon S3 in .NET applications

AWS re:Invent 2020: Closing the lid on public S3 buckets: Preventing S3 bucket exposure – Part 1

AWS re:Invent 2020: A defense-in-depth approach to Amazon S3 security and access

AWS re:Invent 2020: Secure your file transfers to Amazon S3 over SFTP, FTPS, and FTP

AWS re:Invent 2020: Achieving unparalleled scale on Amazon S3: Manage and analyze your data

AWS re:Invent 2020: Best practices for archiving large datasets with AWS

AWS re:Invent 2020: Amazon S3 to AWS Lambda: A flexible pattern at the core of serverless apps

AWS re:Invent 2020: Closing the lid on public S3 buckets: Preventing S3 bucket exposure – Part 2

AWS re:Invent 2020: What’s new with Amazon S3

AWS Pi Week 2021: S3 & Lambda - flexible pattern at the core of serverless applications | AWS Events

AWS Pi Week 2021: Live coding - Uploading media to Amazon S3 from web & mobile apps | AWS Events

AWS Pi Week 2021: The best compute for your storage - Amazon S3 & AWS Lambda | AWS Events

AWS Pi Week 2021: Demo - Monitor your Amazon S3 inventory & identify sensitive data | AWS Events

AWS Pi Week 2021: Amazon S3 Replication: For data protection & application acceleration | AWS Events