Using Amazon S3 for Apache HBase on Amazon EMR

Published on Nov 21, 2016

Please visit the Amazon EMR documentation for more information about Apache HBase on Amazon S3 - http://amzn.to/2gapiYy. You can use Amazon S3 as a data store for Apache HBase on Amazon EMR using the EMR File System. By using Amazon S3 as a data store for Apache HBase, you can separate your cluster’s storage and compute nodes. This enables you to save costs by sizing your cluster for your compute requirements instead of paying to store your entire dataset with 3x replication in the on-cluster Hadoop Distributed File System (HDFS).