Adoptable Cookbooks List

Supermarket Belongs to the Community

Supermarket belongs to the community. While Chef has the responsibility to keep it running and be stewards of its functionality, what it does and how it works is driven by the community. The chef/supermarket repository will continue to be where development of the Supermarket application takes place. Come be part of shaping the direction of Supermarket by opening issues and pull requests or by joining us on the Chef Mailing List.

hadoop cookbook

Requirements

This cookbook may work on earlier versions, but these are the minimal tested versions.

Chef 11.4.0+

CentOS 6.4+

Ubuntu 12.04+

This cookbook assumes that you have a working Java installation. It has been tested using version 1.21.2 of the java cookbook, using Oracle Java 6. If you plan on using Hive with a database other than the embedded Derby, you will need to provide it and set it up prior to starting Hive Metastore service.

Usage

This cookbook is designed to be used with a wrapper cookbook or a role with settings for configuring Hadoop. The services should work out of the box on a single host, but little validation is done that you have made a working Hadoop configuration. The cookbook is attribute-driven and is suitable for use via either chef-client or chef-solo since it does not use any server-based functionality. The cookbook defines service definitions for each Hadoop service, but it does not enable or start them, by default.

Attributes

Attributes for this cookbook define the configuration files for Hadoop and its various services. Hadoop configuration files are XML files, with name/value property pairs. The attribute name determines which file the property is placed and the property name. The attribute value is the property value. The attribute hadoop['core_site']['fs.defaultFS'] will configure a property named fs.defaultFS in core-site.xml in hadoop['conf_dir']. All attribute values are taken as-is and only minimal configuration checking is done on values. It is up to the user to provide a valid configuration for your cluster.

Attribute Tree

File

Location

flume['flume_conf']

flume.conf

flume['conf_dir']

hadoop['capacity_scheduler']

capacity-scheduler.xml

hadoop['conf_dir']

hadoop['container_executor']

container-executor.cfg

hadoop['conf_dir']

hadoop['core_site']

core-site.xml

hadoop['conf_dir']

hadoop['fair_scheduler']

fair-scheduler.xml

hadoop['conf_dir']

hadoop['hadoop_env']

hadoop-env.sh

hadoop['conf_dir']

hadoop['hadoop_metrics']

hadoop-metrics.properties

hadoop['conf_dir']

hadoop['hadoop_policy']

hadoop-policy.xml

hadoop['conf_dir']

hadoop['hdfs_site']

hdfs-site.xml

hadoop['conf_dir']

hadoop['log4j']

log4j.properties

hadoop['conf_dir']

hadoop['mapred_env']

mapred-env.sh

hadoop['conf_dir']

hadoop['mapred_site']

mapred-site.xml

hadoop['conf_dir']

hadoop['yarn_env']

yarn-env.sh

hadoop['conf_dir']

hadoop['yarn_site']

yarn-site.xml

hadoop['conf_dir']

hbase['hadoop_metrics']

hadoop-metrics.properties

hbase['conf_dir']

hbase['hbase_env']

hbase-env.sh

hbase['conf_dir']

hbase['hbase_policy']

hbase-policy.xml

hbase['conf_dir']

hbase['hbase_site']

hbase-site.xml

hbase['conf_dir']

hbase['jaas']

jaas.conf

hbase['conf_dir']

hbase['log4j']

log4j.properties

hbase['conf_dir']

hive['hive_env']

hive-env.sh

hive['conf_dir']

hive['hive_site']

hive-site.xml

hive['conf_dir']

hive['jaas']

jaas.conf

hive['conf_dir']

oozie['oozie_env']

oozie-env.sh

oozie['conf_dir']

oozie['oozie_site']

oozie-site.xml

oozie['conf_dir']

spark['log4j']

log4j.properties

spark['conf_dir']

spark['metrics']

metrics.properties

spark['conf_dir']

spark['spark_env']

spark-env.sh

spark['conf_dir']

tez['tez_env']

tez-env.sh

tez['conf_dir']

tez['tez_site']

tez-site.xml

tez['conf_dir']

zookeeper['jaas']

jaas.conf

zookeeper['conf_dir']

zookeeper['log4j']

log4j.properties

zookeeper['conf_dir']

zookeeper['zoocfg']

zoo.cfg

zookeeper['conf_dir']

Distribution Attributes

hadoop['distribution_version'] - Specifies which version of hadoop['distribution'] to use. Default 2.0 if hadoop['distribution'] is hdp, 5 if hadoop['distribution'] is cdh, and 0.8.0 if hadoop['distribution'] is bigtop. It can also be set to develop when hadoop['distribution'] is bigtop to allow installing from development repos without gpg validation.

APT-specific settings

hadoop['apt_repo_url'] - Provide an alternate apt installation source location. If you change this attribute, you are expected to provide a path to a working repo for the hadoop['distribution'] used. Default: nil

RPM-specific settings

hadoop['yum_repo_url'] - Provide an alternate yum installation source location. If you change this attribute, you are expected to provide a path to a working repo for the hadoop['distribution'] used. Default: nil

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.