Hadoop v3 Offerings

This blog will cover some new feature which Hadoop V3 has to offer for existing or new Hadoop customers. And it’s a nice idea to familiarize yourself with these features ,so that incase you want to move to Hadoop or upgrade your cluster from an older version you will be aware what you can try and experiment with your cluster!!!

I will be covering installation and upgrade to Hadoop v3 in separate blogs as this one has a strict focus area towards features of Hv3.

The Overview:

So, let’s have a look at the history of Hadoop version 3 which was released end of last year on 13-December-2017. What a nice Christmas surprise to the community!!! All thanks to the dedicated hard working committees for their dedication to making this happen.

After four alpha releases and one beta release, 3.0.0 is generally available. 3.0.0 consists of 302 bug fixes, improvements, and other enhancements since 3.0.0-beta1. All together, 6242 issues were fixed as part of the 3.0.0 release series since 2.7.0.

If you are more keen on details about the JIRA reported and addressed then you can have a look at the below-provided link:

The salient features of Hadoop v3:

As we have already taken a look at the history, let me jot down some features introduced as part of this new release :

Minimum required Java version increased from Java 7 to Java 8

Support for erasure coding in HDFS

YARN Timeline Service v.2

Shell script rewrite

Shaded client jars

Support for Opportunistic Containers and Distributed Scheduling

MapReduce task-level native optimization

Support for more than 2 NameNodes

Default ports of multiple services have been changed

Support for Microsoft Azure Data Lake and Aliyun Object Storage System filesystem connectors

Intra-data node balancer

Reworked daemon and task heap management

S3Guard: Consistency and Metadata Caching for the S3A filesystem client

HDFS Router-Based Federation

The API-based configuration of Capacity Scheduler queue configuration

YARN Resource Types

Now, I will be cover details of the features which are part of my favourite list and would help readers to understand it technically. Note: At this point, I can’t cover in-depth details of each feature as this will make blog clumsy and boring which I don’t want at all.

Hadoop Erasure Coding: Erasure coding is a method for durably storing data with significant space savings compared to replication. Standard encodings like Reed-Solomon (10,4) have a 1.4x space overhead, compared to the 3x overhead of standard HDFS replication.Since erasure coding imposes additional overhead during reconstruction and performs mostly remote reads, it has traditionally been used for storing colder, less frequently accessed data. Users should consider the network and CPU overheads of erasure coding when deploying this feature.To understand more about this feature you can refer to the listed link:

Namenode HA with more than 2 nodes: In this feature, a customer can have more than two name nodes as an Active/Passive node. In the earlier release, we had HA name node which is an Active/Passive method of implementation with only one name node failure tolerance. In this new feature to achieve the higher degree of tolerance, a customer can implement HA for name node with having more than two name nodes and quorum general manager for fencing.

Changes in default ports of multiple services: With this feature Hadoop services such as NameNode, Secondary NameNode, DataNode, and KMS ports are now moved out of Linux ephemeral port range (32768-61000). In earlier version having these services ports in ephemeral port range sometimes conflicts with other application and create a problem in service startups.

Intra-data node balancer: Remember the below command for balancing the Hadoop cluster when we add new data nodes to our cluster or to achieve more admin specific tasks in the cluster. However, adding or replacing disks can lead to significant skew within a DataNode. This situation was not handled by the earlier version of Hadoop HDFS balancer utility, which concerns itself with inter-, not intra-, DN skew. In the new feature, this is been taken care and can handle inter-balancing in data nodes.

hdfs balancer

HDFS Router-Based Federation: HDFS Router-Based Federation adds an RPC routing layer that provides a federated view of multiple HDFS namespaces. This is similar to the existing ViewFs) and HDFS Federation functionality, except the mount table, is managed on the server-side by the routing layer rather than on the client.

Yarn Timeline v2 service: Timeline v2 addresses two major challenges: improving scalability and reliability of Timeline Service, and enhancing usability by introducing flows and aggregation which were lacking in the earlier version.

Yarn Resources types: In this feature user defined countable resources is enabled using which a Hadoop cluster admin can define the countable resources like GPU, S/W licenses or locally attached storage. This also includes the CPU and memory which was part of earlier releases.

Developing custom maven plugin using Java5 annotations

Maven provides lots of built-in plugins for developers but at some point you may find need of custom maven plugin. Developing custom maven plugin using java5 annotations is very simple and straightforward.

You just need to follow below steps to develop custom maven plugin using Java 5 annotations:

Steps:

1. Create a new project with pom packaging set to “maven-pom”

2. Add below dependencies to your plugin pom:

ii. Since Maven 3.0 version we can use java 5 annotations to develop custom plugin.with annotations it is not necessasry that mojo super class should be in the same project if your super class also uses annotations. To use annotations in mojos add below dependency to your plugin pom file.

4.You can execute your plugin from command line by providing following command:

mvn pluginGroupId:artifactID:version:mojoName

to shorten the command to be executed for plugin add below lines to maven’s settings.xml file in pluginGroups section. This will tell maven to search repository for this groupID:

<pluginGroups><pluginGroup>plugin group id</pluginGroup></pluginGroups>

After this you can run your plugin simply by providing goal prefix and mojo name command to run plugin will be like this:

mvn goalPrefix:mojoName

5. configuring goalPrefix:

To Create goalPrefix add plugin maven-plugin-plugin to maven plugin pom.It is used to create plugin descriptor for any mojo’s found in source tree to include in jar.it can be used for generating report files for mojo’s updating plugin registry.

6. You can pass external parameters to your plugin from command line and also you can set default values for you parameters if they are not send from command line.

command to run plugin by passing parameter is :

mvn goalPrefix:mojoName -Dparam1='acd';

if you set required parameter of property to false then there is no compulsion of passing parameter from command line.

As we all know that maven has default structure of scanning source files in src/main/java structure and test files in src/test folded,similarly if you want your plugin to scan files in particular folder in your project structure then you can do this by adding org.apache.maven.project.MavenProject property

you can also set default values to your parameters by setting parameter property name tag in maven-plugin-plugin plugin’s configuration section.if you don’t want to set default property then keep these custom property fields blank.these property fields are the property values you mentioned in mojo.

9. Using your plugin in main project: for using your plugin in another project add plugins dependency in build section of your project.

10. To release plugin copy plugin’s jar and other dependent jars from your m2 repository and release it to client or qa.

Here we are done with developing custom maven plugin with java 5 annotations.please let me know if you have any doubts or suggestions.The sample project is present on github you can download it from below link:

Please note in above commands:

Above commands did not work because of an error given below.

"message" : "org.apache.ambari.server.controller.spi.SystemException: An internal system exception occurred: Could not delete service component from cluster. To remove service component, it must be in DISABLED/INIT/INSTALLED/INSTALL_FAILED/UNKNOWN/UNINSTALLED/INSTALLING state

The only option was to remove this component completely from ambari database and restart ambari-server/agent processes.

Writing Custom Gradle Plugin using Java

What is Gradle ?

Gradle is build automation tool based on Apache ant and Maven. Gradle avoids traditional .xml file based configuration by introducing groovy based domain specific language. In gradle project have .gradle files instead of .pom files. Gradle was designed for multi-project builds and supports incremental build.

Gradle plugin groups together reusable pieces of build logic which can be then used across many different projects and builds. We can use any language whose compiled code gets converted to byte code for developing custom gradle plugin. As gradle is mainly designed using groovy language its very easy to develop gradle plugin using groovy but lets see how to develop custom gradle plugin using Java language:

5. All the user defined values to custom plugin are provided through extension object so creaete extension class and register it with plugin as shown above to receive inputs from user.If user doesnot provided input then default values will be assumed.

6. Create extension class which is similar to java pojo class it will contain user defined properties and their getter setter methods.If user provides values for these properties during run time then these values will be accepted otherwise default values will be considered.

For logging into your custom plugin use slf4j or any other logging framework of your choice.If you want to fail the build on exception in your task then throw TaskExecutionException which will cause BuildFailures other exceptions will not cause build failure.TaskExecutionException accepts task object and Throwable object as input.

Here DefaultTask is standard gradle task implementation class and we need to extend it while implementing custom tasks.@TaskAction annotation makes method action method and whenever task executes this method will be executed.

as we are developing gradle plugin using java add apply plugin:’java’ line.whatever external dependencies your plugin is dependent upon add these dependencies in dependencies section.the repositories in which your plugin should look for should be mentioned in repositories section.mentioned group id,version of plugin in group and version tag.

For making plugin available to other projects gradle plugin should be published to repository or its archives should be uploaded for this purpose either use publishing or uploadArchives functionality.

for publishing plugin use following command if you are publishing plugin to local maven repository:

“gradle clean build publishToMavenLocal”

If you are uploading plugin to local maven repository then use below command:

“gradle clean build uploadToArchives”

11. For using plugin in another projects make following changes in build.gradle file of your project.

here plugin dependency must be defined in buildscript section and to tell gradle which repositories to scan for getting plugin dependencies add repositories section in buildscript section this repository section must come ahead of dependencies section.Afer this add apply plugin line.

Sample task provided will be used for executing our plugin logic in task value of type is path of our custom task.whatever custom arguments we want to provide to plugin that we need to define in task section .if we want to run with default parameters then comment samplePlugin.sampleFilePath line in task section.

Kafka integration with Ganglia

In this blog post I will show you kafka integration with ganglia, this is very interesting & important topic for those who want to do bench-marking, measure performance by monitoring specific Kafka metrics via ganglia.

Before going ahead let me briefly explain about what is Kafka and Ganglia.

#
# Ganglia monitoring system php web frontend
#
Alias /ganglia /usr/share/ganglia
<Location /ganglia>
Order deny,allow
Allow from all #this is very important or else you won’t be able to see ganglia web UI
Allow from 127.0.0.1
Allow from ::1
# Allow from .example.com
</Location>

cluster {
name = "hadoopkafka"
owner = "unspecified"
latlong = "unspecified"
url = "unspecified"
}
/* The host section describes attributes of the host, like the location */
host {
location = "unspecified"
}
/* Feel free to specify as many udp_send_channels as you like. Gmond
used to only support having a single channel */
udp_send_channel {
#bind_hostname = yes # Highly recommended, soon to be default.
# This option tells gmond to use a source address
# that resolves to the machine's hostname. Without
# this, the metrics may appear to come from any
# interface and the DNS names associated with
# those IPs will be used to create the RRDs.
#mcast_join = 239.2.11.71
host = 172.30.0.81
port = 8649
#ttl = 1
}
/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
#mcast_join = 239.2.11.71
port = 8649
#bind = 239.2.11.71
#retry_bind = true
# Size of the UDP buffer. If you are handling lots of metrics you really
# should bump it up to e.g. 10MB or even higher.
# buffer = 10485760
}

Minimum user id error while submitting mapreduce job

This is a small blog to help you all solve this minimum user id error while submitting mapreduce job in Hadoop.

crazyadmins.com – We know Hadoop!

Error:

Application application_XXXXXXXXX_XXXX failed 2 times due to AM Container for appattempt_ XXXXXXXXX_XXXX _XXXXXX exited with exitCode: -1000
For more detailed output, check application tracking page:http://<your RM host>:8088/proxy/application_ XXXXXXXXX_XXXX /Then, click on links to logs of each attempt.
Diagnostics: Application application_ XXXXXXXXX_XXXX initialization failed (exitCode=255) with output: Requested user hive is not whitelisted and has id 501, which is below the minimum allowed 1000
Failing this attempt. Failing the application.

This means that there is a property (for minimum allowed UID value) which has been set with value 1000. In above example, it appears that the hive user has UID as 501 which is less than 1000.

To solve this, we either need to:

Update UID of user hive to a unique value greater than or equal to 1000

OR

Update the property value to 500 so that hive user UID meets the minimum value.

We will go with option 2 here i.e. Update the property value to 500 so that hive user UID meets the minimum value.

If you are using Ambari to manage your Hortonworks cluster, then:

1. Login to Ambari UI with user having privileges to edit configurations.

2. Navigate to YARN configurations.

3. Go to Advanced yarn-env.sh

4. Update “Minimum user ID for submitting job” to 500

Resubmit your job now and it should just run fine!

Similarly if you are using Cloudera Manager, find the same property in YARN configurations and update it.

If you are not using any of the Management UIs, then you can try finding this property in the conf directory where your hadoop conf files are located. Usually you will find them in /etc/hadoop/conf location.

Tune Hadoop Cluster to get Maximum Performance (Part 2)

In previous part we have seen that how can we tune our operating system to get maximum performance for Hadoop, in this article I will be focusing on how to tune hadoop cluster to get performance boost on hadoop level

The heapsize of the jvm –Xmx for the mapper or reducer task.

This value should always be lower than mapreduce.[map|reduce].memory.mb.

The amount of memory for ApplicationMaster

yarn.app.mapreduce.am.command-opts

heapsize for application Master

yarn.nodemanager.resource.cpu-vcores

The number of cores that a node manager can allocate to containers is controlled by the yarn.nodemanager.resource.cpu-vcores property. It should be set to the total number of cores on the machine, minus a core for each daemon process running on the machine (datanode, node manager, and any other long-running processes).

mapreduce.task.io.sort.mb

Default value – 100MB

This is very important property to tune, when map task is in progress it writes output into a circular in-memory buffer. The size of this buffer is fixed and determined by io.sort.mb property

When this circular in-memory buffer gets filled (mapreduce.map. sort.spill.percent: 80% by default), the SPILLING to disk will start (in parallel using a separate thread). Notice that if the splilling thread is too slow and the buffer is 100% full, then the map cannot be executed and thus it has to wait.

io.file.buffer.size

Hadoop uses buffer size of 4KB by default for its I/O operations, we can increase it to 128K in order to get good performance and this value can be increased by setting io.file.buffer.size= 131072 (value in bytes) in core-site.xml

dfs.client.read.shortcircuit

Short-circuit reads – When reading a file from HDFS, the client contacts the datanode and the data is sent to the client via a TCP connection. If the block being read is on the same node as the client, then it is more efficient for the client to bypass the network and read the block data directly from the disk.

We can enable short-circuit reads by setting this property to “true”

mapreduce.task.io.sort.factor

Default value is 10.

Now imagine the situation where map task is running, each time the memory buffer reaches the spill threshold, a new spill file is created, after the map task has written its last output record, there could be several spill files. Before the task is finished, the spill files are merged into a single partitioned and sorted output file.

The configuration property mapreduce.task.io.sort.factor controls the maximum number of streams to merge at once.

mapreduce.reduce.shuffle.parallelcopies

Default value is 5

The map output file is sitting on the local disk of the machine that ran the map task

The map tasks may finish at different times, so the reduce task starts copying their outputs as soon as each completes

The reduce task has a small number of copier threads so that it can fetch map outputs in parallel.

The default is five threads, but this number can be changed by setting the mapreduce.reduce.shuffle.parallelcopies property

I tried my best to cover as much as I can, there are plenty of things you can do for tuning! I hope this article was helpful to you. What I recommend you guys is try tuning above properties by considering total available memory capacity, total number of cores etc. and run the Teragen, Terasort etc. benchmarking tool to get the results, try tuning until you get best out of it!!

Tune Hadoop Cluster to get Maximum Performance (Part 1)

I have been working on production Hadoop clusters for a while and have learned many performance tuning tips and tricks. In this blog I will explain how to tune Hadoop Cluster to get maximum performance. Just installing Hadoop for production clusters or to do some development POC does not give expected results, because default Hadoop configuration settings are done keeping in mind the minimum hardware configuration. Its responsibility of Hadoop Administrator to understand the hardware specs like amount of RAM, Total number of CPU Cores, Physical Cores, Virtual Cores, Understand if hyper threading is supported by Processor, NIC Cards, Number of Disks that are mounted on Datanodes etc.

For Better Understanding I have divided this blog into two main parts.

1. Tune your Hadoop Cluster to get Maximum Performance (Part 1) – In this part I will explain how to tune your operating system in order to get maximum performance for your Hadoop jobs.

2. Tune your Hadoop Cluster to get Maximum Performance (Part 2) – In this part I will explain how to modify your Hadoop configurations parameters so that it should use your hardware very efficiently.

How OS tuning will improve performance of Hadoop?

Let’s get started and see what parameters we need to change on OS level.

1. Turn off the Power savings option in BIOS:

This will increase the overall system performance and Hadoop Performance. You can go to your BIOS Settings and change it to PerfOptimized from power saving mode (this option may be different for your server based on vendor). If you have remote console command line available then you can use racadm commands to check the status and update it. You need to restart the system in order to get your changes in effect.

2. Open file handles and files:

By default number of open file count is 1024 for each user and if you keep it to default then you may face java.io.FileNotFoundException: (Too many open files) and your job will get failed. In order to avoid this scenario set this number of open file limit to unlimited or some higher number like 32832.

Commands:

ulimit –S 4096
ulimit –H 32832

Also, Please set the system wide file descriptors by using below command:

sysctl –w fs.file-max=6544018

As above kernel variable is temporary and we need to make it permanent by adding it to /etc/sysctl.conf. Just edit /etc/sysctl.conf and add below value at the end of it

fs.file-max=6544018

3. FileSystem Type & Reserved Space:

In order to get maximum performance for your Hadoop job, I personally suggest by using ext4 filesystem as it has some advantage over ext3 like multi-block and delayed allocation etc. How you mount your file-system will make difference because if you mount it using default option then there will excessive writes for file or directory access times which we do not need in case of Hadoop. Mount your local disks using option noatime will surely improve your performance by disabling those excessive and unnecessary writes to disks.

Note – noatime option will also cover noadirtime so no need to mention that.

Many of you must be aware that after formatting your disk partition with ext4 partition, there is 5% space reserved for special operations like 100% disk full so root should be able to delete the files by using this reserved space. In case of Hadoop we don’t need to reserve that 5% space so please get it removed using tune2fs command.

Command:

tune2fs -m 0 /dev/sdXY

Note – 0 indicates that 0% space is reserved.

4. Network Parameters Tuning:

Network parameters tuning also helps to get performance boost! This is kinda risky stuff because if you are working on remote server and you did a mistake while updating Network parameters then there can be a connectivity loss and you may not be able to connect to that server again unless you correct the configuration mistake by taking IPMI/iDrac/iLo console etc. Modify the parameters only when you know what you are doing

Modifying the net.core.somaxconn to 1024 from default value of 128 will help Hadoop by as this changes will have increased listening queue between the master and slave services so ultimately number of connections between master and slaves will be higher than before.

Command to modify net.core.somaxconnection:

sysctl –w net.core.somaxconn=1024

To make above change permanent, simple add below variable value at the end of /etc/sysctl.conf

net.core.somaxconn=1024

MTU Settings:

Maximum transmission unit. This value indicates the size which can be sent in a packet/frame over TCP. By default MTU is set to 1500 and you can tune it have its value=9000, when value of MTU is greater than its default value then it’s called as Jumbo Frames.

Command to change value of MTU:

You need to add MTU=9000 in /etc/sysconfig/network-scripts/ifcfg-eth0 or whatever your eth device name. Restart the network service in order to have this change in effect.

Note – Before modifying this value please make sure that all the nodes in your cluster including switches are supported for jumbo frames, if not then *PLEASE DO NOT ATTEMPT THIS*

5. Transparent Huge Page Compaction:

This feature of linux is really helpful to get the better performance for application including Hadoop workloads however one of the subpart of Transparent Huge Pages called Compaction causes issues with Hadoop job(it causes high processor usage while defragmentation of the memory). When I was benchmarking our client’s cluster I have observed some fluctuations ~15% with the output and when I disabled it then that fluctuation was gone. So I recommend to disable it for Hadoop.

Command:

echo never > /sys/kernel/mm/redhat_transparent_hugepages/defrag

In order to make above change permanent, please add below script in your /etc/rc.local file.

if test -f /sys/kernel/mm/redhat_transparent_hugepage/defrag; then echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag ;fi

6. Memory Swapping:

For Hadoop Swapping reduces the job performance, you should have maximum data in-memory and tune your OS so that it will do memory swap only when there is situation like OOM (OutOfMemory). To do so we need to set vm.swappiness kernel parameter to 0

Command:

sysctl -w vm.swappiness=0

Please add below variable in /etc/sysctl.conf to make it persistent.

vm.swappiness=0

I hope this information will help someone who is looking for OS level tuning parameters for Hadoop. Please don’t forget to give your feedback via comments or ask questions if any.
Thank you I will publish second part in the next week!