Java requires third-party and user-defined classes to be on the command line’s "-classpath" option when the JVM is launched.
MapReduce jobs are executed in separate JVMs on TaskTrackers and sometimes you need to use third-party libraries in the map/reduce task attempts.
ClassNotFoundException occurs when the required libraries are not found in the class paths of the nodes running the map/reduce tasks.
Below are the different ways to avoid the ClassNotFoundException in hadoop.
1. Include the JAR in the “-libjars” command line option of the "hadoop jar ..." command
The jar will be placed in distributed cache and will be made available to all of the job’s task attempts.
More specifically, you will find the JAR in one of the ${mapred.local.dir}/taskTracker/archive/${user.name}/distcache/… subdirectories on local nodes.
2. Include the referenced JAR in the lib subdirectory of the submittable JAR.
Points 1 & 2 are preferred when the JARs are
- small,
- change often, and
- are job-specific
3. Install the JAR on the cluster nodes.
The easiest way is to place the JAR into $HADOOP_HOME/lib directory as everything from this directory is included when a Hadoop daemon starts.
The same guiding principles apply to native code libraries that need to be run on the nodes (JNI or C++ pipes).
- You can put them into distributed cache with the “-files” options,
- include them into archive files specified with the “-archives” option, or
- install them on the cluster nodes.
If the dynamic library linker is configured properly the native code should be made available to your task attempts. You can also modify the environment of the job’s running task attempts explicitly by specifying JAVA_LIBRARY_PATH or LD_LIBRARY_PATH variables:
hadoop jar <your jar> [main class]
-D mapred.child.env="LD_LIBRARY_PATH=/path/to/your/libs" ...
Really good piece of knowledge, I had come back to understand regarding your website from my friend Sumit, Hyderabad And it
ReplyDeleteis very useful for who is looking for HADOOP.I browse and saw you website and I found it very interesting.Thank you for the good work, greetings.
Hadoop Training in hyderabad