Anyone that has used Hadoop knows that jobs sometimes get stuck. Hadoop is powerful, and it’s experiencing a tremendous rate of innovation, but it also has many rough edges. As Hadoop practitioners, we all spend a lot of effort dealing with these rough edges in order to keep Hadoop and Hadoop jobs running well for our customers and or organizations. In this class, we will look at a typical problem encountered by a Hadoop user, and discuss its implications for the future of Hadoop development. We will also go through the solution to this kind of problem using step-by-step instructions and the specific code we used to identify the issue.
One of the instructor's customers asked why his Hive job seemed to be stuck. The job had been running for more than a day, with 43 out of 44 mappers successfully completed. To figure out what was happening with the job, he used the Hadoop 2 Resource Manager user interface, which provides status and diagnostic information about jobs running on Hadoop. Stepping out of the details, the Hadoop did provide the information required to understand the problem with this Hive job. On the negative side, the job kept running new map attempts after it hit an error that should have been terminal. To make matters worse, the relevant error message was hard to find – even after finding the relevant page five levels deep in the user interface.
As a community, we need to work together to improve this kind of experience for our industry. Now that Hadoop 2 has been shipped, we believe the Hadoop community will be able to focus its energies on rounding off rough edges like these, and this class will provide advanced users with some tools and strategies to identify issues with jobs and how to keep these running smoothly. Attend this class to:
- Learn techniques for identifying the source of specific problems within Hive jobs
- Have a broad discussion on the portions of Hadoop that can cause the most issues for data scientists
- Showcase how to use the Hadoop 2 Resource Manager user interface to get the info you need to solve specific issues
- Discuss next steps we must take as an industry in order to make Hadoop easier for everyone