Home » Controlling Hadoop Jobs using Oozie Cognitive Class Exam Answers

Controlling Hadoop Jobs using Oozie Cognitive Class Exam Answers

by IndiaSuccessStories
0 comment

Introduction to Controlling Hadoop Jobs using Oozie

Oozie is a workflow scheduler system to manage Apache Hadoop jobs. It allows you to define a workflow of dependent jobs, which can be Hadoop MapReduce jobs, Hive queries, Pig scripts, and others, and schedule their execution. Here’s a basic introduction to controlling Hadoop jobs using Oozie:

1. Workflow Definition

Oozie allows you to define workflows using XML. A typical workflow consists of a sequence of actions. Each action can represent a Hadoop job or a task to be executed. Example actions include:

  • Hadoop MapReduce Action: Executes a MapReduce job.
  • Pig Action: Executes a Pig script.
  • Hive Action: Executes a Hive query.
  • Shell Action: Executes a shell script.

2. Workflow Control Nodes

In Oozie, workflows are defined using control nodes that define the workflow structure:

  • Start Node: Entry point of the workflow.
  • End Node: Marks the end of the workflow.
  • Action Nodes: Define individual actions to be executed. These nodes specify what action to execute and any input/output specifications.
  • Decision Nodes: Allow conditional execution paths based on the result of a preceding action.
  • Fork and Join Nodes: Enable parallel execution of actions.

3. Workflow Configuration

Each action in an Oozie workflow requires configuration specific to the type of action. For example, a MapReduce action needs to specify the jar file, input/output paths, and other job-specific properties. This configuration is defined in the XML workflow definition.

banner

4. Coordinator and Bundle Applications

  • Coordinator Application: Allows you to schedule and manage recurrent workflows based on time and data availability.
  • Bundle Application: Helps in managing workflows that are dependent on each other or need to be executed together.

5. Workflow Execution

Once a workflow is defined and configured, you can submit it to Oozie for execution. Oozie coordinates the execution of actions based on the workflow definition. It monitors the progress of actions and handles retries in case of failures.

6. Oozie CLI and Web UI

  • Oozie CLI: Provides command-line tools (oozie command) to interact with Oozie, submit workflows, check status, and manage jobs.
  • Oozie Web UI: Web-based interface to monitor workflows, view job history, and manage coordinator and bundle applications.

7. Error Handling and Notifications

Oozie provides mechanisms for error handling and notifications:

  • Actions can specify retry policies.
  • You can define email notifications on job completion or failure.

Conclusion

Oozie provides a powerful framework for managing and coordinating Hadoop jobs through workflows. It simplifies the task of scheduling and monitoring complex job dependencies and executions. By defining workflows and coordinating actions, Oozie helps in efficiently managing data processing tasks in a Hadoop environment.

Controlling Hadoop Jobs using Oozie Cognitive Class Certification Answers

Question 1: Oozie definitions written in the Hadoop Process Definition Language (hPDL) are encoded in which of the following files?

  • workflow.txt
  • workflow.html
  • workflow.json
  • workflow.xml

Question 2: Oozie detects job completion via callback and polling. True or false?

  • False
  • True

Question 3: The Oozie expression language (EL) provides access to all of the following except

  • error codes
  • workflow job size
  • application name
  • workflow job id

Question 1: Which of the following can trigger the start of an Oozie job?

  • The Oozie CLI
  • Data
  • An application call to the API
  • Time
  • All of the above

Question 2: The Oozie coordinator works with Central European Time (CET). True or false?

  • False
  • True

Question 3: The Coordinator Job uses all of the following files except

  • job.properties
  • coord-config-default.xml
  • coordinator.properties
  • coordinator.xml

Question 1: Which of the following statements about the BigInsights Workflow Editor is correct?

  • It displays a read-only diagram to show the overall workflow
  • It runs in an Eclipse environment
  • It supports complex Oozie workflows without requiring knowledge of the Oozie xml xds schema
  • It’s a new feature, and it was introduced to BigInsights in version 2.0
  • All of the above

Question 2: You can use the BigInsights Workflow Publishing Wizard as a graphical tool to create and modify a workflow.xml file. True or false?

  • False
  • True

Question 3: Which of the following statements is NOT correct?

  • The InfoSphere BigInsights Tool for Eclipse is essentially an Eclipse module with BigInsights add-ins.
  • At a higher level, we can link multiple applications to run in sequence.
  • We cannot build sub-workflows in a workflow.
  • Deployed applications can be scheduled.

Question 1: What is the primary purpose of Oozie in the Hadoop architecture?

  • To provide logging support for Hadoop jobs
  • To support the execution of workflows consisting of a collection of actions
  • To support SQL access to relational data stored in Hadoop
  • To move data into HDFS

Question 2: How are Oozie workflows defined?

  • Using the Java programming language
  • Using JSON
  • Using a plain text file that defines the graph elements
  • Using hPDL

Question 3: Control nodes in an Oozie Workflow can contain all of the following except

  • Start
  • Fork
  • Pig
  • End
  • Kill

Question 4: A workflow job can be executed from

  • A Java API
  • A Web-server API
  • The command line
  • All of the Above

Question 5: Where do the workflow.xml, config-default.xml, JAR, and .so files need to be stored prior to Oozie workflow job execution?

  • On a web-server
  • In HDFS within a defined directory structure
  • On the local file system where you are executing the job
  • None of the above

Question 6: What is the purpose of the Oozie Coordinator?

  • To invoke workflows when some external event occurs
  • To invoke workflows when data becomes available
  • To invoke workflows at regular intervals
  • All of the above

Question 7: Which of the following need to be stored in HDFS?

  • coordinator.xml only
  • coord-config-default.xml only
  • coordinator.properties only
  • coordinator.xml and coord-config-default.xml only
  • coordinator.xml and coordinator.properties only

Question 8: The Oozie coordinator can be executed from

  • A Java API
  • A Web-server API
  • The command line
  • All of the Above

Question 9: How is an Oozie coordinator configured?

  • Using the Java programming language
  • Using JSON
  • Using a plain text file that defines the workflow schedule
  • Using XML

Question 10: By defining a dataset template as part of the coordinator.xml file, you can use the coordinator to trigger a workflow when an updated dataset has arrived in HDFS. True or false?

  • True
  • False

Question 11: coordinator.properties can be used to establish

  • values for variables used in workflow.xml
  • values for variables used in coordinator.xml
  • the location of the coordinator job in HDFS
  • All of the above

Question 12: job.properties can be used to establish

  • The location of the workflow job in HDFS, only
  • Values for variables used in workflow.xml, only
  • The actions to perform at each stage of the workflow, only
  • Values for variables used in workflow.xml, and the actions to perform at each stage of the workflow
  • The location of the workflow job in HDFS, and values for variables used in workflow.xml

Question 13: The kill node is used to indicate a successful completion of the Oozie workflow. True or false?

  • True
  • False

Question 14: The join node in an Oozie workflow will wait until all forked paths have completed. True or false?

  • True
  • False

Question 15: Decision nodes can be used to select from multiple alternative paths through an Oozie workflow. True or false?

  • True
  • False

You may also like

Leave a Comment

Indian Success Stories Logo

Indian Success Stories is committed to inspiring the world’s visionary leaders who are driven to make a difference with their ground-breaking concepts, ventures, and viewpoints. Join together with us to match your business with a community that is unstoppable and working to improve everyone’s future.

Edtior's Picks

Latest Articles

Copyright © 2024 Indian Success Stories. All rights reserved.