MapR Academy All Courses
ADM 202 - Configure a MapR Cluster (MapR v6)
Lesson 4: Configure Users and Cluster Parameters

Audio

Transcript

Remember from the previous lesson that we’ve discussed what a high performing cluster looks like.   It doesn’t mean that you’re running expensive hardware, but more importantly, that it is running smoothly and efficiently. Ideally, the memory should be up to 98% capacity, the CPU maxed out at full capacity, and zero swap going between memory and disk when running jobs.   The performance of the MapR cluster can be managed and evaluated through the MCS Dashboard. This gives administrators a single place for configuring, monitoring, and managing MapR clusters. It can be accessed using the syntax shown at the bottom of the screen.  The MCS can configure and monitor many different things, but we will only be focusing on the items of specific interest to measuring and obtaining high performance. For a more comprehensive discussion on the MCS, you can revisit the Admin 2000 curriculum or review the online documentation.   There are two major features exposed by MCS that dramatically simplify administration of a cluster, which are MapR Heatmaps and YARN Metrics.  MapR Heatmaps give administrators an instant read on the health of the nodes in the cluster, providing easy-to-interpret icons representing the state of each node. Heatmaps can show the overall health of a cluster, or more detailed node properties such as CPU, disk, or memory utilization.

Configure a MapR Cluster (MapR v6)

2.1.1: Tabs

Click the tabs
  • 1. Definition
  • 2. How it Works
  • 3. Pros and Cons
  • Uses labeled training data
  • Learns relationships between given inputs to a given output
  • Desired output must be part of the labeled data, to be known
1.

Load labeled input data

2.

Train model on the data: connection to input variables are output is made

3.

Apply new data to algorithm

4.

Provides output

Audio

Transcript

Remember from the previous lesson that we’ve discussed what a high performing cluster looks like.   It doesn’t mean that you’re running expensive hardware, but more importantly, that it is running smoothly and efficiently. Ideally, the memory should be up to 98% capacity, the CPU maxed out at full capacity, and zero swap going between memory and disk when running jobs.   The performance of the MapR cluster can be managed and evaluated through the MCS Dashboard. This gives administrators a single place for configuring, monitoring, and managing MapR clusters. It can be accessed using the syntax shown at the bottom of the screen.  The MCS can configure and monitor many different things, but we will only be focusing on the items of specific interest to measuring and obtaining high performance. For a more comprehensive discussion on the MCS, you can revisit the Admin 2000 curriculum or review the online documentation.   There are two major features exposed by MCS that dramatically simplify administration of a cluster, which are MapR Heatmaps and YARN Metrics.  MapR Heatmaps give administrators an instant read on the health of the nodes in the cluster, providing easy-to-interpret icons representing the state of each node. Heatmaps can show the overall health of a cluster, or more detailed node properties such as CPU, disk, or memory utilization.

2.1.2: Flow Steps

Click the numbers
Performance Guidelines
Performance Guidelines
Performance Guidelines

Audio

Transcript

Remember from the previous lesson that we’ve discussed what a high performing cluster looks like.   It doesn’t mean that you’re running expensive hardware, but more importantly, that it is running smoothly and efficiently. Ideally, the memory should be up to 98% capacity, the CPU maxed out at full capacity, and zero swap going between memory and disk when running jobs.   The performance of the MapR cluster can be managed and evaluated through the MCS Dashboard. This gives administrators a single place for configuring, monitoring, and managing MapR clusters. It can be accessed using the syntax shown at the bottom of the screen.  The MCS can configure and monitor many different things, but we will only be focusing on the items of specific interest to measuring and obtaining high performance. For a more comprehensive discussion on the MCS, you can revisit the Admin 2000 curriculum or review the online documentation.   There are two major features exposed by MCS that dramatically simplify administration of a cluster, which are MapR Heatmaps and YARN Metrics.  MapR Heatmaps give administrators an instant read on the health of the nodes in the cluster, providing easy-to-interpret icons representing the state of each node. Heatmaps can show the overall health of a cluster, or more detailed node properties such as CPU, disk, or memory utilization.

2.1.3: Accordion

Click the plus signs

1. Clickstream Data

Click files:
Large files of nested data

Log files:
Flat text and stored in folders saved by year, month, day

2. MapR-DB Table

3. Parquet File

4. JSON File

this is new title 1

this is new title 2

this is new title 3

this is new title 4

this is new title 5

this is new title 6

this is new title 7

this is new title 8

this is new title 9

this is new title 10

this is new title 11

2.1.4: Knowledge Check

Click the answer

What is the output of a custom function?

Incorrect

A. Multiple rows

Incorrect

No matter which interface you implement for your function, the single or multiple interface, the output is a single row.

Try another answer

Incorrect

B. A SQL data type

Incorrect

No matter which interface you implement for your function, the single or multiple interface, the output is a single row.

Try another answer

Incorrect

C. A string only

Incorrect

No matter which interface you implement for your function, the single or multiple interface, the output is a single row.

Explore other answers

Correct

D. A single row

Correct

No matter which interface you implement for your function, the single or multiple interface, the output is a single row.

Try another answer