MapR Academy All Courses
DEV 350 - MapR Streams Essentials
Lesson 1: Introduction to MapR Streams

Summarize the Motivation Behind MapR Streams

1.1.1: Why MapR Streams?

Click the numbers to view information.

Many big data sources are event-oriented.

Example: trigger events (stock prices, user activity, sensor data)  

Why Mapr Streams?

Today’s applications need to process high-velocity data as soon as possible
and handle high-volume workloads.

Why Mapr Streams?

Today’s applications need to provide real-time results with extremely low latency.

 

Why Mapr Streams?
Continue 

1.1.2: Examples of Diverse Data Sources

Click the tiles to view information.

What is a possible example for each data source?

Social Media Feeds

Real-time social media alerts

Financial Transactions

Real-time fraud alerts

Geotagging Data

Real-time navigation information

Internet of Things

Real-time sensor data

Web Browser Clickstreams

Real-time advertising

Application Metrics

Real-time event analytics

Continue 

1.1.3: Analyze Data

Click the numbers to view information.

What if you need to analyze data as it arrives?

Analyze Data

Batch processing cannot answer "what is happening right now?” 

Analyze Data

Analyzing data as it arrives requires several distributed applications to be linked together in real time

Analyze Data
Continue 

1.1.4: Organize Data

Click the numbers to view information.

What if you need to organize data as it arrives?

Organize Data

Integrating data sources can be complicated

Organize Data

Topics organize events into Categories

Organize Data

Producers publish to topics

Organize Data

Consumers subscribe to topics

Organize Data
Continue 

1.1.5: Process High Volume of Data

Click the numbers to view information.

What if you need to process a high volume of data as it arrives?

Process High Volume Data

Traditional message queues cannot handle high volumes of data

Process High Volume Data

Partitions spread the load across multiple servers

Process High Volume Data

Producers are load balanced between partitions

Process high Volume Data

Consumers can be grouped for faster performance

Process High Volume Data
Continue 

1.1.6: Message Recovery

Click the numbers to view information.

What if you need to recover messages in case of server failure?

Message Recovery

If there's no replica we risk data loss

Message Recovery

With MapR Streams all messages are replicated

Message Recovery

Producers and consumers send and read from the primary partition

Message Recovery

Server 2 went down

Message Recovery

Producers and consumers will be re-routed

Message Recovery
Continue 

1.1.7: Real-Time Access

Click the numbers to view information.

What if you need real time access to live data distributed across multiple clusters and multiple data centers?

Real-time Access

If there's no replica outside the cluster data may not be available

Real-time Access

Streams are collections of topics

Real-time Access

Streams allow high availability and disaster recovery

Real-time Access
Continue 

1.1.8: Knowledge Check

Click the tiles to view information.

What is the benefit of each of the following features of MapR Streams?

Replication

Fault Tolerance

Partitioning

Scalability

Topics

Organization

You have completed this learning goal!

Continue to 1.2