Wright Brothers Success in Relation to Succeeding Today

Everyone knows the story of the Wright brothers. They were the first people to actually build a flying machine that could carry a person operator. As one may assume this task did not just happen…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Crossing the Streams With Azure Event Hubs and Stream Analytics

What’s covered?

Azure Stream Analytics is a real-time analytics and complex event-processing engine designed to analyze and process high volumes of fast-streaming data from multiple sources simultaneously. It supports the notion of a Job, each of which consists of an input, query, and an output. Azure Stream Analytics can ingest data from Azure Event Hubs (including Azure Event Hubs from Apache Kafka), Azure IoT Hub, or Azure Blob Storage. The query, which is based on SQL query language, can be used to easily filter, sort, aggregate, and join streaming data over a period of time.

Assume you have an application that accepts processed orders from customers and sends them to Azure Event Hubs. The requirement is to process the “raw” orders data and enrich it with additional customer info such as name, email, location etc. To get this done, you can build a downstream service that will consume these orders from Event Hubs and process them. In this example, this service happens to be an Azure Stream Analytics job (which we’ll explore later of course!)

In order to build this app, we would need to fetch this customer data from an external system (for example, a database) and for each customer ID in the order info, we would query this for the customer details. This will suffice for systems with low-velocity data or where end-to-end processing latency isn’t a concern. But it will pose a challenge for real-time processing on high-velocity streaming data.

Of course, this is not a novel problem! The purpose of this blog post is to showcase how you can use Azure Stream Analytics to implement a solution. Here are the individual components:

An individual order is a JSON payload that looks like this:

This is the workhorse of our solution! It joins (a continuous stream of) orders data from Azure Event Hubs with the static reference customers data based on the matching customer ID (which is id in the customers data set and id in the orders stream)

In this section, you’ll:

Please note that you need to create two topics:

Save the JSON below to a file and upload it to the storage container you just created.

To configure Azure Event Hubs Input

Open the Azure Stream Analytics job you just created and configure Azure Event Hubs as an Input. Here are some screenshots which should guide you through the steps:

Choose Inputs from the menu on the left

Select + Add stream Input > Event Hub

Enter Event Hubs details — the portal provides you the convenience of choosing from existing Event Hub namespaces and respective Event Hub in your subscription, so all you need to do is choose the right one.

To configure Azure Blob Storage Input:

Choose Inputs from the menu on the left

Select Add reference input > Blob storage

Enter/choose Blob Storage details

Once you’re done, you should see the following Inputs:

Azure Stream Analytics allows you to test your streaming queries with sample data. In this section, we’ll upload sample data for orders and customer information for the Event Hubs and Blob Storage inputs respectively.

Open the Azure Stream Analytics job, select Query and upload sample orders data for Event Hub input

Save the JSON below to a file and upload it.

Open the Azure Stream Analytics job, select Query and upload sample orders data for Blob storage input

You can upload the same JSON file that you uploaded to Blob Storage earlier.

Now, configure and run the below query:

Open the Azure Stream Analytics job, select Query and follow the steps as depicted in the screenshot below:

Select Query > enter the query > Test query and don’t forget to select Save query

The query JOINs orders data from Event Hubs it with the static reference customers data (from Blob storage) based on the matching customer ID (which is id in the customers data set and id in the orders stream.)

It was nice to have the ability to use sample data for testing our streaming solution. Let’s go ahead and try this end to end with actual data (orders) flowing into Event Hubs.

An Output is required in order to run a Job. In order to configure the Output, select Output > + Add > Event Hub

Enter Event Hubs details: the portal provides you the convenience of choosing from existing Event Hub namespaces and respective Event Hub in your subscription, so all you need to do is choose the right one.

In the Azure Stream Analytics interface, select Overview, click Start and confirm

Wait for the Job to start, you should see the Status change to Running

Start a consumer to listen from Event Hubs output topic

Create a kafkacat.conf file with Event Hubs info:

Let’s first start the consumer process that will connect to the output topic ( customer-orders) which will get the enriched order information from Azure Stream Analytics

In another terminal, start sending order info to the orders topic

In a terminal:

This will block, waiting for records from customer-orders.

You can send order data via stdout. Simply paste these one at a time and observe the output in the other terminal:

The output you see on the consumer terminal should be similar to this:

As expected, you won’t see a corresponding enriched event corresponding to orders placed by customers whose ID isn’t present in the reference customer data (in Blob Storage), since the JOIN criteria is based on the customer ID.

This brings us to the end of this tutorial! I hope it helps you get started with Azure Stream Analytics and test the waters before moving on to more involved use cases.

In addition to this, there’s plenty of material for you to dig in.

High-velocity, real-time data poses challenges that are hard to deal with using traditional architectures — one such problem is joining these streams of data. Depending on the use case, a custom-built solution might serve you better, but this will take a lot of time and effort to get it right. If possible, you might want to think about extracting parts of your data processing architecture and offloading the heavy lifting to services which are tailor-made for such problems.

In this blog post, we explored a possible solution for implementing streaming joins using a combination of Azure Event Hubs for data ingestion and Azure Stream Analytics for data processing using SQL. These are powerful, off-the-shelf services that you are able to configure and use without setting up any infrastructure, and thanks to the cloud, the underlying complexity of the distributed systems involved in such solutions is completely abstracted from us.

Add a comment

Related posts:

About me.

My name is Luis Alonso Solis, I am 22 years old, I am studiying Mechatronics engineering, I like travel with friends and I don’t like stay in home :(. In my free time I like to do sports like soccer…

Top 4 Reasons Why Cryptocurrency Investment Is Detrimental

Is Cryptocurrency investment always on your mind? Or are you thinking about where to invest in crypto? Set back for a while and reanalyze your financial plans! The Cryptocurrency investment mania and…

Humdrum

my heater sits beside me on my desk working hard — fulling its purpose ‏‏‎ ‎‏‏‎ ‎‏‏‎ ‎‏‏‎ ‎‏‏‎in life to ‏‏‎ ‎‏‏‎‎‏‏‎‏‏‎ ‎‎‏‏‎‎‏‏‎‎‏‏‎‎‏‏‎‏‏‎‏‏‎‎‏‏‎‎‏‏‎p‏‏‎ ‎‏‏‎‎‏‏‎‏‏‎…