Welcome to the Visual Flow ETL Demo!

P.S.Click through the navigation menu to immerse into your own user experience.

Introduction to Visual Flow

Demo Task ETL Process

Extract, Join & Cache Data

Load to Elasticsearch, AWS S3, Local File

More Transformation / Load to DB2

Job Execution

Once a project is selected, you can go to its overview page which shows the number of jobs / pipelines and their status.

On the left menu, go to Settings → Parameters to view the parameters created at the project level. These are some connection parameters used in our demo job.

On the Jobs tab there is a list of existing jobs within a project with a search bar, sorting and filtering options.

With the Palette button clicked, go to Job Designer. Switch from Palette to the Params tab to see the job parameters.

Getting started

Project Parameters

Jobs Overview

Job Designer

We are using two tables from MSSQL database.
The first one is Event, and it stores the events data like Event ID, Venue ID, Event Category, Event Name and Start time

The second one is Venue and it contains data about Venue (Venue ID, Venue Name, Venue City, Venue State, Venue Seats Quantity)

We use the following stages in the job:
* Read * Join * Cache * GroupBy * Transformer * Sort * Filter * Write

Table Event

Table Venue

Transformation and Load

Stages used in the job

For Read stages, MSSQL storage is used as we read the data from 2 MSSQL tables Event and Venue.
For such MSSQL parameters as JDBC URL, User, Password, Schema, and Table, we usE parameters created earlier via the Settings menu.

Two Read stages are connected to the Join stage as we join Event and Venue tables.
In the Join stage, we use the type Inner join, and the key for joining is venueid.

Cache stage is used here for keeping intermediate data (in our case it is the result of join) in memory before it is loaded to multiple destinations.

Read Stages

Join Stage

Cache Stage

To record Event and Venue joined data to Elasticsearch, we selected Elasticsearch storage in the Write stage configuration.
Its parameters like nodes, port, user, and password are also pre-created via the Settings → Parameters menu option.

To load the data to AWS S3 we partition it by venue state using Partition By option.
So the result dataset on AWS S3 is split into multiple files per each venue state.

We filter the data to the specific time frame we are interested in. This data is to be loaded to the local parquet file.

Now we load the filtered data to the local file of Parquet format.
Also, we partition the data by eventname.

Load to Elasticsearch

Load to AWS S3

Filter by Start Time

Load to Local File

With Group By stage we group the joined data by venue city and apply aggregation function SUM to count venue seats per venue city.

We use Transformer Stage here to rename a couple of columns.

With Sort Stage we sort the data by the number of seats ascending.

Finally the result is recorded to DB2 table count_seats.

Note: we use the existing DB2 connection called DB2.

This was previously created via Settings→Connections option, so it is visible for all jobs within the project “Demo Visual Flow”.

Group By Stage

Transformer Stage

Sort Stage

Load to DB2

Using Existing Connection

Now let’s run the job. First, its status turns to Pending, then to Running.

Now our job has succeeded. The status is reflected as Succeeded on the Job Designer control bar.

The Logs button on the job control bar, leads you to the Logs screen

With the info button, you can learn more about each stage and how you can use it to manage your data.

Running the job

Job Completion

Job Logs

More About Job Stages

Try on

Get on

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Welcome to the Visual Flow ETL Demo!

Watch Visual Flow demonstration and
UI review from the solution architect

Contact us

You have successfully subscribed to our newsletter!

Welcome to the Visual Flow ETL Demo!

Watch Visual Flow demonstration and UI review from the solution architect

Contact us

You have successfully subscribed to our newsletter!

Watch Visual Flow demonstration and
UI review from the solution architect