Solution:
Data Migration
Tags:
ETL
IBM DataStage
Data Migration
VisualFlow

A large-sized IT company the development of digital solutions contained vast amounts of data for the company structure, projects, customers, employees, and their education certificates of 3,000+ employees.

The existing system provided them with more functionalities than needed, leading to overpricing. The decision to change was financially based, and pricing was the reason.

Challanges

1
Migrating ETL jobs from a existing application to a modern, cloud-native, scalable, and open-source alternative (cost of ownership was the main reason for the changes);
2
The license was about to expire, which would cause the imminent shutdown of servers hosting the old database and IBM DataStage. So, engineers should do the migration in the shortest terms.
3
The solution should be low-code, intuitive & have job visualization options, so the non-tech data analysts could use it.

Solutions

  • ETL jobs Migration
    Migrate all ETL jobs and data pipelines from IBM DataStage to the Visual Flow low code solution.
  • Configuration
    Install and configure the new target database. Develop corresponding jobs and data pipelines in the Visual Flow solutions.
  • Training
    Educate the client’s data and non-data analysts on how to use their new software. Create extensive technical documentation for the delivered Visual Flow solution.
  • Simple UI
    Create a simple drag-and-drop interface/UI for the panel for creating ETL job pipelines.
  • Deadline-driven
    Efficient planning, setting up clear goals to achieve & approved work scope allowed to meet deadlines before the imminent shutdown of servers hosting the old target database and IBM DataStage.

Results

  • 5 times lower operational costs

    The migration decreased the cost of ownership of the ETL system by at least five times compared to the original DataStage usage.

  • 3 times Improved performance

    The IBM DataStage logic has been optimized and successfully migrated to Visual Flow, resulting in 3 times improved performance.

  • Usability: 8-10 times high-speed task performance

    The Data Analysis team is now able to build new jobs and pipelines 8 to 10 times quicker, using the simple drag-and-drop interface/UI for the panel and technical documentation for the Visual Flow solution.

Sooner or later, many companies meet data transformation challenges during their business operation course. It was the case with one of our clients who approached the Visual Flow professional service team with the request to migrate ETL jobs from a current application to something modern and open source.

The Story of the Project and the Business Behind It

Let us start with some general information about our client. It’s a large-sized tech-driven organization focusing on the development of custom digital solutions. They invest heavily in the education of software engineers to improve their hard and soft skills.

The customer system was developed around the IBM Datastage software. It contained ETL data for the company structure, projects, customers, employees, and their education certificates. Just for reference: there was data of 3,000+ employees. It includes information for completed and passed courses, internal promotion and compensation increase lists, and data that allows for monitoring the availability of engineers for potential projects.

Our client faced a complex issue. First, the IBM Datastage provided them with more functionalities than they needed, meaning they were paying for services they did not use . Pricing was the reason for the changes.

The fast-approaching expiration date of the hosting license created a sense of urgency for the data migration.

So, they approached the Visual Flow team to help migrate away from DataStage and old servers. We were to find a solution that would meet our client’s requirements regarding the tech stack. It should be modern, cloud-native, scalable, and open-source. It also needed to be intuitive enough and come with low code and job visualization options. The latter functionalities were due because they greatly assist non-tech data analysts, and HR use them in their day-to-day ETL routines.

Goals and Objectives for the Visual Flow team

After a negotiating and planning phase; the Visual Flow team came up with clear goals to achieve. The team created an approved work scope.

  • Migrate all ETL jobs and pipelines from IBM DataStage to the Visual Flow software solution.
  • Provide HR analysts and resource managers with the possibility to build simple ETL job pipelines should they need new data for their evaluation and appraisal initiatives.
  • Create a simple drag-and-drop interface/UI for creating ETL job pipelines.
  • Install and configure the new target database.
  • Create extensive technical documentation for the delivered Visual Flow solution.
  • Develop corresponding jobs and pipelines in the Visual Flow solutions.
  • Educate the client’s data and non-data analysts on how to use their new software.

The project was complicated by the requirement to conduct all the processes within one month without compromising the quality, reliability, or performance of ETL processes. As mentioned, deadlines are explained with the oncoming shutdown of the servers.

How Visual Flow Responded to the Challenges

Our team should prepare to meet such ambitious objectives within the deadlines. So, we established and followed a clear plan with detailed milestones.
  • Install a new target database.
  • Create schemas and tables.
  • Establish a relationship between tables in the new target database.
  • Document the old DataStage logic.
  • Make changes to the old logic, keeping in mind the features of the original Visual Flow solution.
  • Install and configure the Visual Flow solution.
  • Create a project inside the Visual Flow solution.
  • Create and validate connections to source and target databases.
  • Create Visual Flow jobs in the designer, so they are parallel with jobs in DataStage.
  • Create pipelines — analog of sequence jobs in DataStage.
  • Check the performance of jobs and pipelines and fix errors.
  • Perform data validation between old and new target databases.
  • Optimize the performance of the Visual Flow solution.
  • Disable scheduling for DataStage sequence jobs.
  • Add a schedule to the Visual Flow cron pipelines.

The Business Outcomes of Our Work

Business-level projects are not about meeting technical requirements, but rather about outcomes obtained. Here is a summary of the achieved results through this project implementation:

  • The IBM DataStage logic has been optimized and migrated to Visual Flow, resulting in 3 times improved performance.
  • Engineers have transferred all data from the old target database to the new one. Now it is Cognos-compatible and suitable for building reports without changing the logic.
  • The Data Analyst team can now build new jobs and pipelines 8 to 10 times quicker.
  • The cost of ownership of the ETL system was decreased by at least 5 times compared to the original DataStage usage.

Do you have a data migration or a similar ETL project in mind?

Are you inspired by the results Visual Flow provides the clients? If so, contact us today! Our professional services team will always be happy to provide you with extra details or schedule a free strategy session.

Contact us

Support Assistance