Data Pipelines

How to Build an ETL Pipeline without Code

Written by 
JD Prater
June 15, 2022

External data ingestion is essential for modern companies. But bringing in external data from a variety of sources is time-consuming, tedious, and manual work. Differences in data formats, files, sources, and systems make it difficult to seamlessly ingest data. The data must be wrangled, cleaned, and properly formatted for it to easily pass from one system to the next. 

For example, let’s say you're a global technology supplier working with multiple global distributors and manufacturers. And you need to ingest product catalog information every day, often multiple times a day. This process is critical for you to be able to keep your database up-to-date to forecast and plan better.

However, each distributor and supplier sends you data in a different format with some expecting to ingest via APIs, others provide nightly CSV dumps in FTPs, or they send email attachments. This approach involves a combination of manual data copy-paste to bring in external data.

Typically, ingesting this data takes time, effort, and technical know-how. This is an expensive approach requiring resources from engineering and data teams to complete. This recurring, repetitive work eats up time that can be better spent on tasks that drive business impact, such as developing models, performing analysis, or building products.

Thankfully, there's a way to simplify and automate external data ingestion that unlocks growth, productivity, and uses less resources.

How to Build a No-Code ETL Pipeline

Ingesting external data is now simpler thanks to AI and no-code data transformations. The Osmos External Data Platform simplifies onboarding messy, non-conformant external data into your operational systems. Plus, you can do this with zero engineering costs and completely automate the process.

In this example, the source is product catalog data as a CSV file located in a SFTP folder and it needs to be ingested into Snowflake.

Step 1: Create a Source Connector

The first thing you have to do is build an Osmos Pipeline by starting with a connector source. Osmos lets you ingest data from various formats and sources like databases, CSVs, APIs, etc. In this example pipeline, the source is SFTP.

List of connectors

After selecting the source, you fill in additional information such as the connector name, host name, port number, user name, etc. With Osmos, you have access to step-by-step instructions that walk you through each item. Once you save the information, the source connector is created.

Step 2: Connect to Destination Connector‍

Your incoming data needs a destination. In this example, you want to ingest the product catalog CSV file into Snowflake so you can use the pre-built Snowflake connector to get started quickly. The schema for this connector is defined by the selected columns from the query. 

Step 3: Map and transform your data without code

You’ve created the source and destination connectors. Now, it's time to map and transform the data to match your Snowflake schema. 

Here is where you typically have some engineer writing code. For example, we need to map the EAN value to product ID. They would typically take someone writing, testing, and maintaining Python scripts or SQL queries to clean up the data.

Mapping the source data to the destination

But with Osmos, you can validate, clean up, and restructure incoming data to fit your Snowflake schema without having to write code. Simply provide our AI engine with a few examples, and it computes and figures out what transformation is required to clean up the data

Data transformations

Our low-code External Data Platform eliminates the headaches of ingesting external data by teaching machines how to automatically clean it, fit it into the right formats, and send it where it needs to go.

‍Step 4: Schedule and Automate the No-Code ETL Pipeline‍

Once the data mapping and transformations are completed, you can select the time, day, and frequency for how you want to run the pipeline. Osmos turns this one-time painful process into a repeatable, easy to solve automated solution.

Step 5: Run/Test the ETL Pipeline‍

It’s now time to test the ETL pipeline. Once it runs, the incoming data is transformed and ingested into your destination. You can check the total number of records shared by querying Snowflake to ensure the numbers records match the number in the CSV file.

Within a few minutes, Osmos turned a painful product catalog CSV cleanup and ingestion process into an ETL pipeline that just works automatically.

Congrats on finding a better way to automate your external data ingestion from your partners, distributors, and manufacturers.

Osmos No-Code ETL Pipelines Simply How You Ingest External Data

Building no-code ETL pipelines to ingest external data from partners, vendors, suppliers, and manufacturers does more than simply save time. It creates exponential value for businesses by cutting down the cost of managing data, freeing up your technical teams, having cleaner, more accurate product catalogs, and the ability to scale your intake  to access a border set of products.

The future of external data ingestion is about letting your systems talk to your external systems, and Osmos is leading this charge. Explore how Osmos Pipelines can help your company quickly ingest data from external parties, without writing a line of code.

With Osmos you are in full control of your data onboarding. Companies that want to make data imports as fast and efficient as possible look to our no-code data pipelines. It's perfect for businesses that need to:

  • Control how external data is ingested. Ingest clean data from your customers and partners every time
  • Control how frequently the data is imported. Osmos Pipelines can be set up on a recurring schedule or manually triggered to run at any time so you never miss a dataset.
  • Control the customer experience. Automated data imports help you provide a premium customer experience by streamlining operations and communication.

The Definitive Guide to Data Onboarding

Discover the benefits and challenges of customer data onboarding, how to create an efficient process, and see how the right approach can accelerate business growth.

view the GUIDE

JD Prater