How to Automate FTP Data Ingestion with No-Code ETL Pipelines

Written by

Julia Harold

March 17, 2022

Customer onboarding is an exciting time for both you and your new customers. But sometimes the process can bring more stress than excitement. When signing on new customers, you’re embarking on a partnership that will hopefully be able to scale as both parties change and grow. So setting the right tone early is key to a successful, long-term customer experience.

A key part of the customer onboarding journey includes being able to quickly–and accurately–ingest their data, so they can start seeing the value of using your product(s). For many MarTech and eComm companies, onboarding customer data is a long, slow, and tedious process. The longer it is, the longer time-to-value your customers. Especially, if this is an ongoing process.

A popular way customers share data is through FTP connection. While there are benefits to this method, there are significant challenges when trying to scale the process.

What is FTP and SFTP?

File Transfer Protocol (FTP) is a standard communication protocol used for the transfer of computer files from a server to a client on a computer network. FTPs help users move and migrate large data files over a secure network.

SFTP, which stands for Secure File Transfer Protocol, is another way to transfer files and access data, but more securely, as it encrypts data to prevent security breaches for sensitive data. Additionally, SFTPs require authentication for secure data transfer.

Both FTPs and SFTPs are preferred data sharing processes, because they offer additional security and reliability.

Secure: some FTP servers require a username and password to access
File size: FTPs allow for large file sizes
Custom: FTPs can be set up to have programmatic access
Reliable: FTPs and SFTPs allow for guaranteed file delivery

The Challenges of Sharing Data via FTP

While FTP is great for one-off large data transfers, there are three significant challenges when trying to scale the process for your customers and partners.

1) Lots of teams involved on both sides

Sharing data like CSVs via FTP folders between companies might be inevitable, but it can be a tedious exercise that requires a lot of coordination between your internal teams and your customers.

For example, let’s say your big retail customer needs to run an email campaign next week. To share files, your process is probably follows these steps:

Your retail customer pulls customer data from their database as a CSV by the IT or eng team.
They drop that CSV into an FTP folder.
Their team will then send an email letting you know the file is ready. Or, maybe your FTP provider has programmatic alerts that notify when a new file is added.
Your CS team then pings or creates a ticket to inform the eng team to retrieve the file.
That sounds easy enough–except for your customer’s CSV doesn’t match your system’s schema.
To tackle this issue, the engineering team uses a combination of Python, Airflow, and other tools to clean up and restructure the data to fit your system.
The data then gets uploaded into your system so you can run a successful email campaign.

2) Incredibly slow and tedious process

Onboarding your customer’s data for the first time is a slow process that can take anywhere from 4-6 weeks. The process takes lots of time, because it requires a lot of resources and cross-functional synchronization from your internal teams to be successful.

To automate and speed up the workflow, some companies have their engineering teams hardcode data pipelines for each customer. Once the setup is built, subsequent data “drops” are now completed in hours rather than days depending on the file size.

However, this scenario now needs your eng team’s time to fix and maintain each customer’s data pipeline when things go wrong. This is a very tedious process, especially when it has to be replicated for new customers and partners. Plus, the only way this process scales is if your company hires more engineers as you gain more customers.

3) Manual data cleaning

Part of what makes customer data ingestion a lengthy process is being able to clean and validate the incoming data. Customers format their data one way and you format yours another. That’s fine until the data leaves one party’s system to be added to another.

This data cleanup process requires heavy coding, which means a lot of engineering time and resources focused on data wrangling. Your engineering team could be working on core products instead of manually going column-by-column to match, parse, and clean data.

Thankfully, there’s a better way!

Osmos has significantly streamlined this lengthy data ingestion process for MarTech companies. We’ve taken onboarding new customer data from 4-6 weeks down to 30 minutes with less internal resources. Now your customers can take advantage of your products and services faster than ever.

Automate data ingestion from FTP folders

Ingesting customer data isn’t a one-and-done process. To speed up your process, you can schedule Osmos Pipelines to run down to the minute, automating the entire process from FTP folder to your target system (database, API, etc). Osmos Pipelines can handle millions of records per minute and large file sizes to scale with your business needs. Get fresher data with less resources by automating your customer data ingestion. Free up your teams to focus on providing a better customer experience.

No-code data transformations

Never deal with custom scripts and spreadsheets again. With Osmos, you can quickly clean, validate, and format customer data into your target systems.

We’ve made complex data cleanup simple with no-code data transformations. You can now clean data with a few examples or formulas. This improves the accuracy of the data you’re receiving while speeding up the process for both your customers and internal teams.

Ingest customer data with less resources

While sending CSVs via FTPs are a great way to exchange data between businesses, this method doesn’t get you across the finish line without significant time and resources.

Osmos gives your customers the freedom to share their data on their terms while allowing you to gain control of your data onboarding. Our low-code External Data Platform eliminates the headaches of ingesting external data by teaching machines how to automatically clean it, fit it into the right formats, and send it where it needs to go.

It's vital for growth that your company makes customer data ingestion easy. Our low-code solutions make implementation straightforward without a heavy lift from your engineering team! And our intuitive UI, makes it easy for non-technical team members to learn and use.

Book a demo to learn more!

Go From Co-Pilot to Auto-Pilot

Discover our fully-autonomous AI Data Wrangler on Microsoft Fabric

Talk to an expert

Julia Harold

Product Marketing

Connectors

Osmos HTTP API: “The Super Connector”

Written by

JD Prater

Now you can dynamically talk to any API with Osmos. Osmos can talk to practically any SaaS application and pull data from it or push data to it. It's like a super connector!

Connectors