Fireside Chats

Fireside Chat: Fixing the First Mile of Data Ingestion

Written by 
Kirat Pandya
August 17, 2022

At Osmos, we care a lot about the first mile of data ingestion. Join our me and Matt Robison, CTO of Rahi, as we discuss why the way we ingest external data is changing and the impact it's having on business growth.

Transcript

Kirat Pandya: Hey everyone. Thanks for joining us. I'm Kirat Pandya. I'm the CEO at Osmos. Osmos is a low code External Data Platform that helps companies scale their data intake while reducing costs.

What we help organizations do is figure out how to increase the number of business relationships that they can maintain by increasing the number of data relationships that they can simplify, automate at scale.

With me today is Matt Robinson, who is the CTO at Rahi Systems. Thank you, Matt, for taking the time to talk to me today and please tell us a little bit about Rahi Systems to get us started.

Matt Robinson: Sure thing. So, first off, thanks for having us. It's nice to be a part of this. We've been engaged with Osmos for a long time and absolutely happy and delighted to be a part of a chat through what we're using it for and how it applies.

So Rahi has been around for about nine years, about 40 offices and 30 countries. And our primary goal is to service large customers. In fact, we work with seven of the top 10 hyperscalers today.

Rahi is focused on delivering solutions and services for customers. They're really intended to maximize the performance, scalability, efficiency of very complex business environments that we see today.

Our teams have very deep expertise in data center, infrastructure, compute, storage, networking, security, but we also cover a lot of things like end user, AV as well as cloud computing.

And as the CTO of the organization, part of my job is to continually be on the outlook of new technologies and new trends. And we became engaged together some time ago, a few years now, as we started that journey and a few of our technology trend changes.

Kirat Pandya: Yeah. No, it's been a fun ride. One of these fireside chats, how they always go, people expect that we brought you here to say good things about Osmos. But what I really want to do is get into the invisible pain.

Most people have never come across a picture of what are the kinds of data challenges that an organization Rahi goes through to operate their business day to day at scale and how painful the external data problem is and your proactive approach to dealing with it.

So on that front, I know you guys focus heavily on providing outstanding customer support and build a strong customer success culture around it.

How did you bridge the gap between the human process of relationships with all your vendors, customers, partners, and the data stack that your team is empowered with, if you will, to solve that problem so they can provide the best experience for your customers?

How did you bridge the gap between the human process of relationships and the data stack?

Matt Robinson: Yeah, it's a great question. And if I think about our cultural foundations, one of our key tenants is really putting the customers first. So it's important for us to have that solid relationship as we get into these complex engagements with them.

But at the same time, we're constantly looking at ways that we can improve that experience through the use of data, not just in terms of how we make engagements faster, but also looking at automation, simplification by removing touchpoints, where things may have traditionally been handled by a person.

"It's really become key for us to let technology drive the movement of data."

It's really become key for us to let technology drive the movement of data. And that includes transactions that we have with customers, engagements, orders, and really to ensure that our people are focused on what's important, which is really talking to, engaging, and being directly involved with what our customers need every day.

Kirat Pandya: That makes total sense. Since our first days of engagement with your team, we've thought of the external data, bringing external data to Rahi as sort of a first mile of data ingestion problem, right?

Data never just goes through gets ingested and it's done. It's a process. And so your business looking outside in, again, seems to be fundamentally powered by receiving data from customers, vendors, distributors, manufacturers, logistics providers, and the list is massive.

One of our engineers was joking, "Rahi looks like a giant mucks of data. The data moves around and then physical goods and services follow it. It's crazy."

So our team was surprised with how complex this is and how many different formats you guys deal with day-to-day. At what point... What broke the camel's back, right? At what point was, enough is enough you need tech to solve this?

What broke the camel's back?

Matt Robinson: Well, yeah. And I think it's very true. We have so many ingestion points, so many different customers, vendors, partners that were constantly having to figure out how to absorb all of that.

And I think for us, if we looked at it, the point where we saw this transition was really when we started to measure the business and the ability to scale given we've been continually growing over the last nine years. That scale was being impacted based upon the number of hires we brought on board.

And when people become the inflection point to where you can't scale your business, unless you can start hiring and keeping up at that, well, then you've really got to take a look at some alternatives.

"When people become the inflection point to where you can't scale your business, unless you can start hiring and keeping up at that, well, then you've really got to take a look at some alternatives."

Obviously, the diversity of different data intake points, the tied to data we're using, not to mention a number of ERP consolidation, and now centralizing on one common ERP. All of those had to be broken down into the people impact.

And really from that point, deciding that putting technology in place and a data platform for Osmos, for us, could really just help eliminate that peoples scaling challenge and ultimately let us focus and pivot away from this idea of, "Okay, we're going to bring somebody on board, we're going to train them up and get them used to how we manage all of this intake," and really moving towards process standardization and data pipeline management, which is fundamentally easier and more consistent as a whole.

Kirat Pandya: So if I were to take the simplistic sort of view of what I've seen happen... Tell me if I've got this right. There was a point where in order to scale your business, you were also scaling the number of people to keep up with the data relationships.

And what you were trying to do was break that interlock to be able to scale the business without necessarily having to continually add a linear amount of people with it.

Matt Robinson: Right. Because the intake of new opportunity, new orders, new business, shouldn't have to be impacted based upon how quickly we can hire and have those people come into place.

Kirat Pandya: Yeah, yeah.

Matt Robinson: It's a scale that can't be handled in a linear way over time. Ultimately, something has to give in that process. And culturally, again, we want to look at our customers about the business problems they're having and not let that interaction be purely about the order or the process for which all these logistics and global warehousing efforts take place.

Kirat Pandya: Makes total sense. I'm sure your customers appreciate the completely reduced friction they have in terms of working with Rahi, right. So, makes sense.

Matt Robinson: Yeah, absolutely.

Kirat Pandya: So in one of the things that I see, we have customers come to us and they're like, "Hey, what are the best practices out there?"

From that perspective, what advice would you give organizations in a similar position, your peers at organizations that look a lot like Rahi, giant data mucks, if you will?

What advice would you give organizations in a similar position?

Matt Robinson: Right. Well, it is funny. It's like, "Well, we can't give them too much of a competitive advantage. They're not allowed to watch this broadcast." No.

But in all seriousness specific to our own investments in this, logistics and global warehousing and these large order managements, I would encourage leaders who feel they're in a similar scenario to really start looking at all the touch points that their businesses have with the various suppliers, vendors, partners, customers.

You think you might have a handle on it. You probably don't. And digging into that is really important to set a baseline for what is the scale of the problem you're running into.

I'd say, from there, you need to take a look at where data is passed back and forth, and really try to decide which of those intersections can be automated or simplified through the use of technology. And sometimes it's looking at which problems or sets are bigger or more impactful to the business, which can be another way in which to index on the way in which data goes back and forth.

I'd say, from there, you need to take a look at where data is passed back and forth, and really try to decide which of those intersections can be automated or simplified through the use of technology. And sometimes it's looking at which problems or sets are bigger or more impactful to the business, which can be another way in which to index on the way in which data goes back and forth.

I personally don't think we're going to get away from, well say, traditional communication scenarios, because there's so many times where data flows in and out of a company through very manual, as you describe, messy processes. The idea of spreadsheets, emails with document attachments, cloud drive shares, the list is huge.

And all of these scenarios can really have a time metric as well as a cost metric for them. And for us, we found that that could either be reduced or completely eliminated with Osmos.

Kirat Pandya: Got it. Got it. No, that makes somewhat sense. So when you set out to find a solution to this problem around messy, external data, what were your key goals? What were you looking to solve?

To be honest with you, again, going by the theme of where I started with. I'm looking to really understand how you thought about this problem from first principle's perspective on this.

What were your key goals? What were you looking to solve?

Matt Robinson: Yeah. And there's a few key points. I mean, we could always look at the good, fast, cheap triangle and say, "That's the strategy here." But if I looked specifically to what we were doing, I'd say our primary goal was really to reduce costs.

If you think about scaling out with people instead of technology, that really means that we had to evaluate what it would mean to say, "Okay, I'm going to put technology in place of all of these individuals and move towards that model and what was the cost impact and what was the long term TCO associated to that."

But the next metric that was important for us was accuracy. Believe it or not, our accuracy rates with our resources was really, really high. And when you bring technology into the picture and we're all engineers, or have been engineers at one point, that can introduce inaccuracies.

And that's something, in terms of working with suppliers, with deadlines and commitments, that we absolutely needed to avoid. And especially when data's messy, which was the case for us and frankly, I think it's the case for tons of companies out there, it's really important to have people identify those little nuances.

And that can be in line items, it could be in key values that are located in the wrong location, in the data set that's coming in. Even descriptions can be important to carry over in a consistent way. And having those converted properly into the new format is absolutely important.

So for us, we absolutely need to make sure that the data was cleaned up in a way that ensured everything in the order management process worked smoothly. So those are two goals: accuracy, and costs.

So for us, we absolutely need to make sure that the data was cleaned up in a way that ensured everything in the order management process worked smoothly. So those are two goals: accuracy, and costs.

I see the third is really around speed. Can we speed up the process? Obviously, having people do this work and really trying to drive it around batch jobs was important, but again, giving time back to those team members to engage our suppliers, engage our customers, really improves our presence as a whole and provides a benefit to Rahi in terms of increasing that customer interaction.

Humans + machines for messy data cleanup

Kirat Pandya: Makes sense. I mean, I'm recalling back to early days of that first projects we worked on and it was this problem of, you had a customer who had a customer part number that needed to map to a manufacturer part number. And when you think about accuracy for this kind of messy data, it really struck with me as humans can be very accurate. It's a question of time and energy spent, right?

Matt Robinson: That's right.

Kirat Pandya: And on the other hand, a machine will make mistakes that are different than a human. And so it's not one to one where, "Hey, the manufacturer part number was in the description field and we extract it." And then you also have to account for the fact that when the machine does it wrong, make sure you leave enough breadcrumbs for the human to come back and fix it and be like, "Hey yeah, you chopped off the last number accidentally machine. You got it wrong."

So it's been such an interesting learning journey working with your team on this very specific problem. The other thing I would add, Matt, is it was incredible watching your team execute, through the pandemic, some of the stuff you guys did in terms of keeping organizations functional, like the AV rigs you guys sent to your customers so people could work effectively from home.

That took some incredible logistics at a time where logistics was coming to a grinding halt. So, clearly you guys are doing stuff right.

Matt Robinson: Yeah. We're proud of the efforts that the Rahi team has put in overall. I think everything from order management, through logistics, our warehouses, our sales operations.

The pandemic, I think was a struggle for everybody. And while trying to implement a new technology solution, while also going through some of these behaviors around what work meant and how things were impacted from a supply chain standpoint, it was tough.

But I'd say overall, we're really proud about where we're at now, given so many of the changes. I would say that even today, seeing a lot of the supply chain impacts and maybe I'll say just for the date, this is second half of '22 calendar year, seeing some of the supply chain challenges that are existing around these popcorn parts out in the manufacturing sector is also creating challenges, not just in terms of how the orders are coming in, but how they stack up over time.

So this data problem, whether we're talking about pandemic impacts to how data and orders come in and what we can do to simplify that for our customers to now, how do we deal with challenges and orders where maybe what you're looking for, isn't out there. And we start to have to query a lot more suppliers to try to provide the same type of need at scale. It's an interesting challenge.

Challenge, I think for us overall. It's been very helpful for us to work with Osmos and know that the data's coming in consistently in that way.

Kirat Pandya: That's good to hear. It sounds like, especially in a volatile pricing and logistics, availability environment, it's constant churn of suppliers. You need vendor pricing catalogs brought in. So yeah, makes total sense.

So from that perspective, where does an External Data Platform fit into your data stack? And how does it fit into and help tie back to your overall technology strategy?

Where does an External Data Platform fit into your data stack?

Matt Robinson: Yeah, it's an interesting question because obviously there's multifaceted to way in which Rahi works and engages customers today. In terms of where we started and where we continue to work, obviously Osmos Pipelines, as you know, is key to a lot of our business and the ingestion of orders from suppliers merging in and connecting it to our ERP data, dealing with part number, stuff, as you mentioned, just so that we can streamline all of those elements together for our suppliers and customers and some number of internal use cases.

We also use it to extract part information for quoting purposes, which actually helps activate some of our sales organizations so that we can simplify quote creation, which again, if you think about it being that first mile, as you described earlier, it then feeds into sales operations and feeds into other order systems, which then goes to logistics.

So, the ability to get things right up front really trickles down into benefits long term through the entire cycle. But to that multifaceted scenario, looking forward, we're looking for scenarios where not only we solving some of our own problems and seeing how Osmos enables us, but we're really trying to take it a step further.

So now with a cloud practice that we have, and having this ability to engage customers around data transformation, pipelining, and maybe even migration, we see Osmos as it being a very powerful tool, not just for our cases, but for some of our big customers as well.

So now with a cloud practice that we have, and having this ability to engage customers around data transformation, pipelining, and maybe even migration, we see Osmos as it being a very powerful tool, not just for our cases, but for some of our big customers as well.

So we're engaged with you to really explore, "Well, how do we take that partnership and be able to work now that we've built our own skills with Osmos and our ability to correct our own data sets? How can we take that forward to our customers and find opportunities to engage them with their own data migration and transformation needs just beyond our own internal use cases?"

So we see this as a continued growth between the organizations. Obviously, we're enjoying being a customer, but we see that there's a lot of other opportunities out there for other customers.

Freeing up teams with technology

Kirat Pandya: That's great to hear. I'm super excited about us working together on that space. I think bringing together Rahi's expertise of having solved the problem in-house with some of Osmos' technology is going to be pretty cool and exciting to see in the real world, in industry, sort of building that out together.

Matt Robinson: It's funny because I've always enjoyed picking up, I hate to say VI, but I used... that's my favorite editor. I've come from the old school days and being able to get to sit down with some Python code and write out some mechanisms to manipulate data and really change things around.

But it got to a point where I was like, "Well, let me show you this really cool interface with Osmos," and I could bring some data in and show how you can... "Oh, check out how the program synthesis is adjusting the data. Look how it's actually re-manipulating these fields below there and showing how you can join or transition data from one format to another."

It's actually really exciting. And while I enjoy coding and I enjoy sitting down and looking at how I may be able to do this manually, it's really nice to see how the tool simplified that experience. It made it so much easier to where it's almost starts pulling you away from the code side and really having you focused on what the platform can enable.

So in that context, I'm always going to try to hold onto my programming roots, but it's also nice to know out there that there's an easier way to do some things. There's always a better way to build a better mousetrap, if you would.

Kirat Pandya: Oh, a hundred percent. The other piece, Matt, is I see how since you guys have adopted Osmos, your teams, people or people within your organization who were previously tied to living in Excel or Google Sheets are now empowered to take what was a single cleanup process every day to automation, things that they couldn't have done before.

So I view this as taking engineers like the two of us away from code, so be it, but empowering the rest of the organization to also pitch in and help move the needle, if you will, on automation.

Matt Robinson: Yeah, absolutely. I mean, we've got phenomenal engineers here at Rahi. They do everything with cloud, to data, to AV. I mean, you name it. We have just excellent, excellent engineers, but to make the technology approachable in such a way that now people in sales operations or people even in marketing, or in finance can use this as a mechanism to improve their data sets.

It's a really powerful tool. I mean, there... And even if I go back to the early days of working with Osmos where we were initially working through how you looked at the connectors on each end and where data's flowing, now, it's like, "Hey, just give me a ugly CSV file or give me a bad Excel file. And now help you clean it up."

That's really approachable. And that's a very different model than somebody on my end where an engineer would come in and maybe build out all the macros or write some code to clean that up. It can now be just a simple drag and drop operation.

Kirat Pandya: Maybe an engineer who opens up VI, right?

Matt Robinson: Yeah, exactly. Exactly. I've exposed my weakness. I am an old school editor person at heart.

Kirat Pandya: Same page, same page. My team tries to take my laptop away and be like, "No more code for you." So, now I hear you. Hey Matt, thank you so much for taking the time to just teach us more about what powers the real world, if you will.

Matt Robinson: Thanks Kirat. Thanks everybody.

Kirat Pandya: All right. Thanks everyone in the audience for listening in and we at Osmos are excited to help you go on a similar journey here in terms of next leveling the way you deal with messy, external data.

Should You Build or Buy a Data Importer?

But before you jump headfirst into building your own solution make sure you consider these eleven often overlooked and underestimated variables.

view the GUIDE

Kirat Pandya

CEO & Co-founder