Toggle Menu

Season 2 · Episode 1

Data Observability, Customer-Led Growth, and Confidence with Barr Moses

Barr Moses discusses with Sam about bringing DevOps into Data Engineering, building a data startup, and letting joy guide your way to creating impact. Learn how being data-driven depends on systems of people and trust.

Published June 10th, 2021  |  27:26 Runtime

Episode Guest

Barr Moses

Barr Moses

Co-Founder and CEO at Monte Carlo

Episode Transcript

Hi, I'm Sam Ramji and you're listening to Open Source Data. We're kicking off season two today with Barr Moses, CEO and Co-Founder of Monte Carlo. Welcome Barr.

 Hi Sam. Great to be here.

Great to have you on the show.

We like to start each conversation asking our guests what open source data means to them. So what does it mean to you?

 For me, I think it's really about the strong community around open source and data, it's around the people and the energy and the movement. And I think what we're seeing in recent years is kind of this explosion of solutions, and options, and tools, and so much movement happening, and data in open source in particular. You know, we're seeing more complex architectures, more data sources, more teams using data, more companies actually becoming data-driven -- it's no longer just a term.

And so there is some tension between trying to make data actually open, while also ensuring it's compliant and reliable, secure, and safe. And so there's a bunch of new challenges that are emerging as a result of that. Questions like data trust, data discoverability, data ownership, new roles in data, etcetera.

So I'm really excited for the community and the movement to help figure out how to do that. I think in that tension, in that creativity is where the magic will come. So really excited about that over the next few years.

And you're taking a pretty big stand in this environment, right. You're creating a category of data observability. Which is kind of a big deal, right? Cause if you can't see it, it can't really be open. Right. You know, I think about the scene in Indiana Jones and Raiders of the Lost Ark, where at the end they're putting the arc of the covenant in a giant warehouse, right. So if your data looks like that, it's not really open. Even if you had the keys, cause you can't discover it. So I'd love to hear you talk a little bit about data observability and what it's like to think about category creation and how you explain it to folks. 

 In software engineering, the concepts or the dev ops best practices have been well known right. And well understood. And the concept of observability is a very important one in software engineering, right? Observability helps engineering and engineering teams make sure that their apps and infrastructure are up and running and reliable. So everything that we take for granted today, like for example, the fact that we can watch Netflix whenever we want, or, you know, we can go to Google and search for something, all of that depends on apps up and running 24 seven, right. We really take that for granted. 

For some reason in the data space, we're still catching up on all of those concepts, not just observability, but all concepts of kind of dev ops and best practices, but in particular onto the concept of observability and reliability of data, we have a lot to improve.

What do I mean by that? I mean that 5 to 10 years ago there were only a very small handful of people in the company that were actually using data. And maybe they were using that just to look at a couple numbers once a quarter and that's it. And so in those cases, you could do a lot of things very, very manually. And maybe you didn't even need to think about things at scale or automation or, you know, machine learning. But today when you have a high percentage of people and a growing percentage of people in the company, we're actually relying on data, you have people making decisions based on data using in real time, powering products with data in real time. In particular, in industries like FinTech and e-commerce and media, where their Alliance on data is incredible. But also, you know, in other companies like in B2B and marketplaces, really, you know, no company is fair to that degree from, from using data. And in that new reality where we have real-time decision-making with lots of people, lots of data sources, the stakes are higher.

You can no longer shove things under the rag or, you know, do things really manually or overlook that report or that number that looked a little bit wrong and you're not sure why, but ugh, let's just not talk about the later and figure it out. We can't do that today. And because of that, we need to start figuring out how do we solve these problems at scale in an automated way?

And so observability is a concept in DevOps which is basically the idea of understanding the health of a system based on the outputs of that system. And so what if you took that concept and applied that to data? What if we could understand whether that data can be reliable? Or the data can be trusted by observing its outputs.

And I think this is probably the biggest challenge that we have to actually adopting data. Because if you look at companies trying to use their data, the biggest problem that ends up happening is people are like, "Um, the data doesn't look right. So let's just not use it. Let's just resort to gut based decision making, no big deal."

And so not being able to trust your data is in my opinion, the number one challenge to actually becoming data-driven and adopting data and fulfilling on the promise of data. Definitely an important one to figure out pretty quickly.

There's a really nice parallel that you drew between dev ops and maybe what we could call data ops, right? A movement towards more trustworthy, more reliable real-time data. One of the things that makes DevOps work is the visualization of a pipeline. Not just because it tells you whether or not you can trust the system, but also tells the team how to behave. You know, if it's up or down, you know, which link broke, you know, what's red, you know what screen?

And then you can take a corrective action pretty quickly, hopefully blaming the process and not blaming the people. Right. Everybody can kind of cluster around and we can be better humans together. As we solve it. Data pipelines have been pretty fraught. With availability, complexity, somebody changes a system ACL and all of a sudden part of the pipeline breaks.

So there's a lot of fragility and data pipelines that we haven't seen in software development in many, many years, because we have so much experience now in what we can call dev ops and code pipelines. That seems like it's been an inspiration for you in what you built. I'd love to hear you talk a little bit about first.

Why has data gotten away with this more monolithic and lagging behavior for so long? Because it seems like it's several years behind compute and applications, and then how do pipelines change the world for data engineering.

I totally agree with you on the. But what it looks like today and how the ability to bring that visual of the pipeline helps start that discussion reminds me when I was at Gainsight, you mentioned this, I led the data and analytics team and we basically called it the gong team.

It was Gainsight on Gainsight because we were using Gainsight data internally to make decisions. And I remember just literally every Monday morning or something like that, something would break. Someone would call us out like our CEO or our customers would be like, Like WTF. Why is the data wrong again? What happened? And then we'd start this mad rush to figure out what broke, where is it on that report level? Is it somewhere in the transformation? And is it, you know, a data source that, maybe someone changed the API. There could be so many things I would break.

 And actually, I remember vividly, I gathered a team together, we got into a room and we kind of whiteboarded, like what the pipeline actually looks like. And we're like, okay, here are all the data sources. And here's the next step right after that. And here's the next step right after that. And here's the transformation that we're doing right after that. And here's the report and just map it out manually. And we're like, okay, now we can start understanding what breaks where, and trying to triage that. And then we could start asking questions, like, is everything arriving on time at each of the different points of the pipeline and , are the transformations actually accurate?  Does this logic make sense, and is the volume of data right?

Is it the same volume of data that we got yesterday, right? All these questions that were just really hard to answer. And I remember thinking to myself, how are other people solving this? Ask yourself, am I crazy? Is the world crazy? Are you crazy? What is going on here?

How has there not an easier way to solve this? And, you know, I think to your point, the first thing is to start blaming and, you know, I was blaming myself. I was like, oh man, we really don't have the tools to solve this. How am I ever gonna change the situation that we're in? And as the person responsible for data, that's the worst, right? You're like, I'm responsible for one thing, getting the data right.

I think realizing that I wanna understand what other people are doing. And I actually spoke to, you know, several hundreds of data teams asking, like, what are you doing to deal with this?

 The good news was that I was not alone. And the bad news was that I was not alone. Many other teams, or almost all teams actually struggled with this as well. I think a lot of this has to do with the fact that there's a lot of best practices that were developed in other industries that we have not adopted in data quite yet. And then we were trying to make better use of the data that we had, but we did not have the solutions and the tools for that. And when you look at engineers. They have tools like app dynamics and Datadog and PagerDuty, all of those are technological advances that help engineer organizations to perform well. You don't have to be crazy to run an engineering organization without something like that, right.

And then for some reason, data teams are like, oh yeah, let me go manually map out everything and like, try to figure it out. And that just blew my mind ... that difference. And that seemed to me.  Just a totally crazy world to live in. And it was clear to me that this problem is only going to get worse. And in five to 10 years from now, there's going to be lots more people like me who are asking themselves, are we crazy? Why isn't there a better way to do this? And it was just really clear to me that the world will have a data observability solution. Will have something like Datadog or PagerDuty or AppDynamics, but for data teams, and that we need that.

Now you've developed a very particular point of view on prescriptive guidance for what the data team ought to be constructed out of. Right. What are the elements? What are their responsibilities? You do a lot of work with your customers on helping them understand, kind of get out of the fog of war.

 Maybe everybody is not a data analyst. Maybe everybody's not a data scientist, but how do you construct those? Can you talk a little bit about that?

 As people, myself included, obviously we have a tendency to be very self centered or kind of start with like, well, we care about this like an outside in view. And one of the things that I've really learned from working at Gainsight and helping create the customer success category is to try to fight hard for that and actually start designing with the customer in mind. And that customer might be a data scientist or a data engineer, or it might not be. It might be someone in marketing or in sales or in support.

 We're in data, if you think about data as a product, or if you think about consumers of data, those can be either internal consumers or external consumers of data.

At the end of the day, everyone has customers. We all have customers. And oftentimes when we work with the data teams, they will tell us we are responsible for the data, but we can actually fix it. It's our engineering counterparts who are making a lot of the changes and who can actually help fix the data, but we are on the receiving end of that. And so who actually should be accountable for this and, you know, who's the owner of the data when it goes wrong. And it's a great question, right? And that's actually a people question and a process question.

 And I think that starts with thinking through all of us in a way our need to be customer centric. And if I'm designing a pipeline, I need to make sure that the pipeline is working well and that the data. That's writing in the pipeline is accurate too. We call this the good pipeline, bad data problem. We can design the best architecture in the world, but, when the data hits the downstream consumer and they don't understand why the data is wrong, then in a way, the job has failed. Even if you were unaware of that.

When designing an organization, when designing a new process, when thinking about how to make data powerful in your organization, the lens that I always try to take is starting with our customer and designing around that, who is the customer of what we're doing? What are their requirements? What do they need, what do they care about? What do they wake up sweating about at night? And continuing to ask ourselves that, right? So if I'm with our team, you always ask yourself, like, what problem are we trying to solve? We can have all the technology in hand, the best algorithms, the best solutions, but what are we actually trying to solve? Let's make sure we solve that in particular.

Yeah. And your power can only be as strong as the power of your customer and your promise to the customer. Right? One of the things that you kind of pointed out there is a lot of politics in every organization and I've used the distinction, politics is the game of who gets to say what, what about what.

And data is so often a second class citizen because the political power is in the apps team. And we think about app driven data. And so then it's trapped into microservice and it's kind of ETL doubt and you've got to get permission. You got to talk to the engineers versus being able to go and have a new class of customers. Maybe the chief marketing officer, maybe the chief revenue officer, and like, what are you doing to serve them? How can you elevate the power of the promise of data? And then you can start to walk back downstream and construct the right org. Does that seem consistent with your experience?

 Yeah, I think that's exactly right. Clarifying where are the consumers of data, why is this data being used, who should care about this, right? Is that the chief marketing officer or our customers or someone else? And then also starting to put together how do we collaborate around that problem? What contracts can we have between us? What are some SLA that we need to put in place to make sure that, you know, we're all holding our end of the contract.

Again, things that are pretty straightforward in software generic, but we haven't adopted it yet. We're not reinventing the wheel here, which is, I think is the good news. 

It's super exciting, right? To be able to do the cross domain appropriation. Now, as CEO, you have to make it real. Particular decisions about how you run the business, right? Your philosophy, your ethics, your aesthetics, your principles of what company we are.

And in Silicon valley, there's a lot of conversation about product led growth businesses, but you have kind of a contrarian opinion about product led growth. And you've talked about customer led growth. What can you say about what brought you to that perspective and maybe frame the perspective so that others can follow.

 Yeah. You know, I think this goes a little bit to the question that we talked about before. What is our starting point? When we started the company, we had this very theoretical idea that I just talked about in data. We should have something like DataDog, or AppDynamics to better serve our customers and have reliable data. Okay, great. What does that mean? In theory, go design a product based on that, right?

 Part of creating a new category means that you're building a product that was never built. You're selling that product to customers. Who've never bought a product like yours. So how do you actually define what to build?

One option is to start with, let's actually build the product that we think this should be, and then try to sell that. And the other approach which we took was let's start with our customers. So we actually got access to customer's data and started working with them on their problems before we even wrote a single line of code. And we actually just asked them, like, tell us about the most recent time you had data downtime. Show us what it looks like in your data. What does it look like? What are the symptoms of that? So when that engineer made a change to the schema upstream, and that resulted in downstream table that had a freshness problem, and then one of the reports went completely wrong. And you know, your COO was really unhappy with that. Show me that in the data. Tell me what that looks like, right? Let's make it really concrete.

 And we did that with so many companies. We actually compile together, basically a repository or an analysis of what that looks like. And based on that, based on real conversations and real data, that's how we designed the product. And we've been building the product since. With our customers hand in hand. And so as you know, CEO and founder of a company, you have to straddle both, what do customers want, and really solve the customers in hand, and also innovating because you're creating a totally new category. You're building something that was never been built before.

 And so when you think about customer- lead growth, it comes from both, you know, designing your product around the needs of your customer. And also recognizing that all of the answers to all of your questions lie within your customers, not anywhere else. That's really what should be guiding your strategy and your operations and your product, but starts and ends with your customers actually.

 Such a contrarian perspective because as a consumer based economy, I think it's kind of natural for us in the US to think of “you are what you buy”, the tool that you use to find your job and who you are. Whereas you're really coming from an outcome driven point-of-view, right? What are the outcomes that you'll produce for the customer and you are the outcomes that you can cause.

 Yeah, we're very much what we do is shaped by who we are and the experiences that we've had. You know, I grew up in Israel, moved to the bay area at a relatively young age. And so I really got used to adapting to new environments.

 My parents, they couldn't be more different. So my dad. Actually, is a physics professor. So very science driven and intellectual and likes to discuss, like intellectual topics. And my mom, she's actually a meditation and dance teacher. So way more spiritual and centered and connected to herself. Basically any question that I would ask them, I would literally get opposite answers. There would never be any consensus. I was not pushed to develop my own opinions or mode perspectives very early on, and being able to recreate that in a startup is fun because for any question in a startup, you can get five people that will tell you one thing and 5 people that will tell you the exact opposite. And there's an abundance of strong opinions on all of these questions in the, at the end of the day, like you have to mark your path and you're making the bet on why you think that is like what the right product feature is, what type of customers to work with, you know, when and how to launch, like what function to build and when like all that stuff, there's no playbook. You got to go create it. 

I'm not an engineer by background, but I love science and data and, you know, love working with the best data teams in the world. And, you know, I was part of, so the customer success story and the creation of that category and was really fortunate to work with amazing people there.

We bring a lot of that to what we do today. So, at Monte Carlo we really focus on our customer pain points and putting our customers at the center. We try to make our content really approachable, whether you're an engineer or not. If you're a data scientist and ML engineer and data analyst, we write a lot about data downtime. We could have made it extremely technical and, you know, write only about the deepest set of schema changes and duplicates and all that stuff, but we focus on the stories we focus on. You know how it impacts people, what people care about, what are some things that you can actually just adopt and start using tomorrow?

And so, you know, I think when you're creating a new category, you're creating a new product that also requires adapting to all these new environments. And if you think about the data industry today, it's insanely different than where it was two years ago. I remember when we started the company, it was right around when Tableau and Looker got acquired and those were some of the first big acquisitions kind of in the data space and you know what we've seen since with, you know, the IPO of snowflake, right. Which is the largest software IPO of all times, having that be in the data industry is amazing, right? And I think really speaks to the future of the data industry and, and its importance, overall.

So, yeah, I think that's a very powerful sentence. You know, "you are what you do." And it has lots to do with the experiences and stories that have shaped our lives. Yeah. 

 There's a lot of boldness in that too, because the whole thing can be almost summarized as decision-making under uncertainty.

Like, you're always in something of an information vacuum. So you gotta have your mental model, you gotta have your aesthetics, you gotta make the call again and again and again. And it's like that compounding effect of all the decisions that give you like a style of play and a character to your products and a culture for your company.

 And it's ironic because we're in the data industry. So we're supposed to be using data, but in a startup, you have exactly zero data to make the decision based on, especially early on. Of course, as you develop, you know, you, you collect more and more data, but. Really early on when it's like just a first handful of people, the first handful of customers that is a lot of decision-making under ambiguity and under, very little concrete data points, and you still need to make a decision and move forward.

So we always talk about that internally, you know, how data-driven we are and how driven data-driven we should be given we're a data company. And yet, the world doesn't always present that option to you.

 Especially in the earliest ages, right? You are doing just tremendous amounts of inference.

Which is something that we're so good at. And then we become amazed that computers can do some levels of inference and we call it artificial intelligence. I really prefer calling it machine learning because really we can, we can see what the inference that's being made is when you can find me an AI product manager, then I will eat my words and probably eat my entire computer. I'll just put in a blender, make us move out of it. But I don't think I'm in any danger.

 I'm really curious to hear some of your stories around what brought you into data as an industry. And then maybe close with, what guidance do you have for people who are maybe just graduating out of, you know, one of the hardest years in recent memory, and trying to figure out what to do next with their lives. There's a lot of interest and excitement in data. And now that it's 2021 it's become a lot more obvious. But let's start with how you got into it, because data was a lot less obvious as a choice when you took that path.

 I was born and raised in Israel. So I served in the Israeli Air Force as a Commander of a Data Analyst unit, started a little bit back then, but moved to the bay area, sunny California. Studied math and stats, which really felt kind of like Disneyland but for my brain, you know, there were just like so many. Interesting classes and different things that I could take. And I was like, oh man, where do I even start? Right. Every different class seemed like a different rollercoaster that I could go on. I love that period of time. I mean, it was really hard, you know, it kicked my butt for sure. But I just love that period of time.

And since then, got to work with data teams in different environments as a consultant at Bain and Company. Oftentimes I actually learned about how to make strategic decisions without data. That's the first time where I really saw how companies will collect as much data as they could about a decision. But then at the end of the day, oftentimes, you know, you don't have the perfect data and you still need to make a decision about M and A or product launch or a new market. And you make all these decisions with imperfect data and how to do that and how to. You know, be confident about that and how to drive a hypothesis driven workstream.

I work with data and different formats. I wouldn't say that it's a\ particularly conscious choice when people always ask me, what do I want to be when I grow up? It's typically like, I, you know, I never want to grow up. I really optimize locally if you will.

I think some of the most interesting work that I'm able to do in the data industry is because of how much is happening and changing in the data industry. I think markets where there's lots of movements and dynamics and changes is really a place where you can help shape your career. And so, you know, for folks that are interested in learning new things and pushing themselves, like there's no shortage of that in, in the data industry, in particular.

And for me specifically I joined Gainsight, which was a small startup. When I joined, you know, we helped create the category. Help build a number of different teams, work with great people. And honestly, I just wanted to do that all over again. I'm just excited about working with great people on really big missions and making a change and leaving an impact. And that's really what this is about. Every day, every minute, having a lot of fun.

 I feel that everyone listening can be uplifted by your optimism and your enthusiasm. So I'd love to close on a piece of advice that you have for somebody who's early in their career, or maybe just graduated or maybe is trying to figure out what to do with their excitement about math or.

Statistics or computer science or anything. And it was being pulled into this expanding universe of data where there's growth and where the Coke tastes better.

 I think it's my role and the company's role in finding a really strong match between what the company needs and what your desires or energy is. That's where the magic happens when you find a strong fit between what people love working on, and the problem that needs to be solved. And so, I think you can try to over-optimize for a lot of different things in your career. For me, it was in, I think always still is about having fun. I think life is just too short to do anything else. You know, I think if you actually optimize for that, you will also solve really hard problems and we'll work on things that matter and will have an impact. And so I think listening to that and listening to what brings you joy and fulfillment in your work is where you also be most successful. Listening to that inner voice. That's telling you maybe to be a little bit irresponsible and having fun. I actually think it's the longer-term it pays off to listen to that, to that inner voice.

 That is awesome advice, BARR, thank you so much for your time and your generosity of spirit.

 Yeah, of course. Thank you so much for having me.