Toggle Menu

Season 2 · Episode 5

Cloud-Native, Open-Source, and Collaborative with Eric Brewer and Melody Meckfessel

What's the future of open source and distributed systems? Google Fellow & VP of Infrastructure Eric Brewer, Observable CEO Melody Meckfessel, and DataStax Chief Strategy Officer Sam Ramji explore the state of the art, the near future, and grand challenges for the next decade in cloud-native data.

Published August 5th, 2021  |  37:19 Runtime

Episode Guest

Eric Brewer & Melody Meckfessel

Eric Brewer & Melody Meckfessel

Episode Transcript

Sam:

Well, hello everyone. I am delighted to be hosting a conversation with Melody Meckfessel and Eric  Brewer today about cloud-native data. So, we're gonna spend about 25 minutes exploring the state-of-the-art, the near future, and grand challenges for the next five to 10 years in cloud-native data. This is an  awesome opportunity to talk with two of the people who have done more with cloud-native and  DevOps than most folks in the industry.

Melody Meckfessel has spent 14 years at Google, where she built the infrastructure and led a thousand  person engineering team that did all of Google DevOps, and then brought that to the public in the  Google Cloud DevOps. About a year ago, Melody formed a new company. She became CEO of  Observable, which focuses on data visualization as a high performing open source project, D3.js.

Eric Brewer is a Google fellow, the father of Kubernetes and author of Spanner and many other cloud native core capabilities. I learned a great deal in the last few years from Eric and his hard won  experiences building distributed systems for search and for core infrastructure. Google is also ... Eric is  also a professor emeritus at Berkeley in Computer Science. So good morning everybody.

Melody Meckfessel:

Good morning.

Eric Brewer:

Morning.

Sam:

So, Melody, I thought you might lead us out, and talk a little bit about what you're seeing in the  structure of cloud-native data. You've had the brag book of what it means to do DevOps at high speed,  relating that to data and what you're, what you're seeing in, in the world every day. I'd love to hear what  you're seeing.

Melody Meckfessel:

Thanks Sam. So, I think I am really interested in the human element, the folks that are actually writing  the software and having to interact with the data. And we know that there are more people writing  software than ever before. And we can call those folks developers. I like to call them creators because  they come from a diverse, educational background and domain background.

And it's fascinating to look at, look at the rise of open source, right? We sort of take it for granted now.  The open source way of developing is here to stay. And we know that today's software it's either open source or it's built on top of open source. So for humans, it means that we're more dependent  on code written by other people, and we're also more likely to share code that we've written. These are  sort of the social norms that we're in.

But there are a lot of challenges in that, right? There's a lot of duplication out there, there's a lot of  tools, and DevOps really brought us this idea of, you know, how can we automate away the toil, and  how can we bring in cultural practices that let us focus on that creation? Then we have to reconcile it  with the fact that open source it's, it's, it's deep, it's, it's difficult to learn, and what can we do to make it  easier? So, I really think there's a call to action here around improving how we understand and develop  software and the applications that we're building.

But the interesting part to me is that we need to talk about the data as well when we're writing the  software, so the complexity that we have to navigate with the data. So data distributed across multiple environments, across multiple formats, and I see a real risk and challenge that our  understanding is starting to be connected, disconnected from that data. So I, when I, when I started  looking into, well, how do we make developers productive? And how do we get them closer to  understanding the data? Visualization was kind of an awakening for me. It is a powerful tool for thinking.

And if you look at the power of D3 and what it's done for humans in the world to use visualization to  understand data, it's just a new way of, a new model of trying to come to a new  understanding. And I think, you know, the visualization it's, it's not, it's not the end, it's the means to the  end, it's the insight that it gives us. And it really, it taps into our human visual system so that it helps us  think more effectively, and it exposes underlying data and code that allows us to question and explore  and learn as we're building cloud-native apps, and as we're trying to manage this complex data  environment that we're in.

So I really see a lot of power in visualization, bringing visualization to data, bringing visualization to  software development and empowering not just what we call developers, but non-developers that  wanna create and interact with data in powerful ways, and really build apps in a way that, you know,  draws from a lot of the lessons of DevOps, but brings in this more intuitive way of interacting with data.

And, I think there's so much opportunity to move beyond looking at static text on a screen as we write  code and we build the next generation of cloud-native apps. So those are some of the things that I'm,  that I'm seeing out in the ecosystem and I'd love to hear more from, from Eric about his perspective.

Eric Brewer:

Well, thank you, Melody. It's great to be here today even if at virtual. I was involved in the open source  with Kubernetes and the reasoning behind that. And that certainly has gone well, I think it's fair to say.  And when I think about why we did that and what we wanted to get out of it, I had a few things that  stick out. The first of course is that, we wanted it to be a platform for everyone, and something that  everyone could contribute to, and open source is by far the best way to do that.

Eric Brewer:

But also we're trying to kinda change the whole industry, right? The goal here is to make the way you do  development and the way you run applications be much more application centric, and much less based  on the kind of the virtual machine you're using and the kind of model it, the virtual machine and OS and libraries that you historically have to deploy that really had nothing to do with your application per  se, right?

So containers are a big part of that. Getting your container containerized structure for your application  allows you to, to decouple some of this, but then Kubernetes is really more about how do you decouple  arrays? How do you decouple the policy for running things? How do you decouple things like SLAs from  the particular services? That's what a lot of Istio is about.

And so a lot of my focus now is on how do we do automation at the Kubernetes level and above, to  make the experience better both for application writers but also application operators. How do you run  a highly scalable service? How do you auto-scale? These are things that need to be done through  automation, and when done that way are a great experience for developers. So they're, I think they're  deeply tied together.

Second broad area I'm looking at this year in particular is open source supply chain security, which is  really about how do you trust the software that you're importing, right? We are all now importing many  different packages and not all of them are trustworthy. So how do you figure out which ones are? How  do you tell when you're at risk? And the short answer is, we don't as an industry have a great answer to  that yet. We don't even have good metadata about which versions are protected by a vulnerability.

And we're not even great at tracking vulnerabilities because it's a very human process today. You, you  write in English what the vulnerability is, what version it affected, right. But that's not really the  automation I'm seeking for all the rest of the platform. So we need to automate vulnerability tracking,  automate dependency management, a lot of pieces here that actually make this whole experience significantly better. And in the past we didn't care because you kind of wrote your own software in  house and you trust the code that your team wrote.

Now, most of your code isn't written by your team or even your company, and so why should you trust  them? The answer is, well, it's open source and that's kind of a good answer, but we can make it better.  Finally, I'll just add because of COVID, we ended up having a thousand interns that needed projects they could do from home, and now they're working on open source as well on all these projects, not just  Kubernetes, but things like TensorFlow, Android, and Chrome, and Linux. And actually, we're hoping that they'll develop a lifelong love for open source as well. This is really a skill that I'd like every young  developer to have.

Sam:

That was awesome. A thousand Google open source internships this year. Thank you, Eric. You know,  it's, it's always a funny thing to look back and say, “Who, who had the metaphors that helped us  understand the world?” And I think Tim O'Reilly was one of the people who really helped so many  of us get a sense of the world we are in. So he said “The internet architecture is small pieces loosely  joined.”

And I think it spoke to construction of open source as well. It's hard to imagine being cloud-native  without open source. So the thing that's been attracting my attention the most is the state of data in  open source, the state of data in cloud-native, because as those small pieces start to move together  more effectively, they can be joined loosely. You can have data that has many different opinions, you can have many different data infrastructures.

But the real transformation that I'm seeing in cloud-native data is, most databases were built in an era  where they had nothing they could depend on, and they had to guarantee you awesome raw  performance. So they tend to be more bare metal. And then it moved to virtualization. They still wanted  to have a lot of control over the system under virtual environments. But the move to cloud-native is  really fundamentally transformative.

So with Cassandra, what we found is, it's a good structure for our cloud-native database because it's  able to refresh itself, replicate effectively, spread the nodes, and it uses eventual consistency to get  global scale.

But it used to have to take care of all the lower layers by itself. So of course it did it in a slightly  monolithic and quirky way. But being able to embrace that idea of, of small pieces loosely joined, letting  go, kind of exhaling, trusting Kubernetes to take on, a bunch of that also lets us improve the lives of  operators. So part of Melody's quotes in our Google DevOps days were, "No grumpy humans," right?  And what makes grumpy humans is having to do the same routine work all the time.

What are the really cool things about Kubernetes is this operator pattern. So we've learned a lot in  actually managing Cassandra on premises. We've been able to extract a lot of those things into a  Kubernetes operator for Cassandra. So, you know, there are a lot of people who are working on cloud native data. I don't think any of us has the right solution.

But what we're trying to do is figure out, how do we, how do we take what happened in the 2000s, you  know, where people like you know, Andy Bechtolsheim, and his company Arista and others created a  networking infrastructure that could let us simultaneously address billions of devices.

The 2010s where, where this team, you know, Melody, and Eric, and others built a compute  infrastructure that could scale to billions of nodes, right, billions of workloads. This next piece, where we  bring data along for that ride, I find that really fascinating. So that's what gets my propeller hat  spinning these days.

So let's talk for just a bit about sort of state-of-the-art what, what you're seeing Melody and Eric in the  practices of, you know, operating data, sharing data. I think you've got two real different views, because  Melody these days you're, you're focused very much on the, the, the end user experience and the  author, right, the editor who's making the data consumable to people through visualization.

And Eric, you're looking at the ability to automate the production of data, right? Sort of the  infrastructure for data pipelines among others. So I think you'll have real, real different points of view,  I'd, I'd, I'd love to hear from, from both of you maybe starting with Melody.

Melody Meckfessel:

Yeah. Thanks, Sam. There's, there's three areas that I would highlight around data and data exploration. So the first thing that I'm seeing just it builds on a lot of the lessons and evolution of DevOps. And that is  that we want the exploration of data for developers and creators to be as real time, and as interactive as possible. And that means, you know one of the things that we're doing at Observable is you're working  in this environment where there's no compiling required. It runs like a spreadsheet.

So when you're writing code and you're interacting with the data, you're getting that immediate  real time feedback. And that ability to sort of learn and explore quickly through iteration is dramatically  changing, you know what, what we think of as, as data science, right, in terms of the ability to model,  explore, and analyze data, and really move beyond ... (laughs) make a joke about this, but here you're  working towards some output and then you get to the thing that you're, the visualization, and then you  make an image of it, and you put it in some other format like a slide deck. We need to change that.

And I think the reason why we need to change that is, we need to get folks across the organization  connected to the data itself, and the logic, and the decision making that happens around that data. So,  bringing an environment where everyone can explore and interact and make it as transparent as  possible.

The second thing that I'm seeing is that, it's, you know ... In the open source community, there's  thousands of examples out there. And so, one of the things we're trying to do is really harness and bring  together communities like D3 and Vega-Lite, so that you don't have to start from a blank slate, right?  You can find an example, you can fork it, you can bring in your data and you can immediately get up and  running.

And that model is something that we're used to. We need to just bring the community around it,  right? Importing code, so you don't have to write it yourself. And then you know, I would say the final  thing that I'm seeing is just sharing and collaborating, right? We don't write software alone, especially in  this clump, complex environment that we're navigating, and so the ability to collaborate and share what  you know. I love the example from the internships that Google's enabling, where I think of all those folks  that are now gonna be able to share and collaborate out in the open.

And I'm very inspired by our ability to break down the walls between the tools and the people. So data  scientists and developers working more closely together and sharing information. Data analysts you  know, data storytellers, and other types of creators, making it possible for them to come together, to do  the exploration, to have examples to start from and to be able to share and collaborate. And you know,  in, in my view of the world, I want everyone to be part data scientist, because we don't have any  barriers around exploring the data and getting to the insight that we, that we, that we wanna find.

Sam:

I love that quote, “Everyone is part data scientist.” I learned a lot from you when I got to work  with you at Google as you were building out the SRE tooling. And I never really understood what an SRE  did or what they were until I saw the tools. And I realized, "This is data science in operations." Right?

Melody Meckfessel:

Yeah.

Sam:

So that idea of, of the visualization shape in your cognition, and then using that collaborative toolset to  be able to look at different visualizations and the group could think differently about why the system  was down or you know, what were the issues that were that were showing up in production because  that's where science is practiced, right? When your systems are down, (laughs) you need science 'cause  the system is way too complicated to just kind of, you know, do it by logic.

Melody Meckfessel:

That's right. Yeah.

Sam:

And you said something else super interesting. You talked about forking data. So you and I haven't had a  chance to talk about this a lot. Again, I've learned from you in your current role but maybe you  could talk a little bit about the code experience 'cause I don't think most people think of data as something which you fork or code as something that you get to do a lot with data. People probably  associate like R or Python with the

Melody Meckfessel:

Yeah.

Sam:

... kinds of things that you build dashboards and data visualization art of, but you've done something  really different with JavaScript. And, I think it's super interesting.

Melody Meckfessel:

Yeah. So the motivation for me with what we're building in the platform is that writing databases, creating databases, it's, it's really challenging (laughs). You know, not only do you need to ... You have an  idea and, or question that you wanna ask, but then you need to go to the data and you need to figure  out how to bring the pieces together, and model, model what you're trying to, you know, to explore.

And then you need to apply the logic, right, and get to a visualization where you can interact. And that  could be, you know, sales regions in a particular area, it could be what's the best algorithm for me to  apply to the application that I'm building or the best configuration dashboard of what's happening with  my service right now, and how I would visualize that.

It's difficult. Like those data pipeline steps are challenging. And the way most people work now is that  the data scientists are in their own toolset, working in their environment, and when they have  something interesting they throw it over the wall to the developer team to turn it into something that's  interactive and can reach more folks within the organization or out in the world. And so if we start to  make it easier for folks to get started in the same environment, you find an example of the database  that you want. Maybe it's D3, maybe it's Vega-Lite.

You find that example, there's thousands of them out there, you fork it, right? So you've created your  own version of that. And then you can start to tinker, swap out the data, change the data, be able to  change the visualization, you know, simple things like the labels to actually then changing the logic in the  JavaScript, in the code.

And because it's reactive, you get to see those updates real time. And again, I think that immediacy and  that level of interactivity around, "I don't have to, I don't have to start from like a blank slate you know,  or like (laughs) some sort of blank terminal (laughs)." But I can take something like, "Oh, you know,  this looks like it's close to what I'm trying to do. I'm gonna fork it." And then I'm gonna start to bring my  own data in, and get to the outcome much faster, much more ... You know the velocity is higher, it's faster iteration and ultimately you know, it's, it's better, it's getting to that better decision and that  better outcome.

And so I, I just think it's so powerful. And this, all of this forking came you know, came, came from, came  from the open source community, this model. And we probably take it for granted, but it's so powerful.  And I think we have the potential to reach many more folks to create that, or don't consider themselves

developers, right? They're data hackers or they're data storytellers or you know they have accessibility  to code now because they're starting from an example and they can tweak it and learn much faster.

Sam:

It sounds transformative when you said that the state-of-the-art was to throw it over the wall, that  reminded me of the old bad days before DevOps.

Melody Meckfessel:

Yeah.

Sam:

So in some ways, this, like this, this sounds like a DevOps transformation for data pipelines and, you  know, data, data understanding. Eric, you know, you've been, you've been at the forefront of cloud native data. You've seen it, you've, and built it at scale right? I think you're starting to see some, some  changes in patterns from large blankets of distributed cloud-native data. I'd love to hear about that, and, and then maybe more pointedly, right, you've been pushing Kubernetes to do a better  job of embracing statefulness, because it's kinda solved the stateless environments.

There's an architecture under, under statefulness, which is not just staple sets, it's what are you doing  with data management and how are you doing storage. And you've got, I think a lot of, a lot of hard won  thoughts about what needs to happen in Kubernetes for that.

Eric Brewer:

Yeah. It's a super interesting space and there's a bunch of kind of large trends going on that are high  indirectly, what Melody just said. And one phrase I think is, I've been using a lot internally is data  transformation. And I distinguish that from the classic model data management, which is basically, "I  have a database and I updated in place." And that's what most traditional applications do. They store  their data, they make it change, they write over the previous data with new data.

And that ... It certainly starts to make sense if you had a limited amount of storage and you didn't have  room to write it twice. But actually when you say the word fork, it actually implies that you're making a  copy. And I actually think the velocity change we're seeing is exactly because of the copy. Right.  When you transform data, you're really making a copy and then transforming it. And that gives you  some wonderful freedom.

Particular, you and I have your own copy and you can muck with it however you like, right? Which  means you get velocity, it means you get autonomy, you get independence on the things you want to  do. And it's the ... In fact, the only way to do large scale collaboration is through this copy model, right?  We can't actually give access control the database rows to all of the people in open source. It wouldn't  work, right? In fact, get is the same thing. Get for code is basically a fork model where you make copies  of things, maybe we can reintegrate your changes back in, but there are actually separate things and  you work on them autonomously.

So, there's lots of reasons that transform model is actually better just fundamentally. So for example,  think of all the fields of science. You're doing science, you have raw data from your experiments, you  don't ever wanna change that data, right? You should never be writing over the raw experimental data,  you should be transforming it as you learn, as you apply different pieces of statistics or machine  learning to that data, you produce new things that are outputs.

So it's much more of a data pipeline model, like an Apache Beam, for example, and I see that model  actually works quite well on Kubernetes. Because actually it's easier to do transformations and pipelines  than it is to do live update in place database management. So we actually can take advantage of that.

It's a better fit for velocity, it's a better fit for collaboration, and it's a particularly a better fit when you  don't wanna lose the original data.

Sam:

Mm-hmm (affirmative).

Eric Brewer:

Right. So lots of places will move to this model if they haven't already, and I think many have without realizing that it's actually a different model. Now we still need to do updates in place databases and manage accounts, and there's lots of reasons for additional transactions and Kubernetes needs to be better at that. And also, kind of state-of-the-art, will give you a node with a disc and you can manage it yourself, right? Make sure you get the same disc every time.

So if you could do logging, you have a place to do it. But that's not all that helpful. It's sufficient, but not  super helpful. Whereas in the pipeline, in case it's much easier. Well, you can go back to some classic  things like MapReduce, one reason it worked well is that, it's a transformation pipeline and it never  messes with the original data. It just produces new data.

So doing it in parallel, doing it with fault tolerance is really easy 'cause if you screw it up, you can  just start over, right? Literally you just rerun the partial transformation. So that's a nice model and I  think people should adopt it more explicitly. I think many have, but we should actually talk about it as a  separate thing from classic data management. And that'll, that'll I think give people some freedom to actually feel differently about how they wanna share things and even what they wanna share.

Sam:

Yeah. You've said that DevOps is all about velocity, right? Doing more things on your own and not having  to wait on others. So, I, I hear, I hear echoes of that in your comment.

Eric Brewer:

Data transformation is the same thing for data velocity that decoupling developers from operators does  for code velocity, right? If you can do a launch without having to do a launch calendar and a whole bunch of reviews because your infrastructure is kind of giving you guardrails that keep you safe. That's what gives you velocity for application deployment and development. I think data transformation  has the same feel too. You can transform data without much risk to the original data, which means you  have much more freedom to do so.

Sam:

It also sounds like this is in keeping of your ... You're thinking on you know, having a larger number of  smaller databases right against, against a common, you know, back plane or a fabric.

Eric Brewer:

Yeah. I want everyone to have their own copy of the pieces they need to work on. It's good for lots of  reasons, including basic availability. All right. If you have copies it's easy to keep one of them around (laughs). But I think more, it's more about the autonomy. Like, does your group have the ability to  work with the data and transform it as needed on your own schedule with your own processes, right?  That's what gives teams the feeling that they actually can make progress quickly. Right? It's that  autonomy. So we want that for both data and for applications.

Sam:

One of the things that we, we're also seeing is the ability to support all that stuff at a level of, of  economic efficiency. So there's a pattern that I don't think has been used a lot, but it's starting to come  up. We've just been developing this recently with Astra which is our Cassandra for Kubernetes, runs in  GKE as well as in EKS is multitenancy.

So multitenant Cassandra gives us a chance to leverage so much what's great about Kubernetes, but also  be able to say, “Hey, for, for these fast on fast off workloads for led experimental trial workloads, it's  cheap and efficient to be able to start something get going, spin it up, spin it down.”

So those kinds of emergent pressures where you have to do this and you have to do it within the limits  of our economic reality are pushing a bunch of technical innovation, right? Because we're all, we're all  service providers now. It's not like the old days where we're like, "Throw your software over the wall  and the TCO is the customer's problem." Right? Now, now, now (laughs) TCO is actually, that's about our  margin. So we have to do a substantially better job there.

I guess I'd, I'd love to, I'd love to get your thoughts on the next, you know the next five to 10 years,  right? Both of you are, are in the thick of it. You're, you're at the cutting edge again, another great quote  William Gibson's, “The future is already here. It's just unevenly distributed.” I think you're both working  in, in, in spaces and enrolls where you're, you're at the edge of the future. So from where you're, you  can see even farther.

So the next five to 10 years, if you were to kind of lay out, what do you, what do you think is the grand  challenge from, in your domain that, that is gonna be really hard to solve, really worth solving, right?  Something that anybody listening can kind of sign up for and say like, “I wanna do more of that. I, I now  see where the future is going.” I'd love to, I'd love to hear what's what's on your mind for grand  challenges. Maybe we'll start with Eric and finish with Melody.

Eric Brewer:

Sounds great. So I would say a few things that I think are long term issues. I already  mentioned se, security one, which is really more about making it safe to do all this collaboration that we  still have to do, and, we'll get there. It's solvable, but it's an industry problem that any single player can fix it.

But I think in terms of challenges that are, you know, worth thinking about quite a bit, I think there's  plenty more to do on how do you do state management in something that Kubernetes. It's really quite  challenging and I think, you know, the models we have are not quite right. So for example, within  Google, there are storage teams. And basically if you're not a storage team, you don't get to do state  management. That's because this view is very hard to do, and it is hard to do.

Conversely, if you're not in storage, then you don't have to do state management because you can do  RPCs to the state management systems that someone else is running and life is good. Now that actually  does increase the productivity for those teams. So I, I think that dichotomy has been useful to us. But  again, that's something, the traditional model of data is in one database and it has a lot of access  control, and is managed and updated in place and that we will remain important.

We need to do that better particularly at the low end in sense of if you want a million small databases,  that's actually a hard problem today, much better at one big database in some sense. But we can, you  know, invest our resources in, in kind of SREs and things like that to keep the one big database working  well.

We need lots of databases to give the autonomy to teams that we wanna have, even if they're built out  of fractions of larger databases. But I think this other model of data transformation, that's kinda easy to  deal with the state management for... We don't quite have the right tools for it, we definitely don't have  the right metadata for it. So, for example, if you wanna collaborate and pick up data from someone else,  you need to know a lot about that data if you wanna use any automation at all.

And it's, you know, it doesn't need to have a full database schema but it needs to have enough  metadata that automation is possible. You're kinda missing that layer today. And we're actually missing  an open source as well for the code site. Like again, I can't really tell if I have a vulnerable version, which  versions of libraries of the included are actually affected the vulnerability.

It's very hard to tell because it's done in comments and human texts not in metadata that enable  automation. But the same thing's gonna be true for data, and you know, we're not there yet. Like so how, how do you label your data such that people can even know what the axes are, what the units are?  Do you want to know the providence of the data, right?

[inaudible 00:30:04] The consequence of the transfer model is that all your inputs have a history, and you  might need to know that history in order to interpret the data correctly. Right. And so we don't  have a way to store that history again, other than in, in verbal description, if at all. And in fact, if  previous transformations were done on the data that you've imported, you might like to see if they  were done correctly.

I remember [inaudible 00:30:31] with a cooking stove invention and it improved the quality of cooking  stoves which improved the environment, and had all these positive health effects. And we went to  prove it, it's not my team, the team I helped, and they had this great chart showed it was working better  than the old cooking stoves, but then we couldn't actually figure out how they created the chart, right?  Because it was done with Perl scripts and various things and those had all been modified.

So the actual code used to generate the chart was not available. Right? Now that's a data scientist  rookie air frankly, but the systems don't make it easy to get that right. And so that whole providence peice is also missing, that'll also affect the security of data, not just its quality.

Sam:

Melody that sounds like the perfect segue to [crosstalk 00:31:21].

Melody Meckfessel:

So I, I wanted, I wanna pop up a level which is so many companies across every vertical market, in every  different domain, feel so much urgency around trying to wrangle and make sense out of the data to  make better decisions, right? And you combine that with what's happening in the market, which is  essentially a skilled crisis around data scientists. Their demand is far outpacing the supply.

And you can see something has to change. So good news, there's lots of developers. Developer  community is growing, the diversity of the developer community is growing. And I, I really think the time  is right to evaluate and understand how we increase data literacy in the world in the next five to 10  years. And that means, you know, means access to data ... Eric brings up a great point, like if more  people are empowered and have autonomy to explore data and to ask those questions, there's some  better outcomes that are gonna come out of that new workflow.

So you know, the next five to 10 years I, I, I believe is about how we bring access to the data and to Eric's  point how we bring transparency to what's coming out of the data, which means that more people can  see the underlying data, the constructs and can also see the logic that's being applied, and can  understand, "This is the visualization piece that I'm passionate about," can more intuitively understand  the data because it's in visual form. And the visualization allows more intuitive learning, right? More  interactivity and exploration.

And that's extremely powerful because it just comes back to the core principles of DevOps, faster  iteration, right? Faster understanding. You shift faster, you have happier users, right? You have a team, a  collective team that's more informed about the data. They can see the code on the data combined  together. So you know, that increase in data literacy opening up the toolset, and the skills, and the  access increases the approachability of those tools to a diverse audience, right?

Bringing the data scientists, and the developers, and the data analysts, and the business analyst, and the  financial analyst and you know, or the hobbyist who's working on a project together to be able to  collaborate and share, is you know ... I think that's the journey for the next five to 10 years.

And it really is the crux of when people say the democratization of data, that's what that means. It  means empowering humans of many different backgrounds to have access and to have that  transparency to, to, to learn and to collaborate together. And so that's, you know that's what really inspires me for the next five to 10 years. And yeah, and I'm looking forward to the amazing things that  people are gonna create in the visualization space.

Sam:

That's so awesome. You know, literacy is the foundation of citizenship. And so as society continues to  evolve faster than humans are, our ability to get people data literates to be part of a data driven society,  right, as data citizens, whether you're a scientist ... my kids are, you know, studying Environmental  Science and Nano engineering. You wouldn't think of them as a data scientist or a developer, but they  need these tools and they need that literacy like at the, at the tip of their fingers, both to understand,  also to inform, also to persuade.

Yeah, we've got, we've got an amazing decade ahead. I think that we can learn so many  lessons from what hyperscalers like Google have done, right, invented things because they had to not  for any theoretical purpose. And then I think this, this next decade the grand challenge that I see will  be, how do you make everything data-driven in a pragmatic way?

How do you have infrastructure that you can just depend on that just scales up, and scales down, and  just gives you the data that you want and it moves things where they need to be, it looks at the load  balancers, it looks at the invocation sequences and you know, pre-caches data for you, warms it up,  moves it where it needs to be? But all for the sake of understanding how we take data, and up-level to  information, and up-level to knowledge.

How can companies who are trying to make sense of their domain, do a good job for their customers,  good, do a good job therefore for their employees? How can they assemble all that into a graph of data  where it's alive, and rich, and living, and they can interpret, right? There's a way faster than human  speed of how all these small pieces of, of data and information come into the system and actually know  what they mean, right? How do you have a dynamic active ontology that allows cognition to be  augmented, right? Rather than artificial intelligence, looking more at augmented intelligence.

And how does the computer really become a great partner to the human? And I think  that will take all of the work that both of you have done for decades and many others as we take you  know cloud-native into the, into the next decade, which I think is gonna be awesome for open source, I  think it's gonna be awesome for for computation, and awesome for collaboration.

So with that, I wanna thank you both so much for taking the time to be part of this conversation. I'm  looking forward to many more in real life and hopefully a few more here in the virtual world before too  much time passes.

Melody Meckfessel:

Thanks, Sam.

Eric Brewer:

Thank you so much.