Distributed Data Show 64
We talk with Josh Perryman of Expero about the current state of the art in enterprise data architectures, how he sees that changing in the future to include a broader set of database and streaming technologies, and how to deal with the resulting complexity.
0:15 - Welcoming Josh back to the show to talk about enterprise data architecture
1:39 - The state of the art in enterprise data architectures - relational databases. Problems come when you have relational databases in 3rd normal form that can’t support the required joins on the read path at scale
3:20 - The toolset for solving these problems - it starts with understanding your entire stack and creating your data model appropriately
5:05 - You can’t just create any data model and expect it to scale. You need to think about data locality - getting your data on the same node, the same partition, or even the same location on disk.
7:40 - The emerging state of the art - a deconstruction of the traditional database architecture consisting of storage, transaction logs, indexes, etc. Now we can mix and match different technologies according to our application needs, for example Cassandra for persistence and Kafka for streaming
10:24 - On the similarity of Kafka to the traditional transaction log from the relational database world
12:02 - Managing the complexity of the polyglot persistence approach. We need better tools for managing interfaces and schema across technologies and service boundaries
15:15 - The role of analytics, machine learning and distributed tracing in complex enterprise data architectures
16:45 - Wrapping up and teasing some potential topics for future episodes with Josh
Developer Relations at DataStax