Distributed Data Show Episode 63
We talk with Josh Perryman of Expero about his experiences building highly scalable and performant applications using relational databases, graph databases and sometimes even both at the same time.
0:15 - Jeff welcomes Josh to the show and finds out what a “data junkie” is,
1:31 - Josh got into graph databases by way of consulting in high performance computing - a client struggling with relational performance asked him to look at graph solutions
3:41 - He started by working on proof of concepts with multiple graph databases
4:49 - In this particular case, it turned out that it wasn’t necessary to rewrite the entire backend to use graph wasn’t the right choice, because they were able to optimize relational queries.
8:47 - Lesson learned: the ideal solution may involve both relational and graph databases. Josh recommends using the Command Query Responsibility Segregation (CQRS) pattern to help determine where to use graph.
11:37 - The nutshell of the CQRS pattern is separating reads and writes. Graphs can really shine on read performance, helping when there are complex queries involving multiple hops. The tradeoff is the write amplification - you pay a performance penalty up front on the write.
15:43 - When using multiple databases, abstract the interactions behind a data layer. Josh favors using GraphFrames for loading data into DSE Graph.
19:25 - Creating an abstraction layer, perhaps using a Data Access Object (DAO) pattern, gives an ideal place in your architecture where you can manage the performance and scalability of your data access
22:19 - Wrapping up
Developer Relations at DataStax