0:08 Miko Pawlikowski
Hello and welcome to Conf42Cast, another exciting episode. My name is Miko Pawlikowski and today with me is Luke Feeney, the CCO at TerminusDB. Hello Luke, how are you doing?
0:21 Luke Feeney
I'm fantastic. Thanks a lot for having me, Miko. It's great to be here.
0:24 Miko Pawlikowski
Fantastic. I'll take fantastic. Fantastic is in short supply, so I'll take as much as I can. All right. So to confuse you for us, as we do to all of our guests, a weird question: what's your favorite galaxy?
0:36 Luke Feeney
My favorite galaxy? Okay, so I'm a big fan of the Iain Banks culture novels, which are kind of, you know, a sci-fi on a grand space scale. And it's all about, you know, artificially intelligent ships who talk to each other. But in one of those, there's these, kind of, lizard guys, who at one point get in big war with the culture. And they all go to Andromeda. And end up in Andromeda. And they're incredibly long-lived. So, I'll go for Andromeda. I also like it because it's going to crash into the Milky Way in a few billion years and destroy everything. So, what's not to like in a galaxy that's going to crash into another one and destroy everything.
1:18 Miko Pawlikowski
And that's probably the best answer I've heard so far to that question. Yeah, the impending doom in a few billion years, something to look forward to, exciting stuff. All right. That's a good start. So, one of the things that I don't usually see, when I talk to tech people, is being a diplomat for last, was it 15 years? Previously for like, 14-15 years? From my mental maths on your LinkedIn profile. So, I'm hoping we can talk a little bit about that. And I'm really curious how that affected your journey. And you know, what made you jump to begin with. And in general, you've been to quite a few places. So, more on that in a minute. But let's first talk about TerminusDB. So, full disclosure, I watched a few of your presentations, scan through the website. And for those who haven't seen any of them, one of them basically starts with Luke introducing an Excel file and saying: "Here's 20 different projects that do Git for data and they're all inferior to our stuff". So, what's TerminusDB? Where do you start with that?
2:24 Luke Feeney
Yeah, thanks Miko. So, we are a open-source graph database and documents store. And we're an immutable database. We kind of tried to be a Git for data and also then provide a GitHub for data. So, we have TerminusHub, which operates like a GitHub for data. And the database then itself is like a Git for data. And I wouldn't say the others are inferior, they're just different. And they do different things. And there's definitely, you know, especially in machine learning operations, there's been a real proliferation of tools that like to say that they're Git for data. And it's almost become a joke. In one of those videos, I was talking at the Knowledge Graph Conference recently, and I pulled out a screenshot from the MLR Slack. And somebody had had a, "Okay, let's set up a channel called 'Bad startup ideas' and it'll be all hilarious. And we'll have a joke". And one of the other participants put up something called "Git for data as a terrible startup idea". Because there's so many of them, there's so many of them that call themselves 'Git for data', because there's a huge hole. The Git has had such an enormous and distributed revision control in general, has had such an enormous impact in software development. And now, as we get from the era of the lone data scientist, into the era of the data engineer, when you're trying to build reliable pipelines that deliver value into business. You need something like Git to make that all happen. So, TerminusDB is one of the types of solutions for that, that'll allow you to have versions of your data through a pipeline. That you can then have Dev, you can then stage to maine and exactly that. That we, kind of, try and mirror some of those Git semantics and Git processes. But, in a proper database, that's queryable. Where you can query deltas, you can query a diffs between different states of database.
4:29 Miko Pawlikowski
Okay, so many things to touch upon now. You dropped so many casually: open source, distributed source control, Git. You've mentioned, both document and some kind of schema, from what I understand. So, maybe let's try to decompose that a little bit for smaller bits. So, I think it's probably arguable that the distributed source control and, in general, the source control modern and the way that we see it now is probably one of the biggest inventions in writing software in the recent history. And then, I guess the GitHub that you also mentioned, as this entire layer of socializing on top of that entire culture of Gits and memes and all of that and makes it into, basically a club, in a sense, right? A society, rather, than just the code exchange there. So, when you say 'Git for data', it kind of sounds a bit like 'Uber for X', right? When all of a sudden, everybody had this idea of saying 'Uber for X', right? When you say that, is most of the vibe that you're hoping to build in the actual data inside of the GitHub for data you are talking about? Or is it more in the tooling?
5:46 Luke Feeney
So, that's a really interesting question. So, what we would see is that one of the things that goes with Git as a software development system is the microservices revolution that allows distributed teams to break down complex code bases and work much more efficiently in terms of development time. And what we want to be able to do similarly in data is to allow those teams, those domain teams to become much more efficient by working with distributed data and data products. And really, we see the value as a combination of organizational shift where you can devolve responsibility for dealing with data products out to domain teams, by enabling them with technology. And revision control is an absolutely essential piece of that. Because in order to trust domain teams to actually deal with the data effectively, you have to be able to roll back. And then, for anybody that is operating on that, you want to be able to have branches. And actually, we found that describing them as branches can be confusing in a data setting. And we're now talking more version graphs that within any one database or data product, you have a series of version graphs. And those version graphs might be like Dev and maine, like you'd have a Git. And the maine being the production one, that's running some application somewhere within your enterprise or a web app. And then Dev could be the Dev one that you mess around with. But you could also have, you know, version graphs like the one that has no personally identifiable information that you can surface up to the rest of the company. And you can manage that just much more effectively, if it's managed in a decentralized team way. So, that's really the sort of enterprise architecture that we're hoping to enable. Now, we're at the beginning of that journey from TerminusDB. But, that's really the vision of where we create value within the enterprise. Maybe I'll just turn back a little bit and give a short bit of our trajectory. Because we're a university spin out. We came out of Trinity College in Dublin. I'm sitting here in Dublin, Ireland, right now. We were a very large research project that was led by the European Commission, funded by them, to build the technical architecture for something called Seshat. It's a global history databank. So, bringing all of the economic and social datasets from all of human history. And then providing them in a single machine-readable format, so that people could do advanced analytics and looking at long journey. So, we came out of that world of just really complex schema and complex collaboration amongst a lot of different people, and built the software there. And then spun it out of university as that. And when we first came out, we were looking at large scale implementations of graph and enterprise. And now, went down the more open-source, more open-science type of route and to try and build a community around that.
8:53 Miko Pawlikowski
Okay, so that makes perfect sense. But I'm wondering, from a perspective of someone who just heard about it and wants to kick the tires, how different is it? You mentioned some of the problems with using Git and GitHub for data, like querying the revisions and stuff like that. What's the experience like? Is it more like Git or like a database? I think I saw somewhere the query language WOQL? What's that like?
9:21 Luke Feeney
So, WOQL is a datalog. It's like datalog's been hanging around for 40 years as a query language, but it's finally now coming a little bit to the center stage. And there are a bunch of up-and-coming databases that are using WOQL. Or not using WOQL, but using datalogs and variants of datalogs. You have databases like Datomic that people might be aware of, Kraken, Open Crux. So, there's a bunch of new up-and-coming databases that are using datalog as a query language, because it can just contain a lot of complexity. And as you get into lots of complex data, your SQL runs out of headroom and I'm thinkg of something like datalog, which WOQL was based on. So, we'd say it's quite SQL-like. It is, obviously, there's some learning to do between it and SQL. And for people that are picking up the database. We're more of a database than a Git type system. So, we borrow a lot of those Git functions in order to describe things that happen in the database. But we're a proper database with a proper full query language. You can do all the database type of things, the sorts of use cases we're focused on are more analytics at the moment than being an operational database for the back end of it about high throughput application. Just because we see a lot of problems in analytics, it's just not a solved problem. And, I won't go back to it again, but the truth is, I think for a lot of people probably listening to this, is that the cloud data warehouse or the data lake, data swamp, they're bottlenecks that you have to go through in order to kind of solve things. So distributing things in the same way that software revision control, like Git, allows you to distribute work there. That just seems like a no-brainer to me.
11:13 Miko Pawlikowski
It does, although I'm still trying to picture that. So, we're kind of mixing a lot of different things. You mentioned SQL, which is tables and rows and queries like that documents. Which, I guess, makes sense because others have the data natively some kind of CSV somewhere that you're gonna want to put in there. And you also mentioned graph. I'm curious how all of this different basically approaches to storing the data and working with it works in terms of discoverability of that thing? What's the magic sauce now to kind of put it all together to help with the discoverability, as opposed to just 'I can stick a few CSV files in Git and can call it a day, right?'
12:00 Luke Feeney
Exactly. People do that. And we see a lot of people that are taking CSVs on Git, in order to say that's revision control. And like, that probably works fine for some people. But once you kind of get into more complexity, then it just all falls apart. The big thing for us is that we've got strong data models and we give strong schema support within the database. And we version schema with code or I mean with data. So we have versioning of both, all the way down to the to the base. And, under the hood, we're a RDF database, in fact. But we speak JSON to everything. So we're JSON-LD over the wire. And we realize that everybody now is very comfortable in a JSON-first world and a document-first world. We're shaping ourselves towards that and trying to make it easy for coders, for programmers, for anybody there who just wants to have a document store, be able to use our database in the way that they like, and not have to go through elaborate building of schema upfront. But be able to do that a little bit later on into your project, when you have a greater idea of where you're going.
13:10 Miko Pawlikowski
Sure. You mentioned 'under the hood', which gets me curious. Can you talk a bit how its implemented? What is under the hood, with technologies like languages and all of that? A quick flyover tour?
13:24 Luke Feeney
It's likely and peculiar and speaks to our background, because when we came out of university, we come from the link data community initially. And so our database is implemented in Rust at the storage layer, the distributed storage layers, the Git-like stuff is built in Rust. And then the query layer is built in prologue, which is a logical programming language from the 1970s as well, which isn't very popular with coders these days. But we kind of have the trendiest and the hottest language is our storage layer, which is rather one of the least popular languages of all as our server layer. It's really, really strong on query. So therefore, we ended quick to develop and quick to find the functionality we want. That's our base layer, where we implemented the database from the ground up, both the storage engine and the query. We don't really have anybody else's technologies in there. Though now, we are definitely starting to build out more links within the ETL ecosystem and things like that. So the externalities of the database are better served. My advice to all the listeners is: "Don't try and build a database yourself. It's crazy, terrible idea."
16:56 Miko Pawlikowski
Yeah, a lot of people say that the open-sources are eating the software world now. And I definitely see more and more of this technology is being used all over the place now. And even on this podcast, we're seeing more and more companies jump into the open source first model, and just work around that for their actual business model. Also notice that you have very strong commitments on your website, it's like 'now and forever open-source'. I was wondering if there was a dig against, like 'don't suss me' kind of wave of changes in licenses recently? Yeah, it is good, left to right,
17:37 Luke Feeney
MongoDB, mostly. Mongo, Elastic, all of these guys, basically leaving open-source after. I mean, I kind of understand their reasons, because AWS and these other guys are, you know, launching DocumentDB and trying to come into their space. But they were built by the open-source community. And then, turning around and saying, 'well, we're not open source anymore, when we're a $5 billion business, because it's not enough of profit' seems to me from where we're standing, that those companies are enormously successful, and that they should remain committed to open-source even in the face of competition. So, it's forever, even though it will launch a cloud platform as well. But, we'll stay open-source with the core database forever.
18:22 Miko Pawlikowski
Awesome. So, for everybody who's tuned-in so far, and their interest got picked. I know, I certainly am interested, how do you get started with that? What's the best way to go kick the tires tested with some data? Because with the data stuff, you need some data to actually play with. Do you have, like a sandbox environment, where people play? Or some kind of tutorial that introduces? What's the one-on-one?
18:51 Luke Feeney
Yeah, so we have a bunch of tutorials with a bunch of data associated with them. And obviously, database can be downloaded directly from our website. And we have a Discord community that can point everybody in the right direction of the various different tutorials, with various different physical data. We are about to drop a cloud data platform, which will have a kind of sandbox and a trial and have all of the nice, easy to use, easy to extract data in it, for people to mess around with.
19:22 Miko Pawlikowski
Awesome. So TerminusDB.com.
19:26 Luke Feeney
Yeah, absolutely. And then we have our discord community, which is really where we do most of our chatting. I mean, we were on Slack, and we've moved across to the Discord and we just like it way more, I have to say. The gamers design things in a much more ergonomic way.
19:40 Miko Pawlikowski
Definitely, big fan of Discord's. Also big fan of the pricing model. Come by Conversa. Awesome, so I can't wait anymore. Tell me about the diplomat. So, I've never seen anyone move from politics to tech. Seen a few people move the other way around. What happened? What attracted you? Were you in tech at the same time? Or did you completely switch and change, because you felt like doing a change?
20:13 Luke Feeney
Yeah, completely switched and changed. So, I was the diplomat for the Irish government for 13 years. I was in the embassy in South Africa, the embassy in Greece, I served in the United Nations in New York for a while. And then I came back to Dublin, after spending four years in Greece. And I was a great time in Greece, because I was there through an economic crisis and refugee crisis. Was really interesting time to be in that part of the world. And Greece is just a fantastic country, full stop. And we also looked after some of the countries in that part of the world, so we were responsible for Serbia and Albania. So we traveled a lot on the Balkan Peninsula. And then I did a year and a bit as the head of the government's Brexit communications. And, during the middle of Brexit, it's very intense period. An opportunity presented itself to take a jump across into tech. And I thought, this seems like something interesting to do. When I arrived, then I immediately regretted it, in a sense. Because you land and everybody is saying Kubernetes, and Docker and Terraform. And this next buzzword, this next buzzword, and you feel immediately lost in a world that seems so alien. In diplomacy, diplomacy is sales, like everything. Everything is sales, it's all about selling something, in that case, you're selling Ireland or selling an idea. And again, then you're kind of back into trying to understand something so that you can explain it effectively to somebody else. And really, that's what I tried to do, when I came into tech, is try and get a decent understanding the way things work together. And then, try and explain it back to people. Yeah, I mean, it's been a crazy transition. When I think back, how different the worlds are. The beauty of startups is that it's a very relaxed and open environment where people are willing to learn and to share. And that's very different from working in a diplomatic system, especially when you're abroad. And there's a lot more parts much closer to the chest about who knows what, and what are secrets, what are open. And really, tech isn't really like that. I mean, it's a very welcoming, and a positive learning community most of the time. I mean, obviously, there are dark underbellies to that as well. But there's just a lot of people who want to learn and want to be positive about learning. And that's great. Most of the time, yeah.
22:47 Miko Pawlikowski
Yeah. So basically, moved from selling Ireland to selling tech now.
22:53 Luke Feeney
Yeah, exactly.
22:54 Miko Pawlikowski
That sounds like you wish the diplomacy was a bit more open. Is there anything that you wish tech was more like diplomacy is? In the other direction.
23:05 Luke Feeney
That's a difficult question. I suppose the tech probably is like diplomacy, in a sense. I mean, when you get like a lot of these very big corporates like Amazon, and Apple, and they get to the scale of being, as largest states. Obviously, you got a $2 trillion market capitalization for Apple, and it has all of those branches associated with it, like public affairs, or public policy, who are effectively diplomats for those very large tech companies, who are trying to shift public policy to make it more amenable to those corporate concerns. So, you know, big tech companies, and the biggest ones have their own diplomatic services out there doing diplomatic things. But why would I take with me, there's within the diplomatic service itself, there's an incredible collegiality. There's a very, very, very strong team spirit within there. You might be presented with other diplomatic services who might have other aims. But on your own team, on the inside, there's a very, very strong support structure that does feel like family. And I missed that, I do miss that working in a small organization that you can never quite recreate that sort of family feeling within a larger organization like that.
24:28 Miko Pawlikowski
That makes sense. Although, I gotta say, an idea of being a diplomat in Athens sounds pretty, pretty damn attractive, especially right now.
24:37 Luke Feeney
Yeah, it was, it was a crazy time. Because, if you think back, it was Yanis Varoufakis. It was Syriza. It was Tsipras. It was all of these larger than life figures emerging onto the scene, and you're right there in the front row, and seeing all this happen and reporting back on it. so fascinating, really fascinating time to be there.
25:02 Miko Pawlikowski
And yeah, you're calling this a fascinating time going through all the crisis, including the Brexit one. I'm guessing you really experienced a lot of one-in-a-lifetime events this way.
25:16 Luke Feeney
Yeah, well, the Brexit one was different for me, because I was looking after communications. And I suppose it's like, if anybody's working in startup or in tech, it's like the day you announce your 50 million series A or Series B. But every day is like that, in terms of the intensity of communications work. Because Ireland was so central to it. And so many international journalists were interested in Ireland's perspective on Brexit. You just had this constant flow of the media spotlight being on Ireland and on your office, and trying to coordinate that, across a system of 35,000 civil servants is very challenging. So, it was just a very intense period. And I think it was very successful from Ireland's perspective. We managed to explain our position very well, I think that kind of got out there. And you know, Britain's gone totally nutty, since Brexit, as well remains a little bit nutty.
26:14 Miko Pawlikowski
Yeah. And I think that's probably where we should leave that. So, people who have very strong opinions on this stuff. Okay, so to shift gears a little bit, because we're almost out of time. I'm curious out of your very unique career path, if you were to pinpoint a single one thing that you did, that provided the highest return on investment for your career. It can be anything from a book, you read to a course you attended, to the meditation technique, you tried out, what would it be?
26:49 Luke Feeney
Yeah, that's interesting one again. I mean, it's a difficult thing. Now I have four young children. And so I have to say that they're obviously the most positive thing that happened, in the sense. Because they provide me with incredible grounding, and no matter what's going on, in terms of everything collapsing, or everything being incredibly difficult, and they gave me great grounding into just being their father and being shouted out for whatever else is going on in life. But, I'd say, one of the things that I've started to do recently, because of seeing all of these Silicon Valley grows talking about it, is regulating my sleep to a very great degree. So, going to bed at the same time every single night and regulating it to a very great degree. And that's just been transformational in terms of my daily ability to be productive. It seems so foundational and so simple. But previously, I would have been up working till 2am or 1am, or doing something else. And changing that just allowed me to be much more productive in terms of a book. Geez, I don't know, that's a hard one. There's so many great books out there. When I came into tech, and I wonder if other people know this, but I read the "Phoenix project", which is like this novel that's based around an IT department that's going through digital transformation, and kind of implementing microservices type architecture. But it's written in novel format. It's a story about a guy who is the head of IT within a corporation that's got a terrible IT department and treats the developers very badly, and how through his story, he kind of gives rebirth to the company. That's amazing. I really enjoyed that.
28:38 Miko Pawlikowski
Awesome, so much good stuff. So kids, when you listen to that, 10 years from now, dad's proud. And remember, he did good. And the sleep thing is also interestingly something that I kind of struggled for a long time with. And only recently, during the pandemic, I was kind of forced to regulate that too. And I really saw a big improvement just from getting tired at the same time and going to sleep at the same time. And then yeah, crazy things like waking up without the alarm. So, I can definitely share support for that. idea. This really has been great. Thank you so much for your time, Luke. I think we learned quite a lot interesting stuff about unusual database/GIts/something that TerminusDB seems to be. Learned a little bit about how being a diplomat is, for Ireland specifically, and got some good book recommendation. So Luke Feeney, CCO at TerminusDB. And thank you so much for your time.
29:40 Luke Feeney
Thanks, Miko, it was really a pleasure.
Priority access to all content
Video hallway track
Community chat
Exclusive promotions and giveaways