Democratizing data, thinking backwards and setting North Star goals with Dr. Donald Kossmann

This post has been republished via RSS; it originally appeared at: Microsoft Research.

Episode 107 | February 19, 2020

Dr. Donald Kossmann is a Distinguished Scientist who thinks big, and as the Director of Microsoft Research’s flagship lab in Redmond, it’s his job to inspire others to think big, too. But don’t be fooled. For him, thinking big involves what he calls thinking backwards, a framework of imagining the future, defining progress in reverse order and executing against landmarks along an uncertain path.

On today’s podcast, Dr. Kossmann reflects on his life as a database researcher and tells us how Socrates, an innovative database-as-a-service architecture, is re-envisioning traditional database design. He also reveals the five superpowers of Microsoft Research and how we can improve science… with marketing.

Related:


Transcript

Donald Kossmann: We have been programming devices. We’ve been programming mainframes. We’ve been programming PCs. We’ve been programming the web and so on. I think we need to go to the extreme craziness and think that the world is one big computer. I think this is the big North Star goal that we have. 

Host: You’re listening to the Microsoft Research Podcast, a show that brings you closer to the cutting-edge of technology research and the scientists behind it. I’m your host, Gretchen Huizinga.

Host: Dr. Donald Kossmann is a Distinguished Scientist who thinks big, and as the Director of Microsoft Research’s flagship lab in Redmond, it’s his job to inspire others to think big, too. But don’t be fooled. For him, thinking big involves what he calls thinking backwards, a framework of imagining the future, defining progress in reverse order and executing against landmarks along an uncertain path. 

On today’s podcast, Dr. Kossmann reflects on his life as a database researcher and tells us how Socrates, an innovative database-as-a-service architecture, is re-envisioning traditional database design. He also reveals the five superpowers of Microsoft Research and how we can improve science… with marketing. That and much more on this episode of the Microsoft Research Podcast. 

Host: Donald Kossmann, welcome to the podcast.

Donald Kossmann: Thanks. Thanks for having me.

Host: I like to start by situating my guests. It’s such a research-y term. And you are very impressively situated here. So as a Distinguished Scientist, and the Director of Microsoft Research’s Redmond Lab, what do you hope to accomplish here? What gets you up in the morning?

Donald Kossmann: So what gets me up in the morning are the people. I’m working with an incredible group of people. Researchers, engineers, designers, testers, program managers, biz operations people… They are all amazing and it’s an incredible privilege to be given the opportunity to – to be their advocate. On the research front, what gets me up is democratizing technology. I think banks have democratized money, right? And they’ve made it, for everybody, possible to have money, to grow money. Cars have made it possible for everybody to move around in the world. Right? That is the democratizing mobility. So databases, which is my background, has democratized data, which has made it possible for everybody to get the best value out of their data. If I want to get value out of my data, I need to get the tools to get the value. If we, kind of, go back to the metaphor of a bank, right? How do I get value out of my money from a bank is also that I combine it with other people and the bank pools it and then makes out of the mass something better and bigger and then lets me participate in that. So my data, my genome data alone, is not very useful, but of a large population of people pooling this together, correlating it with things that happen, that is actually very valuable. And I think what we need to still do, and that’s where democratization needs to happen, is that I, as an owner of my data, need to control how it is used and how I get the value back. And at the moment, we have just way too few offerings for that.

Host: Yeah. How does the cloud change that?

Donald Kossmann: Well the cloud, at the beginning, is just like a bank. It’s like a vault where you put your data and it’s also kind of the opportunity to do something with the data. So it is a platform that then allows everybody, at some point, to kind of realize their visions and their dreams on what to do with data and how to create value with that data.

Host: Prior to MSR, you had a lengthy and notable career in academia. I’m going to ask you more specifically about your life and your path to MSR later, but I think it’s worth talking briefly about your time as a professor at ETH Zurich where you were a database person in the computer science department. Tell us a little about the history of database systems and what the landscape looks like now in the era of cloud computing.

Donald Kossmann: One of the things I always say jokingly about databases, is that databases are boring and hard, and that’s why they make so much money. Because nobody wants to do the boring stuff and nobody can do the hard stuff, so it’s kind of a good combination. But essentially, database is a fairly old technology, but it has always been about three things. One thing is value. How do you get the best out of your data, which is, what are the features that you provide, the power of querying the data, of updating it, of correlating it, and doing things with the data? The second thing has been security. How do you make sure that the data stays under your control, that you own it and determine what happens with the data? And the third is, I would call it cost or performance, is making sure that you don’t overpay for the data, right? That it’s kind of cheap to, or kind of gets more and more affordable, to do what you want to do with your data and control it.

Host: Alright. So what did you do as a database professor?

Donald Kossmann: Yeah, so one of the waves I was very involved in was the so-called semi-structured data wave. The best way to process data is if it’s really structured and you know exactly what it is, right? And you have a schema, essentially. And I spent a lot of time working on semi-structured data, which has some structure that you kind of extract and that is kind of like getting good value out of all data, not just your structured data like your bank accounts, but also your email, the books you write, the word documents you write, getting some value out of that.

Host: Mmm-hmm.

Donald Kossmann: So that was a big phase of mine. Another big phase of mine was distributed databases and how to optimize them and how to make them perform in a very scalable way.

Host: All right. So that’s kind of three waves you’re riding. Is there anything that you see out in the ocean right now that’s a wave coming in that database people might be facing… new challenges that research could address?

Donald Kossmann: I think it’s still about value, security and cost, and will always be in the database world. But I think what we’ve seen of the generations, or the eras of computing, is this pendulum, right? We started with a mainframe computer, which is kind of very centralized. Then we got into the PC era, which is kind of decentralized, where you push to the customer. Then we went to the web, which is, again, centralized. We went back to the mobile phone and smartphone, which is decentralized. Then we went to the cloud, which is, again, logically centralized. And now we are hitting back again in this pendulum to what we now call the edge. And I think we haven’t, in databases, even started to think about the edge because the edge for us is kind of like nine billion new machines. And nobody has thought about deploying databases on nine billion machines. We’re now at hundred thousands or ten thousands of machines in the cloud, but nine billion is yet a totally different thing!

Host: How… how are you even thinking about that?

Donald Kossmann: Well, of course my mental framework is pretty much always the same on the technology side: value, security and cost. But the best way to think about it is, if you believe in this as being the new machine or so, what is the killer application? What do you want to do with this data at the edge? And what are the constraints? So one way to do research is to look at what is happening today and think of one assumption that is going to go away, right? Or that has had it, but usually it’s an assumption that goes away. Of course, it has to be driven by some application so now, if the assumption goes away, it’s centrally managed, what can you do with this data now…

Host: Right. 

Donald Kossmann: …if you have such a system, and then that kind of inspires you to think about how to build such a system.

Host: Before you put on your visionary leader hat – I know you wear a lot of hats around here, Donald – I want you to tell us about some of your own research. Let’s start with Cipherbase, which is a SQL database system that stores and processes strongly encrypted data. Tell us about Cipherbase, and how a database professor got involved in cyber security and cryptography.

Donald Kossmann: When I was at ETH in the late 2000s, I was working with several companies. Among others, I was working with the Swiss banks. And so, there was actually a very big scandal in Switzerland and that is that the German government paid one of the Swiss database administrators to produce a CD of all German customers that had bank accounts in that bank. And of course, the assumption was, those were all tax evaders and most of them were. So the problem with that was that in Switzerland, this is illegal. But in Germany it’s actually totally legal. So what happened is that the Swiss bank came to me, because it was a database administrator and I was a database person, they came to me, and they said, okay, Donald, how can we prevent that that never happens again? And that kind of created my interest into encrypted databases or protecting data from the administrators but still letting the administrator do their job. They do a lot of important things with the database, but they don’t have to really look at the data, right? The business needs to look at the data, but not the database administrator.

Host: Hmm.

Donald Kossmann: And so we developed a bunch of technology, and we worked on that for two or three years and, at some point, a distinguished engineer from Microsoft visited ETH and he came to me and we talked about what I work on and I told him that story and he said, oh, actually we are, at Microsoft, very interested in that problem. And so I visited Microsoft and MSR and I learned about their solution and it was actually much, much better than what I had thought out over three years at ETH. So I said, oh, this is great. I want to work with you. And that was the…

Host: Very interesting. 

Donald Kossmann: …birth of the Cipherbase project. And that’s what, then, later on became the Always Encrypted feature of SQL.

(music plays)

Host: Traditional database architecture has some significant limitations when it hits the cloud, and one of the most exciting projects that you’re even currently involved with is an answer to that. It’s called Socrates. As a kind of set up, I want you to unpack expectations versus reality with the move toward Database-AsaService paradigms in the cloud, and how this new architecture compares with the older, what you call, monolithic database architecture.

Donald Kossmann: I think this question is best answered if I give an analogy, and that is retail. There’s the brick and mortar retail and then there’s the online retail. And both are important, just like both database architectures…

Host: Right. 

Donald Kossmann: …will be important, but they were designed with different assumptions and different goals in mind. So the brick and mortar, we want to kind of minimize the movement of goods. So you go there, you try your new fancy suit on, it fits, you go home, you have almost zero returns because logistics are expensive.

Host: Right.

Donald Kossmann: It is also about a kind of very specific experience that people want to have. It is all together. So this is what traditional databases do. They are designed for a particular experience and having a particular assumption. So moving data around is expensive. And the experience is, when I do a query to a database, I want to immediately get the answer…

Host: Right. 

Donald Kossmann: …and I want it to be fast. Now let’s go to online retail. Online retail has this big logistics problem, but it has some other features, right? Essentially you have virtually all products in one hand at one fingertip…

Host: Right.

Donald Kossmann: …and if you think about why online retail is so successful is because it is cheap, right? That’s what got people hooked up, is low cost. And why is it so cheap? Because it never wastes any resources, right? If you look at a shop, there are people working there that, sometimes there are no customers and they are just wasting resources. If you think about an online retailer, there’re no wasted resources. All the workers are constantly working and active. There’s nobody standing around. And the same happens in the cloud. And that is essentially the Socrates architecture. It is really designed for not wasting any resources. And that’s our kind of goal in the cloud to drive down cost and that’s why we separate the resources and you just use resources and put them together as you need them.

Host: All right, so I want you to tell me a little bit more about Socrates, technically, and how you have achieved this reduction in cost and increase in efficiency with the architecture that Socrates presents.

Donald Kossmann: Yeah, so, essentially what it is all about is separating concerns or disaggregating. So, traditional databases are monoliths. All functionality is kind of intertwined and mingled together, but very highly-optimized to have that experience…

Host: Right.

Donald Kossmann: …just like a shop. What Socrates does is it essentially separates compute, storage, and the log… Essentially, it separates concerns to make sure that we can optimize and can utilize these concerns in the best possible way. When we talk about disaggregation, we typically talk about disaggregation of computing resources…

Host: Right.

Donald Kossmann: …and when we talk about the architecture that does it, we talk about decomposing it into mini-services.

Host: Interesting.

Donald Kossmann: So there’s a mini-service that runs queries. There’s a mini-service that logs all the updates that happen. And then there’s a mini-service that serves the data. In the retail environment, there’s a mini-service that gives you the catalog and presents the goods to you. There’s a mini-service that is the warehouse that ships the products to you. And then there’s a mini-service that does the payment. And it’s kind of like the analogy here.

Host: All right, so where is this in the pipeline because we’ve got huge legacy systems. And now you’ve got this new idea that’s optimized for the cloud…

Donald Kossmann: In some sense, the good news is we don’t have to change the API. Kind of another analogy is if you buy an electric vehicle, you don’t have to relearn how to drive…

Host: No!

Donald Kossmann: …right? So you’ve changed the engine and you’ve done something really big underneath, and that’s one of the big achievements of the engineering effort of Socrates, is that we didn’t change the API. So it’s all under the hood!

Host: Failure has many faces and not all of them are ugly! You’ve had some experience with failure. Sometimes you’ve even called it miserable failure. So tell us about some work that you’ve done that didn’t work out and what lessons you learned while you were at it.

Donald Kossmann: My favorite story is that of a failed start-up. So I’ve also done start-ups. That’s, kind of for me, is part of the academic experience that you do start-ups. And I had a grandiose failure of a start-up.

Host: I’m not laughing at you…

Donald Kossmann: Yeah, well, it’s actually fun now! It was a little bit less fun at the time. So in 2006, Amazon kind of started with the cloud and I started a company that built a semi-structured database for the cloud. And it was a semi-structured database because I had been working on semi-structured databases. And it was the cloud because I thought the cloud was really cool and was going to be a game-changer and I wanted to be the first there! The problem was that these were two big bets. It was a big bet on the cloud and it was a big bet on semi-structured data. And if I had just made the cloud bet, I would have been great. I mean, the ideas of separating compute and storage, that’s exactly what I did at that time. And the semi-structured bet, I think it’s still going to pan out. It’s still really important. It just didn’t pan out at the same time. And with a start-up, if you make two bets, they need to all pan out at the same time. And that’s just not going to happen, right? So bet on one miracle rather than two or three! Finding the one miracle, that is the art of doing a start-up, but also doing a research project.

Host: Let’s talk about superpowers. You wrote a blog post, which I loved, where you compared and contrasted super powers of academia, of product groups, of start-ups and Microsoft Research. So give us a superpower breakdown of these various institutions and entities and where you land personally on what we might call the value proposition of Microsoft Research.

Donald Kossmann: Essentially, the… what I believe the five super powers that the company gave us – this is really when Bill Gates kind of founded Microsoft Research – this is freedom. We can freely collaborate with everybody in the company. We are not tied to any organizational structure. The second one is, we have time. We don’t have product deadlines, shipping deadlines and so we have time to really think things through.

Host: Mmm-hmm.

Donald Kossmann: The third one is, we take risks. We can fail fast. We don’t have legacy. If we find that an idea is stupid, we just kill it. We just stop working on it, right?! This is different from product groups. Creativity is a big part of our culture. We generate ideas constantly. This is kind of part of our job. And the fifth one is we build stuff, we execute, and, of course, we do that with the product groups.

Host: Right.

Donald Kossmann: And so coming to your question, I think every kind of organization has a different mix of these.

Host: Right.

Donald Kossmann: I mean, academia is creative. Our product groups are creative, right? Start-ups have some of these. But this combination is unique. And so, if we want to innovate, which is kind of our mission, and what we want to achieve, that’s how we create value to the company, we have to use these five super powers. We were talking about some projects like Always Encrypted or Cipherbase, that’s exactly something that academia cannot do because academia doesn’t have the execution part.

Host: Right. 

Donald Kossmann: They just don’t have the resources to do it. A start-up cannot do it either because these projects take time and the time to do this, a start-up just doesn’t have. And so that’s what we’re looking for and it’s actually amazing, in this time, how many projects really need exactly this combination of super powers.

Host: There’s been a long-standing debate between what I might call pure research purists and another group that I would call team tech transfer, who are entrepreneurs. And the argument stems around purpose of research, and how you measure it. And one side is always yelling science! and the other is always yelling impact! but you’ve had actually argued that the argument is becoming moot. Why?

Donald Kossmann: Well, because it’s both, right? And so I would have to kind of drill down a little bit what I think a good research project does and it has essentially three components. It has scientific insight, right? Some idea, some secret sauce. The second piece is it has execution. It executes on something. It creates something. And the third one is, I call it marketing, but what it really means is having clarity on the impact. And the interesting thing is that the execution and marketing make the science better. I cannot explain it, but it’s happening right now. When we do science and we execute, that actually is a feedback loop to our science. We see things that we wouldn’t have seen if we hadn’t executed on it. Or creating the clarity on how this is going to change the world makes us kind of question assumptions that we might have not done if we had just stayed in the scientific world, and actually makes the science much more interesting. This is what I find so amazing about the job that I have and about Microsoft Research, if I see how researchers kind of get this insight and they say, yes, the execution makes my science better and the impact makes my science better, this is kind of like really deeply gratifying.

(music plays)

Host: Most people think, somewhat logically, that in order to innovate we need to think forward, or think ahead. And you suggest, in another provocative blog post, that we actually need to think backwards. Tell us what you mean by thinking backwards and then unpack why we need to do it, why it’s hard to do it, and what happens when we do it.

Donald Kossmann: Yeah. So I wrote this blog post as a reaction to comments that I heard often: “Well, we’ll cross that bridge when we get there.” And often what happens is, you never get to that bridge or when you get there you’re really stuck, right?

Host: Totally unprepared.

Donald Kossmann: Unprepared and you don’t know what’s going on. So what thinking backwards does is, it starts with, what we call at Microsoft, defining a North Star goal, a really good North Star goal. And then not immediately jump, oh, what is the best direction to this North star goal? But kind of creating landmarks. And I call them landmarks because milestones are kind of like forward thinking, Milestone 1… what is your Milestone 1? But I actually think about landmark N minus 1. Because really what we do is we navigate uncertainty. We don’t know where we will go. But if we know, oh, there is somewhere there, I need to get there. I don’t know exactly what will happen on the path, but I know the dimensions. I know I can go west and east. I can go north and south. I know essentially how I can maneuver. And if I know the landmarks, right? then I can get there. And if I do get stuck, it kind of helps me not to get frustrated. So if I know this is my landmark and I get stuck, I hit a dead end, which happens to all of us, I will find a solution to get to the landmark, or I will redefine the landmark. It gives me much more clarity to deal with these situations.

Host: Okay.

Donald Kossmann: Whereas if you move forward and you hit a dead end, you’re stuck. And then you often give up and get frustrated.

Host: Well, Donald, we’ve reached a part in the podcast where I always ask my guests what could possibly go wrong? And I do this because every line of research that has potential for great good also has potential for great risk or great harm. And as a leader, you don’t only have to worry about your own stuff. You have to worry about all the stuff of all people that you shepherd and supervise. So what, if anything, keeps you up at night, metaphorically, and what responsibility do you have to identify and then try to mitigate the potential risks of the work that you do and the work that the people here do?

Donald Kossmann: So as a high order bid, I’m an optimist and I just, well, move forward. We’re just talking about thinking backwards, but I actually always think there is a way out. And one of the reasons why I think that is because, in the bad situations I’ve been in life, there’s always somebody with me, I’ve always managed to never be alone because if things get bad, it’s much better not to be alone. I have also, I have this, I think one of my biggest strengths is I can detach myself from myself. So sometimes if things go really wrong, I can look at myself and say, well, Donald, you really screwed up. Okay. And then I have a different perspective and it helps me to move on. In Microsoft Research we are about risk taking. We’ve created something called Failure to Lunch, which is a seminar series where people of the lab talk about their failure and we celebrate the kind of, what we call, smart risk-taking but usually there’s something to be learned. And we celebrate failure and that is great, I think.

Host: All right. So let’s move over and not talk about failure. Let’s say you succeed wildly in some of these technologies that you’re chasing, that are your North Star goals, and they have unintended consequences. How do you mitigate that?

Donald Kossmann: That, of course, is a great question and I think, when I started computer science, we were innocent. I remember writing grant proposals and the question about ethical concerns, it was a no brainer. And now, everything we do has an ethical side. We are dealing with technology that is dangerous and we know it, right? It can all be misused in many ways. As scientists, we have a responsibility to think about how our technology can be misused and we have a big, big responsibility to educate society and do our best to explain the technology and possible misuses of technology. If we do that, we kind of do the right thing and we also play our role. It is not our decision how our technology is used. We just need to be responsible and develop technology that makes the world a better place. We should think about the positive sides, but again, I’m an optimist.

Host: All right, it’s story time. We’ve heard a bit about your academic and professional life. Let’s rewind a bit and hear how you got there.

Donald Kossmann: How I got into computer science is not a heroic story. I didn’t know anything else what to do. It’s kind of more random than really by design. I come from a family of lawyers. And so I wanted to always become a chartered accountant, which is also… I mean, database is just as boring as that… 

Host: You can make a lot of money!

Donald Kossmann: Yeah! And, and so I went to the Harvard summer school and I did a course on programming and it just infected me. I got the bug and I love programming. And so that kind of changed my plans. I studied computer science and I think I just got lucky.

Host: So from there you went to be a professor. You got a PhD somewhere in the mix there. You were a professor at ETH Zurich. Connect the dots for us.

Donald Kossmann: Here is the story. I… So I had been working on this Cipherbase project with MSR and probably I was somehow on the watch list and I had visited. And I got an offer to join MSR. And for me, it was actually pretty clear that I would decline the offer. Unfortunately, not for my wife, or fortunately, not for my wife! So my wife literally said, “Donald, you can stay happily senile at ETH or you can start from scratch.” And I thought, well, actually both of those options are pretty bad, right? I don’t want to start from scratch, but I don’t want to be senile either. And so, what we ended up deciding, that I start from scratch, because, why have one career if you can have two careers, right?

Host: Right.

Donald Kossmann: And so now I’m in my second career. I started from scratch and it has been quite a ride.

Host: Tell us something interesting about yourself that we might not know, whether it’s a characteristic, a life event, a side quest, something you’ve done and how it has affected or impacted your life or career? And if it didn’t even affect yours, maybe somebody else’s?

Donald Kossmann: One thing I did this summer, I wrote a small book. It’s called Wunder Informatik… The Miracle of Computer Science. So I actually wrote it in German because I have four children, three daughters, I wrote it essentially for my daughters because they are kind of asking me all these questions. Why did you study computer science? How did you get there? And I never had a good answer. I felt like I weaseled my way through a whole career as a professor without knowing why to study computer science or why I was so lucky. I had to reflect on this and what is it that makes computer science so special? I wrote it. I evangelized it a little bit in Switzerland. That’s why I wrote it in German because I was invited to give a commencement speech at one of the… a high school in Switzerland, and that was kind of, I used that opportunity to kind of advertise the book and so I got a lot of feedback from that. It has been great because, as it always is, when you teach you learn more than anybody else, right?

Host: We’ve talked about the importance of thinking backwards and establishing North Star goals. Maybe you could give our listeners a little primer on how they could go about setting their own North Star goals, including what they should do if they run into one of those dead ends that you talked about earlier.

Donald Kossmann: What I believe my biggest problem as a lab director, or as a researcher, is defining the right goal. That is the biggest problem of all. Essentially it defines our ambitions. Getting the right level of ambition, and rising because the opportunity is rising, that’s a very difficult task for a researcher. So I think that the thinking backwards framework is actually great to define a North Star goal and to get to the right level of ambition And the way it works is, if you don’t find a path backward from your goal to where you are, your ambition is probably too high, right? Starting civilization in space, right? is probably too big of an ambition because we cannot execute on it, right? But if you kind of have not interesting landmarks, if they are boring, if they are not inspiring you, then your ambition was probably too small. And so the framework allows you to, first of all, reason what your goal is and kind of dream of it and the implications that it has and the impact, but it is also a way to kind of keep you honest and validate things.

Host: As we close, I want to give you the last word. And as the leader of the MSR Lab in Redmond, you’re in a unique position to offer some advice and inspiration to our listeners. What’s the next big north star goal for Microsoft Research?

Donald Kossmann: Yeah, so of course now I have to think big and I have to think beyond to be inspiring. So I think we have been programming devices. We’ve been programming mainframes. We’ve been programming PCs. We’ve been programming the web and so on. I think we need to go to the extreme craziness and think that the world is one big computer. I think this is the big North Star goal that we have. And I think, to break it down, and we were talking about the edge and the cloud, we are kind of making the world programmable by injecting computers, or micro controllers, into everything and that way, we make the world programmable. But at the moment we’re still doing that in isolation. And I would love us to think of it as a one big system that we should program. And of course, we should think about, again, what are the things that we can enable? What are the killer applications of that computer? What are the ways to optimize it kind of in the same way as Socrates? How to secure it? If everything is connected, how do you draw lines? And, essentially, how to program it in an efficient way so that everybody can take advantage of this world computer. Again, it ties into our superpowers. We have the freedom to work on this. We have the skills to execute, maybe not on the complete vision, but on pieces, important pieces, once we have clarity about the landmarks. We have time to do that. We can take risks, right? Some of the things will fail on that path. We have all the ingredients here that you need to address these really, really big dreams.

Host: Donald Kossmann, thanks for taking time away from your own North Star goal and coming in!

Donald Kossmann: Thank you so much!

(music plays)

To learn more about Dr. Donald Kossmann and how thinking backwards is moving us forward, visit Microsoft.com/research 

The post Democratizing data, thinking backwards and setting North Star goals with Dr. Donald Kossmann appeared first on Microsoft Research.

REMEMBER: these articles are REPUBLISHED. Your best bet to get a reply is to follow the link at the top of the post to the ORIGINAL post! BUT you're more than welcome to start discussions here:

This site uses Akismet to reduce spam. Learn how your comment data is processed.