Neural architecture search, imitation learning and the optimized pipeline with Dr. Debadeepta Dey

This post has been republished via RSS; it originally appeared at: Microsoft Research.

image of Debadeepta Dey for the Microsoft Research Podcast

Episode 108 | February 26, 2020

Dr. Debadeepta Dey is a Principal Researcher in the Adaptive Systems and Interaction group at MSR and he’s currently exploring several lines of research that may help bridge the gap between perception and planning for autonomous agents, teaching them to make decisions under uncertainty and even to stop and ask for directions when they get lost!

On today’s podcast, Dr. Dey talks about how his latest work in meta-reasoning helps improve modular system pipelines and how imitation learning hits the ML sweet spot between supervised and reinforcement learning. He also explains how neural architecture search helps enlighten the “dark arts” of neural network training and reveals how boredom, an old robot and several “book runs” between India and the US led to a rewarding career in research.

Related:


Transcript

Debadeepta Dey: We said like, you know what? Agents should just train themselves on when to ask during training time. Like when they make mistakes, they should just ask and learn to use their budget of asking questions back to the human at training time itself. When you are in the simulation environments we used imitation learning as opposed to reinforcement learning. Because you are in simulation, you have this nice programmatic expert. An expert need not be just a human being, right? Or a human teacher. It can also be an algorithm.

Host: You’re listening to the Microsoft Research Podcast, a show that brings you closer to the cutting-edge of technology research and the scientists behind it. I’m your host, Gretchen Huizinga.

Host: Dr. Debadeepta Dey is a Principal Researcher in the Adaptive Systems and Interaction group at MSR and he’s currently exploring several lines of research that may help bridge the gap between perception and planning for autonomous agents, teaching them to make decisions under uncertainty, and even to stop and ask for directions when they get lost!

On today’s podcast, Dr. Dey talks about how his latest work in meta-reasoning helps improve modular system pipelines, and how imitation learning hits the ML sweet spot between supervised and reinforcement learning. He also explains how neural architecture search helps enlighten the dark arts of neural network training and reveals how boredom, an old robot and several “book runs” between India and the US led to a rewarding career in research. That and much more on this episode of the Microsoft Research Podcast.

(music plays)

Host: Debadeepta Dey, welcome to the podcast!

Debadeepta Dey: Thank you.

Host: It’s really great to have you here. I talked to one of your colleagues early on because I loved your name. You have one of the most lyrical names on the planet, I think.

Debadeepta Dey: Thank you.

Host: And he said, we call him 3D.

Debadeepta Dey: That’s right. That’s right, yeah!

Host: And then you got your PhD and they said, now we have to call him 4D!

Debadeepta Dey: That’s right. Oh, yes. Yes, so the joke amongst my friends is like, well, I became a dad, so that’s 5D, but they’re like well, we’ll have to wait until you become, like twenty, thirty years, if you became the director of some institute, that will be a sixth D and whatnot. Whereas like, the Ds are getting harder to accumulate.

Host: Right! And it’s also like the phones… 3G, 4G, 5G.

Debadeepta Dey: Exactly.

Host: When does it end? Well I’m so glad you’re here. You’re a principal researcher in the Adaptive Systems and Interaction, or ASI group, at Microsoft Research and you situate your work at the intersection of robotics and machine learning, yeah?

Debadeepta Dey: That’s right.

Host: So before I go deep on you, I’d like you to situate the work of your group. What’s the big goal of the Adaptive Systems team and what do you hope to accomplish as a group – or collectively?

Debadeepta Dey: ASI is one of the earliest groups at MSR right? Like, you know, because it was founded by Eric and if you dig into the history of how MSR groups have been, many groups have spun off from ASI, right? So ASI is more, I would say, instead of a thematic group, it’s more like a family. ASI is a different group than most groups because it has people who have very diverse interests, but there’s some certain common themes which tie the group together and I would say it is decision-making under uncertainty. There’s people doing work on interpretability for machine learning, there’s people doing work on human-robot interaction, social robotics, there’s people doing work in reinforcement learning, planning, decision-making under uncertainty, but all of these things have in common is like you have to do decision-making under bounded constraints. What do we know? How do we get agents to be adaptive? How do we endow agents, be it robots or virtual agents, with the ability to know what they don’t know and act how we would expect intelligent beings to act.

Host: All right, well let’s zoom in a little bit and talk about you, and what gets you up in the morning. What’s your big goal, as a scientist, and if I could put a finer point on it, what do you want to be known for at the end of your career?

Debadeepta Dey: You know, I was thinking about it yesterday and one of the things I think which leaped out to me is like, you know, I want to be known for fundamental contributions to decision theory. And by that I don’t mean just coming up with new theory, but also principles of how to apply them, principles of how to practice good decision science in the world.

Host: Well, let’s talk about your work, Debadeepta. Our big arena here is machine learning and on the podcast I’ve had many of your colleagues who’ve talked about the different kinds of machine learning in their work and each flavor has its own unique strengths and weaknesses, but you’re doing some really interesting work in an area of ML that you call learning from demonstration, and more specifically, imitation learning. So I’d like you to unpack those terms for us and tell us how they’re different from the other methods and what they’re good for and why we need them?

Debadeepta Dey: First of all, the big chunk of machine learning that we well understand today is supervised learning, right? You get a data set of labeled data and then you train some, basically, a curve-fitting algorithm, right? Like, you are fitting a function approximator to say that if you get new data samples, as long as they are under that same distribution that produce the training data, you should be able to predict what their label should be.

Host: Right.

Debadeepta Dey: Right? And same holds even for regression tasks. So supervised learning theory and practice is very well understood. I think the challenge that the world has been focusing… or has a renewed focus on in the last five, ten years has been reinforcement learning, right? And reinforcement learning algorithms try to explore from scratch, right? You are doing learning tabula rasa, you assume that the agent just was born and now has to interact with the world and acquire knowledge. Imitation learning is more middle-ground, where it says hey, I’m going to learn a policy, or a good way of acting in the world, based on what experts are showing me, right?

Host: Okay.

Debadeepta Dey: And the reason this is powerful is because you can bootstrap learning. It’s assuming more things, that you need access to an expert, or teacher, but if the teacher is available, and is good, then you can very quickly learn a policy which will do reasonable things.

Host: Okay.

Debadeepta Dey: Because all you need to do is mimic the teacher.

Host: So that’s the learning from demonstration? The teacher demonstrates to the agent, and then the agent learns from that and it’s somewhere between just having this data poured down from the heavens, and knowing nothing.

Debadeepta Dey: And knowing nothing, right?

Host: Okay.

Debadeepta Dey: And mostly, in the world, especially in domains like robotics, you don’t want your robot to learn from nothing.

Host: Right.

Debadeepta Dey: Like, you know, to begin tabula rasa, because now you have this random policy that you would start with, right? Because in the beginning you’re just going to try things at random, right?

Host: Right.

Debadeepta Dey: And robots are expensive. Robots can hurt people, and also the amount of data needed is immense, right? Like the sample complexity, even theoretically, of reinforcement learning algorithms is really high and so it means that it will be a long, long time before you do interesting things.

Host: Right. Well, I want to talk a bit about automation. You’ve done some interesting exploration in what you call neural architecture search or NAS, we’ll call it for short. What is NAS, what’s the motivation for it, and how is it impacting other areas in the machine learning world?

Debadeepta Dey: So NAS is this sub-field of this other sub-field in machine learning colloquially called Auto ML right now, right? Like where Auto ML’s aim is to let algorithms search for the right algorithm for a given data set. Let’s say this is a vision data set or an NLP data set. And it’s labeled, right? So let’s assume in the simpler setting instead of RL. And you are going to like, okay, I’m going to like you know try my favorite algorithms that I have in this tool kit, but you are not really sure, is this the best algorithm? Is this the best way to pre-process data? Whatnot, right? So the question then becomes, what is the right architecture, right? And what are the right hyper-parameters for that architecture? What’s the learning rate schedule? These are all things which are, um, we call it the “dark arts” of training and finding a good neural network for, let’s say, a new data set, right? So this is more art than science, right? And, as a field, that’s very unsatisfying. Like, it’s all great, the progress that deep learning has made is fantastic. Everybody is very excited, but there’s this dark art part, which is there, and people are like well, you just need to build up a lot of practitioner intuition once you get there, right? And this is an answer which is deeply unsatisfying to the community as a whole, right? Like we refuse to accept this as status quo.

Host: Well, when you’re telling a scientist that it’s art and you can’t codify it

Debadeepta Dey: Yes.

Host: …that’s just terrible.

Debadeepta Dey: That’s just terrible and it also shows that like, you know, we have given up, or we have like lost the battle here so… and our understanding of deep learning is so shallow that we don’t know how to codify things.

Host: All right, so you’re working on that with NAS, yeah?

Debadeepta Dey: Yes, so the goal in neural architecture search is, let algorithms search for architectures. Let’s remove the human from this tedious “dark arts” world of trying to figure things out from experience. And it’s also very expensive, right, like, you know, most companies and organizations cannot afford armies of PhDs just sitting around trying things and it’s also not a very good usage of your best scientists’ time, right? And we want this, ideally, that you bring a data set, let the machine figure out what it should run, and spit back out the model.

Host: Right. Well, the first time we met, Debadeepta, you were on a panel talking about how researchers were using ML to troubleshoot and improve real time systems onthefly…

Debadeepta Dey: Yeah.

Host: …and you published a paper just recently on the concept of meta-reasoning to monitor and adjust software modules onthefly using reinforcement learning to optimize the pipeline.

Debadeepta Dey: Yeah.

Host: This is fascinating and I really loved how you framed the trade-offs for modular software and its impact on other parts of the systems, right?

Debadeepta Dey: Right.

Host: So I’d like you to kind of give us a review of what the trade-offs are in modular software systems in general and then tell us why you believe meta-reasoning is critical to improving those pipelines.

Debadeepta Dey: So this project, so just a little bit of fun background, like actually started because of a discussion with the Platform for Situated Interaction team, and Dan Bohus, who’s in the ASI group and like, you know, sits a few doors down from me, right?

Host: Yeah.

Debadeepta Dey: And so the problem statement actually comes from Dan and Eric. I immediately jumped on the problem because I believed reinforcement learning, contextual bandits, provide feasible lines of attack right now.

Host: So why don’t you articulate the problem

Debadeepta Dey: Okay.

Host: …writ large, for us.

Debadeepta Dey: Okay. So let me give you this nice example, which will be easy to follow. Imagine you are a self-driving car team, right? And you are the software team, right?

Host: Yeah.

Debadeepta Dey: And the software team is divided into many sub-teams, which are building many components of the self-driving car software. Right? Let’s say somebody is writing the planner, somebody is writing low-level motor controllers, somebody is writing vision system, perception system, and then there is parts of the team where everybody’s integrating all these pieces together and the end application runs, right? And this is a phenomenon which software teams, not just in robotics, but also like if you’re developing web software or whatnot, you find this all the time. Let’s say you have a team which is developing the computer vision software that detects rocks and if there are rocks, it will just say that these parts near the robot right now are rocks. Don’t drive over them. And in the beginning, they have some machine learned model where they collected some data and that model is, let’s say, sixty, seventy percent accurate. It’s not super nice, but they don’t want to hold up the rest of the team, so they push the first version of the module out so that there is no bottleneck, right? And so while they push this out, on the side they’re trying to improve it, right? Because clearly sixty, seventy percent is not good enough, but that’s okay. Like, you know, we will improve it. Three months go by, they do lots of hard work and say now we have a ninety nine percent good rock detector, right? So rest of the team, you don’t need to do anything. Just pull our latest code. Nothing will change for you. You will just get an update and everything should work great, right? So everybody goes and does that, and the entire robot just starts breaking down, right? And here you have done three months of super-hard work to improve rock detection to close to a hundred percent and the robot is just horrible, right? And then all the teams get together is like, what happened? What happened is, because the previous rock detector was only like sixty, seventy percent accurate, the parameters of downstream modules had been adjusted to account for that. They’re like oh, we are not going to trust the rock detector most of the time. We are actually going to like, you know, be very conservative. These kinds of decisions have been made downstream, which actually have been dependent upon the quality of the results coming out upstream in order to make the whole system behave reasonably. But now that the quality of this module has drastically shifted, even though it is better, the net system actually has not become globally better. It has become globally worse. 

Host: Right.

Debadeepta Dey: And this is a phenomenon that large software teams see all the time. This is just a canonical example which is easy to explain, like, you know, if you imagine anything from like Windows software or anything else.

Host: Any system

Debadeepta Dey: Mmm-hmm.

Host: that has multiple parts.

Debadeepta Dey: Yeah. So improving one part doesn’t mean the whole system becomes better.

Host: In fact, it may make it worse.

Debadeepta Dey: In fact, it may make it worse.

Host: Right.

Debadeepta Dey: Just like in NAS, how we are like, you know, using algorithms to search for algorithms, this is another kind of Auto ML, where we are saying, hey, we want the machine learned monitor to check the entire pipeline and see what I should do to react to changing conditions, right?

Host: Okay.

Debadeepta Dey: So the machine… this monitor is looking at system-specific details like CPU usage, memory usage, the run-time taken by each compute, like it’s monitoring everything. The entire pipeline, as well as the hardware on which it is running and its conditions, right?

Host: Right.

Debadeepta Dey: And it is learning policies to change the configuration of the entire pipeline on-the-fly to try to do the best it can as the environment changes.

Host: As the modules change, get better, and impact the whole system. How’s it working?

Debadeepta Dey: We have found really good promises, right? And right now we are looking for bigger and bigger pipelines to prove this out on and see where we can showcase this even better than we have already have in the research paper.

Host: Real briefly, tell me about the paper that you just published and what’s going on with that in the meta-reasoning for these pipelines.

Debadeepta Dey: So that paper is at AAAI. It will come out in February, actually at New York next week and there we showed that you can use techniques like contextual bandits as well as stateful reinforcement learning to safely change the configurations of entire pipelines all at once, right?

Host: Wow.

Debadeepta Dey: And let them not degrade very drastically to adversarial changes and conditions, right?

Host: You know, just as a side note, my husband had knee replacement surgery.

Debadeepta Dey: Okay.

Host: But for decades he had had a compressed knee because he blew it out playing football

Debadeepta Dey: Okay.

Host: …and he had no cartilage.

Debadeepta Dey: I see.

Host: So his body was totally used to working in a particular way.

Debadeepta Dey: Yeah.

Host: When they did the knee surgery, he gained an inch in that leg. Suddenly he has back problems.

Debadeepta Dey: Yeah, because now your back has to, like, you know… it’s the entire configuration, right? You can’t just…

Host: No, and it’s true of basically every system, including the human body, is, you push down here it comes out there!

Debadeepta Dey: No, that’s true. Cars, like people go and put oh, I’m going to go and put a big tire on my car and then the entire performance of the car is degraded because the suspension is not adapted.

Host: But it’s a cool tire.

Debadeepta Dey: Yeah, it’s a cool tire, the steering is now rock hard and unwieldy and but the tire looks good though.

(music plays)

Host: Well, let’s talk a little bit more about robots, Debadeepta, since that’s your roots.

Debadeepta Dey: Yes.

Host: So, most of us are familiar with digital assistants like Cortana and Siri and Alexa and some of us even have physical robots like Roomba to do menial tasks like vacuuming, but you’d like us to be able to interact with physical robots via natural language and not only train them to do a broader variety of tasks for us, but also to ask us for help when they need it!

Debadeepta Dey: Yeah.

Host: So tell us about the work that you’re doing here. I know that there’s some really interesting threads of research happening.

Debadeepta Dey: This project actually, the one that you’re referring to, actually started with a hallway conversation with Bill Dolan, who runs the NLP group, after an AI seminar on a Tuesday where we just got talking, right? Because of my previous experience with robotics and also AirSim, which is a simulation system with Ashish and Shital and Chris Lovett. And we found that, hey, simulation is starting to play a big role and the community sees that, right? And already like, you know, for home robotics, not just outdoor…

Host: Sure.

Debadeepta Dey: …things that fly and drive all by themselves and whatnot, people are building rich simulators, right, and every day we are getting better and better data sets, very rich data sets of real people’s homes scanned and put into AirSim-like environments with Unreal engine as the backend, or Unity as the backend… which, game engines have become so good, right? Like, I can’t believe how good game engines are at rendering photorealistic scenes, and we saw this opportunity that hey, maybe we can train agents to not just react reasonably to people’s commands and language instructions in indoor scenarios, but also like, ask for help… because one of the things we saw was that, at the time, we had dismal performance on even the best algorithms. Very complicated algorithms were doing terrible, like six percent accuracy on doing any task provided by our language, right? But just like any human being, right, like, you know, imagine you ask your family member to hey, can you help me? Can you get me this, right?

Host: Yeah.

Debadeepta Dey: Um, while I am working on this, can you just go upstairs and get me this? They may not know exactly what you are talking about, or they may go upstairs and be like, I don’t know. I don’t see it there. Where else should I look? Human beings ask for help. They know when they have an awareness that, hey, we are lost or I’m being inefficient. I should just ask the domain expert.

Host: Ask for directions.

Debadeepta Dey: Exactly. Ask for directions, and especially when we feel that we have become uncertain and are getting lost, right?

Host: Sure.

Debadeepta Dey: So that scenario, we should have our agents doing that as well, right? So let’s see if we give a budgeted number of tries to an agent, and this is almost like, if you have seen those game shows where you get to call a friend?

Host: Yeah, a lifeline.

Debadeepta Dey: A lifeline, exactly, right? Like, you know. Um, you… and let’s say you have three lifelines, right? And so you have to be strategic about how you play those lifelines…

Host: Don’t call me…

Debadeepta Dey: Or at least don’t use them up on easy questions.

Host: Right.

Debadeepta Dey: Right? Like, you know, something like that. But also there’s this trade-off like hey, if you mess up early in the beginning and you didn’t use the lifeline when you should have, you will be out of the game, right? So you won’t live in the game long enough, right?

Host: Yeah.

Debadeepta Dey: So there’s this strategy. So we said, you know what? Agents should just train themselves on when to ask during training time. Like when they make mistakes, they should just ask and learn to use their budget of asking questions back to the human at training time itself, right? When you are in the simulation environments we used imitation learning as opposed to reinforcement learning, and we were just talking about imitation before, because you are in simulation, you have this nice programmatic expert. An expert need not be just a human being, right? Or a human teacher. It can also be an algorithm which has access to lots more information at training time… you would not have that information at test time, but if, at training time, you have that information, you try to, like, mimic what that expert would do, right? And in simulation, you can just run a planning algorithm, which is just like shortest path algorithm, and learn to mimic what the shortest path algorithm would do at test time, even though now you don’t have the underlying information to run the planning algorithm. And with that, we also like built in the ability for the agent to become self-aware like, “I’m very uncertain right now. I should ask for help,” and it greatly improved performance, right?

Host: Yeah, yeah, yeah.

Debadeepta Dey: Of course, we are asking for more information, strategically, so I don’t think it’s a fair comparison to just compare it to the agent which doesn’t get to ask.

Host: Right.

Debadeepta Dey: But we showed that, like, you know, instead of randomly asking, or asking only at the beginning or at the end, at various normal baselines that you would think of, learning how to ask gives you a huge boost.

Host: Well, Debadeepta, this is the part of the podcast where I always ask my guests what could possibly go wrong? And when we’re talking about robots and autonomous systems and automated machine learning, the answer is, in general, a lot!

Debadeepta Dey: Yeah.

Host: That’s why you’re doing this work.

Debadeepta Dey: Right.

Host: So since the stakes are high in these arenas, I want to know what you’re thinking about, specifically. What keeps you up at night and, more importantly, what are you doing about it to help us all get a better night’s sleep?

Debadeepta Dey: So in robotics and self-driving cars, drones, even for home robotics, like safety is very critical, right? Like, you know, you are running robots around humans, close to humans, in the open world, and not just in factories, which have cordoned-off spaces, right? So robots can be isolated from humans pretty reasonably, but not inside homes and on the road, right?

Host: Or in the sky.

Debadeepta Dey: Or in the sky, absolutely. The good thing is, the regulations bodies are pretty aware of this. And even the community as a whole realizes that you can’t just go and field a robot with any not-well-tested machine learning algorithms or decision-making running, right? So there’s huge research efforts right now on how to do safe reinforcement learning. I’m not personally involved a lot in safe reinforcement learning, but I work closely with, for example, the reinforcement learning group in Redmond, the reinforcement learning group in New York City, and there’s huge efforts even within MSR on doing safe reinforcement learning, safe decision-making, safe control… I sleep better knowing that these efforts are going on and there’s also huge efforts, for example, in ASI and people working on model interpretability…

Host: Right.

Debadeepta Dey: People working on pipeline de-bugging, and ethics and fairness, including at other parts of MSR and Microsoft, and the community in general, so I feel like people are hyper-aware. The community is hyper-aware. Everybody is also very worried that we will get an AI winter if we over-promise and under-deliver again, so we need to make our contributions be very realistic and not just over-hype all the buzzes going around. The things that I’m looking forward to do is like, for example, like meta-reasoning. We were thinking about like how to do safe meta-reasoning, right? Just the fact that the system knows that it’s not very aware and I should not be taking decisions blindly. These are beginning steps. Without doing that, you won’t be able to make decisions which will evade dangerous situations. You first have to know that, I’m in a dangerous spot because I am doing decisions without knowing what I am doing, right? And that’s like the first key step and even there, we are a ways away.

Host: Right. Well, interestingly, you talk about Microsoft and Microsoft Research and I know Brad Smith’s book Tools and Weapons addresses some of these big questions in that weird space between regulated and unregulated, especially when we’re talking about AI and machine learningbut there’s other actors out there that have access to – and brains for – this kind of technology that might use it for more nefarious purposes or might not just even follow best practices. So how is the community thinking about that? You’re making these tools that are incredibly powerful, um…

Debadeepta Dey: Yeah, so that is a big debate right now in the research community because often times what happens is that, we want to attract more VC funding, we want to grow bigger, it’s land grabs, so everybody wants to show that they have better technology, and racing to production or deployment.

Host: First to deploy

Debadeepta Dey: First to deploy, right? And then first to convince others, even if it’s not completely ready, means that you maybe get, like, you know, the biggest share of the pie, right? It is, indeed, very concerning, right? Like, even without robotics, right, even if you have like services, machine learning services and whatnot, right?

Host: Right.

Debadeepta Dey: And what do we do about things which are beyond our control, right? We can write tooling to verify any model which is out there and do interpretability, find where the model has blind spots… That we can provide, right? Personally, what I always want to do is be the anti-hype person. I remember there was this tweet at current NeurIPS where Lin Xiao, who won the Test of Time award, which is a very hard award to win, for his paper almost twelve years ago, started his talk saying, oh, this is just a minor extension of Nesterov’s famous theorem, right, like you know… And Subbarao Kambhampati tweeted that, hey, in this world where everybody has pretty much invented, or is about to invent, AGI, so refreshing to see somebody say, oh, this is just a minor extension of…!

Host: It’s an iteration.

Debadeepta Dey: Yeah. And most work is that, right?

Host: Yeah.

Debadeepta Dey: Like, irrespective of the fancy articles you see, or in PopSci magazines, robots are not taking over the world right now. There’s lots of problems to be solved, right?

Host: All right, well, I want to know a little more about you, Debadeepta, and I bet our listeners do too. So tell us about your journey, mostly professionally, but where did you start? What got a young Debadeepta Dey interested in computer science and robotics and how did you end up here at Microsoft Research?

Debadeepta Dey: Okay, well, I’ll try to keep it short. But the story begins in undergrad in engineering college in New Delhi. The Indian system for getting into engineering school is that it is a very tough, all-India entrance exam and then, depending upon the rank you get, you either get in or you don’t, to good places, right? And that’s pretty much it. It’s that four-hour or six-hour exam and how you do on it matters. And that is so tough that you prepare a lot for that. And often what happens is, after you get to college, the first year is really boring, okay? Because, I remember, because we knew everything that was already in the curriculum in the first two years of college…

Host: Just to get in.

Debadeepta Dey: Yeah, just to get in, and so you’re like, okay, we have nothing to do. And so I remember the first summer after the first year of college, we were just, a bunch of us friends were just bored, so we were like, we need to do something, man, because we are going out of our mind. And we were like, hey, how about we do robotics? That seems cool. Okay, first of all, none of us knew anything about robotics, right? But this is like young people hubris, right like you know…

Host: You don’t know what you don’t know.

Debadeepta Dey: …yeah, like confidence of the young. I guess that’s needed at some point.

Host: Yeah.

Debadeepta Dey: You should not get jaded too early in life, so we were like, okay, we are going to do robotics and we are going to build a robot and we are going to take part in this competition in the US in two, three years’ time, but we need to just learn everything about robotics, right? And, okay, you must understand this is like, you know, pre-… internet was there, but the kind of online course material you have now, especially in India, we didn’t have anything. There was nobody to teach robotics and this was a top school, right? And there was like one dusty robot in the basement of some, I think the mechanical engineering department, which had not been used in like ten years. Nobody even knew where the software was and everything. Like, we went and found some old, dusty book on robotics… But luckily what happened is, because we were in Delhi, somebody had returned from CMU. Anuj Kapuria had started this company called Hi-Tech Robotics. So we kind of got a meeting with him and we just started doing unpaid internships there, right? We were like, we don’t care… we don’t… Because he actually knew what robotics was, right?

Host: Right.

Debadeepta Dey: Because he had come in right from CMU and finishing his master’s and he was starting this company. He would sometimes go to the US and it was so dire that we would like, will you buy this book for us and bring it back from the US, right? Because there’s nobody here… We can’t even find that book, right?

Host: Right.

Debadeepta Dey: And so I got my like first taste of modern-day robotics and research there and then, in undergrad, after the end of my third year, I did an internship at the Field Robotics Center at Carnegie Melon. And then after that, I finished my master’s and PhD there. I came back to India, finished and then went back to the US, and that’s how I got started mostly because I think it was, I would say, pure perseverance. I’m well-aware I’m not the smartest person in the room, but, as somebody had told me right before I started at Intel Research and who is now at Google, finishing a PhD is ninety nine percent perseverance. And research is, as almost all big things in life, it’s all perseverance. You just got to stick at it, right, and through the ups and the downs. And lucky enough, I also had fantastic advisors. CMU was a wonderful place. When I came to MSR it also re-energized me in the middle of my PhD.

Host: Would it be fair to say you’re not bored anymore?

Debadeepta Dey: Um no, no! Not at all! Like you know, nowadays, we have the opposite problem! We are like…

Host: Too much.

Debadeepta Dey: Too many cool problems to work on and yeah, not enough time, yeah.

Host: Tell us something we don’t know about you. I often ask this question in terms of how a particular character trait or defining moment led to a career in research, but I’m down for an anecdote even if it doesn’t relate to that.

Debadeepta Dey: So my mother is a history professor in India and, growing up with her, I was reading a lot like because she would bring me all kinds of books. Not just history, like literature and everything, and I was very good at English literature and I wanted always to be an English professor. I never wanted to do anything with CS. In fact, I was actually kind of bad at math. I remember I flunked basic calculus in grade 11, right? Mostly because of not paying attention and whatnot, but all of that was very boring and the way math was predominately taught at the time was in this very imperialistic manner. Here’s a set of rules, go do this set of rules and keep applying them over and over. And I was like, why? This all seems very punitive, right? But, my mother one day sat me down and said, look, you’re a good student, here’s the economic realities, at least in India. I am one in a thousand who makes a living from the humanities, most people don’t, and will not make it, and it’s very difficult to get, actually, a living wage out of being an English professor, at least in India. And you are good at science and engineering. Do something there. At least you will make enough money to pay your bills, right? But there’s always this part of me which believes that if there was a parallel life, if only I can be an English professor at a small, rural college somewhere, that would work out great as well!

Host: As we close, I want to frame my last question in terms of one of your big research interests and you started off with it: decision-making under uncertainty.

Debadeepta Dey: Yeah.

Host: Many of our listeners are at the beginning of their career decision-trees, but absent what we might call big data for life choices, they’re trying to make optimal decisions as to their future in high tech research. So what would you say to them? I’ll give you the last word.

Debadeepta Dey: The one thing I have found, no matter what you choose, be it technology, arts… and this is particularly true for becoming good at what you do, is pay attention to the fundamentals, right? Like I have never seen a great researcher who doesn’t have mastery over the fundamentals, right? This is just like going to the gym. You are not going to go bench press four hundred pounds the first day you go to the gym. That’s just not going to happen, right? So a lot of people are like well, I’m in this Calculus 101. It seems boring and whatnot and I don’t know why I’m doing this, but all of that stuff, especially if you are going to be in a tech career, math is super useful. Just try to become very, very good at fundamentals. The rest kind of takes care of itself. And wherever you are, irrespective of the prestige of your university, even that doesn’t matter. One of the principals that we have found true, especially for recruiting purposes, is, always pick the candidate who has really strong fundamentals because it doesn’t matter what the rest of the CV says, really good fundamentals… we will make something good out of that. So if you just focus on that, wherever you are in the world, you will be good!

Host: Debadeepta Dey, this has been so much fun. Thanks for coming on the podcast and sharing all these great stories and your great work.

Debadeepta Dey: Thank you. I had a lot of fun as well!

(music plays)

To learn more about Dr. Debadeepta Dey and how researchers are helping your robot make good decisions, visit Microsoft.com/research

The post Neural architecture search, imitation learning and the optimized pipeline with Dr. Debadeepta Dey appeared first on Microsoft Research.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.