What Will AI Mean for Everyday Life?

Originally aired on June 9, 2020 @ 1:00 AM - 1:30 AM EDT

Best of: Internet Summit - 2017

Anthony Goldbloom - CEO, Kaggle
Jen Taylor - Head of Product, Cloudflare

English

Internet Summit

Transcript (Beta)

This is a test. This is a test. This is a test. This is a test. All right, I'm going to go ahead and get right into it to keep us rolling. I'm Jen Taylor, Head of Product here at Cloudflare, and I am pleased to be joined by Willie, who's responsible for IBM's developer ecosystem, total developer experience, and developer product offerings and strategy. Previously, he was SVP and GM of cloud networking and emerging products at Akamai. To Willie's left is Anthony. Anthony is the co-founder and CEO of Kaggle, the world's largest data science and machine learning community. Kaggle's a platform for predictive modeling and analytics competitions in which companies and researchers post data, and statisticians and data miners compete to produce the best models for predicting and describing data. Google acquired Kaggle earlier this year and is building upon it. So our focus today is really, what does AI mean for everyday life? Now, I feel like I'm hearing a lot about AI. I'm hearing a lot of buzz about AI, but what is your assessment of where we are and how it's making a difference? Well, I'm happy to start. First of all, basically, I think we're in an unprecedented time in terms of how fast things are going in relation to AI and AI adoption. It's probably the buzz. I think that's an interesting era that we're actually in. Most of the times, AI, from a consumer perspective, is kind of cast a shadow along the lines, essentially, of things like HAL from the 2000 Space Odyssey or Skynet, where there's a negative connotation associated to it. But I think these technologies are going to do a tremendous amount in terms of patient-assisted care, essentially allowing consumers to actually do better things actually in terms of their selecting what they buy. I mean, if you think about it, it's fueled in tremendous amount by the data infusion that we actually have. If you go all the way back from 2004, 2006 timeframe, Facebook, iPhone, AWS, it just created this tremendous amount of data velocity that we're experiencing actually right now. The combination of the availability of compute resources and that data set is really a lot of the things that's fueling this AI. Depending on the stat, you might hear 90% of the world's data will be created or has been created in the last two years. You hear a lot of that kind of stuff. Well, I think AI and its application, these cognitive systems that we're actually talking about, are the things that will help us deal with that kind of information overload. And I think the big difference in the systems is that we like to think that they have four attributes. One is they know how to understand, they know how to reason, they know how to learn, and they know how to interact. And I think that's a big difference between the systems that we typically associate where we're programming those systems. These systems have these different attributes. And I think that's why there's a lot of interest and a lot of uncertainty and a lot of opportunity. My take on this, I actually don't like the word AI. I much prefer machine learning, which is where a lot of the more pragmatic use cases have come from. And I think that we've hit, over the last 15 years, two inflection points. The first was the rise of a set of techniques called basically ensemble decision trees. So things like random forest and gradient boosting machines. They allowed us to do much more powerful things on structured data problems. So we can much more accurately predict fraud if you're a credit card company or do credit scoring or predict insurance claims. Not very exciting use cases, but we could do things we were doing with linear regression, logistic regression, not very accurately, substantially more accurately. And that was sort of a jump that we made 15 years ago. The reason AI is such a hot topic at the moment is because of a probably more significant jump that was made about five years ago, which is known as deep learning, deep neural networks. And what this has unlocked is the ability to do very exciting things with unstructured data. So for the longest time, people wanted to automate radiology, for instance. Take medical images and try and do things like diagnose breast cancer. And there was little pockets of small levels of success. Deep neural networks came along, and all of a sudden, use case after use case, we're seeing machine learning is able to be as accurate or when the data is sufficient, more accurate than humans. And so when we talk about artificial intelligence, I typically prefer to talk about machine learning because that is like the pragmatic, that's where the pragmatic use cases are coming from. You know, what Anthony actually refers to, I think, is that at its core, the algorithms, basically, and the advancements in the algorithms are the things that are contributing to that. You know, when in my role, basically, a lot of it is solutions on top of these base algorithms. So you take a look at things that we've done in oncology, as an example. To exactly Anthony's point, the ability to train a system for oncology, for example, I think the Watson system actually knows eight cancers right now. I have six. And so a big portion of that is actually leveraging these algorithms for accuracy. But the effectiveness is really based upon the training sets that you're feeding it, right? So do you have the right training sets, and are they authoritative in that particular area? And then the humans who are actually training these things, are they the right humans? It's no different, basically, than data in, data out. You know, who is training these systems is going to actually result in terms of its level of accuracy. The results set are a couple of folds. There's one, you know, we need to hold the systems to the same level of standards that we hold lots of times, essentially, the humans who are participating in this. And that means they have to pass the same test. They have to actually be the same accredited. We have to actually hold them to that. It doesn't matter if they're driving cars or they're recommending, basically, something else, actually, from that standpoint. But I think that the importance there, and a design principle that we see, is that it's always assisted, right? So the idea, essentially, that a physician is actually utilizing a tool, if your design is actually that it's there as assisted patient care, opposed to replacement of the physician, your design goals and principles will be different. And we always see it, whether it be, essentially, a customer service rep or a physician, it's always going to be, basically, assisted. Do you feel the same, Anthony, that the future is assisted? Not necessarily. So I feel like there are... Humans spend a lot of time doing very repetitive things, and I actually think a lot of those repetitive tasks can be automated. And so radiology is one example where, you know, maybe the path is, you know, it's kind of unpalatable to go from where we are now to total automation. So the path may involve being assisted, but I think the end result is that algorithms will be our future radiologists. I think there are other professions as well. Anything that involves a lot of repetition. So, you know, heck of a lot of auditors out there. Auditing is a very repetitive task. I think the majority of auditing probably will be automated. Mundane legal tasks. The number of times you see an NDA that is so similar to the last NDA you looked at, I think that will be fully automated. You know, maybe what happens, maybe there is probably an element of... So some things will be completely automated. There will be probably an element of tasks that will be, you know, sort of a nice combination. So where the algorithm will do the routine, very simple, you know, I have very high confidence that this is an NDA that is safe to sign or this is a case of breast cancer, and then kicks off the more challenging, you know, the ones that it's not sure about, the more challenging cases to a human. That's probably the degree to which I think they're the future, the end state is stateware. I agree with that. So the... But in that, I think we're asking humans to raise their game, right? So the whole idea, essentially, that you can get rid of these commodity tasks. You know, we have a retailer who does something like... They had 20 customer service reps are getting something like 20,000 types of conversations a month. They focused in on 40 primary use cases that typically come in. I don't know about you guys, but, like, when I call Comcast, like, and try to get something fixed, it's usually the same type of thing, but I go through the same type of question over and over again. So in... And what they were able to actually do is say, OK, how do we actually take these four use cases, which are common, against those 20,000 per month, how do you give that back to the agents so that they can actually work on the complex things, and how do you reduce the time from one and a half days to five minutes? I think those are the kinds of compression pieces. Even back when, you know, IBM did their first thing here, it wasn't exactly what I'll call AI, but it was essentially, you know, something called Deep Blue, basically, that competed in chess against Garry Kasparov, right? It was the first system that actually won against Grand Champion. Even in that scenario, at the end, Kasparov said, you know, at some point I actually see that it'll be me and my supercomputer against my opponent and his computer, because they'll take care of the commodity moves, and then the creativity for winning the match will actually come from the Grand Champion. So I think that that level of assist is what I'm implying, actually, along those lines. Well, and kind of, Anthony, as you were saying earlier, you know, one of the great leaps forward we've seen in the last five years is the ability to process a bunch of unstructured data. And really it's the ability for systems to be able to apply machine learning on a huge volume corpus of unstructured data and give insights to somebody who may only see a small sliver, so be able to help do that kind of team -based, trend-based analysis. I mean, the things that Anthony is talking about, like the unstructured data, I mean, if you even look at just the life sciences, 90% of that basically is not structured type of data itself. They're, you know, handwritten reports. They're radiology graphs. They're these types of things that essentially, to be able to digest those things. In oncology, you know, someone said basically you can do, these folks read 200 to 300 research reports a year, right? There are 10,000 new articles every month on 100 clinical trials. How do these guys do that? It's impossible for them to actually do all those pieces, so. Similar or related, you know, radiologists can look at, I don't know how many images, let's say 1,000 images. No, probably more actually. Let's say a radiologist can look at 3,000 images a year. A machine can look at 3,000 images a second, right? And so as long as the task is suitable for machine learning and there's a very clear, you know, objective, cancer, no cancer, machines have an unfair advantage. They can train on much more data. It seems that there's a great deal of promise with this technology. What are the things, you've got a room full of technologists here and online, what are the things that we need to be doing as a community to help us realise the potential benefit of these technologies in everyday life? Companies are not currently well set up to take advantage of machine learning. By far the, I'm at Google now, and I think Google does it more effectively than others. And a really good example, there's a team within Google, Google Brain, that sort of infuses deep neural networks into the various Google products. And so when we use the voice assistant or we use photos and we search for, you know, our photos by, you know, search by name or whatever and it finds pictures of me, that is, you know, the Google Brain team going out and infusing various products with deep neural networks. I'm not aware of another company doing it nearly as successfully. I think there are a bunch of, there's a bit of a shift that I think more traditional companies need to make. Machine learning is, people talk about software engineering as a 10x profession where you can get a really good software engineer and they're, you know, 10x more productive than the average. I think that machine learning is a significantly more high leverage profession again. I mean, if you're a bank and you're deciding who gets credit and who doesn't, one algorithm decides who gets credit and who doesn't across a massive customer base. Now, that is an extremely high leverage algorithm. In the hands of an extremely good machine learner, that will generate a huge ROI. In the hands of somebody who is less experienced or maybe not as capable, generates a huge amount of damage. And so, you know, look at large companies and other high leverage positions. The CEO is a very high leverage profession. They get paid a lot. And so, being willing to, there is, you don't need a huge team of very strong machine learners. You need a handful of outstanding machine learners. Pay them a hell of a lot because it is a very high leverage profession. And that is probably the best way to sort of get the right talent in and think about talent. Then there's also this issue where systems are not really set up for productionizing machine learning algorithms. You know, we had banks as customers and they're using mainframes to do their credit scoring. Implementing a deep neural network on a mainframe would be challenging. And so, there's the issue of thinking the right way about the talent. Not that you need huge teams, but you need small teams that are extremely talented and also making them able to do their job by having an architecture that will allow them to actually push these algorithms into production. On the latter, I actually think the cloud will help a lot. As workloads move to the cloud, it becomes much easier to get algorithms productionized. You know, the things that Anthony points out, it's kind of like, I truly believe that the developer is, you know, this era's doctor, engineer, lawyer really is what it comes down to. Maybe not quite lawyer, but doctor, engineer. Clearly, solution sets basically that have leverage and help, whatever field basically they're going after. You know, one of the things that we actually see dominantly in development teams actually right now is the data team and the application team used to be able to actually work separately, right? And so, you know, one of the things that we see predominantly is that there's more need for data scientists and people who understand machine learning. And that the collaboration of those teams will be more and more. If you think of basically data as the fuel for the AI, essentially, then that's a really important dynamic to actually think about with your development teams as you actually move forward for these types of systems. And it's not the norm yet in terms of the architecture that we actually see. You know, just like we actually had roles during big data of like, you know, data scientists, data engineers. Those roles didn't exist by definition and name as we actually go into this next set of computing architectures. We'll have something similar in terms of new roles and division of labor has actually come up. But, you know, Kaggle basically did a tremendous amount basically in terms of building a community of folks who had this kind of expertise. And I think that that has to become pervasive in most folks' systems basically for them to actually get the benefit that they want. So we talked a little bit about kind of where we are. We kind of talked a little bit about kind of how we got here and the things we need to do to move it forward. Like looking forward five years, 10 years, 15 years, I mean, how far do you think we should be taking machine learning and AI and how far should we, how far can we? I like the somewhat hackneyed William Gibson quote that the future is already here, it's just not widely distributed. So already I think the challenges in realizing the benefits of machine learning are really organizational. I talked about Google Brain as being an exemplar of how you productionize machine learning in a company. And so seeing that model or other successful models adopted in companies will massively increase the amount of machine learning that touches us in everyday life. Then there are also some areas that are still in late stage research that will probably make their way into real pragmatic use cases. Probably the furthest along is a technique called reinforcement learning which automates trial and error. So at the moment we've seen like DeepMind or users for playing Atari games where the first game it moves left and it dies, and the second game it moves right and it doesn't die, and over time it plays enough games that it learns what gets it a good score and eventually it dominates these games whether it's Go or Atari games. And we're starting to see people thinking about using reinforcement learning in stock trading, ad targeting, actually as an input to better training neural network architectures, it's got a lot of promise and I think we'll start to see, it's quite an abstract technique at this point, it's not clear what the killer apps will be but I suspect there will be some. Similarly, generative models are a new area of machine learning that are pretty exciting. So if you've seen some of the Stanford demos where they'll take an image and we'll be able to write a caption for those images. There are exciting use cases there, like for instance somebody who is visually impaired being able to describe the scene so they know that you have a camera or glasses or something that is taking a picture of what you're seeing and being able to describe what's around you. There are exciting use cases like that that I think some of these new techniques are going to unlock. So it's a combination of just deep neural networks starting to make their way into existing use cases as well as the promise of some new areas of research and what they'll bring. I mentioned basically like humans have to kind of raise their game. Part of it is people who are in specialized professions often actually do because the commodity things will actually be handled for you. Anthony mentioned things like these structured environments like tax code or something along those lines. There's no reason to believe essentially that that can't be replaced essentially by a system in natural language question and answer. And you actually get very, very high recommendations just because it's so structured and the systems basically know how to learn to actually give this kind of recommendation. So across the surface area, think about how many times you actually use that. Tax, I have a cold, I do these other things. Right now, it's just that lots of times the data is not leveraged. You know, there's like less than 1% of like all data basically being used as these training sets along these lines. Over the course of these next years, as we actually understand that this is the model by which we actually go, all these systems essentially will have an embedded knowledge or intelligence actually in them and it'll be as structured as they are with the surface area of like think about tax. You have these years basically of things that it can actually learn from. You should be able to actually get a very, very good answer from those types of systems. If those are readily available, that's what the Internet will provide, access to those particular systems and a commodity level. It should be a rich environment, whether it be shopping, medical, buying a car, it doesn't really matter. But those are the types of systems that we should actually be expecting as we see this kind of thing actually advance. Yeah. You know, and I think part of what you've talked about is sort of it replacing commodity activities and you've talked about it being sort of used more broadly and sort of Google Brain. You know, I think a lot about the development of the technology, but I also think a lot about the development of the trust that it will take for these technologies to become more widespread. Like, you know, my kids at home, like they inherently trust they can talk to Alexa and the answer that Alexa gives them is the right answer. I'm slightly more skeptical, but I am also significantly older. How should we think about helping build the trust for broader adoption here? That's actually, when I was talking about organizations not being set up for it, I actually missed the, you know, there is the market being ready, right? So am I going to be comfortable trusting the diagnosis of, you know, my cancer diagnosis from the machine? I think that building trust is use case specific. So let's use the radiology has been a bit of a theme in this discussion, so we'll use that as an example. One thing you can start off with is having a machine operate alongside a radiologist, right? And so look at the agreement rates. And, you know, with medical diagnosis, you actually do eventually know what the right answer is, right? You know, after a biopsy has happened, for instance, you know for sure. And so over time, the machine builds up, you know, articles get published. It builds up a track record and trust gets built. And, you know, if the machine is lower performing than an oncologist, then maybe it's just always a second check. If it's higher performing, then maybe eventually it takes over. But I feel like there's no, like, general answer to how to build trust. I feel like it's very, very, very use case specific. I agree. I think you have to actually build these systems on some principles, like transparency. So, you know, I think anytime you're interacting, whether you're a consumer or medical or any type of thing, you kind of want to know, am I dealing with a human or am I dealing basically essentially with a system? So that level of transparency, depending on where it is, you want to know what taught this thing. So you know what the system you're talking to, what was essentially and who were the people basically who taught it, because it has credibility no different than essentially when you go to a human for that same level of advice. And then when it generates basically a recommendation, you'd like to know the data set that it made that recommendation from. And so, you know, those types of things, I think, are important as you actually go through that. And again, from our perspective, human assisted basically is a core principle actually from our perspective. We do see basically that the commodity pieces will go away, but human assisted is a design principle that will yield a type of system that I think is the type of system that everybody's looking for. Right. Well, it's also, it's interesting if you think about it, like no human is perfect. No human, you know, and so the notion of sort of what are our expectations of the system versus our expectations of other humans. Also no algorithm is perfect as well. And that's a difficult, that's something that makes it harder to trust machine learning because it will make mistakes. The Tesla will have an accident. And it's going to happen. Humans have accidents. So really, does the Tesla have accidents at a lower rate than humans? Probably is. It probably does. But when it has an accident, boy, is that going to get a lot of attention. But it doesn't drink. It's not sleep deprived. It has some advantages actually from that network. It's not texting. Exactly. Exactly. Okay, so I could keep asking questions, but I am going to turn it over to the audience now. Question there in the back. I am a physician at Stanford, so I see this in my personal life and also the research confirms that doctors spend only about 30% of our time taking care of patients and 70% of our time creating data and entering it mostly for billing purposes. So the controversial question I have is, are we developing artificial intelligence at the expense of our human natural intelligence? That's a good question. You know what? Our perspective basically isn't at the expense. How do we free you more time so that you can actually, in many cases, look at more data, assess basically what is being presented to you? How can you actually take, in many cases, the power of all these doctors who have helped train the system and bring it to you actually so you can be more efficient and actually have more data to make your own decision and better prescription in terms of a particular patient? So in some of these cases, we believe basically that what we're doing is giving you more leverage in terms of the data set that you actually have and in some cases the efficiency. So you won't have to key in a bunch of data as much as possible. Lots of that stuff will be learned, read, listened to specifically basically by a machine and interpreted and put into the data pool. I think our future work lives will be more interesting because a lot of the mundane, repetitive tasks get taken away. There is, of course, a... I can't pronounce the word. A side effect, I guess, is a good medical term. LAUGHTER Which is, OK, so let's say a huge proportion of our roles are currently mundane and all of a sudden that stuff goes away. Does that mean there are fewer of us needed? Are there going to be new jobs that open up to replace the ones that get automated? I think it's... Certainly historically we've gone through waves of automation and as something gets automated, a whole lot of new professions that we never could have imagined end up existing. It's hard to know in this case, right? Is the disruption happening so quickly that we don't have enough time to retrain or, you know, for the structures of our economy to adapt? It's kind of... It's a little bit scary and, to be honest, an open question, but if the structure of our economy does change and new roles open up, I think all of our work lives will be more interesting because the mundane stuff gets taken away. Question over here. Just to follow up on what you said there. Last month you gave a talk about what jobs might be replaced by machine learning. I've got a 20-year-old daughter in college. Do you have any career advice for her? Any skills she should make sure she has? I don't know. It's really tough. I think that computer programming and machine learning is a good bet. The basic heuristic that I like is if the job involves creativity and sort of connecting dots in disparate ways, then, you know, there is no machine learning technique that... Machine learning is very good at training on this thing and doing this thing. It's not good at taking this bit of information and this bit of information and combining it to do that. And we have no tech... I'm not aware of any machine learning techniques that are even remotely, like, even in early stages of development that are capable of doing that. So that's probably a good heuristic to use. That if it involves connecting disparate threads or some element of creativity, then it's probably a good career direction to go into. I think that's consistent. I mean, again, in each one of the industrial... The agricultural revolution, the manufacturing revolution, the idea that basically it was going to replace everything was always actually there in each one of these particular stages. And then we found occupations, job specialties, actually, for the advancement. And that's my point just about everybody has to raise their game from that standpoint. And I think we're just in that same era. Okay, we have time probably to take one more question. Would you be willing to replace jury trials with artificial intelligence? And if not, why not? Probably at lower levels of the legal system, yes, because it's only... Once you get to... I'm not a lawyer, so this is based on sort of my arm's length understanding of how courts work. But the Supreme Court gets the very novel cases, right? The ones that are going to be path breaking in some way. And so that should still be done by humans. But to the extent that you're getting the same kind of rote cases again and again and again at lower level courts, it seems like it's a very plausible idea. Okay, well, thank you both so much for taking the time to talk with us today. I'm very happy that machine learning probably won't replace panels going forward because conversations like this are incredibly powerful. So thank you guys very much. Thank you. Thank you.