Dina Genkina: Hi. I’m Dina Genkina for IEEE Spectrum‘s Fixing the Future. This episode is brought to you by IEEE Explore. The digital library with over 6 million pieces of the world’s best technical content. In the November issue of IEEE Spectrum, one of our most popular stories was about code that writes its own code. Here to probe a little deeper is the author of that article, Craig Smith. Craig is a former New York Times correspondent and host of his own podcast, Eye On AI. Welcome to the podcast, Craig.
Craig Smith: Hi.
Genkina: Thank you for joining us. So you’ve been doing a lot of reporting on these new artificial intelligence models that can write their own code to whatever capacity that they can do that. So maybe we can start by highlighting a couple of your favorite examples, and you can explain a little bit about how they work.
Smith: Yeah. Absolutely. First of all, the reason I find this so interesting is that I don’t code myself. And I’ve been talking to people for a couple of years now about when artificial intelligence systems will get to the point that I can talk to them, and they’ll write a computer program based on what I’m asking them to do, and it’s an idea that’s been around for a long time. And one thing is a lot of people think this exists already because they’re used to talking to Siri or Alexa or Google Assistant on some other virtual assistant. And you’re not actually writing code when you talk to Siri or Alexa or Google Assistant. That changed when they built GPT-3, the successor to GPT-2, which was a much larger language model. And these large language models are trained on huge corpuses of data and based primarily on something called a transformer algorithm. They were really focused on text. On human natural language.
But kind of a side effect was that there’s a lot of HTML code out on the internet. And GPT-3 it turns out learned how HTML code just as it learned English natural language. The first application of these large language models’ ability to write code has been first by GitHub. Together with OpenAI and Microsoft, they created a product called Copilot. And it’s pair programming. I mean, oftentimes when programmers are writing code, they have someone— they work in teams. In pairs. And one person writes kind of the initial code and the other person cleans it up or checks it and tests it. And if you don’t have someone to work with, then you have to do that yourself, and it takes twice as long. So GitHub created this thing based on GPT-3 called Copilot, and it acts as that second set of hands. And so when you begin to write a line of code, it’ll autocomplete that line, just as it happens with Microsoft Word now or any Word processing program. And then the coder can either accept or modify or delete that suggestion. GitHub recently did a survey and found that coders can code twice as fast using Copilot to help autocomplete their code than if they were working on their own.
Genkina: Yeah. So maybe we could put a bit of a framework to this. So I guess programming in its most basic form like back in the old days used to be with these punch cards, right? And when you get down to what you’re telling the computer to do, it’s all ones and zeros. So the base way to talk to a computer is with ones and zeros. But then people developed more complicated tools so that programmers don’t have to sit around and type ones and zeros all day long. And programming languages and their simpler programming languages are slightly more sophisticated, higher-level programming languages so to speak. And they’re kind of closer to words, although definitely not natural language. But they will use some words, but they still have to follow this somewhat rigid logical structure. So I guess one way to think about it is that these tools are kind of moving on to the next level of abstraction above that, or trying to do so.
Smith: That’s right. And that started really in the forties, or I guess in the fifties at a company called Remington Rand. Remington Rand. A woman named Grace Hopper introduced a programming language that used English language vocabulary. So that instead of having to write in symbols, mathematic symbols, the programmers could write import, for example, to ingest some other piece of code. And that has started this ladder of increasingly efficient languages to where we are today with things like Python. I mean, they’re primarily English language words and different kinds of punctuation. There isn’t a lot of mathematical notation in them.
So what’s happened with these large language models, what happened with HTML code and is now happening with other programming languages, is that you’re able to speak to them instead of— as with CodeWhisperer or Copilot, where you write in computer code or programming language and the system autocompletes what you started writing, you can write in natural language and the computer will interpret that and write the code associated with it. And that opens up this vista of what I’m dreaming of, of being able to talk to a computer and have it write a program.
The problem with that is that, as I was saying, natural language is so imprecise that you either need to learn to speak or write in a very constrained way for the computer to understand you. Even then, there’ll be ambiguities. So there’s a group at Microsoft that has come up with this system called T coder. It’s just a research paper now. It hasn’t been productized. But the computer, you tell it that you want it to do something in very spare, imprecise language. And the computer will see that there are several ways to code that phrase, and so the computer will come back and ask for clarification of what you mean. And that interaction, that back-and-forth, then refines the meaning or the intent of the person who’s talking or writing instructions to the computer to the point that it’s adequately precise, and then the computer generates the code.
So I think eventually there will be very high-level data scientists that learn coding languages, but it opens up software development to a large swath of people who will no longer need to know a programming language. They’ll just need to understand how to interact with these systems. And that will require them to understand, as you were saying at the onset, the logical flow of a program and the syntax of programs, of programming languages and be aware of the ambiguities in natural language.
And some of that’s already finding its way into products. There’s a company called Akkio that has a no-code platform. It’s primarily a drag-and-drop interface. And it works on tabular data primarily. But you drag in a spreadsheet and drop it into their interface, and then you click a bunch of buttons on what you want to train the program on. What you want the program to predict. These are predictive models. And then you hit a button, and it trains the program. And then you feed it your untested data, and it will make the predictions on that data. It’s used for a lot of fascinating things. Right now, it’s being used in the political sphere to predict who in a list of 20,000 contacts will donate to a particular party or campaign. Contacts will donate to a particular political party or campaign. So it’s really changing political fundraising.
And Akkio has just come out with a new feature which I think you’ll start seeing in a lot of places. One of the issues in working with data is cleaning it up. Getting rid of outliers. Rationalizing the language. You may have a column where some things are written out in words. Other things are numbers. You need to get them all into numbers. Things like that. That kind of clean-up is extremely time-consuming and tedious. And Akkio has a large— well, they’ve actually tapped into a large language model. So they’re using a large language model. It’s not their model. But you just write in natural language into the interface what you want done. You want to combine three columns that give the date, the time, and the month and year. I mean, the day of the week, the month, the year. The month and the year. You want to combine that into a single number so that the computer can deal with it more easily. You can just tell the interface by writing in simple English what you want. And you can be fairly imprecise in your English, and the large language model will understand what you mean. So it’s an example of how this new ability is being implemented in products. I think it’s pretty amazing. And I think you’ll see that spread very quickly. I mean, this is all a long way from my talking to a computer and having it create a complicated program for me. These are still very basic.
Genkina: Yeah. So you mention in your article that this isn’t actually about to put coders out of a job, right? So is it just because you think it’s not there yet. The technologies not at that level? Or is that fundamentally not what’s happening in your view?
Smith: Well, the technology certainly isn’t there yet. It’s going to be a very long time before— well, I don’t know that it’s going to be a long time because things have moved so quickly. But it’ll be a while yet, before you’ll be able to speak to a computer and have it write complex programs. But what will happen and will happen, I think, fairly quickly is with things like AlphaCode in the background, things like T coder that interacts with the user, that people won’t need to learn computer programming languages any longer in order to code. They will need to understand the structure of a program, the logic and syntax, and they’ll have to understand the nuances and ambiguities in natural language. I mean, if you turned it over to someone who wasn’t aware of any of those things, I think it would not be very effective.
But I can see that computer science students will learn C++ and Python because you learn the basics in any field that you’re going into. But the actual application will be through natural language working with one of these interactive systems. And what that allows is just a much broader population to get involved in programming and developing software. And we really need that because there is a real shortage of capable computer programmers and coders out there. The world is going through this digital transformation. Every process is being turned into software. And there just aren’t enough people to do that. That’s what’s holding that transformation back. So as you broaden the population of people that can do that, more software will be developed in a shorter period of time. I think it’s very exciting.
Genkina: So maybe we can get into a little bit of the copyright issues surrounding this because for example, GitHub Copilot sometimes spits out bits of code that are found in the training data that it was trained on. So there’s a pool of training data from the internet like you mentioned in the beginning and the output of this program the auto-completer suggests is some combination of all the inputs maybe put together in a creative way, but sometimes just straight copies of bits of code from the input. And some of these input bits of code have copyright licenses.
Yeah. Yeah. That’s interesting. I remember when sampling started in the music industry. And I thought it would be impossible to track down the author of every bit of music that was sampled and work out some kind of a licensing deal that would compensate the original artist. But that’s happened, and people are very quick to spot samples that use their original music if they haven’t been compensated. In this realm, to me, it’s a little different. It’ll be interesting to see what happens. Because the human mind ingests data and then produces theoretically original thought, but that thought is really just a jumble of everything that you’ve ingested. Yeah. I had this conversation recently about whether the human mind is really just a large language model that has trained on all of the information that it’s been exposed to.
And it seems to me that, on the one hand, it’s impossible to trace every input for any particular output as these systems get larger. And I just think it’s an unreasonable to expect every piece of human creative output to be copyrighted and tracked through all of the various iterations that it goes through. I mean, you look at the history of art. Every artist in the visual arts is drawing on his predecessors and using ideas and things to create something new. I haven’t looked in any particular cases where it’s glaring that the code or the language is clearly identifiable is coming from one source. I don’t know how to put it. I think the world is getting so complex that creative output, once it’s out there unless something like sampling for music where it’s clearly identifiable, that it’s going to be impossible to credit and compensate everyone whose output became an input to that computer program.
Genkina: My next question was about who should get paid for code by these big AIs, but I guess you kind of suggested a model where all the training data get a little bit of— everyone responsible for the training data would get a little bit of royalties for every use. I guess, long term that’s probably not super viable because a few generations from now there’s going to be no one that contributed to the training data.
Smith: Yeah. But that is interesting, who owns these models that are written by a computer. It’s something I really haven’t thought about. And I don’t know if you’ll cut this out, but have you read anything about that topic? About who will own— if AlphaCode becomes a product, deep mines AlphaCode, and it writes a program that becomes extremely useful and is used around the world and generates potentially a lot of revenue, who owns that model? I don’t know.
Genkina: So what is your expectation for what do you think will happen in this arena in the coming 5 to 10 years or so?
Smith: Well, in terms of auto-generated code, I think it’s going to progress very quickly. I mean, transformers came out in 2017, I think. And two years later, you have AlphaCode writing complete programs from natural language. And now you have T coder in the same year with a system that refines the natural language intent. I think in five years, yeah, we’ll be able to write basic software programs from speech. It’ll take much longer to write something like GPT-3. That’s a very, very complicated program. But the more that these algorithms are commoditized, the more I think combining them will be easier. So In 10 years, yeah, I think it’s possible that you’ll be able to talk to a computer. And again, not an untrained person, but a person that understands how programming works and program a fairly complex program. It kind of builds on itself this cycle because the more people that can participate in development that on the one hand creates more software, but it also frees up sort of the high-level data scientists to develop novel algorithms and new systems. And so I see it as accelerating and it’s an exciting time. [music]
Genkina: Today on Fixing the Future, we spoke to Craig Smith about AI-generated code. I’m Dina Genkina for IEEE Spectrum and I hope you’ll join us next time on Fixing the Future.