Authoring a New Language for the EVM w/ Charles Cooper (Vyper)

**Speaker A:** Foreign. **Speaker B:** Hello, and welcome back for another episode of the Strange Water podcast. Thank you so much for joining us this week. Let's get really basic. Let's say that you want to build on Ethereum, or more specifically, let's say that you want to create and deploy a program to the world computer. Your first step, you need to actually. **Speaker C:** Write out the program. **Speaker B:** You need to explicitly write the specific instructions that you want Ethereum to execute every time that it runs that program. And by default, that probably means you're writing your code using a language called Solidity. But here's a little under the hood secret. Ethereum doesn't actually understand solidity. It's much too abstract and verbose for a computer. What actually happens is that your Solidity code is sent to a compiler which translates your human understandable code into machine understandable bytecode. This paradigm is the magic behind the vast majority of programming. We write our code in a language that's easy for humans to work with, but we transform it to machine code when it's actually time to run it. So here's the question to kick off today's episode. If we can translate Solidity to EVM bytecode, can we translate a new language into the same bytecode? Maybe one that has analyzed the pain points of solidity and has fixed them? Charles Cooper is a core developer of Vyper, an alternative programming language for the Ethereum blockchain and other similar blockchains. While non devs might not be familiar with Vyper, you're definitely familiar with Vyper programs, for example Curve Finance. Throughout this conversation, you'll not only learn how programming languages actually work, but but also how properly designing a language can completely change how developers are able to build. Here's the core insight that I want you to listen for. The most important part of being a builder is the ability to clearly communicate what you want to happen. And as much as that is a statement about the skills of a builder, it is also a warning that the tools you choose affect your ability to communicate your ideas. And so picking and in Charles's case, building the right tools is just as important as putting them to use. One more thing before we begin. Please do not take financial advice from this or any podcast. Ethereum will change the world one day, but you can easily lose all of your money between now and then. **Speaker C:** All right, let's talk Viper. Charles, thank you so much for joining us on the Strange Water podcast. **Speaker A:** Thanks for having me, Rex. **Speaker C:** Of course. So, before we get to Viper and the meat and bones of Ethereum, I'm a huge believer that the most important part of every conversation are the people that are part of it. So with that as a frame, will you kind of give us a little bit of your background and tell us how'd you find Ethereum and what about it was like, so compelling or fascinating or whatever to you that you're willing to dedicate your kind of life and career and reputation towards it? **Speaker A:** That's a really interesting question. I guess I started out my career in like financial tech. So I have been doing various HFT and automated trading for a long time. And then I think in 2017 or so I started to become like, aware of Bitcoin and Ethereum and I like looked into what it was doing and I was like, wow, this is really cool. Like you can do anything, like you can like replace companies, like anything that you want to do that normally requires kind of these like legal contracts and intermediaries you can, you can do with smart contracts. So I started kind of looking into it and I was like, okay, what, what can I do here? What am I suited for? And always had kind of like an interest in programming languages. And I've always like believed that the programmer's most important tool is like being able to, being able to work effectively is there. And the way you work effectively, I actually studied linguistics in college and I think the way you work effective, which is actually science, it's not like a fake thing, but it is a little. It's not like physics, but it is a science. And I do think that being able to express yourself well, easily, concisely and effectively is like very important, is the most leverage you can have as a programmer. **Speaker C:** Sorry, before you like dig into like, what kind of like spiritually got you excited about the work you're doing now, like, how did you get exposed to Ethereum and Crypto? Was it just kind of through the mania of like 2017, you just like happened to be exposed to it through like the Internet or Reddit or was it something that was going on with the high frequency trading firm you're working at? Like, what, how did you see Ethereum and think like, oh my God, this is cool and not just part of, of the deluge of information on the modern life? **Speaker A:** Yeah, I think what happened was I read an article in the Wall Street Journal about Bitcoin being used in Venezuela and I was like, oh, like, this is, this is like the real thing. This is like actually useful. This was early 2017, end of 2016, and I, I like logged into Coinbase to Like, buy some Bitcoin. And I saw Ethereum and I think somebody had mentioned it to me before or I'd like seen it somewhere, someone on the Internet. And I was like, that's interesting. Like, what is that? So I started looking into it more. **Speaker C:** Nice. Yeah. And I. A big thing that I always like to say to Americans or like Westerners is the idea that crypto, like, applications are coming, use cases are coming, just reflects like the very privileged and very like, soft life that we live. And for people like Venezuelans, for people in wars, in war torn countries and people like that are politically or persecuted by gangs or whatever, like crypto has a use case today and it's really about, like, ownership of your wealth and the ability to walk away. And so super cool that you recognize that early on. Took me a while to get there. **Speaker A:** Yeah. **Speaker C:** So I cut you off earlier. You were saying that, like, through your, you know, your education and just your personal passion, you like, really came to the table with this belief that the ability to express yourself is the most important or powerful like, tool in a programmer's toolkit. So when you entered blockchain, like, what did you see? **Speaker A:** So I was like, this is interesting. Like, I'd like to like, start contributing, but. And I want to do it in the way I best know how possible. So, you know, I had some background, some compilers and interpreter things, and I was like, I want to like, work on smart contract programming languages. And so I started, like, looking at what was out there. I tried solidity, I tried, you know, viper. There were a lot of, like, interesting projects at the time. Remember there was like, row Lang. That was a long time ago. And I thought that Viper was kind of like this ideal balance between practicality and I don't really like using this term lightly, but elegance. You know, programming is sometimes, like math. You know, it should be. It should be beautiful in a way, but not at the cost of not being able to solve engineering problems. **Speaker C:** So before we, like, go deep into programming languages and like, what is Vyper and why it's important, like, let's start at the highest level, which is like, what is a programming language and how does it relate to like, a program? And so, like, I guess, just. So, Charles, you know where my head's at? Like, can you walk us through a little bit about like, the process from idea in someone's head to code to, like, an actual program? **Speaker A:** Yeah, that's a funny question because people ask me all the time when I tell them I'm a compiler engineer, they Say, what's a compiler? And I'm like, okay, let's put it this way, you know, like Java or C or Python, and most people, even lay people have heard of those. And it's like, well, your computer is made out of sand. It doesn't know what Java is, right? And like a lot of people, if they think about it, they're like, oh, right, that makes sense. So what a programming language is, is actually, it's for humans. A programming language is like some set of quasi linguistic constructs which lets you express a program. And then there's this thing called the compiler, which takes this program which is written in this high level programming language, Python or C or Viper or Java, and it converts it, loosely speaking, into machine code. And machine code is a bunch of instructions that a CPU actually knows how to execute, like add, multiply, divide, move something from one register to another. **Speaker C:** So, so what you're saying is that when we look at like lines of code and we think like, oh my God, this isn't English, this isn't a language. Like, this is for computers like us. Computer scientists know that actually anything that's written with like really letters is something that is meant for humans and is meant to be like accessible, even if it looks like crazy and weird and whatever. And what, what you're doing and what the job of a compiler is, is to be a translator that says, like, we're going to take the way that humans think about code and we're going to take the way that computers think about code and, and we are going to act as the, I mean, not the translator, right? Because it's really one way, but we are going to interpret everything that the humans wrote and then rewrite it for the machines. Is that correct? **Speaker A:** Yes, exactly. **Speaker C:** And then I think like the important thing to recognize, especially here in Blockchain World, just because you have one high level language like Solidity, and then you have one like Computer or vm, right, which is the evm. Like the important thing here is that the EVM just needs the instructions. If you can translate solidity into those instructions, there's no reason that you can't create or pick or whatever a different language than Solidity. As long as you have a compiler that knows what the machine's expecting and knows how to do the translation, there's no reason that. Let me just be very frank, Solid, like Ethereum doesn't understand solidity, it understands machine code. And like, there's no reason to limit ourselves to just this one translation of solidity to machine code. Is that right? **Speaker A:** Yeah. And you can actually write bytecode by hand? I mean, you can write, right, EVM code. I mean, I do that all the time for fun. Right. And besides Viper and Salutity, there's a lot of like, low level languages which target the evm. I think Huff is like a fairly popular thing these days. It's basically like an assembler with macros. So you are not exactly writing bytecode, but you're writing, you know, things that are fairly close to bytecode and it gives you some glue to, to, to glue them together in different ways. And that's fairly popular recently, especially as like a learning tool to understand how the EVM works. **Speaker C:** And so like, when you are like, you know, again, because you have this background in linguistics, you have some experience in interpreters and compilers, did you enter the space thinking like, solidity isn't good enough, we need something better, or did you enter the space thinking like, programming languages? If it's, you know, if it's a real technology, there should be more than one and I just want to contribute to that diversity. Like, what drew you to the programming language level of the stack? **Speaker A:** I tried programming some things in Solity and I was like, this doesn't seem very fun. This was a long time ago. This is like solidity04 something, I think. Anyways, I'm like, not here to say bad things about programming languages, other programming languages, but I started writing some contracts and I was like, wow. So Solidity has, is like, let's put it this way, it's very powerful, lets you do a lot of things, but there's like a lot of ways you can shoot yourself in the foot. There's a lot of foot downs in the language. And so I started like writing some smart contracts and I was like, ooh, like this seems like a little bit weird or dangerous. And I was like, yeah, I remember opening up the, you know, the Vyper docs, the website and looking at the language goals and I was like, this is really cool because, you know, the top three design goals of Vyper are like, simplicity, readability and safety. And it's like really important for programs to be readable. It's really, really important for them to be reasonable. And I use reasonable in the technical sense, which is that like, you are able to reason about the program and you can always like, add more features, you can always make things faster, you can always make them more efficient, but it's really, really hard to make things more reasonable. And so looking at, you know, Viper's Goals. I was like, this is really good, this is really important. And I thought that, you know, the best way for me to work on making programming language program smart contracts more reasonable was to, to work on, on Vyper, basically. **Speaker C:** Yeah, that makes a lot of sense and I think it's always applicable to think like that. And for, for those of us that don't have a computer science background, it's like, cannot be overstressed that having something that is understandable and readable and just like accessible to the human brain is so much more valuable than you might think. Because like we're just at an age in technology where things are so complex. It's really easy to create something that works, that you have no idea how it works. And that's great until it doesn't work and then you're pretty screwed. So I definitely understand like how much focus can be put on just like accessibility to humans. But that's in normal programming, right? On top of that, in our realm we live in the evm, right, which is if something goes wrong, you can't fix it, you cannot undo the glitch, you cannot just hot swap the code. Whatever happens is out there forever. And so do you have just any thoughts or follow ups on why Viper specifically and those attributes of Viper called you as opposed to, as you said, any of the other languages that were possible at the time? **Speaker A:** You know, there's a saying in programming that if it's twice as difficult to debug a program than to write it, then you better write it being half as clever as you know how, otherwise you're never going to be able to debug it. And one piece of feedback I get about Vyper a lot is that it, it's like very, it's so simple, it's very readable. Like non programmers can kind of write Vyper contracts and there's a lot of kind of technical things which also go into that. So for one thing, Vyper is at least the syntax, the way it looks is based on Python. And Python is kind of well known for being syntactically simple, approachable. Oh, I mean Python is like known to look like quote unquote pseudocode, which is like what, what, what code looks like when you're drafting it. So Viper also kind of looks like pseudocode, which is like if you were like to strip the code down to the it's bare bones to like the essence of what it means, take out. **Speaker C:** All of the curly braces and the semicolons and like make it instead of look like code, make it look like instructions. Right? **Speaker A:** Yeah. Syntactic noise. And that's kind of like a UX thing. It's just like. It's an appearance thing. It's the. It's how the dress up the clothes. But it's important. And then there's always been a focus on Vyper being statically analyzable. So that means, to give you a bit of a background, programs exist kind of in two realms. One is static and one is dynamic. So static is the source code itself. It's the meaning of the program without running it. And then dynamic is what happens, or maybe you call it runtime behavior, is what happens when you actually run the program. And that's what it actually does at runtime. And Vyper is kind of designed to be very statically analyzable. So you can kind of write a tool to figure out a lot of things about how this program behaves at runtime. **Speaker C:** Well, actually, can you break that down a little bit further? Like, why are some types of languages able to be statically analyzable? And some, like, really struggle with that, or maybe it's not even capable. Like, what is different about the language? **Speaker A:** So there's a lot of things that can make programs hard to analyze, and they usually have to do with how much dynamism is available in the language. This has to do a lot with how storage and memory are allocated. I'm trying to think of an example. So, for example, for instance, in C, there's kind of two basic ways of allocating things. You know, one is on the stack and one is on the heap with. With Malloc. And when things are allocated on the stack, that's usually better because you know where they are statically. Whereas when they're allocated on the heap, there's some kind of indirection you have to do at runtime in order to find out where something is. So if, like, X is allocated on the stack, you kind of know where it is. You don't have to run the program, but if it's allocated on the heap, sometimes you need to, like, actually run the program in order to find out where it is. And this is kind of a general property of programs that's known as Turing Completeness. **Speaker C:** So, sorry, just to unpack what you said there. So stack in the heap. Technical terms, we don't really need to know what that means. All that we need to know is that the stack is super, super local. It's, like, created in this instant, and we know exactly where it is it's close to us. The heap is like the Library of Congress. Everything that we need is accessible there. But it takes a lot of like, overhead and searching to go figure out where that data is and bring it back. Right. And the point that you're making is if your language isn't careful about where it's storing data or relying much too heavily on the heap, then it's really hard to understand what all the data is and going to do and how it's going to change without actually running the program. **Speaker A:** Yes, exactly. So the big difference is, as far as we're concerned, there's a lot of differences, obviously, but as far as we're concerned, the big difference is that the stack, you know where things are just by reading the program, or not exactly by reading, you analyze it and you might have to like write some things down in a notebook, but you can figure out where they are without running the program. And things on the heap, you might not. So one thing in smart contract land that I can make an analogy is in Vyper, every time you access storage, you can actually kind of know which variable is being accessed because you'll have something in the syntax, it'll be like self my variable. Whereas Solidity has this concept of storage pointers, so you can actually like calculate some number while the program is running and then access that storage. **Speaker C:** So again, just to unpack, in Viper world you say variable X equals whatever. And because of the way Viper works, we always know where X is and we can always check that value. But with Solidity it like where the data is is this like dynamic thing that is changing all the time. And so the only way to know what is at the end of this pointers to this data location is actually to run the program. And at the point that you're interested in, go check what all the specific variables are and then go actually look at it. **Speaker A:** And it might be different depending on what inputs you run the program with. Whereas Vyper, you can tell what variables are being accessed just by reading the program. You don't even need a compiler. You can just read the program or you can write a tool which reads it for you and is like a analyzer or something. An analyzer. An analyzer is kind of like a very, a very stripped down compiler. **Speaker C:** Got it. So the, the idea just to wrap this all up, is that some languages, based on their fundamental construction, create programs that are easy to analyze. And, and when you can analyze them, you can debug them, you can, you can audit them, you can like as a human figure out, if I put in this, then I'm going to get this result. And like they're very just like auditable, if you will. Whereas dynamic programs do have a lot of those capabilities, but like the way to get them is to just run the program and like work, watch what happens as it's happening. And that's fine in a lot of cases, but that's very dangerous in EVM world where you can't undo things that go wrong. **Speaker A:** Yeah, and I want to talk a bit about Turing completeness. Basically all modern programming languages are what's called Turing complete. And what Turing complete means is that it can emulate a Turing machine. And what a Turing machine is, it is like the quintessential computer. And the reason Turing completeness is important is because Turing machines satisfy what's called the halting problem. So you can't actually tell if a program will terminate just by analyzing it. You have to actually run it. And by running it you might run forever. And this is like really, really important concept in programming language theory, which is that you don't actually know what the program. I mean, that's why programming languages are so powerful, but you don't actually know what it's going to do at runtime. So all Turing complete languages have some kind of aspects or you are able to write programs which are undecidable. But all programming languages being like kind of ways to express a program also have restrictions. And where these restrictions are kind of makes or breaks a programming language. Right. So all of all programming languages are going to have these areas where they're undecidable. But part of the art of programming or the difficulty in programming language design is like figuring out which kinds of programs should be undecidable and which kinds of programs. Which kind of program level constructs should be obvious just by reading the program. Is that making sense? **Speaker C:** Yeah, that totally does. But before we continue, let me summarize it and tell me where I'm wrong but right. What you're saying is we start with this concept of a Turing machine. And what that essentially means is that if you can prove something is like a Turing machine, then you know that that thing is capable of doing all things that a Turing machine can do. And why that's important is because a Turing machine is basically as you say, like a generic computer. And so if we know that it can do, if our language can do something that a Turing machine can do, then our language can do everything that a computer can do. The, the paradox or the problem Is that in order? If we know that something is Turing complete, that also, by definition of Turing complete means we can't tell if a program is going to, like, ever stop once it starts. **Speaker A:** And I want to add that this, this halting problem, it can mean it. The, the, the basic definition is that we can't decide. We can't tell if the program is going to halt. But it can also mean. I'm not stretching the definition, but there's, like, some way of understanding this. It basically also means that you can't predict what a program is going to do, so the program is undecidable. **Speaker C:** And so, okay, so the. We want a Turing machine because that means that it can do anything a computer can do. But we're nervous about Turing machines because if we're given code that is like, Turing expressible, Turing complete, whatever term you want to use, then we cannot know what's going to happen from running that code without actually running the code. Is that correct? **Speaker A:** Yeah, exactly. Actually running the code with specific inputs. So it might do something completely different. It might blow up if you give it different inputs. **Speaker C:** Great. Like, it could be like, let's say it takes inputs as all the real numbers, and everything works fine. It completes in a millisecond, but if you give it a fraction, then it will continue forever until the end of time. And like, in regular computing, like, that's totally fine. You run your program in, like, a little sandbox. You make sure that that doesn't happen. And if it does, then you can kill it. Or even in production, you can kill it. Right. In Ethereum, that's not an option. Like, you do not control the computer. And, and like, the whole point of it is that if you give it instructions, it will continue forward. And so, like, the, the real problem that we're trying to get to here is that when you're designing a programming language, you want to make sure that even though, by the definition of Turing completeness, we can't know what the end result of this computation or this program is going to be, we want to make design choices that make analyzing the program and, like, maybe holistically understanding what it's going to do, a lot, a lot easier. Is that right? **Speaker A:** Yeah. And maybe that way of looking at it is, like, a little bit theoretical. Maybe another way of looking at it is saying that in programming, sometimes there's things that you do that are kind of straightforward. You know, you take two numbers and you add them together. And then there's other kinds of things which are like modifying storage or the file system or interacting with the outside world, which are, like, kind of more risky. And a good programming language should make those things really clear and really easy to analyze. **Speaker C:** And so I guess my not sure how to ask this question that doesn't get us too technical, but what, what are the actual types of design decisions that you're making in order to achieve, like, analyzability? Like, do you maybe have an example off the top of your head that Solidity does that makes it really hard to analyze that Vyper understands and has made a design, a design decision to make easier? **Speaker A:** Yeah, I think the concept of storage pointers is really good. So Vyper doesn't have storage pointers. You can't access some variable without actually naming it. So if you have two variables in a contract, X and Y, you can't touch Y without actually saying self Y. **Speaker C:** Got it. And in like Solidity or like, you know, for the more classic languages, like in C or C world, like, you could pretty easily access or change the data in Y by using the. By basically like, manipulating the data in X to change the pointer to be pointing to the space where Y is, but in a way where the program, like, doesn't really understand that. I guess that's, that's very hard to unpack if you're not a programmer. But the point is, is that, like, Vyper is setting explicit rules that, like, even on a technical standpoint, it's possible to access this memory. We're creating the language rules that says if you want to access the memory, you need to do it in this very, like, clean and safe and readable way. Is that correct? **Speaker A:** Yeah, I would say it's explicit. **Speaker C:** Got it. And is that really kind of the quintessential thing that we're getting at here is that in order to make a language more statically analyzable, you need to find all of the places of ambiguity and enforce. Enforce like specificity. **Speaker A:** Yeah. And you want to think about how people are going to use the program. You need to really get into the mind of a programmer. So like, a language designer is like a programmer's programmer. Right. And literally, so you need to get into the mind of a programmer and be like, okay, I mean, we can talk about theory all day, right? But at the end of the day, you don't really need to get into the user's mind and be like, okay, what, what are they going to do? And then is that decidable? Is that analyzable? Can we, can we look at this program and like know what it does? Or is there going to be some kind of issue where you, the programmer takes some language feature and then they make it do something that isn't analyzable? Where if you are an auditor and you're reading this code, you have to think about the behavior of the code at runtime, you know. **Speaker C:** Yeah, and I guess like, let's. This is like so basic, you're going to laugh at it. But like, let's imagine a world where like we have our, you know, sorry, we have our variable X and when we instantiate or create the variable, there's a number in there, right? Like an integer, let's say. Like what? It doesn't even matter there's an integer in there. And in like some languages like that, the compiler will enforce it will, it will not create the program. If at any point it sees that in this variable X, instead of trying to put a number in there, you're trying to put like a word or a, some other type of object. Like the compiler will literally see this and say, oh, this is expecting a number. We got a word and error. Like abort everything we can't compile. And then in some of these looser, like again, C is the perfect example that has no typing at all. You say let variable X equal your integer and then maybe 10 steps down the line you call a subroutine that takes that, whatever the values at X is, and replaces it with a string. And then 20 steps later you, when you go look at X and you're expecting a variable and you get a string, the compiler didn't notice that, but your program will blow up at runtime because what it was expecting wasn't there. **Speaker A:** C is statically typed. So normally if you try to assign something of a different type to X, then the compiler will complain. But in C it's also very easy to do a cast. So you can easily convince C that like X is now now a string or rather like a pointer to a character. **Speaker C:** Got it. So, but at a high level, I guess, to bring it to its core. Like really you as a programming language designer are you want to make as many choices as possible that remove ambiguity and remove kind of flexibility without actually removing the ability for the programmer to express the program that they want to make. Is that like kind of the battle you're always fighting? **Speaker A:** Yeah, I'd say that's a good way of characterizing it. **Speaker C:** Yeah. So you're almost like a sculptor who starts with like the full amount of material and tries to chisel away like, all the excess until you have the functional work of art in the middle. **Speaker A:** Yeah. Or you're like, design. You're like, creating a tool set for sculptors so that it's, like, easy for them to do what they want to do, but also not easy to screw up. **Speaker C:** Makes a lot of sense. All right, so we've talked a lot about the fact that you can have multiple languages. Then we talked about how Vyper has made these design choices around reasonability and security and simplicity. Can we start to talk a little bit about what are the types of applications that really benefit from the Vyper approach? And I know the most famous, the big one, is Curve. So if you want to kind of talk a little bit about Curve or any other type of project, like, who do you think the best candidates to be thinking in Vyper as opposed to Solidity, are? **Speaker A:** I think, actually people who are learning about how to write smart contracts. Vypr is actually a very good entry level smart contract language. And that's not to say, you know, you can't do more powerful things with it. I mean, otherwise you wouldn't be able to build Curve. Exactly. Protocols. But Vyper kind of, you know, as much as possible, tries to strip away the noise. **Speaker C:** Yeah. And it, I mean, it sounds very similar to Python. Right. Where it's like, it's so much about readability and just like, understandability that, like, by learning to program in it, you can immediately focus on what you're trying to do as opposed to, like, the weird quirks of, like, the language itself. And so I think that's what you're saying is why, like, Python is so. Sorry, why Viper is so powerful. Is it because it allows developers to jump to thinking about their apps as quickly as possible as opposed to, like, worrying about learning a new language? **Speaker A:** Yes, exactly. **Speaker C:** So that's Viper is. It is today. Can you talk a little bit about, like, what does the development roadmap look like for a language? Like, what are the types, things that you guys are working on and thinking about? And what do you want to turn Vyper into in the long term? **Speaker A:** I think of Vyper as kind of like Swift for iOS. I want it to be the most natural language for programming smart contracts. As to what goes into a language's maintenance over time is you're always wanting to add more language features. And a lot of the way that happens is you actually look at how the language is being used in real life, reading code on GitHub or whatever, and you're like, that looks unsafe or that looks not very ergonomic. Like what can we add to the language? Or can we design a feature which makes that easier? Is there some, is there a new kind of tool, a wrench that we can add to the language that's going to allow people to do this task in a safer way or in a more ergonomic way? **Speaker C:** So like for example, you might read on GitHub and see like the exact same function or function paradigm in like 40% of projects. And you might realize like, oh, if everyone is doing this, maybe it could be served by like having a built in feature of the language. **Speaker A:** Yes, exactly. And let me give you an example of that. Whenever you call an ERC20 token like transfer, whatever, according to the spec, it's supposed to return some boolean true or false. And what exactly that true or false means is subject to interpretation. But that's not actually the important part. The important part is that there's a lot of ERC20 tokens which are known to be non compliant, so they don't return anything. And so when you call them, normally when you call some function in another contract, the compiler will insert code to check that the contract you called actually you return back enough data. In this case, when you have a boolean in like that's being returned from a contract, you should get at least 32 bytes. And so there's all these non compliant contracts which are not returning any bytes. And so in older Vyper code bases you see a lot of this stuff which is like raw call. So there's a way to bypass the, the compilers, you know, checks or whatever. And the compiler's way of constructing the call to this external contract and reading data, and that's called raw call. And what you'll see in a lot of these code bases is there's like this raw call and then people are like constructing the call data manually. They're like computing the method ID for this transfer thing and then they're like turning all these numbers into bytes. You see this like abi, encode or concat and then you are going to check if it actually returned any data. And if it did return any data, then you make sure that it's true. And if it didn't, then you just kind of assume that the target contract would have reverted if the transfer didn't succeed or whatever. And so these are blocks of code that are like 10 lines long and they're like really weird and they're all over the place. Anytime you want to interact with an ERC20 token. And I don't remember exactly how it came up, but I was like reading some contracts and I was like, this looks like really tedious. What if we added something to the way calls are constructed which would remove all this boilerplate? And so, you know, I thought about it and discussed it with a few people and, and eventually decided on adding this keyword, arg, this extra argument to all, all calls. And it's called default return value. And default return value will just put in a value when the contract you call doesn't return enough data instead of reverting. So now instead of doing this raw call, constructing the call data, checking the length of the data that the contract returned you, whatever, now you just do like token transfer and then you add default return value equals true. Because that's what people are doing anyways. **Speaker C:** Got it. So just to be like, let me make sure I understood what you said. You said before, like, you could, it's possible to, you know, request this data from ERC20. And because the ERC20 contract is non compliant, they would give you nothing or give you garbage or whatever. And in previous versions of Vyper, like, that would cause the whole program to fail. **Speaker A:** Yeah. And people would have to write all this boilerplate code in order to deal with that. **Speaker C:** Yeah, and yeah, exactly. You and the team realized that like, every single person was writing variations of the exact same boilerplate. And not only you realize not only can we save everyone a bunch of time, but because everyone's writing their own, like, boilerplate as a safety check. Like, some people are making mistakes and getting it wrong and causing like, havoc. So you just thought, you know, what a new feature of this language should be that like, should solve this problem for us. And that's how you upgraded Vyper. **Speaker A:** Yeah. And the neat thing about this feature is that it's actually clear what you are doing, which in my opinion is actually the most important part. So like, before you would see like raw call with a bunch of like, data. And then you have to like, figure out what the programmer was trying to do with the return data. And then like, you have to read the code and try to understand what's going on. And if you don't know, if you're not familiar with the pattern, it takes some time to get used to it. But if you see default return value equals true, like you just look up in the docs what that means and you're like, oh, sometimes contracts don't return enough data, don't return enough bytes, and default return value just Gives you a way to deal with that. **Speaker C:** Got it. And so would you say that like most of the upgrades, if not all of the upgrades that you make to viperlang are either like straight up security patches or this idea of looking for what developers are spending their time on as non productive work and then figuring out, okay, is that something that can be incorporated into the language? **Speaker A:** I'd say that's about 30% of it. Another 30% is finding ways to compile to EVM code that's like more optimized. It's either smaller or uses less gas. That takes a lot of time. And then I'd say giving people more data structures is important, which is kind of like this language feature, but it's also like data structures are their own thing. And then a big thing that I've been working on for or actually planning for a long time. And a lot of people are, I think, excited to see it. And also like Charles, why isn't this done yet? Is Vyper's module system, which is really important because the alternative is to copy paste code and developers should have ways to reuse code which isn't just copy paste. And maybe this is too much hesitation on my part, but I've been going about it very slowly for a long time because I recognize that like, once you introduce this module system, and a good module system should feel like part of the language, not like it was baked on as an afterthought, it's kind of hard to change it. And so I want to kind of get it right from the get go. And there's a few features which I think are really important in Vyper's upcoming module system. One is there's no inheritance. So Vyper is firmly in what I would call this contract oriented language paradigm, as opposed to like an object oriented language. So contract oriented means that the programming experience revolves around a contract. And contracts should not, in my opinion, should not have inheritance. Because one way to put it is that inheritance is a little bit like pointer aliasing, except for code. So we talked earlier about like figuring out which variable some storage pointer refers to and how that's kind of sometimes really hard to reason about. And inheritance is kind of like that for code. So when you have a lot of inheritance in a code base, what can happen is that you kind of need to figure out how something is instantiated, how some object is instantiated in order to figure out what it actually does. And by that I mean like what code is actually being run at runtime. So if you have like foo bar. Which bar is that? And this can be like actually a really hard problem when you're reading a code base. **Speaker C:** Yeah, so sorry, let me just unpack that a little bit. For all non like super technical people, the what Charles is essentially saying is that today, if you like, let's say you create like a little function that all it does is take two numbers and add them together. Today if you wanted to use that in two different places in your project, you would literally highlight it all, copy it, and then paste it into the different part of your project. And like, if you, if you know anything about computer science, like, that is not what you're supposed to do. And the reason, very simply is you might decide at some point that you want to make a small change. Maybe instead of plus, it's multiplication. And you might go do that on your first place that you originally wrote the code, you might do it on the second one. But if you forget to do it on every single one, then sometimes your program is going to behave in weird ways. What inheritance and modules and all of this conversation that Charles has just had is about how can we let every time it's the same code, we want it to exist in one place and then we want to use tools like importing from other files to make that code available without having to like manually replicate it and have it live in different places. And then as Charles was saying, like, that's a super powerful super basic feature of modern programming languages. The problem is it creates the exact same analyzability issues that we were speaking of before, which is if your code isn't explicitly with the rest of the code, you have to go find it in other files. And it can be really hard to track down what you're actually trying to do statically right before you run the program. You might be stuck in just having to let it run and see what happens. And so all to say is a very hard design problem. Like it, it's super powerful, but also introduces a ton of surface area for for failure. And like, totally makes sense to me that both people are just all on your case about adding it, but also, like, there's not an easy way to do it. **Speaker A:** Yeah, so like, one very high priority design goal for Vyper's module system is that like, when you see some function call, you kind of know what it does. You don't have to do any reasoning to figure out which function is being called. I mean, you might have to like open up a file, but you don't have to like guess which file to open up. **Speaker C:** Yeah, you don't have to open up a file. That file, depending on which inputs that file will reference one of three other files. Then you know, like, it's really should be super linear that if this code is called, I know exactly what it's supposed to mean. Yeah, cool. Well, in the last few minutes here, I would be remiss if I didn't take the opportunity to ask you to reflect or to talk a little bit about the Vyper hack. Um, and just like, I don't. There's no reason to get too deep into it. The reality is, is that like you program stuff, you put it out in the world, like people are going to look for ways to break it and sometimes they do and like that's just life. But I just wonder if you have any higher level reflections on like what happened and like what you learned from that at the programming language level. And yeah, just how you, how viperlang has changed after going through that crucible. **Speaker A:** So I think one important takeaway, which is kind of obvious if you think about it this like really hammers at home, is that the bar for correctness for smart contract compilers is, is really high. It's much higher actually than Web2, because in Web2 if you find a bug in some code, you just kind of patch it and fix it and move on. Whereas with Vyper or Solidity or any smart contract compiler, it's not enough to just keep the current version up to date and bug free. You also need to like always be checking old versions of the compiler because there could be contracts that are out there that were, you know, deployed with Vyper 0.3.0 and they're immutable, right. They can't just upgrade to the latest Viper version that you can't just recompile it. Because in, in web2 that's pretty much the mitigation most of the time you just like send people a new code, everybody recompiles and then like they can move on with their life. But in, in web3 and smart contracts are immutable land that's not really possible. And so we've kind of started to approach security in a much more um. So I want to choose my words very carefully here. We've started to approach security in a much more robust way. So first of all, we've onboarded several very good audit firms to not just review Vyper's code like on a regular basis, but also to approach Vyper security in like a, in a multi faceted way. So to speak. So that includes, like, writing tooling to help find bugs in Vyper. That includes kind of review of older compiler versions. That includes kind of, you know, differential testing. So, like, if we take the same contract and compile it with, you know, different versions of Vyper, does it behave the same? And if it doesn't, then there's like some kind of problem. And also starting these kind of regular or we've had one, we've started one competitive audit actually of Vyper, and we're probably going to have this do this on a regular basis going forward, since it's, you know, seems to be a very robust way of getting a lot of eyes on the code base. **Speaker C:** So I guess, like, final question on this. Did this experience make you think of the having upgradeable contract smart contracts differently at all? I mean, I guess what was your. Your feelings on should smart contracts be immutable or upgradable? And did this experience, like, alter your ideas on what best practices are? **Speaker A:** Not at all. **Speaker C:** Okay, so what's the best practice? **Speaker A:** I mean, it's very tempting. It's very attractive, right? You know, you compile this contract with some version of Vyper or solidity or whatever, and then this contract is immutable, right? But then sometime later, it turns out that there's a compiler bug in that specific version that, you know, you were using and you want to upgrade it, you want to just like, recompile the contract. It should, with the new version of the compiler, it should do the same thing. So there should be like, no problem with that. You just recompile it and redeploy it and then there's no more bug. And that's actually more problematic than it seems on the surface. So, first of all, I think that immutable smart contracts are a good thing in general as a design choice. You know, you see this contract and you want to interact with it. And not everybody maybe has the technical capacity to do this, but in principle you can, like, read the bytecode and you kind of know what it does. And it's very desirable from a safety perspective in a lot of ways to know that, like, if you do something with this contract today and you interact with it tomorrow, like, it's also, it's always going to be the same contract. And this idea that recompiling a contract doesn't change its behavior isn't actually true. For instance, the Vyper team could like, just like completely change the language. So like x equals one does like, something completely different in two versions than it did today. And that's like, obviously, you know, a bit of a silly example, but that kind of thing happens all the time, which is that the behavior of the language changes subtly between versions. So if the storage layout ever changes, you know, between versions X and Y, and then you recompile your contract with version Y, it's actually going to have different storage than you originally started. And that's basically a very similar problem that upgradable contracts have already, except that it's not really hidden in the compiler. But people already, you know, need to be very careful when they're upgrading contracts that they use the same storage as they were before. And I think that there's this kind of perception that the source code of a program kind of defines what the program does, but that's not actually true. It would be an exaggeration to say that the compiler takes your source code, does some stuff with it, and then like outputs whatever bytecode it wants. But in some ways it's actually not entirely incorrect. Languages have ambiguities. They have like a lot of undocumented behavior. They have undefined behavior, which is very well known in C. I'll take a digression which is that like in C there's areas of the spec which are termed undefined behavior. And these are things like division by zero, accessing invalid memory. And I think what the spec says, or maybe this is just the general convention, is that when the compiler sees code with undefined behavior, it's allowed to do whatever it wants. And in practice this usually just means that it can optimize the code better, but it can also fire the missiles or it can emit code that fires the missiles or whatever. And so the source code is actually not the final say on program behavior. What the program actually does is defined by its bytecode. And so this concept that you can just like recompile a contract and it'll do what you want it to is basically flawed. The program is really defined by, by the bytecode. And yeah, I mean, if that were the case, then you could like take a. In independently implemented versions of Fiber compiler and like compile the same program and you would get the same bytecode. But actually when all, whenever you take, you know, two compilers and you implement the same, and sorry, you compile the same source code, you're going to get like different machine code. **Speaker C:** Yeah. So sorry, just to focus that. So when, when it comes to like developers creating dapps, like their choice is either to like issue an immutable smart contract or to issue a Smart contract that points to a proxy and we can change what. Which proxy it's pointing to. Therefore, we can kind of like upgrade the Smart contract. And I think the point that you're trying to make is that, like, this immutability is like a feature, not a bug of Etherium, and we should, like, really, like, understand and appreciate that. And yes, like, it introduces some risks, like what happened with Viper laying in the hack before. But, like, we're gaining some very powerful things from that, which is the ability to know that whatever happened, whatever that Smart contract did yesterday will happen tomorrow. And, like, you don't want to be so dismissive. Dismissive of that. Is that correct? **Speaker A:** Yeah. Yeah. **Speaker C:** Cool. All right, man, I'm so sorry. I feel like we could keep going on for a long time, but coming up on an hour now, so before I let you go, will you just let the audience know where they can find you, how they can learn more about Vyper and if they want to contribute, what should they do? **Speaker A:** Yeah, so there's a lot of good ways to get involved. If you want to learn more about the Vyper source code or contribute, you can go to the Vyperling GitHub, which is GitHub.com ViperLang and Vyper is of course spelled V Y PER. You can find. You can follow Viper on Twitter, which is also twitter.com ViperLangLang you can also follow me on Twitter, which is BigTechSucksux. And you can also join our Discord and the link to that is on our GitHub page. There's not a really easy, human readable Discord link. **Speaker C:** Cool. All right, man. Charles, thank you so much. This is like, a really, really cool conversation to me because there's not a lot of people that are thinking about, like, really tooling and like, design principles and to make basically this space more accessible and like the. A platform that can be built on. And so I just have so much respect for, like, seeing a problem, seeing a need, and going out there and building something that is not only, like, cool and works, but is robust enough for, you know, the beating heart of DeFi 2.0 or. Or what, however you want to put it. But, like, the proof is in the pudding with Viper Lang. Right? And it's. It isn't for sure with, like, the compiler you've built and everything that's to come and all that stuff, but, like, the. The real proof for everything is, like, are people using it? And so it's very cool that, like, just like, we have client diversity, we now have language diversity. And I think that's, in the end, going to be such a huge, like part of the journey to the world computer. So, Charles, thank you so much. Really appreciate it and hope to talk to you soon. **Speaker A:** Thanks so much for having me, Rex.

Authoring a New Language for the EVM w/ Charles Cooper (Vyper)

Host

Guest

About This Episode

Transcript

Listen to this episode on: