Sveriges mest populära poddar

FLOSS Weekly

Episode 792 transcript

N/A • 17 juli 2024
FLOSS-792

Jonathan: Hey, this week Jeff Massey joins me and we talk with Sylvestre Ledru about the re-implementation of ls, cp and a bunch of other utilities in Rust. But they're not doing it for the reason you probably immediately think of. And to find out more, you're gonna want to stay tuned. This is Floss Weekly, episode 792 recorded July 16th, rust Cortil.

Hey folks, it is Time for Floss Weekly. That's the show about Free Libre and open source. Software. I'm your host, Jonathan Bennett, and today we have a great show. We are talking with Sylvestre Ledru about core utils, but not the regular core utils, we're going to be talking about the rusty core utils.

It's going to be fun. It is, it is of course, not just me. We've got Jeff Massey with us today. Hey, Jeff. Welcome,

Jeff: sir. Welcome. Glad to be on, you know, on this is this topic's interesting. It is. I'm looking forward to it.

Jonathan: So I, I asked Jeff to be the co host and he says, I don't know anything about Rust. I'm like, yes, but you know about core utils.

Yeah, I guess I do.

Jeff: My other, other podcast job. You know, we, we hit on that quite a bit. So a lot, a lot of core utils.

Jonathan: Yes. Well, when you're coming, when you're covering Linux, when it's a Linux show, you do, you do a lot of core utils. Now have you, have you ever gone and grabbed the rust core utils?

Have you tried them? Have you dabbled with them at all?

Jeff: I have, there's not the entire package, but there's a few of them I've actually played with and, you know, so I, I can say I have used them. It's a better, that's more

Jonathan: than I could do, actually. I'm definitely aware of them. We've covered them in the past.

Well, let's, let's not let's not spend any more time talking about the project when we've got Sylvestre Ledru here. Let's talk to the man himself and get the get the download there. So Sylvestre, welcome.

Sylvestre: Nice to see you.

Jonathan: Yeah, it's good to have you. I am super excited to have you here today.

And so you are the, are you the, the, the. BDFL, the Benevolent Dictator for Life over the, the, the Rusty Core Utils. Or have you kind of come more recently to the project? What what's, what's your history there?

Sylvestre: So for life, I don't know, but I started to be involved in the project for the last four years, basically right, right before COVID hit.

Jeff: Ah, yes.

Sylvestre: And I, I have been the main maintainer, but I have three other great maintainer helping me with that work.

Jonathan: And so you didn't start the project, you, you kind of picked the banner back up?

Sylvestre: Yeah, so someone Jordi started like 10 years ago, and then some people worked from time to time on the project, just doing some basic maintenance.

It was hard to get pull requests, and I noticed that the project needed A maintainer who could do spend more time on it. So I decided to volunteer and I'm stuck with a project. Hopefully not for the rest of my life, but for the year to come at least.

Jonathan: Yeah, that's, that's an interesting thing that happens in a lot of, a lot of floss projects, you know, some of them grow enough to where they take on a life of their own.

And so the, the original maintainer gets to step out or hire somebody or, you know, turn it over to a crowd of people, but there's some projects that it's it's one guy doing it for. forever until, you know, until he finally decides to retire. And sometimes it's a problem. I, I suppose time will tell what, which direction the rust core utils are eventually going to go.

And I'm going to get into that in, in a bit. So let's, let's start though at the very basics. I don't know if Jeff and I know the answer to this question, but let's, let's start with the question of what are the core utils? Like what, where do you use these programs? What are they about? What are, what are core utils?

Sylvestre: So it's a, it's a good question. It's, it is easy and not that easy at the same time. So it was if you look at the source code of the first version of Unix in 71 you can see Still, you can see the same program existing, at least some of them. So chmod, for example, to change the permission of a file, ls, cp, those commands are part of the coroutines.

And then they, they grew in different directions. Some people needed different tools. So one of them is sort, cut, and so on. There are tools that we never use, for example in our regular life. So there are some things like PR which is to format. Text before printing, there are other like PTX that you rarely use, and TSORT to do topological sort.

So we have about 60 to 70 CoriTLs. We are trying to match what GNU is doing with their implementation. So I don't know if it is clear for everyone, but you have different implementation of the CoriTLs. You have the GNU one, which is de facto the gold standard, but you have also some Unix implementation, InVaried, Dropbox, and so on.

Jonathan: Yeah, and so there is there's also you know, you have some other implementations too. So BusyBox, for example, has a lot of the core utils that's part of the BusyBox binary. In fact, a week or two ago, I made a comment about how, you know, BusyBox and BusyBox doesn't have grep. And somebody in the comments was like, yes, it does.

It has those programs. Okay. Yes, fine. Okay. It has those programs built into it, too. But these are, it's interesting. These are old utilities, like they've been since 1971. That's a 50 year program that that's been around. That's, it's incredible. And so this, this sort of obviously brings us to the next question.

Why, I guess, for one, why do we still care about these? And two, why are we rebuilding them in Rust?

Sylvestre: So good question. So, we need them every day, like as soon as you do terminal, you use the CoriTLs all day long, just to do LS, CP, MV, RN, CHMOD. So changing the permission, it is our way to communicate with, with the file system, with some of the operating system basics.

And so that's why you care about. Now to the question, why are we interested in implementing them? One of the first reason it is fun to implement those tools. You understand a lot about the system, how it works.

Jonathan: Yeah.

Sylvestre: I You know the expression staying on the shoulder of giant those folks who designed Unix 50 years ago CHmod is still working the same way.

CP is mostly working the same way. And it's fascinating to see the decisions that those folks made at the beginning of modern computer are still relevant now. So if you use any BSD, any Linux, you still have the same paradigm. You still use 666 or 777 when you don't know how to set the permission. And it's still, if you look at the code from Unix back in the day.

It's still the same the same paradigm, and we still use the same model. Now, now we, why are we re implementing it? It's not about security. If you look at the new implementation, they had like 13 CVE over the last 20 years, so in terms of security, which is often one of the, the argument for promoting Rust.

In that case, it is not our driver. Talking for me, my main driver is thinking about the next generation. Like in, in 50 years, will C still be relevant? And, and the young generation, will they want to learn C or C anymore? And I think Rust is part of, it's a fancy language. Everybody wants to learn Rust. I think it's the right time now to to re implement that software in a brand new language which is fixing some of the hard things that C is still Some of the hard things that we still see in C, like memory management or parallelism and those kind of things.

Jonathan: So we have just alienated half of our audience by saying that C is going to go away. No, I think, I think that's a, that's a really interesting really interesting answer. It, it, it fascinates me, you know, like you say, so much of the push for Rust is because of security. And I'm curious So I assume that you look at the C code of the core utils.

To, to make sure that you're matching the implementation from time to time. And I'm, I'm really curious, is there like really old and crusty C? Like you, so, for those that don't know, for those that have not like looked into hardcore C programming you can, you can commit cardinal sins in C. Like you can't, you can do nasty, ugly things in C.

And sometimes like if you're writing a kernel, you have to, and I'm just curious, like, as you look into the C code of these old programs, is it crusty? Do you, do your eyes sometimes water and bleed just from looking at it? Are there some of those in there?

Sylvestre: So it's really, it's a, it's a tricky question. So yes, I look at the source code, but I have to be careful in the way I look at it because it is TPL code and and how our implementation is MIT.

So when I look at it, I'm looking at it because I'm contributing upstream. So I'm also contributing test. I'm working upstream. I wrote a few patches to improve compatibility, error management, adding tests. So in those cases, obviously I have to read at the code, but I'm not reading the code to do the reimplementation on the other side.

I'm spending a lot of time reading upstream tests to make sure that we are 13 compliant. But I'm not looking at the code. So code, however, is complex, but not because it is C, it is because of the legacy. Like CP and LS are nightmare to implement because you have a lot of option and you need to You cannot have undefined behavior.

So if you pass option X and option Y, you need to define a behavior, and sometimes that leads to a very hard code spaghetti code. So that, that part is hard. It's more dealing with the legacies and then that C is sometimes hard to read.

Jonathan: Yeah, does Rust give you the tools or are there more modern techniques that have allowed you guys to get away from the spaghetti code?

Sylvestre: So we, we, because we can re implement, re implement the tool from scratch sometimes it's easier, but sometimes we have some crappy workaround, like if this option and this option are passed together, then you, you change the configuration at the bottom of the, at the start of the functions, this kind of thing.

So sometimes we have to do some crappy Work around to implement. So, for example, one of the things that I implemented during a long flight was the ls dash dash dirt D I R E D which is the directory editor mode for Emacs and that one introduced a bunch of special cases if you are dealing with a directory.

It's not the same as a file, if it is recursive or not. So you have to manage plenty of those cases.

Jeff: Mm hmm. So, I got a question for ya. What, how, how did you get here? So, what made you get up one day and say, You know what? I think I really need to get in and rewrite Core Utils, .

Sylvestre: So it's a, it's a longer story.

So my official job is I'm a director at Madie. I have been working at Madie on Firefox for the last 10 years, and and Madia created rest. So I knew I know most of the core was developers. Some of them were in Paris, and I had the chance to work with them every day. And I was jealous of them doing Rust code.

And at my work in theory, I'm not supposed to write code. Like, I'm managing people and projects, I'm not writing code. So I was jealous of those folks, and I was like, I need to learn Rust, and I want to learn Rust, but I don't want, I'm not a student anymore, so I want to do something that is going to last.

And and that project found the right place. Like the right project to understand how it works and the operating system, like, you know, that middle ground. It's not a kernel, it's not a high level application. It is really something that I can understand. And they are self contained.

Jeff: Well, it would sound then like you also, based on your day job, you kind of can set the pace so you're not you're not so beholden to timelines.

You know, there's a little more flexibility in there.

Sylvestre: Yeah we release Firefox once every four weeks, a major release every four weeks. On my, this project, I can ship when I want. I don't have any constraint. I can stop working on it for like a month. And so yeah, it's a, it's a good way to reconnect with code.

Jeff: Well, and talking about the release schedule. So, albeit, you know, there's some flexibility. When, when do you think it is going to be ready to, you know, just. Put it in Debian or something and say, okay, here's the new core utils we're going to use.

Sylvestre: So it's a good question. So it depends what you mean by ready.

I have been using it in production for almost two years myself. So all my system are using that implementation. There are things that we don't implement but there are usually corner cases or options that I don't use or I don't need or programs that I don't need. I mentioned earlier the pr common to format text for print.

I have never used it in my life. I only use it when I need to implement some function. And so it's already ready. Now. The question is, when is it going to reach a bigger audience? So I know that some companies are already using it in production. So a big social network not Facebook, but they are using it on embedded devices as far as I know, mostly for license reason.

I know that some of the other operating system for cars also using it for license reason. And then this is one of the Strengths and weaknesses of open source. Maybe others are using it in production and never told me about that

Jonathan: So i'm i'm curious you you mentioned that there are there are some of these, There are some of these, some of the utils have sort of weird corner cases that you guys don't support.

Are you, are you planning to eventually add support for those corner cases, or are there some of these things that you have just intentionally said, we're not ever going to do this the way that core utils does. And so I guess the bigger question is, is the goal, you know, 100% Sort of bug for bug, maybe not literally, but, you know, exact feature for feature compatibility with core utils, or do you guys feel a little bit of flexibility to say, you know, maybe this decision that was made back in the 1970s, it was not the right decision or it's not relevant anymore.

And we're going to update it a little bit. What, what's the, what's the philosophy on that?

Sylvestre: So I I have been involved with an LLVM in Clang. And one of the success, in my opinion, of Clang is that the team considers that If they don't implement a GCC flag, it's a bug. And I, I think it contributed significantly to the success of Clang.

And I'm trying to replicate that model into that one. So yes, we want to reimplement everything. Now there are corner cases in tools that nobody uses that maybe it is going to take longer, but our goal is really to implement all the feature and all the flag for the most common options and commands.

So LSC, PMV and so on. So we are, we already are passing 100 percent of the upstream test on some of those common, but not all of them. So we are working on, on fixing those. So for example, one, we have a Google Summer of Code student who has been working on re implementing some of the color function of LS.

And it's quite hard. Like, I, I play with it and LS dash dash color is not easy at all to implement. And so we contribute with other upstream developers and other Rust developers to make it perfect. But this is, we consider that as a bug.

Jonathan: Yeah. Oh, I've, I've done just a tiny bit of color work on the, the output of another, another project that I'm working on.

And so you get into antsy color codes and then you have to pay attention to things like term info. And I can imagine that being just a mess to work with. Oh goodness.

Jeff: So, how many of the utils are 100 percent compatible?

Sylvestre: I don't have the exact number, but we are publishing every day the updated list. So, we can share the link after that podcast, but we have a list.

So, we run the test, all the GNU tests every day, and we publish the result. I think it's on, we have like 65 upstream programs, and I think 20, 25 are 100 percent compliant. But sometime So what I, what I like to do is when I reimplement the tools and I notice that That my tests, all the new tests are passing.

And then I realized a bug in my code. I'm going to look if upstream as a test or not. And if they don't, I'm going to upstream I am going to commit upstream a new test to make sure that it improves a new compact, so new compatibility. So I've been contributing a lot of patches upstream to make sure that their test suite is better and better.

Cause as I said, there are so many combinations that we cannot test everything.

Jeff: Yeah. So, and maybe I want to make sure I heard you right. So, If you find a bug in say the GNU utilities, you implement the bug as well?

Sylvestre: No, we are going to report it upstream. And usually the answer is not a bug. It's a feature.

I'm joking. I'm joking. But

Speaker 4: yeah,

Sylvestre: for example, there is a checksum command is, is a weird comment that you can see that it was designed in a weird way. So you can pass So checksum has plenty of arguments. So there is one which is dash dash tag and dash dash untag. And the command is only going to pick up the last one.

Even if they are conflicting with each other, it's only going to pick up the last one which is quite confusing as a user. So initially I reached out to the GNU project saying can I just make the first one conflicting with the other one and they said no because you are going to break some behavior from the past that someone might have used like 20 years ago.

It clearly makes sense and it's clearly a bug. But it has been used in the past. So we are trying to find a good compromise. So sometimes we, we try to display the same output and the same errors. And sometimes we think that we can do a better job in terms of doing, in terms of doing error management.

So in that case, we are not going to follow exactly the same output as GNU. And we are going to provide a better error management. Oh,

Jeff: sorry. I didn't mean to cut you off there. No worries. So saying, saying that, you know, you're trying to provide a better environment. So how does that, how's the community what's the reaction from the community on these tools?

I mean, have you gotten any feedback or?

Sylvestre: Well, we, we have I just looked at the number before I'm meeting with you folks and we have 500 contributors. I'm sorry. I'm laying 499. Hopefully someone is going to contribute this evening and then prove me wrong. So we have a lot of people who are interested in contributing.

We have a lot of good first bug. So we know that people are excited by that project. Now how many people are using it in production? I don't know. And but the reaction is usually very positive. Some people are always asking the same question about license. So, we, the folks who, who started that project, they use MIT.

And some people saw that it was an attack against the FSF, which is clearly not the case for me. So, we have always, if you look at every Hacker News or Reddit thread about it, they are going to mention the license. So, some people are very vocal about the license. I'm not. I, I honestly don't care that much, as long as it is OSI compliant.

So we have that one and some people there is always a concern about Rust that it is hard to use, hard to package, and hard to develop into. And for some people Rust is just a trend. I disagree, but some people still think that this language is going to disappear.

Jonathan: Yeah, so okay several things there I want to ask about, and the first one this may be a difficult question, or maybe, maybe you have the answer ready, I don't know but I, I get why developers like this, right, because Something, something new, a re implementation of something really old, a popular language.

I mean, that's just candy to developers. Like, I'm sitting here going, I wonder if I could send a patch in. I'm sure there's something that I could dive in there and work on. I don't know Rust at all, but I'm sure I could work on it. Like, it's just, it's just candy for us developers. But for users, for end users, what are If there are any, what are the advantages of going to the Rust core utils?

And you said something about in some use cases, there is actually compliance reasons to do so, and I'm really interested in what that is. You mean compliant in what, in what sense? You, you, oh, sorry, there we go, that one. So you said something about car manufacturers use it and it seemed like there was a, there was a legal compliance

Sylvestre: issue.

Yeah. It's GPL. Oh, it's the GPL versus

Jonathan: MIT. Oh, okay. And so is there, is there something then for, see, I, I figured that was maybe with the new EU laws, they are pushing people towards memory safe languages. And so I thought maybe that was what was going on there. But so comment on that if you would, for a minute about like what the reasons are that regular users might want to use the, the Rust core utils.

Sylvestre: To me, one, one of the reason is that performances. So for some function, we are faster than the glue implementation. So for example, if you do a recursive LS or if you do a CP, we are faster not because we are better developer, but because we are leveraging some of the tweaks that. Rust is providing you for free in the system.

We also have some extensions. So we are documenting every extension that we do. So there is something, I did a presentation at FOSDEM like one year ago. And I mentioned that we are, we have a dash dash progress option in CP and MV and people clap in the room, I was very surprised. And then I, I, and then I had to move some file a few months after I was like, Oh yeah, now I understand why it's extra because it's such a pain with CP to, you never know where you are at.

You have to do some DU on the other side in another terminal to know how many files you have to transfer and so on. So we have some extensions that are helping the user. We also took some, we also implemented some options from we took some options for cut, for example, from the BSD world. So we can do some extensions.

We are trying to be reasonable. Like, we are, we are really trying to understand if it works. Really provides a value to the user because we know that we are contributing to the fragmentation and to the mess by adding extensions. So we are trying really to be careful.

Jonathan: Yeah. Oh, that's, that's interesting.

And yes, for the, for the, the, the, the progress bar. Thank you. That is, that has been a long standing gripe of a lot of people about CP. Okay, so let's, let's go in the direction of packaging. So we, we talked about this a little bit before the show, and I will try to re, re pontificate my thoughts on this.

I'm a Linux user. I use Linux, and therefore I, I am rather fond of using the package manager provided by my Linux distribution. So in Pop OS, it's apt in the Fedora machine behind me, it's DNF. That works great. And there's this issue that languages. Python is one that does this. The, the, the Node.

js JavaScript sort of ecosystem does this. And then also Rust does this. And in Rust's case, it's Cargo. And you have a package manager just for the language, which. For developers is amazing, and it's extremely helpful. The problem that I see there is there's this sort of disconnect of there's now a package manager for your distribution, and there's a package manager for your language.

And how do you then install the packages that you want? Because, you know, there's the obvious advantage of installing them via your distros package manager. And it's just, Is Rust, to put a point on this, is Rust hard to package and has that been a problem for you guys for getting these packages into distros?

Sylvestre: So, I'm, I'm a Debian developer and I uploaded the first version of the Rust compiler in Debian, so I know the pain and I have plenty of scars to show it. So Rust is not easy to package, but it's very similar to Java. So it's not

Jonathan: just me. It's not, it's not just me that has that problem.

Sylvestre: Yeah. So it's, I found it very similar to the Java ecosystem, like with Java, you have.

Not anymore, but it used to move very fast, so you had plenty of different versions of the same library, and you had to package different versions of the same library. So, for example, you had several versions of the XML parser, or several versions of a library to, to read files, and so on, because it, it It's upstream is not always following the best API practices, like the same VEA system.

So you have the same issue in the Rust ecosystem it's less and less an issue because developer are stabilizing the core, the core libraries, the crates. So, but in, in a distro like Debian, you need to package all the dependency independently. And without any network connection. So that means that if you have like the Rust coroutines, we have, I think, 300 dependencies.

That means that you need 300 different packages in Debian to be able to upload that version. And when you want to update a new version of the coroutines, that means that you have, you need to update the dependencies that have been updated. Sometimes it's easy, sometimes it's hard. So yes, it's not that easy.

We have tools in Debian to make your life easier. And other distros have probably similar tools. But it's not specific to Rust. Like OCaml has the same issue. Python, NPM. It's it's Part of the work of the Debian developer, of the packager currently.

Jonathan: Yeah. Okay, so, I, I, I'm very curious Is Rust core utils available in any distros?

So, you know, can I, in Fedora, let's say, can I install, you know, Rust core utils with DNF, or in one of the Debian derivatives, is it possible there, is, is the packaging worked out anywhere?

Sylvestre: I think there was a, I would phrase the question differently. It is where it is not available currently. I think it has been packaged in most of the distros.

So it is on Boo. I saw some people packaging it on Windows. I don't know what that means, but there are some people working. I think it's Wingate or something. It is it is on Debian for a long time. Ubuntu, Fedora, ArcLinux and so on. So you can do it. Now the question is Is how do you update your system to use it?

It's this one is harder. So for example on my system, I just override the path in every terminal To point to the to the rust implementation When I was trying to evaluate how much work was left to be able to boot a Debian system, I was removing the GNU Core Utils to replace it by our implementation, to make sure that I was not using the GNU one.

Linux distro are providing different options. So I know that some Linux distro are offering to replace the GNU implementation on Debian. You have the two implementations next to each other. So you just update your path to make it work.

Jonathan: Yeah, that's that's, that's probably a bit of a challenge. So most distros won't let you uninstall core utils.

That's kind of a protected package. Are, are any distros, do any of them treat the Rust core utils package as a replacement so that you can do the install of one and then the uninstall of the other? Is that, is that actually possible anywhere to go with just the Rust core utils?

Sylvestre: Yeah, I, I don't remember if it is gen2 or arch, but one of the two is providing that option.

Jonathan: Why am I not surprised that it's arch? Why am I not surprised?

Yes, so I, I suppose at some point in the future, arch users are going to say, By the way, I run arch and the Rust core utils.

Sylvestre: Yeah, I had I was at FOSDEM also in February and you know, some people were talking about Rust next to me and one of them told me, oh, I'm using the Rust Core Utils on my system.

And I was surprised because it was the first time that a random guy at FOSDEM told me about that at the conference. Like in real life, it was funny. That kind of

Jeff: anecdote. That's great. Well, as a random guy, where, where could I find the Core Utils? The Rust versions. Now, I know Linux you listed a bunch of different distributions, but BSD or other Unix's or I know you said you touched on Windows a little bit, but

Sylvestre: Yeah, so it is it is one of our strengths is that we, we are treating all the platform as Tier 1 all the supported platform as Tier 1.

So Windows, Mac, Linux, Android we have free BSD support. Someone has been working on the OpenBSD port. So we treat every platform as Tier 1. So if it breaks on Windows, you need to have a good reason to fix it. For breaking it. So for example, Artlink is one of them. So Windows doesn't have support for Artlink, same for Android.

So for this one, we disabled this feature in the code. So it's really but we are really trying to support every platform as we can. And we have CI and GitHub that runs every PR on this platform.

Jeff: Oh, that's awesome. So, okay, I'm average guy. Okay. I decide I'm going to go and download the core utils. Am I going to notice a performance difference?

Sylvestre: Good question, it depends on the command and depends on the option. So sort is faster ls can be faster, and there are other commands like I think cut will be slower. So for now we have been focusing on compatibility and then we will focus on performances. We, we have some performance win, but not always.

For example, there is a common factor to, to to get the prime numbers, to play with prime number, and we are significantly slower than the GNU implementation. And some people recently looked at using some crates which are doing prime number math, and they were pretty bad compared to the GNU implementation.

The math into GNU. So there are places where we are slower. So if someone is into math and want to do some prime number math there is a space for you to do that in Rust and to make it faster than this implementation.

Jeff: I bet you there's somebody in the audience that's probably really good at math, you know, so here's your here's your chance to contribute.

So, and it's also good to know that so even if you hit 100 percent compatibility, That's not the end of the program. You can then go back and say, Okay, it's 100 percent compatible, let's make it faster. So that, very interesting there. Now you mentioned Android, and so I imagine there are some systems that are going to be rather, you know, memory, both RAM and drive constrained.

How's the package size? Is there much difference? Is it, Bigger, smaller, the same?

Sylvestre: We have tricks to make it smaller. So if you Rust is generating some significantly bigger binaries than C in general, so you, if you download If you don't use a trick that I'm going to share, it can be up to 100 or 200 megabytes So called it fields because it's much bigger and we don't use a share library.

So So you cannot say space by using that trick. So there is a trick that you can use is that just like busy box, you, you have a single binary and then you create same link. So it is a trick that I'm using in Debian. So you have only one binary and you do rest dash coroutiles. And then you create a same link that is going to be named LS or NL or.

CP and so on. And at the end, you only have one binary. So size is reasonable. I think it's 20 or 30 megabytes for the memory consumption. It's really dependent on the program. I we had a bug report recently saying that our implementation of mall. So to read a text is significant, using a lot of memory.

So one of the maintainer. Tertz decided to work on it and decrease the memory footprint by a factor of 10. But even with that, we are still using more memories and so it's part of the fun of that project also.

Jonathan: So there's, there's a rumor apparently going around that the Rust core utils project is entirely funded by you getting a Euro every time someone asks you why it's a MIT instead of GPL.

Sylvestre: Yeah, exactly. Yeah, it's usually a good way to start some trouble that one.

Jonathan: Yeah, so, okay, I'm, I'm curious From what I understand of licensing, if you wanted to, you could actually update the license from MIT to GPL because they are compatible in that way. Is that something that's ever been considered?

Now, I'm not telling you that this is something you should do. I'm just asking the question.

Sylvestre: Well, I I have a question with plenty of friends and people online. As I was saying earlier, I I care. I don't care about license. I only care if the license is OSI compliant. I think it's a good rule. So I'm not into a license debate because it's more philosophical than technical.

And and we use that license for a long time and we got a community and the community is vibrant. So we have 20 30 people contributing to each release, at least 10 newcomer every time. I'm not saying that it is thanks to this license. But changing the license might create a lot of unwanted noise and conversation.

And I don't have time for that.

Jonathan: Yeah, especially since you have some users that are using your package because it's MIT. That would, that would definitely be disruptive. That would be that would be fork, fork bait, let's say. Okay, so one of the other things that you mentioned in the prep is that you guys fuzz core utils.

You do some differential fuzzing as well, and I'm very curious about that. You said there has only been like 13 CVEs in the last 20 years in the upstream core utils. Was any of that found because of your fuzzing?

Sylvestre: No, no, no, I I was, when I started fuzzing our implementation, I was excited because I saw that I would find security issue upstream in the GNU implementation and I didn't find anything.

Not even a single crash.

Speaker 5: So

Sylvestre: it is a testament to the quality of the GNU implementation. Like I, I, I'm in touch often with the two main developers. So I'm going to butcher his name, but Padraig and Jim Meyering. And and they are amazing developer, lovely human beings. So it's a pleasure to interact with them.

So they are terrific developers. So I'm not surprised that I didn't find any issue. We first. Not, not really for security, but for crashes in general because it can find some some weird behavior. So for example, there is a sec command in GNU that you, in the coroutines that you can use to generate sequences of numbers.

So integer, integer or float and so on. And when you start fuzzing those things, you find some weird behavior when the numbers are very high or very small or close to one or those kind of things. So we found bugs. And differential testing, differential fuzzing help us find differences with with the GNU implementation.

So we do, so basically what we do is we generate some codes and some batch script or some code that we are going to send to those commands. And we look at at the error code. If it is zero, that means that it works. If it is not zero, it's an error. And then we are going to look at the, at the standard output and the standard error to see if we are producing the same output.

So with LS it's very important. And we are looking at the error messages. So we do a differential fuzzing, not really for security purposes, but for compatibility.

Jonathan: Yeah interesting. Okay, so One of the, one of the other notes that you've got here that I think is interesting to dive into is this idea of dependencies as security, not vulnerabilities, but attack surface, we'll say with the idea of supply chain attacks.

One of the, you know, we talk about Rust as being potentially hardened for security. I'm curious what you think about this idea of there being a bigger attack surface just because with using Cargo, you've got so many dependencies in each Rust project.

Sylvestre: Yeah, you you have to be careful. So, I was if you look at the dependency tree of our implementation, we have like 300 dependencies.

Some of them are huge, some of them are tiny, and we know what we are depending on at level one, so that means direct dependencies. So, for example, we are using the Nix crate, which is a wrapper. Around some libc function. We are using lscolor to do the color management of ls. And we are using sc linux crate to do sc linux feature.

And we know those upstream developer and we trust them. And we know that they are usually very good at doing release management and not taking crappy PRs. And we know that those crates are well maintained. However, with the dependency tree, you have some crates low in the stacks that might be, that might get compromised at some point.

So you have so at Mozilla and with Google, we worked on a program called CargoVet which enables So Chrome team and the Firefox team to verify if to audit the crate and to share with others that, yes, that crate has been verified and validated and there is no issue with that one. So there are mitigation strategy, but it's a, it's a global issue in tech.

Like we saw that with MPM a few year ago, it's really a typical vector of attack injecting some backdoor into dependency. And of course the EXE story, the recent one, is an amazing example of supply chain attack. So yeah, it's an issue.

Jonathan: So that's, that's something I was just thinking about with, with XZ introducing a backdoor into SSH, which is still the craziest story ever.

The, the fact that you're doing A B testing between the Rust core utils and the Upstream core utils it's an interesting opportunity to maybe find issues like that sooner. Maybe immediately. Particularly if some of that testing happens on like on various distros with real installs. And I'm not sure exactly how your CI works.

This might be, this might be difficult to exactly get at. But so I, I don't think SSH, SSH obviously is not one of the core utils. And so the SSH daemon, it's not something that's, you know, in scope for the Rust The Rust query tools project. But just thinking through this, like, let's just just kind of game, game this out.

If there was a Rust version of OpenSSH or even of XZ, let's say you could, you could write a test that would have feasibly caught This difference in behavior because that's essentially how it was discovered, right? There was one of the one of the was he was he a Debian developer? Anyway, one of the He was a Microsoft developer.

That's right he was he was working on I think on Microsoft Linux on but anyway, he he He was playing with with SSH and suddenly realized this is behaving differently than I expect it to So something was taking longer than it was expected to and He It's something that's it's ever since that story has come up that has interested me is this idea of could you automate some testing To discover that sort of difference and you know Obviously if if you could automate that and run that on it on every update you could potentially find stuff like that a lot faster and it's fascinating that because you have you know, you have a Kind of a black box implementation, but it's supposed to be doing the same thing as these query tools and you're doing it in Rust instead of C, you know, it gets you kind of this insulated second opinion and So if someone ever tried to do something like that in and obviously with the query tools It'd be extremely difficult, but still if someone tried to slip one of those things in there You guys would see it, likely, because you have this huge test suite that you're doing A B comparisons with.

I think that's really fascinating.

Sylvestre: Yeah, you're making a very good point. I think the attacker on Xe knew that. That's why he said disable some feature in OSS Puzzle. Like, they knew that fuzzing would be a great way to catch those kind of error, and that's why they went on OSS Puzzle. Fuzz and disabled some as a check.

Yeah, good point.

Jonathan: Yeah. But, but so in, in, in this example though, like, so again, playing this out, like what if there were a rust version of xz or SSH or whatever the patch that he sent in that explained it in OSS fuzz, it got accepted. Because nobody really stopped to dig all the way into it. Whereas if someone had to re implement that in Rust, you would have to get to the bottom of it and understand, okay, why is this suddenly doing this?

Why is he making this change? And things would, things would not make sense there. And I just, I, I, I feel like it might be an opportunity to catch something like that a lot faster, which. I've got to admit it was caught, it was caught extremely fast. Hardly any distros actually shipped it and it was not, it was not live for very long at all.

But I don't know, it just, it seems like a very interesting opportunity that, that maybe Maybe we need more utilities that have a Rust version of them as an insurance policy.

You want to talk about that? I've actually been told that there is a there's an idea that the Rust core utils that you have to maybe include some other applications.

Sylvestre: Yeah so we we have been working on the re implementation of the find utils, which are quite famous, which are not part of the official core utils.

And we started also some initiative to do the same with diff and many of the other tools that are called to to the Linux distro right now. So utils linux, procps, psd utils, hostname login, and so on. So this one is more for For fun, it's one of the thing with those project is that you know, what is a target?

So you don't need to buy shadow to discuss about what should be the input or the output you have a reference So it's very easy to learn the rust by doing it It's fun. If you are into operating system like I am so, you know You can work on your own and you just have to mimic what the other software is doing and you can It's really a great way to learn rust Like it's a way I learn Rust and I'm sure that many people starting contributing to those tools and and learn Rust thanks to that.

Jeff: So is the plan to just keep expanding the programs you translate into Rust? I mean, are you looking at taking almost all the commands, you know, you'd run in the shell and eventually, you know, 10 years from now or whatever? Convert them all?

Sylvestre: Yeah, I think it's a, it's a good investment for the future of our industry.

Like, you know, there is a Chinese saying that there is two, there are two times where you need to plant a tree, 20 years ago and right now. And and I think we are at this point that if we want to have a good take in 20 years, we need to start investing now and starting to replace those two now.

Because I feel that Us, we are starting to be old, but the new generation, will they want to learn C to do some some maintenance of those tools? And those tools need to evolve, like they cannot stay the way they are. I was looking at the, at the GNU coroutines not, and I saw that they updated some code for the new GLibc.

And there are always changes happening, new architecture and changes on the kernel, on the libc and so on. So we need to provide better tools for the future generation to to access tools.

Jonathan: Yeah. I kind of want to jump in and ask quickly, like what what does that look like? Because some of these tools do have changes that get made to them.

And sort of your, your target is kind of a moving target because of that. Is that, is that a challenge in and of itself?

Sylvestre: It can be frustrating, like when, when the new implementation is pushing a new release, we are updating our CI, and we often see like five tests, which were green, are becoming red, and that's always a bit sad.

You know, more work, or they are making changes, and sometimes I'm looking at the changelog upstream, and I realize that I'm the one who made the upstream change and broke my test in the Rust implementation. So sometimes I hate myself for doing it. That happens sometimes. You know what I mean? But it is exciting.

It is very, I, I didn't know much about the new core utils when I started, but it's still there are discussion happening about adding new option on the mailing list and changes that are made, so they are living software, so it's healthy.

Jeff: So, so with you know, okay. The core utils are going to rust.

Other utilities are going to go rust. You know, there's a lot of talk, you know. So. Rust in the kernel. So with this whole rust ecosystem, you know, how, how are the core utils going to. integrate into that like new shell, for example, how does that all fit together?

Sylvestre: So new shell the new shell folks started contributing to our tools because they want to be able to plug themselves directly in Rust to that and not do some system call regular call to the binaries that are provided by GNU.

So they want to be in the same memory space, so they started splitting some of our tools to to provide API for them. So there are more and more integration with GNU Shell which is a fascinating case, a fascinating example to me. I was very pleased when I saw that because, you know, when you, when you do software, you see people using your software in an unexpected way.

And that one was amazing to me. Then for the kernel The Rust ecosystem is well designed, so with crates and cargo, you can, if, if we provide we are shipping some new crates to do some self content change. So, for example, the SC Linux crate was started by someone who contributed to our project to introduce I think it was CP or CS CS on SC Linux feature, and he decided to create a crate, and that crate is used by many other software.

We are trying to split our work so that others use it.

Jonathan: Okay, so I am, I'm curious, you said earlier that you've got sort of a good working relationship with the upstream core utils guys. Is there a future where the Rust core utils becomes more official? You know, at some point in the distant future, are the GNU core utils going to be the Rust core utils?

Is, is this something that could happen?

Sylvestre: Not anytime soon, maybe at some point, but for now they are the gold standard. Everybody uses, well, every Linux distro uses that implementation. And as far as I know BSD and Mac are following what they are doing in general, in terms of options. So they are still the gold standard.

Maybe at some point that will change, but not anytime soon.

Jonathan: So there was something else I was going to ask. Are there any distros or projects that are shipping the Rust core utils by default? And I could, I could very much imagine a Rust centric distro. Like if it doesn't exist, maybe it needs to. that uses the REST core utils by default.

Sylvestre: There is one, I forgot the name, but there is a Linux a distro based on Rust that is using our tools. I don't remember if it is Redux or something like that, but there are people using it already for basis for the operating system. So, the one that I was mentioning earlier for cars, I think it's called r purchase.

I don't know much about it, but they are using it and shipping by default.

Jeff: Mm hmm. Very cool. So, if, if you're doing a lot of that with the You're replacing Corey Utils, the other stuff. Are you, are you putting a GNU out of a job? Is there any animosity or any kind of,

Sylvestre: no

Jeff: friction?

Sylvestre: I don't with with Jim and Padraig, we exchange email often.

So then they are very friendly with us. And and I love what they are doing. I have a lot of respect for those folks. So there is no tension on that front. For the FSF, I have no idea. I haven't received an email from Stallman yet. Maybe I will after that call, that meeting.

Speaker 5: That would be, that would be fun.

Let us know what he says. He has the most interesting opinion on things.

Jonathan: All right. So let's see, is there, is there anything, any problems that you've run into in the process of doing this that were unexpected, any really difficult problems you can tell us about?

Sylvestre: Yeah, sometimes it's, it's interesting to understand why a developer decided to implement that function this way, or that argument. And sometime I wish I had a time machine to go back like 15 years and tell, tell that developer you should not do that this way, you should do that this way. I was mentioning some, some of the, some of the issues that we often see is if the software is well designed, you have Two options doing the opposite, conflicting with each other, and you have some error messages.

But sometimes you don't have that, so sometimes you have conflicting options and only the last one is going to be used. And you have sometimes some lack of consistency in the GNU coroutines, and it's probably going back from the Unix time. And and those things are hard, and sometimes you're like, Oh, I wish I had done that differently, because it makes our code uglier than it needs to be sometimes.

Jonathan: So what, what is, what does the timeline look like? At what point are you going to be able to say, okay, the core utils are done or as done as they can be considering that code is still getting written for the upstream core utils. But like, if you, if you kind of look at your, your, your progress that you're making now and extrapolate that out, you know, are you six months away from hitting a hundred percent on the tests and all of them are five years away?

Like where do you think you're at? I

Sylvestre: think we, I think on the main tools cause if, let's, let's be, let's be honest, like you use the 20 percent or 30 percent of the current yield, like there are many tests that you never use. Like the topological sort nobody uses it most of the time. Like I'm sure that someone is going to say, yeah, I'm using it often, but I never use it in real life.

Central to

Speaker 5: my workload. Yeah. Yeah. Yeah.

Sylvestre: Yeah. Yeah. Yeah. So you have tools that you rarely use. So those ones are going to take longer, but it is our goal. I think we can be 100 compliant with the main tools within a year or something like that, maybe two years.

Jeff: But yeah, awesome. Oh, that's soon.

Sylvestre: Well, the GSOC helps. Like, having someone to fix all the LS corner cases, it's very helpful with the corners and so on.

Jonathan: Yeah, yeah, that's true. That's true. Okay, is there anything that we did not ask you about? I know this is a hard question. Is there anything we didn't ask you about that you want to make sure and let folks know about?

Sylvestre: Well what should I do to to contribute? We have good first bug. We are four maintainers who are spending way too much time on, on that project. We can help mentoring. And as I was saying earlier having a reference implementation makes your life significantly easier. We have a test suite that runs in less than a minute.

It's very, very fast to run all the test suite. GNU takes longer, takes like 15 minutes to run the test. Mostly because they are using a lot of script to run the test and a lot of different namespace and memory. So for us, it's the same memory space. And so it makes things way faster. So it's very easy to run.

You know very quickly if you are regressing the tools and we have a lot of trust in our CI I think the code coverage is like 85 percent 86 currently, so it's amazing And so that makes your life significantly easier as a developer when you want to start hacking in those software So contributing is very easy if you want to learn the rest it's one of the easy project to start with because there are so tools are very self contained and not many dependencies, not like starting to contribute on Firefox or Chrome or something like

Jonathan: that.

So if somebody wants to learn more, where are the places to go to? If, if I want to jump in and do some work, but I have questions, you know, is there a, is there a forum or an IRC or a discord where, where do you, where do you want to send folks to, to find out more?

Sylvestre: I wish it was on IRC, but it's on Discord.

I'm part of the old IRC. But yeah, it was before my time and the community was already there. I don't want to be the old guy saying to the young people, you should use IRC.

Jonathan: I would imagine that there is a way to bridge IRC and Discord. Please pop out your notepad. I don't think so.

Speaker 4: Well I'm just

Jeff: thinking, you're saying Old Guard, I'm the only one here with white on the beard, so, you know.

I shaved, that's

Speaker 4: why.

Jonathan: My white is all on top. Okay, so last couple of questions then that we are required to ask everybody, and that is, what is your favorite scripting language and text editor you spend all day in?

Sylvestre: So text editor, depending on what I do, so if I'm on server, I'm going to use Nano. If I need an application that starts quickly and don't use 20GB of RAM, I'm going to use Emacs.

And if I'm doing Rust code, I'm going to use VS code with RLA, so Rust tool. But that one is using way too much memory, in my opinion. But yeah, it really depends on what I'm trying to do and how much time I'm going to

Speaker 5: spend,

Sylvestre: I think. Yeah, and scripting language. I love Bash. I love Bash. I know that you interviewed the Bash author.

So to me, it's a scripting language. It's ugly, but I love it. And Python. I love writing Python also.

Jonathan: Yes, yes. Did you catch the interview we did with Pavo about Amber? Sort of a better Bashling? That one was really interesting. I enjoyed that one a lot. Yeah, definitely. I'm just And it rings a bell.

Sylvestre: Yeah, I'm looking When I was listening to him, I was like, yeah, it rings a

Jonathan: bell.

Yeah, I'm looking forward to I'm looking forward to the day when we bring somebody on and they tell, you know, that's not associated with that project. We bring them on. They're like, oh yeah, Amber, it's great. It'll be fun. Alright. Thank you. Thank you, sir. Thank you so much for being here. It was a blast to learn more about the project.

And you know, maybe in six months or a year or so, we'll have to bring you back on to talk about what's changed. Sounds good. Alright. Thank

Jeff: you.

Jonathan: Okay, so, what what do you think?

Jeff: I think it's awesome. I, you know, With the, with the new language, forward thinking and all the fuzzing and everything and just, you know, and even better defining some of the ways that some of the tools handle, you know, like you said, conflicting switches and, you know, just kind of, kind of cleaning up.

I mean, I think it's awesome.

Jonathan: Yeah, I, I'm, I wonder, and I, I, Of course now, this is some staircase humor, as it were. I should have, I should have asked about this during the show, and we can ask about this next time. But I wonder if there's a future where, like, you can, you can put the Rust core utils in one of two modes.

I have, like, bug for bug compatible mode, or a clean up some of the weird stuff mode. Because, you know, like you said, things like the different handling of conflicting switches and it sounds like for now they are they are I think specifically you should call that misfeature for misfeature because they're they're not bugs But I could see a future where maybe at compile time or install time you say, you know, turn on the extra candy stuff and fix the old stuff.

But like, just the ability to have a progress bar in the copy command. Like, that's great. I so want that. I sort of want to install the Rust core details and start using them just for that. Because that drives me nuts. And of course, there's workarounds. There's ways to handle that. But that's, yeah, that's really cool.

Jeff: I've done the DU, like you mentioned, just to go, is this thing still working? What's going on? Let me see, you know, and Oh yeah, it's still going. And you just.

Jonathan: I don't remember if it's the CP command, but it's one of them that it's like, the official way to get a progress bar is you send it a a system signal.

Like the it's, it's not SIGKILL, but it's one of the other ones, like SIGUSER1, I think. You send it this signal, and it'll tell you what percentage it is. And so, like, you, you have to open up a second terminal, and so, like, you can set up a watch command with a kill all and then that signal. But it's just so clunky, it's like, why?

So I'm glad somebody's come

Jeff: along and done that. Yeah, I've, I've even tried that, what you're talking about. I've gotten the bar before, but it was, oh my gosh, you know, I was cutting and pasting out. Yeah, I was cutting and pasting out of a guide and like, oh man, it's just. Just start it and walk away. It's just less aggravating.

Jonathan: Yup. Oh, it's great. It's great. And we will, we will have to, we'll have to have the actual core, like the upstream core utils project. See if we can get those guys on. Cause that'd be a lot of fun to, to chat about that too. We can ask them why there's not a progress bar in CP. Come on.

Jeff: There's

Jonathan: 53 years

Jeff: to get it

Jonathan: right.

What's,

Jeff: what's going on here.

Jonathan: Oh, that's great. All right. You have anything you want to plug before we let folks go?

Jeff: The only thing is check me out and Jonathan as well and other co hosts over on the Untitled Linux show on the twit. tv network.

Jonathan: Yep, absolutely.

Jeff: We have a lot of fun over there. So, definitely, definitely want to see people over there in a very similar kind of vein as this show.

Other than that, that's all I got. Just thank you for having me on and always a pleasure and had a great time.

Jonathan: Yeah, thanks for being here. Alright, so I will let you know that the plan for next week is to talk with Jay Cattry about Highlight. io. That's going to be a lot of fun. We are recording on Tuesdays.

It's 1130 Central Time, my time, and we stream off to YouTube. So make sure you go and follow the full video. Floss Weekly YouTube channel, where we are now doing the video interviews as well. We finally got that workflow worked out. And so you can catch the video version there if you want to, or just stick with the audio, you know, what, whatever it's up to you.

As far as things for me to plug, I will mention Hackaday. We've got the security column goes live every Friday morning and lots and lots of stuff to cover there. And we. Other than that, we sure appreciate you being here. Everybody that watches this live, those that catch us on the download, and keep it up!

We'll see you next week on FLOSS Weekly.

Kategorier
Förekommer på
00:00 -00:00