Jonathan: Hey folks, this week Ben Metters joins me and we talk with Darko Fabian about Semaphore, a recently open sourced continuous integration continuous development platform that's got some neat tricks up its sleeve. You might want to check it out, so stay tuned. This is Floss Weekly, episode 825, recorded Tuesday, March 18th.
Open source CI with Semaphore.
It's time for Floss Weekly. That's the show about free, libre, and open source software. I'm your host. Jonathan Bennett, and today we're talking about continuous integration and continuous development, and I've got a co host that we have some shared battle scars over this exact topic. Ben Meadors, welcome!
Ben: How's it going, Jonathan?
Jonathan: Hey, it is great, it is good to have you here. This is the first time that Ben has been on as a co host. He's been a guest twice before, I think? And, yeah, and so when we got the, when we got the guests lined up for today to do the CICD thing, my mind immediately went to Ben because, as I said, we have, we have wrestled the, mainly the GitHub CICD system.
And, you know, we were, we were joking before the show. It's like, when you have 27 force pushes in a row to your Git, that's probably what it is that you're fighting with. So.
Ben: Yeah. And it's always the, the, the step at the end of the work. Flow that you're trying to fight.
Jonathan: You got to wait 45 minutes, 30 minutes.
Yeah, exactly. Well, our guest today has something to say about this. Maybe he has some solutions for us. It's Darko Fabian, and let's go ahead and bring him on. Let me push the right buttons to make that happen. Darko, welcome. Welcome to the show. Hey, guys. Hey.
Darko: Hey, guys.
Jonathan: Good to be here. We are glad to have you.
So, you've got a you've got a company, Semaphore, that is sort of in this space. And let's sort of start with that. What's the, what's the background of Semaphore, and what does it have to do with CI and CD?
Darko: Yeah. Well it I think it started in 2010 or so. Mm-hmm. Maybe 2009. We are essentially like small six or seven people, Rubian rail consulting shop work for, you know, various kinds of startups.
Those were the, I would say in some ways golden area of, of SaaS. So whatever you had on your desktop or you know, somewhere, you kind of ported it into the cloud and we. At the time, we, we were, you know, embracing TDD, BDD, all, all that good stuff. We just needed, needed a CI and such a small team, we all wanted to, you know, just write code and deploy features and all of that.
That, that was valued, you know, for our size and the nature of our, the nature of our business. So we installed, you know Jenkins and experience was initially amazing. It's super easy to install. But the problems kind of start later when we need to scale it and, and manage it. And the other service that appeared in, in those days were, you know GitHub and Heroku.
And GitHub and Heroku, as you guys know, had great, great UX. And Jenkins didn't, was not on that, on that level. So, yeah, what we did is just, okay, let's, let's take that, you know, tiny bit of what we needed and bring it to the cloud. And I would say the main innovation was to say You connect to the GitHub, you click next, next, next, and you, you can use as many as those agents, you know, that will build your coders, and you don't have to, you don't have to worry about them.
And it just turned out that if you call that bit innovation, I don't know what's the, what's the threshold for showing something innovation or not. It changes over time. But that was enough. That was enough for Ruby on Rails community to embrace us. And that was our beginning. There are other interesting technical challenges and stories along the way, but yeah, we can touch upon them later.
Jonathan: Sure, and absolutely we will. I know we've got questions about the technical thing. I want to jump right to the new news, and that is that you guys are releasing this to the world. Or at least parts of it as an open source project.
Darko: That's correct. So I would say about a month ago, we released Sanford Community Edition under Apache, Apache 2 license.
It has vast majority of the features of our cloud, cloud offering. More, more things will, you know, will, will go in into the, the Community Edition. We just have to, you know, prepare the code and all of that. And yeah, there, there is also like an enterprise edition, which exists for a couple of years now which has all those like.
RBAC and various compliance security stuff, which is not in the, in the community edition. We are trying not to be too creative, at least in the beginning with this and do what, what others have done and some achieved success, some not, but, you know, open core open core model. Yeah, in a nutshell, that's, that's the story so far.
Jonathan: And I know it's only a month in, but how how is it going? Have you had a mass exodus of existing customers saying, Oh, we're going to host it ourselves. I, I don't imagine that's going to be the case. Have you started getting pull requests back where people fix things that bug them?
Darko: Well, to start with the first maybe observation no exodus, you know, so far.
I mean When it comes to our cloud customers, we actually, we don't charge, you know. There's no per seat cost, so what people are paying is for resources they use and kind of software comes for free. And our customer base in that realm is, you know, are just people who want something that works and they don't want to manage it.
They're in that mentality that I described as we were. So it's a bunch of developers that want to do as little ops as possible. So, I mean, there might be some people that, you know, for various reasons, might end up going that routes. Our risk estimate in that area is that that's not going to, you know, hurt us in any, any, any, any major way.
And the same thing goes for the. Customers that are, that they're using our enterprise edition where we provide, you know, 24 seven, you know, support, we set SLAs and the contracts and all the, you know, security and all the other stuff that they need to. And those big companies just want to, you know, manage risk in the best way that they can.
And one of the way that they manage, just manage risk is that they have our, you know, 24 seven support and, you know, there's also a certain license clause. But yeah, those are the two segments. But moving to the next next piece. So I mean, in terms of the reactions as you guys know, the main Our audience are developers, DevOps folks, so when they heard that something is going to be open source and they can, you know, fix something that they hated for years and we did not fix it, and it's maybe just like a CSS, I mean, the first contribution that we got was something in the realm of like, What is a selectable element.
So they have a need sometimes to create, to select part of the ui, and they don't want that bullet point, you know, they just want the name or, or something. So it's I dunno, it's something Nons. Selectable. So that was the, the first thing. And yeah, the direction from, from whole customer base was, was very positive.
Mm-hmm .
Jonathan: Yeah. Well, I mean, you, you've got, i i, I have the I assume it's github.com/sema four io slash sema four. That's the, the main repo. You're up to, you're up to 855 stars. You've got 27 forks and it looks like 10 contributors. I'm sure not all of those are from from Semaphore. You've got a couple of releases out, like for being a month into this project, this looks really healthy.
Like this looks really good.
Darko: Yeah, thanks. To be honest, I didn't have time to really benchmark us against, you know, other people, you know, going open source. I mean, there, there, there is a difference. Obviously Sanford is a mature product, you know, which is, which is going open source, which is unique.
I was searching a lot, you know, to find who other crazy folks, you know, took that route and they're like. Very, very few that I could, you know identify.
Jonathan: Yeah, there's, there's a few of them out there, but it's not, it's not what usually happens. Yeah. So let's, let's dive in a little bit to like some technical questions about the CI itself.
Ben, do you want to kick this off? I know you've got some.
Ben: Yeah. Yeah. And actually one that just came up, you know, you, you mentioned you know, kind of getting started. I think you said Ruby on Rails was kind of where, you know, where you cut your teeth. Can you talk a little bit about the, the stack of, of that Semaphore runs on?
Is that a Ruby on Rails application or is it kind of a mix?
Darko: Yeah. Well, it started as a Ruby on Rails application. And it was a Ruby on Rails application maybe until 2017 or something like that. Essentially just a bit of history after, you know, starting the, the, you know, product and getting the traction, we literally spent maybe four years or so just, you know, scaling and making it more liable because you make it reliable for a hundred customers, you know, that can run, I don't know, whatever number of jobs in a day.
And then a couple of months later, you are maybe order of magnitude, you know, higher, you know, volume of, of just work that you are, that you are processing. And it turns out that your state machines and the patterns that you use and all of that is just not, you know, not made for that scale. It literally took, took quite a while.
And reliability is number one attribute of good CI CD system. If, you know, it doesn't run, you know, it doesn't matter how pretty it is or all the features that it has. And as, as we kind of just wrapped that up, you know, and it was supporting customers at a decent, decent scale that they needed. It turned out that it, it lacked some, some, you know, major features that, you know, bigger customers wanted, you know, that, that I want to make dependencies, fan in, fan out, you know, all the, all those kinds of things.
And at that point in time we, we had a bunch of struggles with, you know, running everything as a monolithic application. The main reason was that you have certain components that need to be very, very stable, you know, and you don't want to shake them a lot because of the UI change or dependency change or whatever.
That would be the scheduling algorithm, that would be the the component which is receiving the hooks from GitHub, and the component which is just talking to the agents which are running the jobs. So then we initially started running. That monolithic application has like multiple instances of that.
One would just boot a router for accepting GitHub hooks. And, you know, maybe GitHub, you know, during last decade goes in a somewhat major way down, a couple of times, and then they DDoS that component. And then to scale that component, you have to scale the whole Ruby on Rails app, which has a significant memory footprint.
So going back to the, to the original question, we. In 2017 made a, made a, you know, decision to do like a rewrite to support all those like fancy things that we did not have, like a highly customizable CI and There was there was on one hand, the, the hype of, of microservices was alive. On the other hand, we had that experience already that for us, that being chopped up a bit in a couple of services at that point in time, maybe we had, you know, seven or eight.
And we liked the pattern and then we, we chose Elixir. So Elixir is now something like 80 percent of our code. And when we launched that 2. 0 version, we were Ruby on a list monolith, which has maybe 30 percent of the original code base plus roughly 15 services that, that, that we're running. So it was. I would still classify it as a highly risky move of, of rewrite.
That's yeah, I would not recommend that to, to many people. I think it worked for us mainly because we were couple of years in, we know the patterns, we know the, that would work for us and it ended up working for us. But
Ben: yeah, that's a, that's a all too common story. I've, I've heard of, you know, kind of outgrowing the ruby on rail stack and, and pivoting to the elixir where you have a little bit more. And here it's scalability, so that's encouraging to hear that you guys were able to kind of move over to that and, and solve a lot of those problems.
Darko: Yeah, I haven't spoke about the global interpreter lock and the concurrency and parallelism and, you know, those kind of fights that you get with Ruby that comes for free. But yes, it is there. I haven't. Yeah, haven't had the chance to, to, to connect yet with Jose Willem, the creator of Elixir language.
And I want to hear is, is, is for maybe the biggest, you know, Elixir Phoenix application open source. So, yeah, maybe it is.
Jonathan: It might be. One of the things that I know GitHub has struggled with, with its CI CD, or maybe it's more accurate to say users of GitHub have struggled with, is keeping those things secure.
Particularly when you've got instances like, someone has sent us a pull request, we're going to automatically kick off a CI run. that uses that pull request, and so essentially you're going to run some untrusted code on your, your CICD service. How does, how does Sima4 handle that? How do you avoid getting owned every time somebody sends a PR and it kicks off a run?
Darko: Yeah, I mean, there are two features that I'm familiar with, there may be something else that I don't know, so I think that the first one that we implemented is that there is a radio button or a checkbox disable Runs from four, four tripos. That was the, the first, after that, there was a feature of adding a whitelist.
So I could take maybe GitHub handles from you guys and I can add it to the whitelist. So other people can. The CI will not kick off unless, you know you are in the whitelist. And I think the last iteration of that in terms of the, the UX is that you can, if someone opens a pull request from a fork, you can go and put a comment slash Sam dash approve.
Which would be kind of a manual approval for a random person from the internet To run to run to run the ci job and it seems that it it covered the Most of the corner cases for for a lot of people.
Jonathan: Yeah. Yeah, it's it's something that For the longest time, people didn't realize that this was going to be a security problem, right?
And, and then suddenly it was discovered that RCI has all of these, they've got tokens, they've got GitHub tokens. Sometimes they have AWS tokens, they have all this stuff. And it, it kind of was thrust on the, on the scene that those tokens are sometimes accessible in ways that you don't want them to be.
And so I know places like GitHub have done a lot of work to try to make that secure. We, we got an email not too terribly long ago that GitHub, Was made aware of a problem where one of the action scripts because on github you get to like this There's this whole community of actions that individuals have written and one of those actions got Maliciously modified to have some code added to it.
And so We then had this issue where it's like you didn't You didn't accept code intentionally that was malicious, but because, you know, you may have used this action, you may have run it. And so like, that's a, that's a security problem that has a lot of hair and. We're really still trying to figure out how to, how to, how to deal with all of that.
Darko: Yeah, I would classify it as I mean, somewhat typical open source security problem. As there are so many dependencies that, that are, that are out there maintained by You know various kind of people and we have these, you know takeovers of how it's called xz Yep, okay, which is on a completely another scale.
Maybe not the best example to to use but They're just thousands and thousands of dependencies and tracking What's in all of them is a major
Jonathan: so we've talked about to go a different direction We've talked a lot about github because that's obviously where a lot of people are at And I do want to ask about the github integration But I first have to ask what about get lab and some of the other get labs not the only other one out there But is there any semaphore support for some of those other?
Services that are not get hub.
Darko: We don't have a bit bucket support and and a get live support. Okay.
Jonathan: And so what, like, what does the, what is the integration look like? How let's just take for get hub, for instance, because I feel like that's what most people are going to be most familiar with. Does it just sort of drop in, replace all of get hubs C.
I. Stuff so that you, you don't have to mess with it. Or does it, does it use some of the same interfaces? Like, what's the, what's the user experience like?
Darko: Well, once you like have account, let's say Sam for cloud, and you you want to add a repo, you want to connect the repo, there is like a GitHub app that, that needs to be installed.
Mm-hmm. You click, you, you kind of initiate the process. Sam, for side it, you know, kicks you to, to GitHub. You have to select a repo to which you want to give access to s for on oral field repos. And then you authorize that, and then you go back to, to Xamphore and Xamphore through the API can list your repositories, you pick the repo in which you want to configure Xamphore then it's a single click action.
And in the next step, you need to essentially choose one of our templates for, you know, your CI process, you know, some YAML, maybe play around with it and click next, and you have a first fail built.
Ben: One feature I was kind of interested in that when I was looking through the docs on Semaphore is, I saw mention of an interactive debug with SSH, and that sounded particularly intriguing from my experience that Jonathan mentioned earlier of, of constantly force pushing to try to, to try to kick off A failed feedback loop and in a CICD pipeline and trying to figure out what's going on in the in the constant print of like a LS command or something to try to figure out why some file isn't there.
Can you talk a little bit about that feature?
Darko: Absolutely. Yeah, it, it comes in, in, in two flavors. There is a CLI tool that we have. And essentially if you would go to a failed job, there is like an icon that you click or, you know, debug and you, you copy a line, which does SAM debug and that job ID. You paste it in your terminal.
And what SAM4 is going to do is going to spin up the. The VM or a container, however the job is configured, up for you, and it's going to export all the environment variables, credentials, and all of that into that environment. And it's going to SH you into that. So what would end up happening after five seconds or so, you would be SHed into that, into that box.
And then There is like a script like in commands. sh where we export into that script all the commands that were configured for that job and you can just kick off that script and it would just run all the commands that were meant to be running that or you can, you know, start manually and do Call the checkout command to check out the code and then crunch slowly through your commands.
And it is extremely useful for those, for those situations that you are describing. And it comes also in another flavor, which is sematach. So let's say you have a job, which should finish in five minutes, but let's say 30 minutes in, and it's still hanging, it's saying hanging on some dot from your, your, your test report formatter, and you have no clue what it is, then you can do a SAM attach and give that job ID, and it's going to give you the SH access.
Into that job, you would have those processes running or being stuck or whatever. And this is the moment where your sysadmin will go hardcore sysadmin skills, you know, kicking. The first thing that I suggest to people is that they do S trace on that, on the ID of that process, which is running those, I mean, check out the logs first, but if it's messier than that, then you'll do the S trace there.
And you can maybe see that. If it's some browser heavy test, you will see in the logs that it's maybe polling on some external network resource in a loop, or there is a lock, you know, you are waiting on some SQL query which is now never going to end, or those kinds of things. And Yeah, I have to say that we kind of build those features more for ourselves than for the customers.
Well, that's, that's the way you imagine support. That's the way, that's the way the
Jonathan: best features happen though. That's right. That's the way the best features happen. Goodness, I, I tell people all the time, I'm good at X, Y, or Z, because I've been bad at it for so long and had to figure out how to make do, right?
It's the same thing. The features that you dog food are absolutely the ones that people get the most use out of.
Darko: Yeah, it's in the realm of like, if it hurts, do it more often. Yes!
Jonathan: Oh, that's so true. What about architectures? What architectures do we support? So one of the things that we've run into is trying to do ARM64 builds on an ARM64 runs on CICD.
And that's just the tip of the iceberg. You know, there might be people that want to do old school power PC MIPS RISC, all kinds of stuff out there. What does support for that look like?
Darko: I mean, what is support is, you know, Mac, any flavor of Linux and and, and windows out, out of the box. And then there is a support for also one, which falls into that category, which you mentioned of mm-hmm.
Like, you know, arm 64 and so on. There was, the agent is open source actually for many years now. And, you know, people can compile that. I mean I remember that some people were doing some, battling with some how it's called A IX. Is that the IBM Linux Unix, right? Yes.
Jonathan: Yes.
Darko: That, that's the most exotic thing that I remember.
Plus people were, I think, adding FIPs support for, you know, encryption for the agent.
Speaker 5: Mm-hmm .
Darko: And, that's in the realm of exotic, you know, platforms that I'm aware of. But there are many others.
Jonathan: Well, I guess, I guess that is one of the huge advantages that you have over GitHub is if that agent is open source, then somebody can just come along and run it on just about whatever platform they want to.
Obviously, obviously there might be some technical issues you have to work through to actually make that work. But that's actually, that
Darko: component is, you know, that component is written in Go. Just to be fair to also all the Go folks, there is, there is some Go, I mean Yeah, forcing people to run Erlang VM on their on their agents would not be
Jonathan: We interviewed some of the people behind Erlang, and I think specifically the NERVS project.
And that thing is, that thing is interesting. That's a, that's a cool project. I hear nothing
Darko: against Erlang VM. I think Erlang VM is great. It's just that it's you cannot beat that single binary, which is, you know, 5 megs or so.
Jonathan: Yes. Yes, it is. It is a different way. It's a different way to look at the world.
Speaker 4: No.
Jonathan: All right. So do you, do you know of, or do you have people doing crazy things like running this on a Raspberry Pi? Is that, is that sort of in scope or do you know of anybody doing that?
Darko: I mean us, you know, when we started the open source was owned by Travis CI. You'll probably remember those guys.
They're kind of out of the scene now. And then it comes, when it comes to the open source community, you know, GitHub Actions, you know, just by the nature of where the code is, you know, took over the ownership of that and those kind of like super cool, geeky fun projects unfortunately never belonged to us in,
Speaker 5: in,
Darko: in, in, in that phase.
So Rosemary Pi is not, but on the other hand it was great for us that, you know, various startups in. You know, various domains did use, did use for, for their commercial work. And I would say there are like a couple of interesting things here. So let's say Tegera, a company that makes Calico, like a networking layer for Kubernetes.
They, they need some very custom, you know, networking stuff to be accessible for them to be able to run all that. So then that's one of the projects. Where such, such capabilities are needed. There are now a lot of people that have different kinds of GPUs and those kind of things that also have a lot of data that's sitting somewhere on some very expensive NVMEs and then run those kinds of workloads there.
So I would say it's more kind of that, you know, there is a lot of exciting hardware that people use, but it is, um, more in the commercial setting. And, you know, so for Rosemary Pi. No one reported it. I don't know.
Jonathan: I could, I could just tell you from knowing the open source community been around for a while now.
It's coming, right? As this, as this gets popular outside of the enterprise, somebody is going to be like, I've got an old Raspberry Pi. Now it's probably going to work best on like a Raspberry Pi five with an NVMe, but somebody is going to be, I can guarantee you because I've In the communities I've been in, I've had this happen.
Somebody will come along and say, I got it running on a Raspberry Pi one or the, the zero, the original zero or the zero two. And they'll come along you, they tell you that and just kinda scratch your head. Like, I didn't even think that was possible. .
Darko: Yeah. We are looking forward to that. I mean, honestly speaking, we were missing that, you know, kind of fun element.
Yeah. That, that was initially, initially very present, but you have to be. You have to be glad that you're getting, getting these bigger and bigger customers, you know, it makes your whole story more viable. Yes. But on the other hand, those guys have a very clear KPIs, you know, and OKRs and all of that.
And it's more like CTO comes and says, we want our build to finish under three minutes. And it needs to be parallelized across 60 jobs, you know, in parallel to achieve that. And then. Okay, let's scratch our heads. Let's, let's make, let's help those people make that happen. So there is that element of like, fun is sometimes replaced by a serious stuff.
Jonathan: Yes. Yes. That's, that's the danger of being enterprise first and trying to bring the open source along, but it'll, it'll get there. You can't, you catch the attention of the right people and they bring the fun element with them.
Darko: Yeah. I mean, originally we were not enterprise first, you know, originally we were like, you know, Small folks just trying to make their SaaS thing, you know, go to the next level.
And then it was like, literally we had a lot of those customers who were like maybe five, 10, 15 people for first couple of years, and then something takes off. And then in 18 months, they're like 150 or 200 engineers. And yeah, yeah. And that's how it translated into Datrum. And then over time it just kept going in that direction.
And now the biggest team that is using CERN for is 1, 800 people. You probably know the folks, um, confluent behind Kafka.
Speaker 5: Mm hmm.
Darko: And we, we just entered that, that, that space over time. And I have to say that it's, it, it was a good thing for us, although it, you know, it, it was a struggle to, to, to achieve it because as GitHub Actions, you know, came along, they did cut off that, you know, layer of those five, 10 people, you know, small folks for who's, you know, for whom GitHub Actions, you know, it's enough, it's enough up to a certain scale.
Sure.
Ben: To that point what is kind of the big value add of, of Semaphore when, you know, if, if I'm an open source project on GitHub, at what point can you kind of point to the key features set that Semaphore has that, that it can do above and beyond what GitHub Actions does?
Darko: In the open source world
I don't, I don't, I don't think that there is like a A single huge thing for open source people, you know, this debug, you know, features, these features, you know, being able to manage who can run the full pull request and so on. I mean, builds of most or vast majority of open source project tends to be pretty simple.
You know, if it's like most of the libraries and it's relatively, relatively simple for biggest projects. I agree. You know, someone has to compile Linux and, you know, run all the little stuff and so on in the commercial realm. It's it's maybe a bit different or around that you are iterating on a, on a bigger project with a larger team.
There is this debugging features that we spoke about. Then there is what you see is what you get, like a visual builder. You don't have to write the ML code. You can draw your pipeline. And it generates the code for you in the background. And I don't know if you have like a 50 people team, there are, you know, a couple of CI geeks that, you know, love to mess with that stuff.
But the vast majority of engineering doesn't want to learn your DSL and, you know, how do you run, you know, conditional expressions, if something's going to run or not, and so on. And I would say that the biggest other component is that, you know, continuous delivery component, where for us it has only been the kind of the first class citizens.
So promotions, which can, you know, trigger manual automatically, you can model your deployment process and to end tests and so on. So those are. Those are the big differences comparing to GitHub Actions. It's not that impossible to do that in GitHub Actions, but it's definitely not a first class citizen.
You cannot see your deployment environment with, you know, in such a nice way with, you know, everything was listed, who deployed where, under which conditions, and so on.
Jonathan: If somebody is running Semaphore, making use of all this, and they run into a problem with their CI not necessarily a Semaphore bug, but just CI is hard and therefore you run into problems.
Do you guys have sort of a, a, a forum a way to, to go to y'all's engineers and say, this is the crazy thing my code is doing. Can you please help me? Is that in scope?
Darko: Again, I mean for this, for the open source, you know, piece, we, we have discord that we had for a while mm-hmm . So some people, you know, will probably able to give, you know, a piece of advice there.
For the customers, it's, you know, depending on kind of the level of the, you know, support plan, there is a possibility of, you know, hitting P1 and, you know, we jump on a call and we help you but that, that's that, you know, commercial enterprise ish setting and for other folks, you know, there, there's a, you know, a typical, you know, support forum where, you know, we share advice and tips and, you know, some debug, some attach.
Jonathan: Yeah. No, that's, that's great. I don't think any, no one reasonable using this expects to get that kind of that kind of support for free, right? I think, I think that's fully understood. It's just, it's interesting to know what the options are that are out there because man, I've, I've lost hours and hours of my life fighting with CI and CD stuff.
Darko: Yeah. I mean, as, as every project, every company is, you know, fighting for their space in the world. I mean What our customers think is unique about Sanford, it's not that kind of support that they're not going to get anywhere else and certainly not for, you know, huge vendors such as, you know GitHub, you know, you hit the P1 there because your whole team of, you know, let's just say a couple of thousand engineers are stuck and you have a support plan with us.
We are going to jump on a call with you. You know, even if it's not our fault, you know, to, to, to help you out. We have a luxury of doing that. They, they don't, I don't.
Jonathan: Yeah. That's, that's one of the things I've noticed about GitHub is that when something goes wrong, even if you can point to like, here's a reproducible, reproducible test case.
Where something is wrong inside the, the GitHub code, like getting someone's attention to actually get that fixed is sometimes very difficult. And so I think that in and of itself is, you know, an advantage to working with maybe a little bit smaller team than the behemoth that is GitHub and trying to get support there.
Darko: Yeah, yeah. I'm sure that they're going to get to help you. If you're paying, like, millions or tens of millions a year. Yes. You don't have to
Jonathan: worry about that. That's true. That's true. Us normal folks don't have They're a free customer. Yeah.
Ben: Doesn't get priority support with the millions of other fires they're dealing with, I'm sure, on a day to day basis.
Jonathan: And honestly, though, that's where the fact that all of this is open source for Simba 4 comes into play. Because there, there are bugs in GitHub or, or maybe not even bugs, but just quirks in GitHub CI that I would absolutely go and fix if I could get to it, but it's code that I can't get to, so I can't fix it.
And I think I think, you know, as time goes by and you get more people kind of aware of this as an open source project, you're going to get a lot of those. It's a lot of just little pain points, like your example about this bullet point shouldn't be selectable because I don't want to copy and paste it like that's a trivial thing to fix.
But a very small people group of people would realize that that's an issue. But you then also have a large, like you've got this real small group of people that go, this is an issue. This is annoying in a certain way, and I know how to fix it. And then you have a larger group of people that are like Oh, it doesn't copy the bullet point now.
Cool. I didn't even realize that annoyed me, but I'm glad that they fixed it right. And so you get sort of those knock on follow on effects, and I'm excited. I'm excited for you guys to see some of that start happening, because it's gonna be cool.
Darko: Absolutely. Absolutely. I mean you know, just a few variables, few numbers can change things drastically as like a response time to a debugging CI request.
The other component is that, you know, we are by the By measures of, you know, super successful startups, we are a small, small team. We are 40 people, you know, roughly half of that, you know, engineering. And as you can imagine, our backlog is huge, you know, for various small improvements that people want to make.
And one of the things that we are kind of counting on, we don't expect anyone to at least initially implement anything bigger, but you know, these small paper cuts. You know, are the things that you're hoping that, you know, people will jump, jump in and, you know, work on.
Jonathan: Is there anything that's been a real challenge in this process of going from closed source to open source?
Like, have there been any particular pain points or we, we talk about Some companies, when they try it, when they even think about doing this, they've got like NDAs. And so some of their code, some of the code are either licensed from another company or there's some NDA keeping them from being able to open source.
It was, you know, it's something of that of that nature. Were there any pain points?
Darko: I would say that this happened in two big jumps. The first one was when I spoke about that, you know, growing up from serving small. Small shops to serving the bigger companies. And then at some point people came and said, well, the Sanford is great.
It always with features and so on, but you guys aren't in the cloud and our security compliance, legal teams are not going to take it. They might take it from, I don't know, GitHub or, you know. AWS, but you guys being like a smaller vendor and all of that, you know, does not satisfy. But, on the other hand, if you would let us run it, we would be happy to pay.
And then it was a process of, roughly speaking, like 18 months of packaging the app so other people can install it and run it. And that was, that was the first jump. And once, once we did that, kind of relatively soon after, we had that experience of, you know, some customers want to contribute to some area, have the influence in some area to a larger extent that, you know, maybe we would not be able to, you know, in a short period of time address, you know, if it, even if it's like a relatively small thing, it really depends.
And, That was one of the inspirations for, for, for going open source. The other obviously being, you know, more people being able to use it. But to get back to your question then there was an element of open sourcing it. Obviously one of the first thing on the list was, you know, security and the code that was written kind of over a decade in a closed environment, under, you know, How many eyes, you know, 20 people or so looking at it.
And now the whole world will be able to, to, you know, take a look at that. So that period of like figuring, figuring that stuff out and polishing stuff, also defining what is, what is dedicated to SAS. So you don't want to open source your billing. And let's say billing is part of your main app, you need to drop that out.
You need to drop out the, the scheduler for, which is somewhat specific for the SaaS, you know, workloads and so on. Not, not, not to go into all the details. So microservices architecture helped here. If it was a monolith, there would have been a lot of cutting. Now we use gRPC in between, and then you can rewrite certain service for this particular, you know, use case and, you know, chop it up.
But I would say that roughly speaking, it was a period of roughly a year where I would say larger part of the team was dedicated to preparing all it you know, to, to go, to go open source. You also want to make it as simple as possible to install. And that was not your priority previously in such a, such a, such a huge way.
Jonathan: Indeed. Yeah, that makes sense.
Darko: If anyone, if anyone wants to take this route, I will be happy to track the chat and, you know, Offer maybe some specific piece of advice? It is one, it is a it is a significant investment.
Jonathan: Yeah. Yeah, for sure. And then kind of looking forwards in the, the future of the project like the, the maintenance of it going forwards what are your, what are your thoughts on that?
Like it kind of helps. One of the things we think about with open source is maintainer burnout, and if. You're paying some of the maintainers. That helps a lot avoiding burnout. But like other than that, what are some of your, your future going thoughts about the maintenance of the project?
Darko: Yeah, there are like two major changes that are happening.
One is, you know, making just the, the source code open source, which we are obviously excited about, but I'm personally more excited about our transition to, you know, building in public and moving the engineering product, all of those, you know, organizations to, you know, like you guys are streaming right now, we are, you know, streaming our meetings and making all the.
You know designs and architecture discussions and all that, all that in, in, in public. I mean, on one hand, that obviously has has a huge benefit to, you know, just showing the dedication to, you know, doing stuff in open and, you know, other people can learn and join and it's not it's not just in that code.
There is there are people around it that you can talk with and so on. So, yeah, I mean, as you mentioned, there are people being paid to contribute to Sanford, so Sanford It definitely has that component of being, you know, commercial open source. Our goal is, it's now a hundred percent commercial open source.
We want to make it less commercial open source and, you know, move more into the community realm. It will take time, definitely. But we hope to eventually, you know, get more and more people who are just involved as a community and it's not just us who are contributing to it.
Jonathan: Yeah, what what is the funding model?
Like, so obviously you've got your enterprise customers and they're, they're paying for they're paying for the, you know, whatever it's services they're getting. What's the ramifications of going open source on the funding model? And do you have some interesting ideas for how to make how to make money particularly with the open source project?
Darko: I mean, there is I would say almost nothing and definitely nothing ingenious, you know in, in this realm. So we started as a SAS. So you paid for the, for the minutes that you use to, to build your projects no per seat cost. You just come in, you and you pay for, for, for what you use. That was initially the case.
On top of that, we added that enterprise model there, where there is per seat cost plus the, you know, support elements to that. And we are we are a fully bootstrapped company, you know, maybe not. Directly related to your question, but just to mention that, so when it comes to open source, I mean, those two revenue, those two revenue streams would fund the development going forward as they did so far.
And yeah, we will see. Basing on who are the people that get involved, and who are the people that end up running, you know, CI for themselves. We'll see that that might change might change over time.
Jonathan: Yeah, there's, there's a couple of, there's a couple of revenue streams, ways to make money, that sort of come to mind with this, that you might see some, some traction in.
And the first one is simply, you know, some big company wants to run this internally. They don't want to necessarily pay you a support contract for it, but they say, this is the feature we want to see. Let us contract with you guys to add that to the open source project. And now that happens a lot in various cases.
One of the other really interesting ones that we're sort of as the open source community, we're only now really figuring out is. The, the new, the new security laws in Europe and the United States, you know, they're, they're beginning to say that there has to be some attestations that come with this, this product, you know, if you're going to run it in certain places or if you're going to run it as a business, you know, things like that.
And it seems to me that there's sort of a, there, there's a space for. Open source projects to, and I don't know if this exactly applies to you guys, because you sort of come at this from the other side. But for an open source project to say, Okay, fine, you run it, and we will sell you, like, the attestation that this is, as far as we know, is CVE free.
And, you know, we will sell to you the, um, the, the, the list of other programs that are in it, right? The the BOM, the Bill of Materials, the Software Bill of Materials. You know, and, the, the, one of the places that comes from is you, we've had projects that they have this complaint that, oh, this big corporation came to us and started demanding an S BOM from us, and we don't have time to work on that.
And every time my response is, Sure you do, you just have to tell them this is your hourly rate to do it, right? And, and I, I really think that when it comes to open source, Particularly with the interaction with other corporations, That there, there really are some potential there to, For the an outside corporation to really support the project.
And I'm not sure how much of that makes sense with you guys, Because like I said, you came at this from the other direction. You've got the, the corporation first, and then you open source something. I'm curious your thoughts on all that. Yeah.
Darko: Yeah, I mean, for the second piece, that's something that I mean, SBOM is a big thing, I would say it is somewhat solved in the industry, but the whole problem around like making open source secure is not really solved because you have no clue who contributed to certain libraries and, you know, what's the, what's the, what could be the motivation behind the first thing is something that we talked about.
We don't have many data points. Um, maybe it will be scalable, maybe not. It's, it's, it's hard, hard to say. I mean, I do have a background in consulting. Consulting is a tricky business in its, on its own. Yes. The second piece is, is, you know, is possible to in some realm. But yeah, honestly, I don't have any deep thoughts around that and if that can work out or not.
Jonathan: That is fine. Is there anything like a CLA that contributors have to sign to be able to add code to the project?
Darko: I mean for the Apache two piece no, but for, there will be a much smaller portion of code that we will keep under that commercial license piece.
For that one yes, but we are just in the process of kind of moving that code in the, in, in, in the repo. I would say more than half of it is already there, but we are still, you know, polishing some off and just transferring it there.
Jonathan: Makes sense. There is a question that I know Ben, he told me ahead of time he wanted to ask about, and that's a particular language that he is a fan of that I've never ever done anything with, but he probably wants to know whether you have support for that.
Ben: Yeah, yeah, I was looking at your, your language and database support on, on the docs, and I'm, I do a lot of C sharp. net development and have for a number of years, so I was curious if there was any roadmap for that or interest.
Darko: There are some people that are running it, so it's I mean, with this Sanford 2.
0 thing that we launched, like, roughly like five, six years ago, we tried to make Sanford as kind of extensible as possible there so people don't have to depend on us, you know, to run run that so it's definitely possible. Definitely possible to run. It's just that our, our roots are in that, you know, mainly that SAS ecosystem, which makes things with open source technologies for the web, like kind of web apps and I don't know, yeah, honestly speaking, there are.
There are a few C sharp projects in that realm. I know that you can run open source stuff with C sharp also these days, but
Jonathan: You can install C sharp. So this, this is my, my guideline for how, how open source something is or not. If you can go to a Fedora Linux install and just install it with yum, then it is, it's got its ducks in a line for being open source.
Obviously, some projects are more complicated than that, and the newer thing, like Semaphore, is not going to be there because it's so recently open sourced. But the point I get to is, you can install NET with YUM on Fedora, so like, it's pretty dang open source friendly these days. Yeah, it just has a stigma.
It has a
Ben: stigma associated with it, where people kind of see it in the same way as old You know, java enterprise apps, except it's under microsoft's thumb and
Jonathan: well, so many of us grew up working on windows XP machines and you have to sit there for an hour and a half and pray the whole time that the dot net three dot x install update would work.
Right? And our days of the net framework. Yes, they were. They were dark indeed.
Darko: I'm a little bit older. For me the, the Visual Studio experience was main, mainly tied to Visual Studio four and Visual Studio Visual Basic four, and Visual Basic six. .
Jonathan: Oh,
Darko: tho those are nineties. Yeah.
Jonathan: Well, I, I have some of those memories, but actually doing it doing it professionally, I, I maintained a lot of Windows XP machines and.
Boy, we would just sit there and wait for the dot net updates to finish. Oh, it took so long. So I have a little PTSD over dot net. All right, so you've got a note here. One of the things that you wanted us to ask you about is one of our most slash least favorite topics, and that's a I can't get away from it these days.
Is there a place inside Semaphore or inside sort of the development that you guys make happen for AI assistance? How does, how does that fit? How does that fit without just being AI slop were added to be able to make our backers, you know, our, our, our venture capital backers happy? What's, what's the story there?
Darko: I hear, I first hear a story, a short story to tell. So we are kind of in the process of booting this project of, you know Revamping part of the CD, how it works in Sanford. So it's, um, much more flexible and then you can model how you are deploying across different regions, continents, and so on. And that's automated.
And so not to get into the weeds, weeds of it. And I'm kind of working on the architecture, you know, this feature design of that, and I'm talking with, I'm talking with AI and then I have, you know, a lot of ideas that I. Paste as like markdown to it. And I say, okay, do you see any problems with it? And so on and so on.
And tell me how you would how you would maybe implement that, you know, what are the other options? And then it's literally starts writing SAM for YAML and saying how it would implement it within the SAM for features as it today, you know, and it turns to be like a hacky workaround, but the effect is the same.
And that's not what I asked at all. I asked the guy, you know, dude. The backend architecture and the system architecture. So for, you know, writing YAMLs and those kinds of things, I mean, yes, we're probably going to bring in some input box, you know, to, so you can do it in the app, but there are a lot of these, you know, smart AI.
LLMs around there that you can use, use to interact with software. Right. An area where, where, you know, people are asking us to do, to do more, more in, and yeah, it's not easy finding the time for that, although it would be hot. It's in the realm of that, you know, agentic AI, if something would.
Observe your pipeline, duration of the pipeline, the frequency of failures, the flaky tests, and all of that. And maybe not do something really hands on initially, but just based on the observations, give some feedback to people, you know, what, what, what they could, what they could do better. It is, it is a kind of thing, which is very hard.
You have to really have a process around it. To be good at that, if, if you are doing it in a traditional way, you have to have some, you know, charts and some people and some weeklies, and you know, to review all the flaky tests, to review why is our build getting longer, who contributed this, you know, tests that take two minutes, are you crazy?
You know, all, all, all those kinds of things. So I'm just
Ben: imagining an AI Clippy saying, I noticed that your build is 40 percent longer after this commit, right?
Darko: Yes.
Ben: Would you like help with this?
Darko: And then there is a picture of the person who made it. Yeah. Yeah.
Ben: Automatically get blamed and sends an email to their supervisor.
Jonathan: Oh, that's, that's terrible. This is the dystopian future that we do not want. So there is, and I've, I've done a lot of thinking about this, but also talking to people sort of in this broad industry. And there's one, there's one use case for AI in particular that I think really makes sense. And that is AI guided fuzzing.
And I'm curious, Just to start with, is there sort of an infrastructure for doing fuzzing inside of Sima4? And then is, is that something that someone could, you know, shoelace and bubblegum together to get some AI guided fuzzing happening inside of Sima4?
Darko: Yeah. I mean, there is no, no first class citizen on implementation for fuzzing within, within Cepher.
That's kind of the, the short answer to that in terms of AI, AI and fuzzing. I also think that that would be super interesting. I mean, not, not related to fuzzing in particular. There is another realm, which is very interesting to me. And, spoken to a number of people that, that share the same views.
So you have a successful 10 year old Ruby on Redis application, let's say, to keep in that, in that world. And the build, when it's spread across, you know, maybe 50 parallel jobs, it takes five minutes. So that's okay. So you can imagine you have a crazy amount of, of tests there. You know, can be unique, but can be also those expensive end to end, whatever.
And we were, you know, graphing through the database of tests, you know, and there is roughly speaking 25 percent of tests that have not failed in the last three years. Are those tests useful? Are those tests even valid? Maybe they, maybe they have been always green. Maybe they have never been red.
Maybe it's
Jonathan: not possible for them to be read.
Darko: Yes. And there is a certain, I would bet, you know, a few bucks that, you know, 5 percent of the tests can just never be read. But the point being that they are adding, they are this taking away from your, Velocity and you know, how fast can you run things and so on?
I mean if I would retire And have nothing else to do but write academic papers. I would work on those, you know tests that that That are just wasteful, you know and how to how to figure out what's the percentage of those? Yeah How how to eliminate them?
Jonathan: And so the the thought here being that that might be something that you could turn an ai loose on and say hey Look at my Everything.
You have access to the CI runs. You have access to the source code. You have access to the glue code that holds it together. Give me some. Yeah, they hit the history of runs, all of that. Pull out, you know, 5 percent of the tests that you think might be entirely superfluous and see what it comes up with.
Yeah, that's that's pretty interesting. Yeah, so, okay, we are, we are basically back down at the bottom of the hour. I'm gonna give Ben a chance to get in any final questions he has, if he has any. And then we're gonna ask Darko if there's anything that we didn't ask him about that he wanted us to.
And that's sort of a that's sort of an interesting question to get, because you have to do set theory in your head to figure out all the things we talked about, and all the things I wanted to talk about, and how much do they overlap. So Ben first, is there anything else you wanted to get in before we let him go?
Ben: Yeah, I was gonna ask like how a semaphore kind of fits into, Like containerized deployments now, how does that and orchestration? I know a lot of a lot of Businesses and, and SaaS products are using technology like Kubernetes now, how does, how does Semaphore kind of fit into that paradigm?
Darko: Yeah, I mean, when it comes to I guess that maybe indirectly you are, you are referring to GitHub, GitOps as a pattern of like having the Spec and description of what what the environment should look like. We don't, we don't deal directly with, with Kubernetes. We are not touching those APIs directly.
I mean, over the years we kind of burned ourselves. We had integration with dozens of, you know. Deployment targets over the years, and it was quite expensive, you know, to maintain all those integrations over time. And luckily Kubernetes came along and there is now somewhat of a standard there but it was also the case that we were busy doing other stuff.
And I would say Flux, CD and Argo did that, you know, okay, here's the spec in my repo. I'm going to apply it. I'm going to do sync. So we never ventured into, let's say re implementing or implementing that, you know, sync functionality when it comes to the, to the Kubernetes. We kind of sit on a more abstract way.
So what you can very successfully do with Sanford with promotions being the first class citizen, you can maintain that those YAML definitions for your Kubernetes within Sanford and Sanford can, you know, update the. Particular YAML file with a new tag of an image. And from that point on you know, Flux or Argo or something else can take over and do that synchronization process.
And SAMH4 can do the check post that process. You know, okay, has this been applied, you know, in a certain way.
Jonathan: Very cool. Very cool. And then is there anything that we didn't ask you about? Anything that, burning that you want to let folks know about?
Darko: Not in particular. What I'm I, as a student, wanted to contribute to Blender.
I don't know if you know that piece of software for, you know, 3D modeling and so on. I was very unsuccessful in doing that. First, I wanted to contribute to that C or C code base. I think it's C. And then I wanted to contribute to the Python plugins. And it was, you know, quite hard. I did some work in the end.
I didn't do anything there. I ended up writing in C my own, you know, clone of space invaders and lived in my own little cave, having fun. So I'm, I'm just very curious, you know, to which extent AI can help people onboard them to sign up for, to contribute something. You know, with those new, I don't know, Claude and, and so AI takes on, does clone on the semaphore and says to the, to the CLI tool or wherever, Hey, I would like to make a contribution in this area to fix this and this in this way, you know, onward me to the projects.
Yeah. Tell me five things that I need to know before I hop in. So, yeah. That's kind of answer question that I'm really curious about, because we do have to work a lot on that, you know, onboarding of people and make it easy to onboard an open source project. Right. There is a big to do there, to do more, we just didn't have time.
So that, that's the area. Yeah. I don't know if you have guys seen something there or what are your thoughts on that?
Jonathan: That's, that's an actually a really interesting point I have. I've made the statement before that when someone goes to work on an open source project that's new to them, or I guess a code coding project of any sort, one of the, one of the first challenges you run into is sort of like.
mentally mapping where the code is, right? And so to put that in more concrete terms, which source file do I even need to go look at to try to figure out how to make this happen? And what you're kind of describing is using AI. As a search engine to find what file to start with inside the code. And that, that's actually really interesting.
I would, I would love to hear people's reports on trying to do that with various bits of source code. That is actually a really interesting thought. I kind of like that.
Ben: Yeah, that seems like a better approach than just kind of using the AI to make a drive by pull request and not understanding what it does.
Which I've seen as well. Yeah. The difference I have no idea what this works, but YOLO. Yes, the difference between write
Jonathan: this code for me and tell me where to look to write this code.
Ben: Yeah, help me understand.
Jonathan: Yes. One of those things AI is very good at. One of those things AI is Not,
Darko: yeah, we will see, I mean, for the, let me just mention the last thing for the open source code.
I, I interacted with, um, open AI API and wanted, wanted it to review each of the source code files. For me, I mean, on one hand, it didn't really give the very, it was very verbose when giving the output, you know, is there something that, you know, shouldn't be there potentially and so on. It also ended up costing a lot.
If I would run the most expensive model on each of the files, it would be in kind of thousands of dollars. So yeah, I don't know how it's, how achievable is this one, or is it my. But yeah, interesting,
Jonathan: interesting. All right. A couple of questions we are required to ask you before we let you go. And that is, what is your favorite scripting language and text editor?
Darko: Text editor, Vim, scripting language, Ruby.
Jonathan: Ruby and Vim. There we go. All right. Darko, Fabian, thank you so much for being here, talking about Semaphore, a really neat project and something that if I ever get a chance to, I'll go check it out, maybe I'll be the one that runs it on a Raspberry Pi and to see how that works.
All right, awesome. Thank you. Yeah. All right. What, what do you think? What are your thoughts?
Ben: Looks interesting. Are you ready to to hook all of our GitHub actions over to simf4 and start the exodus?
Jonathan: I, maybe let's start on a small scale. Start small. Take one of the small projects and just do a test run on it.
Yeah, I don't, I don't know. For some projects, that might make sense, particularly when you get into the realm of self hosting. You know, a real small project, self hosted a runner on a raspberry pi, maybe that is the way to go. If there's something that GitHub Actions can't do. And then I can also see, obviously, a really big project or a really big company.
They want to be able to move all that stuff in house and not fiddle with. The, the GitHub runners at all. So there are various places where that makes sense. And I really have the feeling that the more, the more people from the community sort of bang on this and add stuff to it, the more it's going to make sense for all of those different use cases.
Absolutely. All right, then. Thank you for being here. Is there anything that you wanted to plug, let folks know about?
Ben: Sure. If you want to check out any of my open source work github. com slash thebenturn. That's my handle. And most of my work is on the Meshtastic project. So if you're interested in communications and off grid technology using LoRa meshtastic.
org. It's a really neat project. It's really fun.
Jonathan: Yeah, absolutely. Appreciate it. If you want to find my stuff, most of it is at Hackaday these days. We appreciate Hackaday being the home of FlossWeekly. That's also where my computer security, not just computer security, but my security column goes live every Friday morning.
We occasionally dive into physical security and lockpicking and all of that fun stuff. So you can check that out. And there's also the Untitled Linux Show, which is still over at Twit, twit. tv. I think it's twit. tv slash ULS. And you can find that to stay up to date with what's going on on Linux and hardware and open source and all kinds of stuff we talked about there.
Have a lot of fun. We appreciate it. Thank you everyone that's here that caught us both live and on the download and we will see you next week on Floss Weekly.