Some of the highlights of the show include:
Links:
Transcript
Emily: Hi everyone. I’m Emily Omier, your host, and my day job is helping companies position themselves in the cloud-native ecosystem so that their product’s value is obvious to end-users. I started this podcast because organizations embark on the cloud naive journey for business reasons, but in general, the industry doesn’t talk about them. Instead, we talk a lot about technical reasons. I’m hoping that with this podcast, we focus more on the business goals and business motivations that lead organizations to adopt cloud-native and Kubernetes. I hope you’ll join me.
Emily: Welcome to the Business of Cloud Native. I'm your host, Emily Omier, and today I'm here with Ricardo Rocha. Ricardo, thank you so much for joining us.
Ricardo: It's a pleasure.
Emily: Ricardo, can you actually go ahead and introduce yourself: where you work, and what you do?
Ricardo: Yeah, yes, sure. I work at CERN, the European Organization for Nuclear Research. I'm a software engineer and I work in the CERN IT department. I've done quite a few different things in the past in the organization, including software development in the areas of storage and monitoring, and also distributed computing. But right now, I'm part of the CERN Cloud Team, and we manage the CERN private cloud and all the resources we have. And I focus mostly on networking and containerization, so Kubernetes and all these new technologies.
Emily: And on a day to day basis, what do you usually do? What sort of activities are you actually doing?
Ricardo: Yeah. So, it's mostly making sure we provide the infrastructure that our physics users and experiments require, and also the people on campus. So, CERN is a pretty large organization. We have around 10,000 people on-site, and many more around the world that depend on our resources. So, we operate private clouds, we basically do DevOps-style work. And we have a team dedicated for the Cloud, but also for other areas of the data center. And it's mostly making sure everything operates correctly; try to automate more and more, so we do some improvements gradually; and then giving support to our users.
Emily: Just so everyone knows, can you tell a little bit more about what kind of work is done at CERN? What kind of experiments people are running?
Ricardo: Our main goal is fundamental research. So, we try to answer some questions about the universe. So, what's dark matter? What's dark energy? Why don't we see antimatter? And similar questions. And for that, we build very large experiments.
So, the biggest experiment we have, which is actually the biggest scientific experiment ever built, is the Large Hadron Collider, and this is a particle accelerator that accelerates two beams of protons in opposite directions, and we make them collide at very specific points where we build this very large physics experiments that try to understand what happens in these collisions and try to look for new physics. And in reality, what happens with these collisions is that we generate large amounts of data that need to be stored, and processed, and analyzed, so the IT infrastructure that we support, it’s larger fraction dedicated to this physics analysis.
Emily: Tell me a little bit more about some of the challenges related to processing and storing the huge amount of data that you have. And also, how this has evolved, and how it pushed you to think about containerization.
Ricardo: The big challenge we have is the amount of data that we have to support. So, these experiments, each of the experiments, at the moment of the collisions, it can generate data in the order of one petabyte a second. This is, of course, not something we can handle, so the first thing we do, we use these hardware triggers to filter this data quite significantly, but we still generate, per experiment, something like a few gigabytes a second, so up to 10 gigabytes a second. And this we have to store, and then we have large farms that will handle the processing and the reconstruction of all of this. So, we've had these sort of experiments since quite a while, and to analyze all of this, we need a large amount of resources, and with time.
If you come and visit CERN, you can see a bit of the history of computing, kind of evolving with what we used to have in the past in our data center. But it's mostly—we used to have large mainframes, that now it's more in the movies that we see them, but we used to have quite a few of those. And then we transitioned to physical commodity hardware with Linux servers. Eventually introduced virtualization and private clouds to improve the efficiency and the provisioning of these resources to our users, and then eventually, we moved to containers and the main motivation is always to try to be as efficient as possible, and to speed up this process of provisioning resources, and be more flexible in the way we assign compute and also storage.
What we've seen is that in the move from physical to virtualization, we saw that the provisioning and maintenance got significantly improved. What we see with containerization is the extra speed in also deployment and update of the applications that run on those resources. And we also see an improving resource utilization. We already had the possibility to improve quite a bit with virtualization by doing things like overcommit, but with containers, we can go one step further by doing more efficient resource sharing for the different applications we have to run.
Emily: Is the amount of data that you're processing stable? Is it steadily increasing, have spikes, a combination?
Ricardo: So, the way it works is, we have what we call ‘beam’ which is when we actually have protons circulating in the accelerator. And during these periods, we try to get as much collisions as ...