325: Pranting

35 min • 8 februari 2022

Steph is super excited about changing her schedule to dedicate a full day to focus on being a great team lead. Chris talks about his continued adventures in the world of hiring.

Together they answer a listener question about what they consider a “large” table in a database and how they review schema changes.

This episode is brought to you by ScoutAPM. Give Scout a try for free today and Scout will donate $5 to the open source project of your choice when you deploy.

Services down? New Relic offers full stack visibility with 16 different monitoring products in a single platform.

Become a Sponsor of The Bike Shed!

Transcript:

STEPH: I just feel like every time I listen to Celine Dion, there are lots of dramatic hand gestures that have to go with it.

CHRIS: Yep, definitely that. I'm strong team Power of Love.

STEPH: Ooh, yeah, yeah, that's a good one.

CHRIS: Hello and welcome to another episode of The Bike Shed, a weekly podcast from your friends at thoughtbot about developing great software. I'm Chris Toomey.

STEPH: And I'm Steph Viccari.

CHRIS: And together, we're here to share a bit of what we've learned along the way. So, Steph, what's new in your world?

STEPH: Hey, Chris. Oh, I have some exciting news. I am changing up my schedule. It is going to start next week, where as a team lead at thoughtbot, we have been working on finding ways that we can have more time to invest into the team and team-specific initiatives. And most of us spend four days billing on client work, and then we have investment day, which is delightful.

But we're finding as team leads, that's really not enough time to then have the impact that we want in terms of supporting our team and then also having time for mentorship and all the other things that go along with being a team lead and one on ones. So we have been incrementally working towards reducing billing.

So team leads only bill three days a week, and then they have an additional day to really focus on being a great team lead. And I start that new schedule, that new-new schedule next week, and I couldn't be more excited. I think it's going to be wonderful.

I do think there are some challenges that go with it in terms of really balancing, at least this is from the others who have gone before me where they then find it a bit harder in terms of client expectations of saying, "Well, I was billing four days, and I had a larger impact on the codebase and the team. Now I'm dropping to three days. I still need to stay within that constraint. And I want to keep the client team happy." So that seems to be a thing. But I will find out next week how it goes.

CHRIS: Well, I'm very excited for you. That sounds wonderful, frankly. The balancing of the client expectations and then there's sort of now three slices to your work, which there always were, but now you have it delineated in an interesting way. Do you have specific plans for the team lead? So let's say now, nominally, there's one day a week that is dedicated to team lead time. Do you have ideas of what that looks like? Are you planning to pair with your team? Is it longer one on ones? I don't want to seed the question too much with potential answers. So what are you thinking about there?

STEPH: [laughs] Ideas are great. And yes, so I think number one is structure. So right now, one on ones and any support that I need to provide others is more ad hoc, or at least the one on ones those are not ad hoc; they are structured. But they are spread out throughout the week, and then I just context switch between client work and then checking in with others.

Now I can stagger everything on a Thursday or whichever day is going to be my really focused team lead day. So that way, I have all the one on ones on that day. And then yes, I can have more time to pair. So I can say, hey, let's just schedule every other week where you and I hang out, and we pair for an hour, and it can be on their client work. It can be on anything that they'd like to work on. Or, if there's a particular topic they'd like to talk about, we can pair on consulting issues or discussions.

But yes, ultimately, I'd love to do more pairing and then structure one on ones to a specific day and essentially, just really protect that time. Because right now, it feels that client initiatives and work always come first, and then team lead comes second. And I'm excited to balance that more so they have equal priority.

CHRIS: Yeah, that sounds great. I'm super intrigued to see what specific structures fall out of it and how you're experiencing it. I'll be interested to hear how investment time changes for you as a result of this because I remember when I started in the management role, four days a week billing, and then one day a week of investment time. But the investment time then basically went to one on ones and other things like that.

And when I switched to a three-day week, I was able to reclaim some amount of investment time. And it was interesting having that open back up and have that be a consideration. Because definitely one on ones and things like that I think firmly fit within the idea of investment time or investing in the organization and whatnot. But still, there's the like; I'm going to go explore a new framework or something like that that also certainly fits within investment time. So I'll be interested to hear if you find that changes in sort of a specific way.

STEPH: Yeah, I'm really interested in that as well. Because right now, as you mentioned, my investment activities are really focused more around the team and other folks and then Bike Shed. Bike Shed is a really big investment time activity. So I've noticed since becoming a co-host for the show, I talk a lot about code, but I don't necessarily contribute to open-source projects or other internal projects at the rate that I used to. It's now more focused about here and being a co-host and talking about all the things, and that requires some prep for me.

So I'm also interested to see if this will shift my investment time a bit where I do find a little more time to code and then explore just things that I'm interested in. But in the experiment of doing something new, it's always important to then have a way to measure is this a good change? Is this a bad change? So we have been checking in with team leads to say, "Hey, we've changed your schedule to where you're billing one day less. How's that going for you?" Because there's the assumption that this will be great, but you really have to check in with folks to find out.

So Edward Loveall has been sending out a helpful survey and checking in to say, "Hey, how are you feeling about your client work? How are you feeling about your team lead responsibilities? How are you feeling about investment time?" So then you can track your own growth and see is this really helping me? Is this really going in the right direction, or am I just more stressed about everything now? So that's helpful that we are also just looking back to make sure that this is supporting the initiatives that we said it would support.

But that's some of the newness in my world. What's going on in your world?

CHRIS: What's going on in my world? Continued adventures in the world of hiring. So we've got a couple of people in the pipeline now. We've got some folks in the technical interview phase, which we're structuring our technical interview very much inspired by the thoughtbot interview. So it's a pairing session as well as some code review, which is great because I think it's really representative of the actual work that we do.

I believe strongly in not having an interview that is trivia or anything of that sort of thing. I want to see folks at their best as opposed to finding the rough edges. Because I think it's critical to have an interview that really represents the work that we're doing and then also gives candidates an opportunity to show themselves at their best as opposed to trying to hunt out gaps in knowledge or things like that because I think it's easier to shore up a gap of knowledge. But I really want to know what is this person like when they're firing on all cylinders?

So, so far, that's going great. But hiring is a complicated long game. So it will probably be a thing that I'm talking about for some weeks to come. And if anyone out there is listening and is potentially interested in a new adventure, I would love to chat with you. Sagewell Financial is hiring. And it's a wonderful Rails codebase and lots of new opportunities, et cetera, et cetera.

STEPH: As someone that has worked with you, I can absolutely vouch that you are amazing to work with. And I can only imagine the codebase must be...everything we've talked about is really interesting and stellar. So yeah, I love that you're talking about this. I think it's awesome and a great opportunity for folks to get to join Sagewell.

CHRIS: Oh, thanks, Steph. That's very kind of you. But in other unrelated to hiring news, one of the things that I talked about in last week's episode was my search for a new to-do list or a new application to use. And I listed some of the ones that I've been exploring. We got more feedback about that particular segment than any other by like 2X. And there's something to be said there. Maybe the show is just living up to its name.

But so many people are reaching out like, "Oh, have you looked at this one?" And to be clear, I very much appreciate all of the feedback that folks have given. And actually, it has given me a few new things to look at or ways to think about this question. But mostly, I find it very funny that even though we've dabbled in topics like agile, and is it good or bad? Or other contentious ideas [laughs] like that, somehow this idea of what to-do list application should I use by far the most engagement we've seen with our audience.

STEPH: I think it makes sense. Everybody has an opinion. Like you said, we're living up to our name, which is great. Was that great? I don't know. [laughter]

CHRIS: It's something, I'll say that.

STEPH: It's something. But yeah, everybody has felt this pain. They get it. It resonated. But since we do have some people that shared their strategies and their thoughts, did that sway you at all? Are you still going to keep with what you have, or are you going to explore new things?

CHRIS: I consider this project open. I have a project in Things, which is the current to-do list application that I'm using to explore the landscape. But it's basically like, I want to timebox it, find a version that works for me. And right now, I moved to Things, and it's fine. I'm more intrigued by the jobs to be done aspect of it. So as opposed to a particular piece of software and the features that it has or doesn't have, I really want to think about the habits and workflows that I want to make easier and more repeatable.

So particularly, each day, I want to wrap up by cleaning everything up. I like my inbox zero, as you probably know, so doing a little bit of that, and then planning the next day. So I want to have a tool that supports that idea of I want to queue up what I'm going to do in the morning so that tomorrow morning when I start back up, I have a very clear list of things to do. And I can just dive in with what I find to be some of my best thinking time early in the morning.

Similarly, I want to be able to review on a regular basis and know if things are getting stale or overdue. So there are a couple of different workflows that I'm really focusing on. And it's unfortunate because then I look at each piece of software, and I'm like, well, you kind of support this but not totally. So I'm more in a collecting phase right now. I'm thinking about the workflows that I want to have and then finding the different tools and comparing them across those.

But the one thing that I have done at this point is I wrote a little Siri shortcut I think is the name for it. They're called Shortcuts is the name of the application, but if I try and Google that, Google doesn't really know what I'm talking about. They think I'm talking about my phone, but I'm not talking about my phone. I'm talking about my actual computer, but it's little workflow automation stuff on OS X.

And so I have a shortcut now that prompts me for the amount of time, and it defaults to 45 minutes. And then, it will turn on Do Not Disturb for 45 minutes, minimize Slack, because I can't be trusted, and turn on a particular Spotify playlist.

And then there's a little menu bar application that...I wrote a tiny bit of AppleScript; I found it on the internet and actually read it, that finds the top task in my to-do list and puts it in the menu bar. And so now I have all of that. I push a magic button, and I say, "Yes, so I would like to work for 45 minutes on the thing that is at the top of my to-do list.” And then all of the noise of the world goes away for 45 minutes or however long I say.

STEPH: I think you just created the next new hot to-do app. [laughs] This sounds like something that I need and love, especially when you're like, it autoplays a playlist for you and shuts down the world and then has you focus. Yeah, I like this. I think this also rings a bell. I feel like Momentum, or something also has similar prompts. But this sounds delightful.

CHRIS: If we're being honest, it's an absolute hodgepodge of a kludge. You have some weird shell scripts and some AppleScripts. And I had to install a weird command-line utility for Spotify to make it happen. But it was one of those like; I'm spent at the end of the day. I just want to tweak on some piece of code. And this was a perfect, productive distraction, is how I would describe it. And when I've used that, I've been very happy. I know the days that I actually lean into that mode of working are better days.

The days where I allow myself to be distracted by Slack throughout the day, although I'm responsive to certain questions, things are not moving as well as they should. And so, I'm trying to be really intentional with having more of these Do Not Disturb sessions throughout the day. I feel bad saying that. I shouldn't because we all should be in agreement that this is the way that we work. But even saying that, I'm like, I'm not that special. I should be reachable, right? [laughs] But I should take even just a short 45-minute break to focus on the work that I actually need to do. It's a struggle.

STEPH: I have struggled with that where I used to always feel such an urgent need that I had to respond to someone as soon as they messaged me. But over time, I've learned that one, things typically aren't as urgent as I will feel that they are. And then two, if you have that type of environment where people aren't expected to just immediately reply to stuff, then you learn to write things in a way that says, "Hey, when you see this, and here's context, and here are the things that I'm looking for. And here's an easy way for you to give feedback."

It just improves the overall communication. I could go on a rant about this. I think we've actually ranted about this before in a very positive way. [laughs] Yes, I think that's great that you are fighting the good fight and turning off the world for 45 minutes to focus on a task.

CHRIS: What's a positive rant? I feel like there's got to be a word for that. [laughs] But I'm trying to try and come up with that. A celebration isn't...is this one of those gaps in language where we don't have a word for a positive rant?

STEPH: Oh, this is going to bother me. [laughs] There's got to be something for a positive rant.

CHRIS: Well, I'm sure German has...some Scandinavian language or German has a more specific version of when one goes off on a rant for many hours about things that they love and are joyous about in the world or something like that. But maybe English is just lacking this, or maybe this is a market opportunity. And we can coin the word, and then it's ours.

STEPH: I think it's just praise or accolades, although that doesn't feel strong enough. Rant feels like such an emotional word that I agree praise doesn't feel strong enough.

CHRIS: It's also spacious. You don't just rant, and it's one word. It's not just like one swear that you yell in the word. No, it's this long rambling thing, and I want that but positive. Maybe it's just called The Bike Shed [laughter] because I think that might be what we do.

STEPH: I love that. I'm trying to smash it together, and all that I can come up with is prant, so that leads with a P.

CHRIS: Yeah, I went there real quick. [laughter] Portmanteau is where I spend most of my time. But prant is just not enough. Okay, we're going to take this offline. I think we should come up with a word. This is our market opportunity. I don't know that we'll make a lot off of this, but we'll have a word then.

STEPH: It's okay. Free things are good. Oh my goodness, this is going to be so trivial and silly. But I've been playing Wordle as the rest of the world has. If you're not playing Wordle, check it out. [laughs] It's delightful. And it's free. But I started playing without really researching who created it and didn't have all of the details behind it.

And then it was earlier this week I found out that the creator of Wordle is Josh Wardle. And that just blew my mind and made me so happy that it just had that alliteration and similarity. And I just hadn't put it together until that moment. And it was just this wonderful, happy bubble of a moment where I was like, oh, that's delightful. [laughs] And I'm pretty sure I texted some people who were like, "Yeah, yeah, we know that." [laughs]

CHRIS: Yes, that was a wonderful positive rant or prant as it were there. And Wordle really is just such a delightful phenomenon that popped out of nowhere and is given away for free by the kindness of Josh Wardle. So yeah, wonderful things on the internet.

Mid-roll Ad

And now a quick break to hear from today's sponsor, Scout APM.

Scout APM is leading-edge application performance monitoring that's designed to help Rails developers quickly find and fix performance issues without having to deal with the headache or overhead of enterprise platform feature bloat. With a developer-centric UI and tracing logic that ties bottlenecks to source code, you can quickly pinpoint and resolve those performance abnormalities like N+1 queries, slow database queries, memory bloat, and much more.

Scout's real-time alerting and weekly digest emails let you rest easy knowing Scout's on watch and resolving performance issues before your customers ever see them. Scout has also launched its new error monitoring feature add-on for Python applications. Now you can connect your error reporting and application monitoring data on one platform.

See for yourself why developers call Scout their best friend and try our error monitoring and APM free for 14 days; no credit card needed. And as an added-on bonus for Bike Shed listeners, Scout will donate $5 to the open-source project of your choice when you deploy. Learn more at scoutapm.com/bikeshed. That's scoutapm.com/bikeshed.

CHRIS: We have a listener question this week. Once again, just as a reminder, everyone, we love getting these listener questions. Feel free to send them into [email protected] or ping Steph or I on Twitter or any number of different ways. There's, I think, a form that you can go to the website, lots of different ways to ask us your questions. But again, we really love them. They let us have more pointed topics to talk about, such as today's topic, which is "What do you consider a quote, unquote, "large table" in a database?" Which is an interesting question, I think.

And so, let me read the question here. "Hey, Steph and Chris, I’ve listened to you (and most of your predecessors) for a while now. I've really been enjoying the conversational style about your actual development struggles." Thank you so much. This comes from Matt, by the way.

"Anyway, something Chris said in Episode 301 triggered a thought for me around large tables and databases and handling them for development tasks. What do you consider a quote, "large table" in a database? What questions/considerations come to mind when you're doing PR work that has a database interaction in it? We recently needed to delete a lot of rows out of a large table, and the team has a lot of discussion around how to handle it without impacting our production users. Curious on your thoughts. Thanks."

So, Steph, what do you think? What's a large database table in your mind?

STEPH: So I don't have a scientific answer for that, but I can give you my gut instinct. So typically, if there's a table that has a million or more records, I'll refer to that table as a large table. And then, if a table has around half a million records, then I start to be more cautious about data changes and how I'm rolling out schema changes. So that's my very loose; this is my feeling of when we're getting into large territory. How about you? Do you have more of a concrete answer?

CHRIS: I don't. And I think it would actually, in a lot of cases, be defined based on the database system that we're working with and, frankly, the RAM available on that system. There are two different sides of it; one is on the right side, like, how quickly are we inserting data into this table? And how quickly is it growing? Is probably a better question. Maybe there's a ton of data in it, but it's not growing that quickly. And so, we don't need to worry about any runaway characteristics.

The other side of it is how easily can we read from it? And that is the one that's going to be RAM-constrained. Where can we maintain an index efficiently? Can we query effectively and use RAM and whatnot? So a million starts to become an interesting number, probably. But I've worked on plenty of databases where hundreds of millions of rows existed, and we've got efficient indices in place and enough RAM that the database just happily works with that, and there are no problems.

So really, it's a question of like, if we start thinking about having to need to delete data, then that's a large table. If we have one table that is wildly larger than the others in the system, then that is something that I'll keep an eye on. I want to make sure, like, how's that table doing? How's the special table doing?

And often, there is one or two special tables similar to the idea of god objects within a system where these are the one or two classes that have just method after method after method after method. Similarly, there are one or two database tables that often have the lion's share of the data within the system. And so those are the ones that I'm really focused on.

And especially as we get closer to the RAM limit, there's this drop-off that I've seen happen where a system is like, it's fine. We got 250 gigs of RAM; there's no problem. And our database is only 100 gigs. And then a couple of weeks later, suddenly, had a bunch of new users sign up, and suddenly, your data and your indices no longer fit in memory. And now we're paging to disk, and suddenly the performance characteristics of your system just tank. And so it's that sort of thing. Watching growth rates is perhaps more important than the absolute size of any individual table. So yeah, those are some loose thoughts.

STEPH: I like how you used the word interesting. I think that's a nice replacement for the word large. When we get around a million records, things start getting more interesting in how we're rolling out schema changes. And then there's also you touched on usage, which aligns well with I often don't think so much about how many records that we have in a table.

But what's the usage of that table? How many queries or transactions are being executed against that table? Is this a very popular table like the users table? And will running a migration that renders that table inaccessible for a couple of seconds will that be problematic, or is this a table that we write to a lot, but we don't read from very often?

And even if it runs a couple of seconds, it's not likely to have an impact on people using the application. So that's one area I tend to think about first is what's the popularity of this table? And how cautious do I need to be in making sure that we don't block other people from accessing this data?

I also really like how Matt asked the question about what considerations come to mind when you're doing PR work that has a database interaction? That's one of those areas that, honestly, I lean pretty heavily on Strong Migrations to remind me how I can rewrite a migration to avoid or to transfer a blocking operation to a non-blocking operation.

So a really good example is setting a NOT NULL constraint on an existing column. I know that it can be very blocking if you try to do that by default when you first run it, and I will look it up every time. I will check Strong Migrations and say, "Hey, I know you've got some really great docs that will walk me through about adding a check constraint instead," and then making sure that I can then add this new column.

So going forward, for inserts and updates will apply the default, but it doesn't validate all the existing data. It's also a really good reminder, that particular example, is start with stricter constraints because it's a lot easier to remove a constraint than to add one later. So that's one consideration that comes to mind.

I also think the fail fast and fail loudly applies nicely here. So if I'm looking at a PR that is making a schema change, then I want to validate that the application has low timeout values so that way if a migration does take more than 30 seconds to run, then the migration will timeout. And then that will alert the developer to say, "Hey, do you need to think of a new approach or see if there's a way to improve this?" Versus if that migration didn't timeout, then that timeout is going to become user-facing as they start to experience problems with the site.

And then also looking for more performant methods so using find_n_batches, update all, delete all, just checking for the more performant ways that we can update large sets of data. Those are, I think, the top things that I really look for. How about you?

CHRIS: Yeah, I think very similar to everything you just said. And broadly, there's a point in time that happens frankly pretty early on in the growth of an application and the data set behind it where you need to start behaving differently with regard to migrations. There's a small period of time where I can just get away with anything.

I actually really love the part before we have any actual users where I'm like, oh, we need to change this fundamentally. I'm just going to drop the table and rebuild it because it's easier than trying to think about how to migrate this data. But so quickly, you get into a place where it's like, nope, sorry, can't do that have to treat this as realistic.

So a bunch of the strategies that you're describing, like indexes concurrently, is one of the things that I'll reach for often because that allows me to decouple the timing there and not...again, the migration timeout that you're talking about is absolutely something that I want to have. Migration should go through quickly, and if they can't, then we need a different approach. Maybe we need to introduce the new column right to that one in parallel to the existing column, and then eventually do a switchover. It's definitely more work and involves a couple of deploys to get that done, but that's the unfortunate reality that we have to move to.

I will say one of the things we talked about is like, if we hit that timeout, then we're going to stop that migration. This is a critical feature that I rely on deeply at Postgres, which is that schema migrations or DDL transformations; if I'm saying that correctly, I'm not sure I am, but throwing an acronym out there, it'll be fun. This is actually one feature of Postgres that I really rely on.

My understanding is that Postgres has this; MySQL does not, but I may be off. I know that Postgres has transactional DDL transformation, so schema migration sort of things. I'm adding a column; I'm removing a column, et cetera. Those inherently happen within a transaction, and that's wonderful because if they do timeout, we want to be in a consistent state.

The worst thing I can possibly imagine is being like, we got halfway through, but then we failed, or we lost connection, and so it's half migrated. It's like, oh God, I want to trust my database deeply. That's sort of one of the fundamental things that I have. And I've, over time, pushed more and more into the database and saying let's have check constraints. Let's have null true and all of these sorts of things so that the data in my database can be deeply trustworthy.

The idea that a schema migration could go awry, and suddenly we've got like, well, half of the rows have these extra columns. What does that even mean? How do you live in that world? So I love this feature of Postgres. I really rely on it. I feel very bad whenever I have to disable it. I think there are some enum-related things that require disabling DDL transactions. And whenever I type that in a migration, I'm like, I don't like this. I'm not happy about this, but it's the world we live in for now.

STEPH: If we're sharing our truths, yeah, adding an index concurrently also you have to remove that DDL transaction and disable it. For a previous project that I was working on, we often ran into that timeout where we'd run a migration, and then it would timeout. And we would then just specify and be like, "Hey, for this migration, I'm going to bump you up to a minute. I'm just going to make it longer."

And that felt questionable at times, but I at least appreciate the explicitness of it where you're making that decision to say, nope, I think this is fine. It’s not going to impact anybody, or we're going to run it in off-hours. I do want to extend this to a minute, and then make sure you do reset it, so it doesn't affect it globally from there on out.

But that's something that you can do, and I have done before, which I feel is important. You still want to know some of your outs in case you do need something like that just to fix things in a moment but then at least be intentional for when you're using it and then communicate to the team like, "Hey, I'm doing this and let me know if you have concerns about it."

For this specific scenario that Matt provided about we recently needed to delete a lot of old rows out of a large table, and the team had a lot of discussion about how to handle this without impacting production users; I happened to have a really nice conversation with Steve Polito, a fellow thoughtboter, about this particular question. And he had a very thoughtful response that I hadn't considered where he suggested starting with deleting the data for a small set of records.

So, for example, if you're working with a users table, you could scope the data deletion to only inactive users and then use a feature flag to disable any interactions that would be affected by that data loss, run that change to delete the data for those inactive users, and then check for unexpected errors or side effects.

So then that way, you have this moment to pause to say, "Hey, did we forget something? Is there something about this application that's still relying on that data that we forgot about? We've only done it for a small amount of users, so we're in a safer space." So then, at that point, you can either repeat those steps for another batch of records or use that feedback to then drop the column with confidence. And that was an approach that I hadn't considered, but I really liked that idea.

CHRIS: Yeah, it's a nice, I'd say methodical approach to what can be a very complex and difficult to wrangle task. I will say I haven't actually explored this too much, but I've always had in the back of my mind, like, if we're deleting data from the application, ideally, we're saying this data is no longer needed.

But I wonder if using table partitioning is an alternative that can be useful in these cases. What if we're able to figure out the correct partitioning? It's often time series sort of stuff. What if we're able to lean into that and say, "Let's partition this by year." And then yeah, we don't use the old data anymore, but it lives in a separate partition. And therefore, I think Postgres is able to do reasonable things with that.

And again, like disk space, we can have a lot more storage on disk, but RAM is really going to be the constraining factor of how much of the index fits in memory. And again, I haven't pushed on this. But I think that's an alternative approach that can be really interesting. But if we do have time-series data, in particular, Postgres is wonderful. But it's not necessarily honed exactly to that use case.

And so, there are a couple of tools that I've kept an eye on right now: ClickHouse, Timescale, and InfluxDB being the three of them. And I think most if not all of them are based on Postgres, but then they build on top of it. And they add some deeper understandings of time series data and how to optimize querying and storing, and all of that. And so, is that an alternative that allows us to still stay in this world but then have a different approach and alleviate some of the burdens that might come with this heavy data that we have?

STEPH: Yeah, all those sound interesting. I haven't heard of some of those. This is why I love chatting with you. You always bring interesting perspectives that I had not considered before, like the partitioning. Just to clarify, partitioning the data is a way of keeping that data, but then it's not indexed. So that way, your system isn't spending as much time making sure that data is easily readable. But then that way, you don't actually delete it, so then it's there should you wish to be like, oh, I wish I hadn't gotten rid of that data.

CHRIS: I think so. I'll be honest; I don't deeply understand it. But I think you basically can say given a giant projects table within your system; we actually may have that logically grouped by user sort of thing. And so we can shard and partition and say, there are ten different buckets of these. And if we optimize it well such that all of the things that are logically together actually live together on disk, then it allows Postgres to be much more efficient.

Similarly, with time-series data, then you can say, use this sort of windowing where it's each month, we get a new bucket. And then it's much easier to query across just that bucket because it's already sort of partitioned down in that way.

But I'll be honest; I'm now speaking well past my actual knowledge. I've never actually worked with it. But it's one of those things that I have in the back of my mind. Like if all of my other tools fail me and if I cannot solve these performance problems in a Postgres system with indexes, and tuning, and other things like that, then I will look to partitioning. So I look forward to that day when the data problems are so massive that I need to table partition.

STEPH: Got it. Like they say, it's a good problem to have. While adding to the list of tools, there's one that I discovered recently; it's called Safe PG Migrations. And it's very similar to Strong Migrations, where Strong Migrations will warn you and say, "Hey, this is not safe. There are other ways to write this migration." Safe PG Migrations take some more aggressive approach and will rewrite your migration to be a safer version. And I don't know how I feel about it. I love it, and I hate it. [laughs]

It's one of those the magic is there, and that could be phenomenal. But I get squeamish when things want to rewrite something as important as my migrations. But on the other hand, it is like a really nice default for the team because it's more than a warning. So that way, if you're trying to put something more strict in place for people to follow, then this would be a good way to do that.

CHRIS: I'm very intrigued by that as a tool because if it were obvious, then Postgres would do it. The team behind Postgres does absolutely amazing work. And so if I tell them, "This is the change I want to make to the system," and they're like, "Cool, we're going to do that super inefficiently," and someone else is like, "No, no, no, I can trick it." Postgres is good at tricking itself, is my stance.

So I'd be intrigued as to what secret knowledge they have or what are their caveats where Postgres has to handle every possible edge case. And therefore, it's slower because of pessimistic concerns that it has. But this tool says, "No, no, as long as you're not doing this very terrible thing, you're fine. And we'll rewrite it to a safer, faster version." But I'm just kind of intrigued, like, why do you think you're better than Postgres?

STEPH: [laughs] Why do you think you're better? Well, I do you have an example I can provide. It's one that they have on their README. And this one highlights that if you're adding a column to an existing table and that you're adding a default value and no constraint, then they show you how it's rewritten where they set explicitly the lock timeout, and then they will add the column.

And then they will set the default value but not in a way that it's going to do a table scan where it's going to add it for all the existing records; it's going to be for new records. And then they, let's see, they also update the users in batches to then set a default value, and then they will reset the statement timeout because it looks like they are...yeah, because initially, they change it, so they're resetting it to an original value. And then, they set the column Null constraint. I know I just said a lot of things reading from their README.

But they have a good example here that kind of highlights that this is how they rewrite it. So I do find that more reassuring as long as I can then see how it was rewritten, and then I can validate it and confirm it with what I think is appropriate. Then I still have full control. Then it's more of a hey, we rewrote this thing for you. Feel free to review it and then change it as you see fit. As long as I have that final authority, then that makes me feel better about this.

CHRIS: Got it. That makes sense. And the thing that you're describing, I think, is a semantically different thing than the first migration where it's like, do this thing. And they're like, well, okay, you could. But if instead, you did X, Y, and Z, then it would go way faster and be way easier. That makes a lot of sense. And it feels like shared knowledge wrapped up into a tool which I'm always a fan of that.

STEPH: Yeah, in general, when I think about general strategies for schema changes, there are really three areas that come to mind or three strategies that come to mind. The first one is that we take incremental steps to avoid blocking reads and writes to the table, which then allows you to deploy during business hours or off business hours. That often means just more manual steps that you have to take to make sure that it's safe. And then the other one is scheduling downtime to run a migration. That is a very real option, something that you can do. Or have a fancy setup that utilizes followers for seamless migrations and upgrades.

I feel like that's like the three big buckets that you can fit your strategy within. And it just depends on the needs of your application and users as to which one of those you're ready for or which strategy you need to use. What do you think? Are there any other big buckets that I left out of that list?

CHRIS: No, I think we covered a bunch there. Hopefully, that was useful. Hopefully, it, I don't know, maybe introduced folks to some new ideas or ways to think about this sort of work. And yeah, with that, shall we wrap up?

STEPH: Yeah, I've still got my Wordle to play for the day. So let's wrap up.

CHRIS: The show notes for this episode can be found at bikeshed.fm.

STEPH: This show is produced and edited by Mandy Moore.

CHRIS: If you enjoyed listening, one really easy way to support the show is to leave us a quick rating or even a review on iTunes, as it really helps other folks find the show.

STEPH: If you have any feedback for this or any of our other episodes, you can reach us at @_bikeshed or reach me on Twitter @SViccari.

CHRIS: And I'm @christoomey.

STEPH: Or you can reach us at [email protected] via email.

CHRIS: Thanks so much for listening to The Bike Shed, and we'll see you next week.

ALL: Byeeeeeeeeeee!!!

ANNOUNCER: This podcast was brought to you by thoughtbot. thoughtbot is your expert design and development partner. Let's make your product and team a success.