Don’t bother with migrations for performance-based indices
I used to write database migrations for performance-based indices, like adding an index to a foreign key column. I don’t do it anymore, and do not recommend the practice. Instead, write your SQL against a production-like database (same schema and data), and run it against production directly as part of your “operations” rather than as part of your development workflow.
Caveats: This doesn’t apply to migrations part of the initial schema (may as well do that at once), and it doesn’t apply to indices part of a constraint, like a unique constraint. Also, you may have totally valid special-case exceptions, I’m only speaking as a rule of thumb on project sizes that you’d take advice on my blog from, like from 1-100 devs. When you get past a certain size, you probably already have standards for all these things, so you do you.
Writing formal migrations for performance-based indices doesn’t really have any upside.
- You don’t need them in non-production-like environments because they only matter when there is enough data and traffic to be worthwhile in the first place.
- They do not change the correctness of the code so are not relevant to the code.
- They are impossible to develop locally. You need to develop and test them in a production-like environment anyway.
- They’ll come along with database restores anyway, so your production-like staging or wherever that is based on production data will still get them.
The process I see everyone follow ends up being:
- Run SQL in a prod-like environment to find something that works
- Copy it to code
- Go through code review, merging, and deployment
- CD runs migration on production
- Verify performance has changed
It seems like you can have a much more iterative and faster process to avoid the migration and CD hoops. That said, you still do need a rigorous process for indices- unwinding from “we added too many indices and it hid performance issues” is almost impossible.
So the process I suggest is:
- Iterate on the migration in a prod-like environment
- Get another set of eyes to review the migration and effects
- Document the rationale and performance results (like a GitHub issues comment with the before/after EXPLAIN)
- Run the migration in prod directly
- Verify performance has changed
I mean, performance index migrations in code isn’t the end of the world, but I feel like removing the friction on this encourages us to fix issues faster and explore more. Like, maybe you want to replace a btree with a gin index- it seems silly to have to write a migration (and then another to potentially undo the change). So I encourage teams to avoid the rigamarole of following the code-change path for performance indices and be more open to experimentation
by exercising the same rigor but with a shorter process.