Please reuse your database servers
I’ve been part of a number of “monolith to microservice” transitions and something that I’ve seen a few times is engineering orgs creating separate database servers per-team or even per-service.
Please don’t do this. Please start with a production server, and an “everything else” server, until you outgrow one large server.
Separate servers, as opposed to different databases on the same server, is purely engineering overhead, and comes with no practical advantages. From an application perspective, connection info is just connection info, so it makes no difference where the DBs are hosted.
From an ops/infrastructure perspective, it’s managing (number of services) * (number of environments)
servers, and managing more servers will always be harder than managing fewer (without a PaaS-like investment).
And because it doesn’t matter to the application, adding database servers upfront is over-engineering.
For any benefit you think you’ll get for having a server per-team or per-service, start doing it on one server, and see how useful and productive it is. My experience and guess is, most teams who talk about hypothetical benefits aren’t doing them on the servers they currently have, and better tools are available. In many cases the teams don’t understand the tools that already exist in their database ecosystem of choice.
There are two times you may want to break stuff out early. One would be for a high-use, critical path service, like sessions and auth. The other would be for a constant high-load service, like analytics.
I’d be curious if anyone has an argument or experience that goes against this advice. Most of what I’ve seen runs into fundamental misattribution problems, where the choice is unrelated to the benefits (this is true of most rewrites I’ve been a part of or have heard about). Let me know in the comments!
Can you clarify if you are talking about databases that hold canonical data?
About this:
“From an ops/infrastructure perspective, it’s managing (number of services) * (number of environments) servers, and managing more servers will always be harder than managing fewer (without a PaaS-like investment).”
Is that still true if you’re using something like Terraform? Or is Terraform part of what you meant when you spoke of PaaS?
I think Terraform will tear down database servers and then set them up again, but its automated, so does it matter?
I don’t especially disagree with this:
“For any benefit you think you’ll get for having a server per-team or per-service, start doing it on one server, and see how useful and productive it is.”
Except maybe you want different teams to have different ssh permissions, and that is easier if they all have their own servers?
Again, can you clarify if you are talking about databases that hold canonical data? The last few projects I worked on, all the canonical data was in Kafka and so all the SQL databases were just caches, which could be easily torn and rebuilt. I wrote about that here:
http://www.smashcompany.com/technology/why-are-software-developers-confused-by-kafka-and-immutable-logs
Hi Lawrence. Great questions.
Last question first- as you suspect, I would only apply this advice to servers with canonical/durable data. So in your case, I guess it’d be the Kafka hosting/servers. I guess I would extend it to something like durable Redis servers, like for a background job system. Your article is great BTW, I didn’t finish the whole thing but it seems to grasp all the issues I’ve seen with microservices as well.
So regarding Terraform, because these are servers holding canonical data, I’m not sure how Terraform gets around most problems (though I am not an expert and may be missing something). The servers still need to be monitored and maintained, for instance. And I think you’d run into problems trying to manage microservices with a monolithic devops script setup, so you’d still need each service to specify some of its own setup and end up essentially duplicating “code.” I am probably wrong on a number of details but unless I’m misunderstanding something I’m not sure how Terraform will help you much in this situation.
Rob, about this:
“So regarding Terraform, because these are servers holding canonical data”
I assume you’d agree that all servers die, occasionally, and any robust system needs to be setup to handle the sudden death of a server? Can you explain this:
“And I think you’d run into problems trying to manage microservices with a monolithic devops script setup”
How do you manage the combination of microservices plus servers that die? I end up writing some code for handling restarts. I’m not sure if “monolithic” is the right word, but surely all of us have written some “devops script setup”?
About this:
so you’d still need each service to specify some of its own setup and end up essentially duplicating “code.”
I agree this happens, but what is the alternative? I write code to restart servers when they die. Some of the code ends up being a bit duplicate, that is true.
In terms of the configuration, the most convenient thing is to put the config variables into the Terraform code, but I know some shops are strongly opposed to that, as it means, putting such variables into a git repo, and many shops are opposed to having anything like a username or password in a git repo. But if you don’t have that in a git repo, its got to be somewhere. I’ve also relied on Supervisor to inject ENV variables, but then that means manually setting up Supervisor. I don’t think there is any magic solution here. Either one has config info in a repo, or one sets everything up manually. Either approach has certain problems.
Thanks for the feedback Lawrence. You’re asking very good questions- but I am not an expert (or even very experienced) on most of them and I don’t have good answers or opinions for. We’ve veered quite far from the central thesis which is “don’t overcomplicate your infrastructure for no benefit” and rather into “you need some benefits so what is the preferred way to complicate your infrastructure” :P My expertise/conviction is in the “a good monolith can solve the vast majority of your problems”.
I can at least affirm your experience that the preferences and best practices of everything dealing with microservices are all over the place :)
Thanks, Rob. I also prefer simplicity. I am curious if you have an opinion about Docker/Kubernetes? I’ve been critical of them because I feel that Terraform gives all the benefits of VMs without the complexity. And much of the complexity seems like it is invested in saving old technology that perhaps should be retired. I wrote up my thoughts here: http://www.smashcompany.com/technology/docker-protects-a-programming-paradigm-that-we-should-get-rid-of