In this blog I am going to explain why you should avoid exposing every tiny configuration in your SaaS application. I am going to talk about configurations that are related to SaaS deployments. This kind of deployments have the unique property that they are fully deployed and managed by either dev or ops teams.
Stop Exposing Config Properties
To show my pain I’ll start with an example. I have a service that receives messages and propagates some of them according to a configured type and severity.
My service is a spring-boot java server so I can pass something like this as a config property
to the service: Dalerts.propagate.conf=”ALERT_A:2,ALERT_B:4,ALERT_C=8”.
My service runs as a pod in Kubernetes so this property will be part of the environment variable JAVA_OPTS inside the deployment.yaml. The deployment.yaml is obviously not hard coded, we are not monkeys, it is built dynamically using helm charts that take parameters from the CLI. Obviously building long helm commands manually is for amateurs so it is automated in Jenkins and the params are taken from Jenkins params and secrets.
Remember that a string with special characters must be escaped. This whole pleasing routine must be done for every config property that you decide to expose. It has to be written and maintained in dev/test-environments, production, etc.
Is this the easiest way?
Now I have a simple question - “Is this the easiest way?” One might say, you got it all wrong,
put the config in a file or even in a database and avoid the hustle! I totally agree but in real life you can’t always make big changes on the spot and totally change the config mechanism.
It is very common to think that potential disasters can be avoided by changing some magical configuration property but I would like to challenge this perception. In this blog, I’d like to challenge the convention that guides us to expose config properties almost automatically.
I want to show that in SaaS environments where software is continually deployed, not only is it not useful but it may even lead to errors and a continuous waste of precious time.
When a production issue occurs, I would like our DevOps teams to have the most accurate and focused knowledge of how to fix it. Many half-documented config properties will not do any good. If no solution can be found and the hints lead to the code, I will be asked to dive into it. Unless I wrote that code relatively recently, I will probably have to investigate and wander about for a while until I find the magical property (assuming we actually had the right property for that exact issue...). Let’s assume it will take me a few hours to find the cause. Then, I’ll tell my DevOps colleague what should be changed and we are done. Now, what if there is no such magical property? How long will it take me to modify a hard-coded property, build the application and provide a new version? If your answer is more than 20 minutes, then configurations are not your problem, you should check your build process.
From my own experience config properties that were not well defined as part of some feature, tend to remain unchanged forever, they can be misused and add unnecessary overhead to all development and production phases. The deployment process is not always so simple and properties might be propagated throughout several config layers (see example above). You might find yourself escaping and maintaining those properties over and over, escaping them multiple times and making the process much more error-prone. In addition there is always a possibility to hit some esoteric bug in one of the layers (e.g helm writing int as float numbers https://github.com/helm/helm/issues/1707). In some cases the amount of properties seem to grow so much that third-party configuration tools are becoming a necessity (and liability) leaving us with a lot of bureaucracy to manage.
Huge amounts of unused config properties make things hard to follow and maintain. In such an environment, it is only a matter of time before you find sensitive properties that have leaked to dangerous places. For instance, leaving a secret as plain-text inside a k8s-deployment instead of a safe k8s-secret. This can be easily detected by using Alcide’s Kubernetes Advisor but it is better to avoid the issue to begin with. In a production cluster there are enough moving parts as it is - and enough security concerns as a result - that you should keep your code as clean and as simple as possible, saving your time for developing instead of maintaining unneeded properties.
To summarize, too many times configuration properties are a premature optimization. It takes time to manage and use them in dev environments, test environments and eventually in production. An exposed property should be either well defined as a part of a feature or have a good explanation. It can also emerge from a real problem that was detected and fixed by exposing a relevant property. A simple guideline question for the necessity of a property is “must I change this property based on the environment?”
A word about non-SaaS installations..
I think that exposing configurations may be a good idea for such environments (on-prem, RT, etc.). You may even want to consider deploying a configuration service that will allow modifying configurations without replacing any files on the customer site. Exposing config properties do come with a certain price, but the ability to solve problems by remotely changing a configuration and without bothering your customers with technicalities is priceless - rather customers just have to approve the new configuration. I do recommend that where possible a configuration should have a good default value but if there is no such value make sure it is clearly documented. That being said, one might wonder - “Should I expose the config-server URL as a local config on the customer’s installation?” but I’ll leave it for you to decide.