Skip to main content

Command Palette

Search for a command to run...

We can do much better

By coming back to the fundamentals...

Updated
6 min read
We can do much better
E

Software engineer at the intersection of functional programming, product development and open-source

I was really taken aback the other day by this piece of Terraform file:

variable "db_username" {
  description = "Postgres username"
  type  = string
}

The Terraform motto is “infrastructure as code”, but what I see above is a configuration file masquerading as a programming language!

Later on, at work, I got into a fight with the AWS Secret Manager and our EKS Cluster. The problem is simple:

  • The AWS Secret Manager holds our Postgres database password, and is in charge of rotating it. Wonderful.

  • The EKS Cluster starts a pod where one container uses an environment variable expecting the database password. Great.

Question: how do we pass this piece of information between those 2 systems?

The path to madness

The current solution consists in aligning precisely several pieces of text:

  • The name of the secret created in the Secrets Manager.

  • The “secrets provider” resource in Kubernetes.

  • The “volumes” section of the Kubernetes pod.

  • The “mounted volume” for a given container.

  • The environment variable with a “secret reference”.

Let’s have a look at those pieces.

The secrets provider

First the secrets provider:

apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
    name: database-secrets
spec:
    provider: aws
    parameters:
        objects: |
            - objectName: "rds!db-63dfcfd8-9b62-482a-ac60-faad5f806344"
              objectType: "secretsmanager"
    secretObjects:
    - secretName: database-username-and-password
      type: Opaque
      data:
      - objectName: "rds!db-63dfcfd8-9b62-482a-ac60-faad5f806344"
        key: DATABASE_USERNAME_AND_PASSWORD

This piece of configuration maps a secret name in the AWS Secret Manager to a Kubernetes Secret. It actually does more than just a mapping. It also guarantees that both will be synchronized when the secret is rotated.

That’s great, but why does the secret name have to be repeated twice? That name now lives in 3 places: the Secrets Manager, the parameters definition, the secretObjects definition. Since everything is a string, there was also nothing to force us to use a secret name for the objectName field. We could have passed rds!db-63dfcfd8-9b62-482a-ac60-faad5f80634 only to discover much later that a digit was missing.

We also have just introduced 3 new strings!

  • The name of the secrets provider: database-secrets.

  • The secretName and the key of the secret because Kubernetes secrets are effectively maps of maps of strings.

The volumes section

In the pod deployment file we need a section declaring which volumes are mounted:

volumes:
- name: secrets-provider-volume
  csi:
  driver: secrets-store.csi.k8s.io
  readOnly: true
  volumeAttributes:
    secretProviderClass: "database-secrets"

There, we need to make sure that we use the right name for the secretProviderClass, by using a string defined in another file. But look, we are introducing yet another string for the volume name: secrets-provider-volume!

Mounting a volume

Next we need to mount that volume inside the container that will use the secret:

volumeMounts:
  - name: secrets-provider-volume
    mountPath: /mnt/secrets/database
    readOnly: true

Once more we need to be careful to match the volume name and we add yet another string for the mountPath.

The environment variable

Finally we can declare an environment variable that will hold the secret:

- name: OCKAM_DATABASE_USERNAME_AND_PASSWORD
  valueFrom:
    secretKeyRef:
      name: database-username-and-password
      key: DATABASE_USERNAME_AND_PASSWORD

Like before we need to make sure to reuse the same names that we declared in the secret provider, secretName and key, except that they are now called name and key.

Note that nothing tells us that the volume secrets-provider-volume needed to be mounted and if the mountPath, /mnt/secrets/database, was relevant.

In summary

In order to use one AWS Secrets Manager secret in a Kubernetes pod we have to carefully align:

  • One generated string (the secret name).

  • 4 user defined strings.

  • In 10 different spots.

This is the definition of programming by coincidence!

Can we have a real programming language, please?

This short video is funny:

It basically says “We made all this YAML so that you don’t have to do it“.

One objective of the Unison programming language is to have a cloud offering where declaring services and all sorts of resources can be done programmatically. This has numerous advantages:

  1. We can name variables that hold values, and refer to those names everywhere.

  2. We can type those variables to prevent runtime errors.

  3. We can use abstractions to isolate from implementation details.

What would it look like for our “secret” example to use a programming language?

The secret sauce

If we had a programming language at our fingertips we could start by defining the concept of “secret source”, that can be implemented by an AwsManagedSecret (all the code that follows is pseudo-Haskell-like code):


databaseManagedSecret : IO AwsManagedSecret
databaseManagedSecret = getOrCreateAwsManageSecret "prod-database" RotationPolicy.MONTHLY

class SecretSource s where
  getSecret : s -> IO Secret

instance SecretSource AwsManagedSecret where
  getSecret = todo

The purpose of a SecretSource is to give us the most recent secret when we ask for it. The real thing would actually give us more, like the frequency of the rotation, or how to access it via the network.

The implementation is using the AWS secrets manager but it could an Azure Key Vault. You will notice that we give the managed secret a name but it is not relevant for the rest of the code.

Next, we need to define a Kubernetes Secret that uses the SecretSource:

-- A KubernetesSecretFromSource takes care of synchronizing 
-- the secret from the source, including
data KubernetesSecretFromSource = todo

makeKubernetesSecret : SecretSource s => s -> IO KubernetesSecretFromSource
makeKubernetesSecret s = todo

class KubernetesSecret s where
  getSecretMap : s -> IO SecretMap

instance KubernetesSecret KubernetesSecretFromSource where ...

Now, we define an environment variable from a KubernetesSecret:

-- An environment variable getting its value from a KubernetesSecret
-- It is responsible to mounting the right volume when used for creating a Pod
data EnvVarFromKubernetesSecret = todo

makeEnvVarFromKubernetesSecret : KuberneteSecret s => s -> IO EnvVarFromKubernetesSecret
makeEnvVarFromKubernetesSecret s = todo

class EnvironmentVariable v where
  getName : v -> Text
  getValue : v -> IO Text

instance EnvironmentVariable EnvVarFromKubernetesSecret where ...

Finally we can create a Pod, using an EnvironmentVariable which abstracts the details of how the value is retrieved:

createPod : EnvironmentVariable v => v -> ... -> IO Pod
createPod variable ... = do
  -- this mounts the right volume for the pod
  setEnvironmentVariable variable

-- The full configuration
databaseClientPod :: IO Pod
databaseClientPod = do
  managedSecret <- databaseManagedSecret
  k8sSecret <- makeKubernetesSecret managedSecret
  envVar <- makeEnvVarFromKubernetesSecret k8sSecret
  createPod envVar

All of this is just a sketch, and the important points are:

  1. We use typed variables to pass information around.

  2. We use interfaces / typeclasses to hide implementation details.

  3. We have library functions that declare exactly what is necessary to build:

    1. A Pod,

    2. An EnvironmentVariable backed by a KubernetesSecret,

    3. A KubernetSecret from a SecretSource.

This is just good old software engineering practices! With huge benefits:

  • Less possibilities to make mistakes,

  • More discoverability of how things work,

  • A lot more fun programming our resources!

Let’s do it?

If we wanted to go that route there would be a number of obstacles:

  1. I am not sure a mainstream programming languages can support that approach if, at some stage, we need some fancy types. However, if we just need type-safe heterogeneous maps we should be covered.

  2. There would be a need for dependency injection in order to be able to easily define a configuration based off another configuration (what “override files” are attempting to do).

  3. The sheer size of the project! The number of data structures and APIs to cover is enormous. I suspect that only a company-backed project (like Pulumi) could sustain this kind of effort.

So, what do you all think? Can we do better? Can we do much better?

Ł

I think you'll love the talk we're preparing for Scalar 2025 because yeah, that's basically the objective we've been working towards since 2021. Check out the agenda, I can't post links because "new user", duh.