Platform Tips #22: Building a Better Alternative To Backstage - Part I

Let's build together an alternative to Backstage focusing on ease of use, low maintenance and developer experience. This is the part I of a series of 3.

Apr 20, 2024

Hey Folks 👋,

I'm Romaric, CEO of Qovery, and this is my 22nd Platform Tips post.

Last week, we talked about how we could build a better alternative to Backstage based on 3 pillars.

Over this series of 3 articles, I aim to dig into what these pillars truly entail, both from a product perspective and technically. The goal is to propose a concrete solution to Backstage that focuses on ease of use, low maintenance, and Developer Experience - so feel free to challenge each of those points and propose even something better 😎. Multiple smart brains are better than only one brain here 🧠.

Part I: Ease of use (no code) ← [YOU READ THIS ONE]
Part II: Low maintenance (it just works)
Part III: Intuitive for developers (Developer Experience)

Note 👀: those articles are a great introduction to how Torii (the concrete alternative to Backstage that I am working on) is built. The code is available here but not yet usable. Expect to be able to use Torii for May 2024.

Feel free to take those concepts, adapt and apply them to any alternative to Backstage that you might build - or join and contribute to Torii ⛩️

Ease of Use

The challenge of an Internal Developer Portal is to act as an intermediary layer between two different types of technical teams that have different scopes, goals, and backgrounds. On one side, you have the Platform Team, which traditionally comes from an IT background and is in the mindset of providing the right self-service experience to their “customers” - Software Engineers. (IT Background because a lot of “DevOps Engineers” transition to “Platform Engineers” roles).

DevOps is a methodology, Platform Engineering is also a methodology… that’s why I use double quotes. But let’s keep it simple here - If you don’t agree - Please read at least this article

On the other side, you have the Software Engineers (SE) who have to deal with software development, shipping features, fixes, and improvements… 90% of them inside organizations have no time (nor interest) to understand how to be their own DevOps engineer. And let’s be honest; infrastructure is a whole world - there are so many things to learn and to be aware of that it’s almost impossible to be proficient in Software Development and Infrastructure (if this’s the case, please apply in my company - Qovery 😁)

So, how do you deal with those 2 personas and their backgrounds to provide a product that will be configured by Platform Engineers and be used by Software Engineers?

Kubernetes, of course. Just kidding 😬! But actually, it’s not that far since IT people have largely adopted YAML with the large adoption of Kubernetes. YAML is then a good candidate to configure the complete alternative to Backstage. Even the web UI!

For software engineers, there is no need to reinvent the wheel here; a web UI is, so far, the best and most intuitive interface that you can give to any software engineer.

To summarize:

Platform Engineers interact with YAML only (no React… Proprietary Framework…) to configure and manage the Internal Developer Portal.
Software Engineers interact with a Web UI that is intuitive and generated for them.

Ok, let’s get deeper now!

The 3 Building Blocks

The subtitle is self-explicit - I propose an approach with 3 building blocks:

Input
1. The configuration YAML file provided by Platform Engineers
2. The scripts used and maintained by Platform Engineers
Backend
1. The parser
2. The API
3. The workflow engine
Frontend
1. The generated components
2. The web components used by Software Engineers
3. The user feedback and status

Input

Those are the bare minimum elements that Platform Engineers need to provide to feed the Internal Developer Portal.

YAML Config File

The configuration file is expected to be a YAML file - a format well-known and used by Platform Engineers. The learning curve is close to 0. The file describes what the Self-Service and Catalog look like for the portal.

Here is a format I propose:

auth:
  ... local / SSO / SAML authentication with third party providers
models:
  ... store data for catalogs
catalogs:
  ... customizable views to list all services and get the right documentations
self_service:
  ... configure and provide the self service portal for your developers

Four main components are specified in our configuration file.

“Auth” to manage the user auth and permissions
“Self Service” provides a self-service portal with all the different actions that users can do.
“Catalog” to provide customizable views of all services and their state.
“Models” to store all the data we need in the Catalog

For instance, the self_service section will generate this kind of view:

A YAML describing a new cluster action could be like this:

...
self_service:
  sections:
    - slug: default
      name: Default
      description: Default section
      actions:
        - slug: new-cluster
          name: New Cluster
          description: spin up a new cluster for a temporary time
          icon: target
          fields:
            - slug: name
              title: Name
              type: text
              required: true
            - slug: description
              title: Description
              description: provide a description for your Cluster - what's good for?
              type: textarea
            - slug: ttl
              title: TTL
              description: Time to live for your cluster (in hours)
              placeholder: 24
              type: number
              required: true
          post_validate:
            - command:
                - python
                - my-scripts/cluster_management.py
                - create
                - --name
                - {{name}} # this is a variable that will be replaced by the value of the field 'name'
          delayed_command:
            - command:
                - python
                - my-scripts/cluster_management.py
                - delete
                - --name
                - {{name}}
              delay:
                hours: {{ttl}} # this is a variable that will be replaced by the value of the field 'ttl'

Which will result in a generated form like this on the web UI

The idea here is not to get into all the details of the options of a potential YAML file but more to show that from a YAML file, we can properly tailor the whole experience for a Self-Service portal. No code is needed. Which will be a big win compared to what you have to do to get the same result as Backstage.

Scripts

If you look deeper into the YAML configuration file, you can see that you can execute commands and, ultimately, scripts.

...
          post_validate:
            - command:
                - python
                - my-scripts/cluster_management.py
                - create
                - --name
                - {{name}} # this is a variable that will be replaced by the value of the field 'name'
...

This will be the best way to re-use what Platform Engineers have already done, regardless of the technology.. the effort of integration is lowered, and you’re free to use whatever best fits your requirements.

We will dig deeper into this in part II - because they have much more to say. I’ll stay concise right now.

Do you use Terraform? Bash? Pulumi? Python? Typescript? Docker?… you’re just free to use really whatever you want.

The only requirement is to return 0 if everything works fine - and you can get the field input as a parameter and return an output. But we will dig deeper in part II. (It sounds familiar? 😁 It is.. but for a different goal).

Backend

The backend part is in charge of exposing an API that the frontend can consume to reflect properly what Platform Engineers want to expose to Software Engineers.

Configuration Parser

There is nothing really fancy here: a YAML parser will parse the configuration provided by the Platform Engineering team once the backend starts. The important thing here is to make sure that the configuration file is valid (E.g., checking that the scripts do exist and are executable). Throwing out an error otherwise to prevent debugging later on.

API

The API exposes the Self-Service, Catalog, and Model configurations through a set of API endpoints. Again, there is nothing really fancy here; those endpoints can be read only - a few of them will get inputs from software engineers using the self-service portal - but that’s it. Of course, if some of the form inputs are wrong, an error will be returned from the API.

E.g. a validation script provided by a Platform Engineer that will return an error to the user:

...
          validate:
            - command:
                - python
                - my-scripts/check_field.py
...

import sys
import json

if __name__ == '__main__':
    # a JSON is passed from the backend as the first ARG
    input_json = json.loads(sys.argv[1])
    input_value = input_json['value']

    if len(input_value) > 10:
      print("Your value can't be higher than 10 characters")
      sys.exit(1)

The whole configuration is done via the YAML file. Which is very convenient to lower the set of API endpoints to configure the interface.

Workflow Engine

This is the part of the backend that is responsible for executing scripts provided by Platform Engineers with the Software Engineers inputs. In a direct fashion and in a delayed way.

The workflow engine must be resilient, recover on error (kill -9), and resume script and background executions when possible.

Frontend

We will cover the Frontend part in detail (Generated Components, Interactions, User feedback, and Statuses) in part III.

Conclusion

In this first part, we covered how we provide a simpler experience for Platform Engineers and how it interfaces with the Self-Service and Catalog for Software Engineers. In Part II (next week), we will go deeper into how we lower the maintenance for Platform Engineering teams by re-using what they have done and their background.

Until that, feel free to share your feedback about this post. Have a great week.

—

Let's revolutionize Platform Engineering by putting developers first. Subscribe now to join me on this exciting journey!

Platform Engineering Tips

Discussion about this post