Speeding up software delivery using Kólga

15 min readMar 4, 2021

TL;DR

We build a microservice architectured application with three services. All of these communicate with one of the following AMQP, REST APIs, and Websockets. The application is deployed using our Open-Source (MIT Licensed) DevOps tool Kólga.

The full source-code for this blog post can be found at https://github.com/andersinno/microservice_kubernetes_webinar.

The preface; the story behind Kolgá can be found in Anders blog.

Writing and deploying an application with Kólga

Let’s deploy a monorepo consisting of three applications. We will do this using Kólga and GitLab CI; however, GitHub Actions can also achieve the same result.

Prerequisites

Let’s begin with the prerequisites.

I will assume that you have a Kubernetes cluster set up.
I will assume that you have access to a cluster and have the appropriate permissions to the cluster.
I will also take that you have access to GitLab CI, with proper rights given to your user to set up environment variables and access package registries.
I will assume that you have set up the Kubernetes cluster integration in GitLab for the project. At the time of writing, that is done under “Operations” → “Kubernetes.”

About the applications

We will create three very small applications that we will deploy, an API for posting scores, a frontend, and a Slack message poster. All of the projects will be tied together using AMQP using a RabbitMQ server, using a REST API, and the scores will be stored in a PostgreSQL database. The frontend will display new scores as they come in by using WebSockets.

The reasoning here begins using so many different technologies is mainly to show the flexibility of Kólga and trying to cover as many use cases for microservices as possible in the same go.

All applications will include the following:

A Dockerfile for creating a Docker image
An entrypoint script that is run when the application starts
A Poetry configuration for package dependency management
A GitLab CI configuration script
The application code

All applications will get their configurations from environment variables.

As an added bonus, all applications will be written using asynchronous Python.

Application dependencies

We will use many of the same dependencies for many of the applications, so, therefore, we go through them now, once, instead of repeating ourselves.

Uvicorn

Uvicorn is an ASGI server based on uvloop and httptools. It enables the development of easy-to-write asynchronous web applications using Python. We will use it to run our applications. More information can be found at https://www.uvicorn.org/.

FastAPI

FastAPI is a package that makes it easy to write REST and GraphQL APIs. It also makes it very easy to write asynchronous Python code, enabling us to write web applications that utilize WebSockets. We will use the package mainly due to the small amount of code needed to write an application. More information can be found at https://fastapi.tiangolo.com/.

AIO Pika

People that have interfaced with RabbitMQ and AMQP before might be familiar with Pika from before. The AIO Pika package, while not actually building on the Pika package, is used for communicating over AMQP but using an asynchronous API. More information can be found at https://aio-pika.readthedocs.io/en/latest/.

DotEnv

Since our application will be configured using environment variables, we want an easy way to inject environment variables in development. For this, we are using DotEnv, which can read in .env files and inject them into the running application environment. This enables us to fetch those variables using os.environ.get(), for instance. More information can be found at https://github.com/theskumar/python-dotenv

About the code

Before we jump into the application code, I want to give a heads up regarding the code’s presentation in this tutorial. For the sake of readability, I will reduct certain parts of the code. These parts will be marked in this article’s source code in the following way # <…redacted X…>. The full source code, including everything, can be found at https://github.com/andersinno/microservice_kubernetes_webinar.

Since we are creating a monorepo, all of the projects will be placed in the same Git repository. The folder structure of the repository is as follows

.
├── README.md
├── docker-compose.yml
├── poster
│   ├── Dockerfile
│   ├── README.md
│   ├── docker-entrypoint.sh
│   ├── poetry.lock
│   ├── poster
│   │   ├── __init__.py
│   │   └── main.py
│   └── pyproject.toml
├── reporter
│   ├── Dockerfile
│   ├── README.md
│   ├── docker-entrypoint.sh
│   ├── poetry.lock
│   ├── pyproject.toml
│   └── reporter
│       ├── __init__.py
│       └── main.py
└── scores
    ├── Dockerfile
    ├── README.md
    ├── alembic
    │   ├── README
    │   ├── env.py
    │   ├── script.py.mako
    │   └── versions
    │       └── 229236b6dde7_score_table.py
    ├── alembic.ini
    ├── docker-entrypoint.sh
    ├── poetry.lock
    ├── pyproject.toml
    └── scores
        ├── __init__.py
        ├── database.py
        ├── main.py
        ├── models.py
        └── schemas.py

Scores API

The scores API will be a Python application with a single REST API endpoint, /scores. It uses SQLAlchemy as an ORM to the database and uses Alembic for handling migrations. If you would like to know more about SQLAlchemy and Alembic, you can read about those at https://www.sqlalchemy.org/ and https://alembic.sqlalchemy.org/.

#  scores/scores/main.py#  <...redacted imports...>
#  <...redacted environment and Database/AMQP connection setup...>
#  <...redacted health check endpoints ...>async def send_message(loop, score: schemas.Score):
    message = json.dumps(score.dict()).encode()    connection = await connect(BROKER_URL, loop=loop)    # Creating a channel
    channel = await connection.channel()    scores_exchange = await channel.declare_exchange("scores", ExchangeType.FANOUT)    # Sending the message
    await scores_exchange.publish(
        Message(message, delivery_mode=DeliveryMode.PERSISTENT), routing_key="kolga"
    )    print(f" [x] Sent '{message}'")    await connection.close()@app.post("/scores")
async def create_scores(
    score: schemas.Score, db: Session = Depends(get_db)
) -> schemas.Score:
    score_record = models.Score(**score.dict())
    db.add(score_record)
    db.commit()    loop = asyncio.get_event_loop()
    loop.create_task(send_message(loop, score))    return score_record@app.get("/scores")
def get_scores(db: Session = Depends(get_db)) -> List[models.Score]:
    return [entry for entry in db.query(models.Score).all()]if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8000, log_level="info")

The code includes two main parts, one for sending a message over AMQP using the send_message() function, and two functions for handling POST and GET requests to the /scores endpoint.

REST API and database structure

The REST API here handles both retrieving and creating scores. The get_scores functions naively fetch all scores from the database and returns them, JSON encoded, as a list. The create_scores function expects a score structure to be sent to it. It then creates a score object in the database using SQLAlchemy. When the object is saved, it then sends the score information to the send_message function, which then handles AMQP communications.

The database schema and the schema that is used by FastAPI for matching a score looks like:

#  scores/scores/models.py
from sqlalchemy import Column, Integer, String
from .database import Baseclass Score(Base):
    __tablename__ = "Score"    id: int = Column(Integer, primary_key=True, index=True)
    user: str = Column(String(255), index=True)
    score: str = Column(Integer(), default=1)#  scores/scores/schema.py
from pydantic.main import BaseModelclass Score(BaseModel):
    user: str
    score: int    class Config:
        orm_mode = True

The database schema migrations are handled through Alembic. To apply the migrations, the command alembic upgrade head can be used.

AMQP

Since this is not a tutorial into sending and handling AMQP messages, I will not go into further details of the things that are going on there. The main thing that happens is that we connect to an exchange, which sends a message to all clients listening (fanout). The message sent over AMQP using AIO Pika needs to be bytes; this is why we use the .encode() function on the string representation of the score object.

Reporter Frontend

The reporter frontend will be a Python application with two endpoints, / (index/root) and one for a WebSocket connection. When a client connects to the frontend, they are presented with HTML and JavaScript that connects to the server through a WebSocket and will get all score updates as they come in.

#  /reporter/reporter/main.py#  <...redacted imports...>
#  <...redacted environment and Database/AMQP connection setup...>
#  <...redacted health check endpoints ...>
#  <...redacted ConnectionManager code...>
html = (
#  <...redacted HTML and JavaScript...>
)#  WebSocket connection manager
manager = ConnectionManager()@app.get("/")
async def get():
    return HTMLResponse(html)@app.websocket("/ws/{client_id}")
async def websocket_endpoint(websocket: WebSocket, client_id: int):
    await manager.connect(websocket)    async with aiohttp.request("GET", f"{SCORES_API}/scores") as object_names_response:
        if object_names_response.status != 200:
            raise HTTPException(
                status_code=500, detail="Could not get scores"
            )
        scores: List[str] = await object_names_response.json()
        for score in scores:
            await manager.send_personal_message(json.dumps(score), websocket)async def on_message(message: IncomingMessage):
    """
    on_message doesn't necessarily have to be defined as async.
    Here it is to show that it's possible.
    """
    print(" [x] Received message %r" % message)    decoded_message = message.body.decode()    try:
        print(f"Message body is: {json.loads(decoded_message)}")
    except Exception:
        print("Not a JSON message, ignoring")    print("Broadcasting?")    await manager.broadcast(decoded_message)#  <...redacted AMQP consumer and connection setup...>if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8000, log_level="trace")

The index get() function in this code returns a HTMLResponse which makes the browser render out the HTML that it is sent by the application. The websocket_endpoint() function adds the WebSocket connection to a connection manager, then it fetches all of the scores from the Scores APIs /scores endpoint and then sends all of them to the browser.

The application also listens to AMQP messages on the same exchange as was used in the Scores API. When any new message comes in, it will call the on_message() function with the message. As all messages are bytes, we start by decoding the message and turning the JSON structure back into a Python dict. We then send that dict to the WebSocket client using the WebSocket connection managers broadcast() method.

Slack Poster

The Slack poster service will post new scores directly to Slack as they come in. It uses the same AMQP queue that the other two services uses and listens for a message that contains score information.

#  /poster/poster/main.py#  <...redacted imports...>
#  <...redacted environment setup for Slack connection...>
#  <...redacted health check endpoints ...>async def on_message(message: IncomingMessage):
    """
    on_message doesn't necessarily have to be defined as async.
    Here it is to show that it's possible.
    """
    print(" [x] Received message %r" % message)    decoded_message = message.body.decode()    try:
        json_message = json.loads(decoded_message)
    except Exception:
        print("Not a JSON message, ignoring")
        return None    await slack_client.chat_postMessage(
        channel=SLACK_CHANNEL,
        text=f":tada: {json_message['user']} just scored {json_message['score']} points",
    )#  <...redacted AMQP consumer and connection setup...>if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8000, log_level="info")

The poster utilizes the Python Slack API package to communicate with the Slack API. The application waits for new messages to come to the on_message() function, turns the bytes into a JSON structure that then gets turned into a Python dictionary. If the message is successfully processed, a message is sent to a specified Slack channel.

Docker

To make development faster and always deploy the same application, we will utilize Docker images to run our application. Docker makes it easy to put all the dependencies and application code in a reusable image that can be run on developers’ machines just as well as in a production environment.

The Docker images for all of the applications above are very similar, so we will go through it once, but the same process can be applied for each of the applications.

Dockerfile

The Dockerfile that we will be creating utilizes Docker's multi-stage build process. This means that it actually includes multiple different image specifications in a single file. During the build process, stages can pull in files from prior stages to have a more effective build process.

# General Dockerfile for all of the services# ===================================================
FROM andersinnovations/python:3.9-slim AS build-base
# ===================================================EXPOSE 8000/tcp# ===================================
FROM build-base AS poetry
# ===================================#  <...redacted poetry installation ...>
# Note that you do not need to use Poetry for this to work, one could
# just as well use `pip` for installing Python dependencies here. The process
# would be almost the same.# ===================================
FROM build-base AS base
# ===================================COPY --from=poetry --chown=appuser:appuser /app/requirements.txt /app/requirements.txt
COPY --from=poetry --chown=appuser:appuser /app/requirements-dev.txt /app/requirements-dev.txt# ==============================
FROM base AS development
# ==============================#  <...redacted development stage ...># ==============================
FROM base AS production
# ==============================# Install production dependencies
RUN apt-install.sh build-essential libpq-dev libssl-dev \\
 && pip install --no-cache-dir -r requirements.txt \\
 && apt-cleanup.sh build-essential# Copy code to image
COPY --chown=appuser:appuser . /app# Set app user
USER appuser# Set up start command
ENTRYPOINT ["./docker-entrypoint.sh"]

As we are mainly interested in running this in production, the poetry installation stage and the development stage are redacted to make the file easier to read. What happens in the Dockerfile is that we first install the required python requirements and other required OS packages. Then we copy over the application code into the image, and finally, we set up an entry point script, which will run when the image is spun up.

The entry point script is quite simple but effective. It checks that all of the required services are running and then lets the user choose to run a custom start-up command or let the Docker container run as it would in production.

#!/bin/bashset -e# Check if the database is available
if [ -z "$SKIP_DATABASE_CHECK" -o "$SKIP_DATABASE_CHECK" = "0" ]; then
    wait-for-it.sh "${DATABASE_HOST}:${DATABASE_PORT-5432}"
fi# Check if the broker is available
if [ -z "$SKIP_BROKER_CHECK" -o "$SKIP_BROKER_CHECK" = "0" ]; then
    wait-for-it.sh -t 20 "${BROKER_HOST}:${BROKER_PORT-5672}"
fi# Start server
if [[ ! -z "$@" ]]; then
    echo "Command is $@"
    "$@"
elif [[ "$DEV_SERVER" = "1" ]]; then
    uvicorn scores.main:app --host 0.0.0.0 --reload
else
    gunicorn scores.main:app --bind 0.0.0.0:8000 -k uvicorn.workers.UvicornWorker
fi

CI / CD and Kólga

Now that we have our applications written and we have a Dockerfile, it is time to build, test, and finally deploy. For this, we will use Kólga, the in-house turned open-source tool by Anders.

We are mainly going to focus on the building and the deployment part here, as Kólga does not make any decisions regarding what tests you run. It does, however, make builds and deployments a breeze.

GitLab and GitHub

In this tutorial, we will only cover how to set up Kólga using GitLab, as that is what we commonly use at Anders; however, all of these features also exist for GitHub using GitHub Actions. The Actions can be found at https://github.com/andersinno?q=kolga-.

Setting up the base

We start by creating a base GitLab CI/CD file called .gitlab-ci.yml. This file will contain the configuration for our pipeline.

# .gitlab-ci.yml
include:
  - remote: '<https://raw.githubusercontent.com/><YOUR REPO CONTAINING KOLGA>/v3/.gitlab-ci-base-template.yml'

This will import the base templates included with Kólga but will not set up any pipeline for you by default. If you are not running a mono-repo project, you could try out the .gitlab-ci-template.yml file instead of the base file. That will set up a base template that builds and runs your application by default.

As you might have noticed, you will need to fill in your own repository here containing the Kólga code. At this point in time, we do not support pulling straight from our repositories on GitHub or GitLab, as the base templates include a slight Anders-specific configuration at the bottom of the base file. The GitHub configuration here does not have this limitation at the moment; however, it can be used as-is.

Deployment stage

There can be multiple stages in an application’s deployment lifecycle. Usually, they end up with a production deployment, but the deployments before varying from product to product and between companies and their policies.

Kólga handles three different types of deployments and can be extended to cover more cases if needed. The three types are “review”, “staging” and “production”. The key difference between these is what they are used for. Staging and production environments are almost identical in how they are deployed and expects dependency application such as databases to be deployed separately. The review environment is a bit different, however. Review environments are short-lived, non-persistent environments set-up at will when new code has been created to ease the review process of that code. They exist as long as a merge or pull-request exists and, after that, are removed. As they set up temporary instances of the application, there also needs to be temporary dependency applications existing for these cases. Kólga helps set up such dependencies and has a set of pre-defined applications that it can run with just a single command.

Application deployments

Since we have three different applications, in this case, we want to configure these separately. So we will create one project-specific configuration for each of our applications, and then we will import those in the main gitlab-ci.yml configuration. The configurations will look like this.

# poster/.gitlab-ci.ymlbuild-poster:
  extends: .build
  variables:
    DOCKER_BUILD_CONTEXT: poster
    DOCKER_BUILD_SOURCE: poster/Dockerfile
    DOCKER_IMAGE_NAME: posterreview-poster:
  extends: .review-no-env
  environment:
    name: qa/r/${CI_COMMIT_REF_SLUG}
  variables:
    DOCKER_IMAGE_NAME: poster
    DOCKER_BUILD_SOURCE: poster/Dockerfile
    PROJECT_NAME: poster
    K8S_INGRESS_DISABLED: 1# reporter/.gitlab-ci.ymlbuild-reporter:
  extends: .build
  variables:
    DOCKER_BUILD_CONTEXT: reporter
    DOCKER_BUILD_SOURCE: reporter/Dockerfile
    DOCKER_IMAGE_NAME: reporterreview-reporter:
  extends: .review
  environment:
    url: https://$CI_PROJECT_PATH_SLUG-$CI_ENVIRONMENT_SLUG-reporter.$KUBE_INGRESS_BASE_DOMAIN
  variables:
    DOCKER_IMAGE_NAME: reporter
    DOCKER_BUILD_SOURCE: reporter/Dockerfile
    PROJECT_NAME: reporter
    K8S_SECRET_SCORES_API: https://$CI_PROJECT_PATH_SLUG-$CI_ENVIRONMENT_SLUG-scores.$KUBE_INGRESS_BASE_DOMAIN
    K8S_SECRET_REPORTER_URL: https://$CI_PROJECT_PATH_SLUG-$CI_ENVIRONMENT_SLUG-reporter.$KUBE_INGRESS_BASE_DOMAIN# scores/.gitlab-ci.ymlbuild-scores:
  extends: .build
  variables:
    DOCKER_BUILD_CONTEXT: scores
    DOCKER_BUILD_SOURCE: scores/Dockerfile
    DOCKER_IMAGE_NAME: scoresreview-scores:
  extends: .review
  environment:
    url: https://$CI_PROJECT_PATH_SLUG-$CI_ENVIRONMENT_SLUG-scores.$KUBE_INGRESS_BASE_DOMAIN
  variables:
    DOCKER_IMAGE_NAME: scores
    DOCKER_BUILD_SOURCE: scores/Dockerfile
    PROJECT_NAME: scores

As you can see, they all look very similar, with the main difference being the variables that are getting set.

The build stage for all three projects only differs in the Docker build variables, which tell Kólga where to look for a file to build (DOCKER_BUILD_SOURCE) and in which context the build should happen (DOCKER_BUILD_CONTEXT). It also specified what the final image should be named using DOCKER_IMAGE_NAME.

In the review stage, we specify on which URL the application should be hosted. Since these applications are dynamically created, we need to use variables in the URL to distinguish between them. In this case, we are using the project path, an environment slug provided by GitLab, and finally, the domain specified when you set up the Kubernetes cluster in the GitLab project. Then image to be used is defined again; this should match the one in the build stage. We also give each environment a project name to refer to later on in other configurations. These are all things that you can read about in the documentation for Kólga as well.

Finally, we want to configure our applications a bit. This is done by specifying specially crafted environment variables with the prefix K8S_SECRET_<VARIABLE_NAME>. These variables will be stripped of the prefix, injected into the running application as environment variables. They can then read by the application when it starts up or at run time. In this case, we want to pass the URL for the Score API to the reporter, for instance, along with its own URL to be used when settings up the WebSocket connection.

The GitLab CI configuration can now import these configurations, and you would have a file that looks like the following.

include:
  - remote: '<https://raw.githubusercontent.com/><YOUR REPO CONTAINING KOLGA>/v3/.gitlab-ci-base-template.yml'	- local: poster/.gitlab-ci.yml
	- local: reporter/.gitlab-ci.yml
	- local: scores/.gitlab-ci.ymlcleanup_review:
  extends: .cleanup_reviewstop_review:
  extends: .stop_review

The two sections here are for making tearing down the review environments easier. It will automatically tear down the review environment in the case of failure and give the user the option to manually stop each environment in GitLab.

Dependency applications

The dependencies that we need in this case are RabbitMQ and PostgreSQL. Those two have pre-defined setups in Kólga, which will make the setup easier. We will update the configuration to add the capability to set up these services dynamically on each merge request.

include:
  - remote: '<https://raw.githubusercontent.com/><YOUR REPO CONTAINING KOLGA>/v3/.gitlab-ci-base-template.yml'	- local: poster/.gitlab-ci.yml
	- local: reporter/.gitlab-ci.yml
	- local: scores/.gitlab-ci.ymlservice-postgres:
  extends: .review-service
  variables:
    POSTGRES_IMAGE: "docker.io/bitnami/postgresql:12.5.0"
  script:
    - devops deploy_service --track review --service postgresql --env-var DATABASE_URL --projects poster reporter scoresservice-rabbitmq:
  extends: .review-service
  script:
    - devops deploy_service --track review --service rabbitmq --env-var BROKER_URL --projects poster reporter scorescleanup_review:
  extends: .cleanup_reviewstop_review:
  extends: .stop_review

Let’s go through what is happening here by tearing down the two service- stages. In both cases, we start off by extending a pre-defined template that contains a few static configurations. We have a script tag in both cases, which runs the command that will set up our service. Let’s split up that command a bit and look at what it is doing.

First, the main Kólga CLI command devops is called, and we specify that we want to deploy a service with the deploy_service argument. The --track argument specifies in which type of environment we are deploying in. The track is really just a string used for making it easier to distinguish between environment types; in the case of services, this will almost always have the value of the review. Then we select which service to run; in our case, this is postgresql and rabbitmq. To connect to these services, we need to expose the connection string to the services that will use it. This is defined using the --env-var argument followed by the environment variable that you will use in your application when connecting to the service. Finally, we specify which services should be given credentials to the service. Remember the PROJECT_NAME variable that we specified when configuring the review environments? This will be the same name as you specified there.

In the case you want to run a specific version of a service, you can also specify which image that should be used when deploying the service. For instance, in the PostgreSQL service, we specify that we want to use the Bitnami image of Postgres 12.5. The name of the variable that should be used when specifying the image can be found in the service source code.

Staging and Production

If you also want to set up staging and production environments, we only need to create two small job definitions for the CI config.

staging:
  extends: .staging
  environment:
    url: http://poster.$KUBE_INGRESS_BASE_DOMAIN
  variables:
    DOCKER_BUILD_CONTEXT: reporter
    DOCKER_BUILD_SOURCE: reporter/Dockerfile
    DOCKER_IMAGE_NAME: reporterproduction:
  extends: .production
	environment:
	    url: http://poster.$KUBE_INGRESS_BASE_DOMAIN
  variables:
    DOCKER_BUILD_CONTEXT: reporter
    DOCKER_BUILD_SOURCE: reporter/Dockerfile
    DOCKER_IMAGE_NAME: reporter

By default, this will deploy to staging every time the master branch is updated and deploy to production every time a tag that follows the pattern r-<number> is created. These can be configured through the only.refs setting in the configuration, however.

Thats it

Alright, that was all. Now your application will be built, and review environments will be set up for you whenever you create a new merge request in GitLab, and if you want staging and production, you can add that with just a few more lines.

Closing notes

Thank you for staying with us for this long 😄

Hopefully, you have found this tutorial insightful and have seen how you, too, can utilize Kólga for your projects as well. If you need any help getting going with the tool, or have questions regarding DevOps in general, don’t hesitate to reach out at any time.

Kólga is already in use in production by Anders and companies and organizations such as Visma and the City of Helsinki. It is also continually being developed and will stay open-source also in the future. As with most open-source projects, we also highly appreciate any contributions from the outside of Anders.