The documentation is in a very early stage.
This is the multi-page printable view of this section. Click here to print.
Documentation
- 1: About Microservice Dungeon 2.0
- 2: Game Rules
- 2.1: Map
- 2.2: (TO BE REVISED) Items
- 2.3: (TO BE REVISED) Game Rule Discussions
- 3: Getting Started
- 3.1: Hello MSD
- 3.2: Develop your Player
- 3.3: Getting started with operations
- 4: Architecture
- 4.1: Design Principles
- 4.2: Game Service
- 4.3: Robot Service
- 4.4: Trading Service
- 4.5: Map Service
- 4.5.1: MapType (Aggregate)
- 4.5.2: Map Configuration
- 4.6: Discussion
- 4.7: Learnings
- 5: Player Development
- 5.1: Local Dev Environment
- 5.2: Player Skeletons
- 6: Operations
- 6.1: Kubernetes
- 6.2: Monitoring
- 6.3: AWS Hosting
- 6.4: Bare-Metal Hosting
- 7: Reference
- 8: Glossary
- 9: Wiki
- 9.1: Event Driven Architecture
- 9.2: Databases
- 10: Contribute
1 - About Microservice Dungeon 2.0
Welcome to the Microservice Dungeon! – or simply MSD.
(tbd - elaborate, give some history, team, funding, papers, …)
2 - Game Rules
The following section provides an overview of the game rules. If you’re wondering what the MSD is, check out our About page.
tl;dr
The MSD is a game where player services compete to achieve the highest possible score. Each player service controls a swarm of robot. Robots move across the game board, mine resources, purchase upgrades, and battle each other. Most actions earn points.
The map is a 2-dimensional grid of star systems. Resources are distributed across it. They can be mined, picked up by robots, and sold at space stations. Space stations are also placed on the map and serve as a kind of home base and safe zone. There, robots can sell resources, get repairs, and purchase upgrades. Combat is not allowed on space stations.
Each player has a money account, and starts the game with an initial amount of money. This can be used to purchase new robots. New robots always start at the player’s allocated space station. Selling resources and killing robots increases a player’s account balance. With more money, a player can buy additional robots or upgrade existing ones.
Map
Each game takes place on a map, visible on start for players. The map itself is a stationary grid consisting of tiles - similar to a chessboard. Each tile has up to eight adjacent neighbors, except at the edges of the map. The positions of the tiles are stationary, meaning they do not move.
The map for an MSD game is configurable. There are a couple of standard map types, and it is possible to add other types as well for dedicated games. Please refer to the map details page for an in-depth explanation of what elements a map consists of.
Games
The MSD is played in individual games. To participate, players register once and join an open game. This is possible even after the game has started, but late joiners do not receive any compensation for missed playtime. The number of participants is limited and defined at game creation.
Win Condition
A game ends either when an administrator intervenes, or when the predefined game time runs out. The player with the most points wins the game, though there are several categories to score a ranking.
Once a player has joined a game, they cannot leave. If they lose all their robots and have no funds left to purchase new ones, their game ends.
Robots
Every player controls robots to compete against other players.
A robot is exactly what you’d expect — a mechanical unit with health points, energy for performing actions, having the ability to move, engage in combat, and upgrade itself through the purchase of upgrades.
Buying a Robot
Players can purchase new robots at any time during the game using their money. Newly bought robots spawn instantly at a space station dedicated to a player or by choice.
Action-Cooldown / -Queue [to be discussed]
After performing an action, a robot requires a short pause before executing the next one. This cooldown applies regardless of whether the action was successful or not. As a result, robots may not respond immediately to new commands.
Robots queue up actions and execute them in order. Players should carefully plan the number of commands they issue, as each action has a different cooldown duration. For instance, attacking another robot has a shorter cooldown than moving. Upgrades are available to reduce cooldown durations.
External actions do not trigger a cooldown, such as applying upgrades or collecting resources.
This mechanism is similar to the mining system, with one key difference: Robots execute actions immediately, followed by a cooldown. Mines require processing time before yielding resources.
Energy
Robots have both health points and energy. Some actions consume energy, and if a robot runs out of energy, it pauses and cannot perform further actions until it has recharged enough energy.
Energy automatically regenerates over time. This process is increased on space stations.
Repairing
Robots automatically restore health points over time on space stations. No action required.
Movement
Robots can move horizontally, vertically, and diagonally across the map. Each movement consumes energy and triggers a cooldown. To track enemy robot positions, players must listen to the movements of other robots.
Fighting [to be discussed]
Robots can attack other robots. Attacks consume energy and trigger a cooldown, regardless of whether the attack is successful or not. To attack, a robot must have enough energy and be in range of the target. Caution, friendly fire is possible!
When an enemy robot is destroyed, the attacker becomes stronger, receives a financial reward and collects the destroyed robot’s resources, regardless of its position.
Mining [to be discuessed]
To extract resources, a player starts the mining process at a specific mine. If the player’s robot is on the same planet, it will automatically collect the resources once the mining process finishes.
Starting the mining process consumes energy and triggers a cooldown - even if the mining fails or the robot is elsewhere. Collecting mined resources does not consume energy nor does it trigger a cooldown. But in order to work, the robot must be present on the planet when the mining completes.
Trading
Trading is an essential part of the MSD and takes place at space stations. Players can sell mined resources, purchase new robots, and upgrade existing ones — all in exchange for in-game currency.
Selling Resources
Resources collected by robots can be sold at any space station. To initiate a sale, a robot carrying resources must be present at a space station. Once the sale is started, all resources on that robot are sold to the market, and the player receives the corresponding amount of money in their account.
The value of each resource depends on its rarity. Currently, prices are hardcoded and fix, but may fluctuate (based on market demand) in the future. In that case, if a large volume of a specific resource is sold in a short time frame, its market price will drop. Players should consider these fluctuations when planning their mining and trading strategies.
Buying Robots
Players can purchase additional robots at any time during the game. To do so, they must have sufficient funds in their account. New robots are delivered to a space station. If the player does not specify a station, a random one will be chosen.
Purchases are made through robot vouchers, which define how many robots will be spawned. Vouchers are immediately redeemed upon purchase.
Upgrading Robots
Robots can be upgraded to enhance their capabilities. Upgrades can improve:
- Carrying capacity
- Combat strength
- Attack damage
- Health points
- Health point regeneration speed
- Energy
- Energy regeneration speed
- Energy capacity
- Cooldown time
Upgrades are purchased in the form of vouchers, which are tied to a specific robot. To apply an upgrade, the robot must be located at a space station. If the robot is not at a station when the upgrade is purchased, the voucher will be discarded without refund, and the player will be notified.
Debt
Player cannot go into debt. If they do not have enough money to make a purchase, the transaction will be declined and an error message will be published.
2.1 - Map
What is a Map in MSD?
A map is basically a 2-dimensional grid, structured like below. Each tile can be of a specific type, and can be identified (relative to the overall map) by an integer tuple, starting at (0,0) in the lower left corner.
Robots can move over the map in the directions N, NE, E, SE, S, SW, W, and NW (see below). The map boundaries limit the movements. I.e. from tile (0,2) you can only move in northern, eastern, and southern direction, but not to the west.

Star Systems and Voids
The tiles of a map represent either be a star system or a void. Star systems are connected to each other by hyperlanes. Robots can use these hyperlanes to travel from one star system to the next. Since voids do not have hyperlanes, they are effectively barriers for robot movement. Robots need to navigate around them, traveling from one star system to the next.
We will create an example map, step by step, to illustrate the map concepts. The image above shows our example map with some barriers on it, on the following tiles:
- (1,1)…(1,4)
- (3,6)…(3,7)
- (7,4)
- (4,3)
- (7,0)…(7,1)
Gravity Areas
Traveling hyperlanes requires energy. The robots have a limited supply of energy, which they need to recharge eventually. An MSD map can have areas with different levels of gravity. Depending on that level of gravity, passing a hyperlane from one star system to the next requires a certain amount of energy (reference point is always the target star system).
The above image shows our map with a “typical” configuration of increasing gravity towards the center of the map. Gravity comes in three levels:
| Gravity Level | Energy needed (default) |
|---|---|
| LIGHT | 1 |
| MEDIUM | 2 |
| INTENSE | 3 |
Please be aware that the energy needed per level might be configured differently than the default for each individual game. So you cannot rely on these values to be hardcoded.
Space Stations, Resource Mines, and Black Holes
Star systems can have a resource mine or a space station, they can be a black hole, or they are just empty. (For simplicity reasons, a star system - in the current version of this game - can only be one these things. So it cannot be a resource mine and a space station at the same time. Neither will it have more than one resource mine.
Any number of robots - both allied and enemy - can occupy the same star system. The image below shows our example map “ready to play”.
Space Stations
A space station is a traversable tile that serves as both a trading outpost and a safe zone for robots. Combat is not allowed on space stations. Robots can use space stations to trade, purchase upgrades, and repair themselves. New robots also spawn (are built and launched …) at space stations.
Each player is assigned to a space station, so that new robots for the player fleet will always spawn at the same location. Depending on the number of space stations, several players may share the same station.
Trading and upgrading is only possible at space stations. This means that a robot with its cargo area full of mined resources must travel to a space station in order to sell the resources, and to get upgrades. Space stations are neutral, and accessible to all robots from all players. I.e. robots can travel to any space station, not only the one they spawned from.
Resource Mines
Some star systems contain mines for resource extraction. Depending on the map type, they can randomly distributed across the map, or deliberately located only in certain map parts. Each mine produces a single type of resource, and will continue producing until it is depleted. Each planet can have at most one mine.
There are five types of resources, ranked from most common to rarest (hint: the further back the initial letter is in the alphabet, the more valuable the resource.)
- Bio Matter
- Cryo Gas
- Dark Matter
- Ion Dust, and
- Plasma Cores
Our example map above shows a fairly small, but still typical distribution of resources. Bio matter can be found in 8 locations, cryo gas in 5, dark matter in 3, and the most valuable ion dust and plasma cores in 2, respectively. The numbers below the acronyms in the above map describe the available units of the resource.
Once a mine is depleted, it is closed and disappears from the map. Depending on the particular map type definition, new resources may be discovered once the existing units are partially or fully exhausted. This discovery of new resources will not necessarily be at the same location as the old mine.
Mining takes time. A player service can ask for the mining process to start for a dedicated robot. After a short delay, the resource becomes available on the planet ready to be picked up (by that particular robot). This will happen automatically, assuming the robot is capable of doing so. Robots are by default able to transport the least valuable resource in their cargo rooms (bio matter). For all other resources, they need to be upgraded.
All resources are volatile substances. If the robot is not capable of holding the mined resource in its cargo area, they remain at mine and evaporate - i.e. they are lost to both the player who initiated the mining, and to all other players.
Black Holes
Black holes can be traversed by robots, but entering a star system with a black hole will - with a certain probability - lead to the robot’s destruction. The default likelihood for destruction is 50%, but this might be configured differently in a particular map.
Summary
Summing up, the tiles in a 2-dimensional MSD map follow this schema1:
map = { tile }
tile = star system | void
star system = space station | resource mine | black hole | empty
resource mine = bio matter | cryo gas | dark matter | ion dust | plasma cores
-
For the afficionados - this is supposed to be a dumbed-down Extended Backus-Naur Form (EBNF) :-) ↩︎
2.2 - (TO BE REVISED) Items
Items - Draft
Buying Items
Players can buy special items from the market. Probably trading has a price list an can issue items itself. Option 1: Trading owns items
- Trading has a list of items
- Players can use funds to buy items
- Items are saved inside of trading?
- Items are saved inside of player?
- Items are saved in a new service (Item Service)?
Using Items
Players can use special items. The core problem: Depending on functionality, the item must be implemented in the respective service. e.g:
- If the item is a weapon / shield, it must be implemented in the robot service.
- If the item is a mining tool, it must be implemented in the mining(planet) service.
Basic event flow:
- Player uses item
- Item service sends event “item type x used”
- Robot/World service receives event and executes action
2.3 - (TO BE REVISED) Game Rule Discussions
Game Rule Discussions
Trading
- Robot has to initiate the selling instead of trading (not consistent with other trades)
- Does selling cost energy?
- How do we enforce debt consequences? If we do not check money beforehand, what should other services implement as a reaction to “player is in debt”
- Items are not defined yet
Map
I have a problem with the map being just a bunch of planets, right next to each other with some other fields in between. I would be more comfortable with it being a grid of tiles, where each tile is a star system. This either contains planets & optionally mining stations, is empty, a black hole, or has a space station.
One tile is a star system, it has up to 8 neighbors, connected by hyperlanes.
A star system contains 0..3 planets, space stations, mining stations, and black holes.
graph TD
%%{init: {'flowchart': {'nodeSpacing': 80, 'rankSpacing': 100}}}%%
%% Row 1
A1[Star System A1] --- A2[Star System A2]
%% Row 2
B1[Star System B1] --- B2[Star System B2]
%% Row 3
C1[Star System C1] --- C2[Star System C2]
%% Vertical connections
A1 --- B1
A2 --- B2
B1 --- C1
B2 --- C2
%% Contents of some star systems
A1 --> P_A1[Planet]
A1 --> M_A1[Mining Station]
C1 --> P_C1[Planet]
C1 --> M_C1[Mining Station]
B2 --> SS_B2[Space Station]
classDef star fill:#eef,stroke:#333,stroke-width:2px;
class A1,A2,B1,B2,C1,C2 star;
classDef feature fill:#99f,stroke:#333,stroke-width:1px;
class P_A1,M_A1,P_C1,M_C1,SS_B2 feature;
3 - Getting Started
3.1 - Hello MSD
3.2 - Develop your Player
3.3 - Getting started with operations
Overview of System Architecture
ArgoCD
Combined with Sealed Secrets theoretically all you need General workflow:
- Write Manifests, or use Kustomize to pull Helm Charts
- Encrypt all secrets with
seal < secret.yaml > secret.yaml - Push to Git
- Profit
Installing Tools
4 - Architecture
For service boundaries & ownership see the dedicated service pages below.
4.1 - Design Principles
Global Design Principles
In MSD, we maintain a couple of global design principles, supposed to help in decision making. Every time we arrive at some priorization decision, these global principles should help deciding the issue. Therefore, this is a living document.
Technical Principles
(tbd - just keywords so far)
- JSON as configuration format (so that it can be reused in REST APIs)
4.2 - Game Service
4.4 - Trading Service
Functionalities
- Buying Robots
- Upgrading Robots
- Selling Resources
- (Buying Items)
Aggregates / Owns
- Money
- (Items)?
4.5 - Map Service
Use Cases
Create new Map Type
Trigger: REST call (POST containing the map type specification, format see TODO)
Responsible Aggregate: MapType
What happens?
- Store the specifification
- return the MapType ID
Produced Event(s): MapTypeCreated
Create Map Instance for a new Game, based on a Map Type
Trigger: Event (originating from Game service) that a new game has been created.
Responsible Aggregate: MapType
What happens?
- TODO
Produced Event(s): MapTypeInstanciated
- Generating Map
- Generating Resources
- Mining / Depleting Resources
Aggregates
See list below.
4.5.1 - MapType (Aggregate)
This page documents the design decisions for the MapType aggregate, which is responsible for
specifying a certain type of map, and creating an instance for it.
Configuration Principle
A map type is defined by its size and description. The grid cells are then further configured by a configuration which consists of the following sequential sections.
- Gravity distribution zones
- The gravity distribution is defined via a sequence of map areas, each having a certain gravity level.
- Those grid cells not covered by this definition have the default gravity value.
- Map structure - void / planet definition
- This section consists of several sequential layers of defining either planets or voids.
- A void is effectively a barrier, while a planet can carry a resource, a space station, or a black hole.
- Each new layer overwrites the ones before, allowing for complex distributions
- e.g. you could define a maze-like structure by first setting the whole map (or a large section of it)
to
void, and then add “pathes” ofplanetson top. - on the other hand, if you want large open space with one or several barriers in it, you start with
a
planet-defining layer, and then add thevoidsas barriers.
- e.g. you could define a maze-like structure by first setting the whole map (or a large section of it)
to
- This section consists of several sequential layers of defining either planets or voids.
- Distributions of resources, space stations, or black holes.
- Distributions consist of an area definition
4.5.2 - Map Configuration
Map configuration is handled by the aggregate MapConfig.
Functionalities
- Generating Map
- Generating Resources
- Mining / Depleting Resources
Aggregates
- Map (Constellation):
- Planets
- Mining Stations
- Resources incl. their distribution
5 - Player Development
5.1 - Local Dev Environment
5.2 - Player Skeletons
6 - Operations
6.1 - Kubernetes
6.2 - Monitoring
6.3 - AWS Hosting
6.4 - Bare-Metal Hosting
This page serves as a reference for how and why we set up the server the way we did. As a short overview:
We are using the following software to manage our server:
- RKE2: A Kubernetes distribution that is easy to install and relatively robust.
- Longhorn: A storage solution for Kubernetes that uses the local hard-drive.
- Nginx-Ingress: A reverse proxy that is used to route traffic to the correct service.
- Cert-Manager: A Kubernetes add-on that automates the management and issuance of TLS certificates from letsencrypt.
- Sealed Secrets: A tool for encrypting Kubernetes Secrets into a format that can be safely stored in a public repository.
- ArgoCD: A declarative, GitOps continuous delivery tool for Kubernetes.
- Kustomize: A tool for joining & customizing YAML configurations.
- Helm: A package manager for Kubernetes that makes installing application pretty simple.
We are using a single rocky linux server running RKE2 (a Kubernetes Distribution). All software that is running on the server is defined in our argocd repository. This is picked up by ArgoCD (running on the server) and applied continuously, so any changes to the repository are automatically applied to the server. This also means, manual changes will be discarded within seconds.
Overview Diagram

An overview of (almost) all the components running on our server, note that everything in the blue lines is running in Kubernetes
Lets dig into the components shown in here. Starting off with the most important:
ArgoCD
The core principle of ArgoCD is pretty simple: if you want to make a change to your cluster, what do you do?
Use kubectl apply -f my-manifest.yaml. This is essentially what argoCD does, but slightly more sophisticated
and static. Every Manifest you define in an “Application” (we’ll get to that) is applied to the cluster.
The good thing: you can reference a Git repository, and ArgoCD will automatically apply any changes to the cluster.
Under the hood, ArgoCD uses Kustomize, which is a powerful tool for combining, patching and generally customizing YAML files on the fly. For example, if you have two yaml files:
- ingress.yaml
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: game spec: ingressClassName: nginx rules: - host: game.microservice-dungeon.de http: paths: - backend: service: name: game port: number: 8080 path: / pathType: Prefix - service.yaml
apiVersion: v1 kind: Service metadata: name: game spec: ports: - name: http port: 8080 protocol: TCP targetPort: 8080 selector: app.kubernetes.io/instance: game app.kubernetes.io/name: game type: ClusterIP
You can use Kustomize to combine them into a single file, which can then be applied to the cluster.
In fact, kubectl has built-in support for Kustomize. To know which files to combine how, you have to use
a kustomization.yaml file (named exactly like this), which looks like this:
kind: Kustomization
apiVersion: kustomize.config.k8s.io/v1beta1
resources:
- ingress.yaml
- service.yaml
If you run kubectl kustomize . in the directory where the kustomization.yaml file is located, it will
combine the two files into a single file and print it to the console.
This is what ArgoCD does, but it does it for you. You define an “Application” in ArgoCD
which is a pointing to a kustomization.yaml file in a Git repository.
ArgoCD will then automatically apply any changes to the cluster -
just like running kubectl apply -k . in the directory where the kustomization.yaml file is located -
but automatically and continuously as the files change.
Now, we are not only using plain yaml files, but also Helm charts. These are a bit more complex (see Helm if you want to learn what they are and why we use them).
Simply said, Helm is a package manager, which you can use to install applications on your cluster. Most of the time you customize the installation with “values”, which are basically installation
options defined in yaml. Kustomize can be used to render a helm chart but you need to use the --enable-helm flag. For example a kustomization.yaml for cert-manager would look like:
kind: Kustomization
apiVersion: kustomize.config.k8s.io/v1beta1
helmCharts:
- name: cert-manager
repo: https://charts.jetstack.io
version: v1.17.0
releaseName: cert-manager
namespace: cert-manager
valuesFile: certmanager-values.yaml
With the cert-manager-values.yaml file looking like this:
crds:
enabled: true
This will install cert-manager into the cert-manager namespace,
using the certmanager-values.yaml file to customize the installation. Under the hood ArgoCD converts the chart into
yaml files, you can see this by running kubectl kustomize . --enable-helm in the directory where the kustomization.yaml file is located.
Helm has to be installed on your system for that to work.
In the ArgoCD install process we have set the following parameters to enable Helm support:
configs:
cm:
create: true
kustomize.enabled: true
helm.enabled: true
kustomize.buildOptions: --enable-helm
Sealed Secrets
In order for us to store secrets like passwords and other sensitive configurations in our git repo, we need to encrypt them. This is where Sealed Secrets comes in. This is a tool that uses asymmetric encryption with a public and private key to encrypt secrets. You can use the kubeseal cli to encrypt an existing secret like this:
kubeseal --cert sealing-key.pem < secret.yaml > sealed.yaml
This assumes that there is a sealing key (a public key/certificate) lying somewhere on the machine you are using kubeseal on. In our ArgoCD
repository it is provided in the applications folder. If you dont have the key, but have kubectl access,
you can use kubeseal to either fetch the sealing key with
kubeseal --fetch-cert > cert.pem or use kubeseal directly:
kubeseal \
-f path/to/unencrypted/secret.yaml \
-w output/path.yaml \
--controller-name=sealed-secrets \
--controller-namespace=sealed-secrets \
--format=yaml
Once you apply a sealed secret to the cluster, the controller will decrypt it and make it available under the name and namespace of the original secret.
Updating / Merging a secret:
You can merge / update a sealed secret by creating a secret with the same keys as the original
secret and using the --merge-intocommand. Suppose you have your sealed secret sealed.yaml:
apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
name: game-sensitive-config
namespace: core-services
spec:
encryptedData:
DB_NAME: AgCgSNobnCxjTaOpYcDwwfUEqeCAL6loxQDqzWIIgna7B58gbTC3MWUio/...
DB_PASSWORD: AgBmAh8Yi8Dz+gqVF1GwiFnooEfv8o3xYL3UHEDUhVK2rmSd1f7BHUGVE...
DB_USER: AgBbVZ99mft7oVuWcHpSV0D+hRRvFousesknAxfVgMdOwRO1BzTYin1SmlRdf...
RABBITMQ_PASSWORD: AgB5fB3P3O/tLuJyPjg7cu3TQcebJAWJbsqoR4ucy8Z8WFhFJ9L...
and the values you want to change in secret.yaml:
apiVersion: v1
kind: Secret
metadata:
name: any-name-works
namespace: core-services
type: Opaque
stringData:
DB_NAME: "database"
DB_PASSWORD: "password"
Use the following command to update the sealed secret (optionally with the --cert flag for local use)
kubeseal -o yaml --merge-into sealed.yaml < secret.yaml
Some notes:
- Sealing keys will automatically rotate every 30 days, so you should re-fetch the sealing key every once in a while. If you keep the private keys backed up somewhere, you also need to re-fetch them, as they will also be rotated. This security feature ensures that if one of your decryption keys is compromised, it will only affect the last 30 days.
- Do not commit any unsealed secrets to the repository. If you do, change all the passwords of affected services. Dont just delete the secret, or encrypt it afterwards.
- Sealing is designed to be a one way process. You can unseal a sealed secret if you have the private key but that is not recommended by the developers.
Ingress & TLS
An Ingress is used to route traffic to the correct service. It does so based on host and path, in our case for
example: game.microservice-dungeon.de will route to the game service, while robot.microservice-dungeon.de
will route to the robot service. Some resources might need encryption, in our setup we can use cert-manager
to issue TLS certificates from letsencrypt. You just need to include the annotation
cert-manager.io/cluster-issuer: letsencrypt-production and the tls section in your ingress resource:
tls:
- hosts:
- my.domain.com
secretName: some-secret-name
Afterwards cert-manager will issue a certificate for your ingress.
If you want to read more about how cert-manager works, read the letsencrypt documentation on HTTP01 and DNS01 Solvers. We use the hetzner-webhook to issue DNS01 challenges, so you can also issue wildcard certificates.
Storage
Storage is usually a bit tricky in Kubernetes, since the hard drives are hooked up to specific nodes.
Even though we only use a single node, we still use longhorn to manage our storage.
It provides a storage class called longhorn-static which you can use to create persistent volumes.
We have a total capacity of 1 TB.
Helm
Helm is a package manager for Kubernetes. It works by using templates to configure installations via a values file. Usually a template will look like this:
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "robot.fullname" . }}
labels:
{{- include "robot.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.replicas | int }}
selector:
matchLabels:
{{- include "robot.selectorLabels" . | nindent 6 }}
template:
metadata:
labels:
{{- include "robot.selectorLabels" . | nindent 8 }}
spec:
serviceAccountName: {{ include "robot.serviceAccountName" . }}
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.image.registry }}/{{ .Values.image.name }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
{{- with .Values.service.my }}
- name: {{ .portName | lower }}
containerPort: {{ .port }}
protocol: {{ .protocol }}
{{- end }}
{{- if .Values.resources }}
resources:
{{- toYaml .Values.resources | nindent 10 }}
{{- end }}
{{- if .Values.livenessProbe }}
envFrom:
- configMapRef:
name: {{ include "robot.fullname" . }}
{{- if .Values.env }}
env:
{{- range $key, $value := .Values.env }}
- name: {{ $key }}
value: {{ tpl $value $ | quote }}
{{- end }}
{{- end }}
They are a bit hard to read, but basically they are normal yaml files with placeholders. Everything prefaced
with .values is something coming from the values file. There are some built-in functions like include,
toYaml, tpl and more. Also there are
Fernverwaltung HPC Knoten
Unser Rechenknoten im HPC wird mittels IPMI Fernverwaltet. Um IPMI zu verwenden, müssen wir uns über m02, einen anderen Knoten, einloggen. Danach können wir das ipmitool in der Kommandozeile verwenden. Die Ip-Addressen sind wie folgt:
- goedel-m01: 10.218.112.200
- goedel-m02: 10.218.112.201
Sammlung an Befehlen zum verwalten:
# Status abfragen
ipmitool -U ADMIN -P ag3.GWDG -H 10.218.112.200 -I lanplus power status
# Einschalten
ipmitool -U ADMIN -P ag3.GWDG -H 10.218.112.200 -I lanplus power on
# Ausschalten
ipmitool -U ADMIN -P ag3.GWDG -H 10.218.112.200 -I lanplus power off
# Sensorstatus abfragen (Zeigt Temperaturen, Lüfterdrehzahlen und andere Sensorwerte an.)
ipmitool -U ADMIN -P ag3.GWDG -H 10.218.112.200 -I lanplus sensor
# Stromverbrauch
ipmitool -U ADMIN -P ag3.GWDG -H 10.218.112.200 -I lanplus dcmi power reading
7 - Reference
7.1 - Game
7.2 - Robot
7.3 - Planet
7.4 - Trading
7.5 - Dashboard
8 - Glossary
Entities
| Domain | Description |
|---|---|
| Black hole | TODO |
| Cooldown | TODO |
| Energy | TODO |
| Game | TODO |
| Life points | TODO |
| Map | TODO |
| Mine | TODO |
| Planet | TODO |
| Player | TODO |
| Resource | TODO |
| Robot | TODO |
| Space station | TODO |
| Upgrade | TODO |
| Void | TODO |
Actions
| Domain | Description |
|---|---|
| Mining | TODO |
9 - Wiki
9.1 - Event Driven Architecture
1. Patterns
1.1 Idempotent Event Consumer
Consumers must be able to process events idempotently. In general, an at least once delivery guarantee is assumed. This means that a consumer may receive the same event multiple times.
There are several reasons for this:
-
Typically, consumers receive and acknowledge events in batches. If processing is interrupted, the same batch will be delivered again. If some events in the batch have already been processed, a non-idempotent consumer will process them again.
-
Some brokers (including Kafka) support transactions, allowing receiving, acknowledging, and sending new events to occur atomically. However, it’s not that simple:
-
First, these transactions are very costly and lead to significantly higher latencies. In larger systems, those costs add up. These costs arise because processing no longer happens in batches as each event is transmitted individually, along with additional calls similar to a two-phase commit.
-
Second, this approach only works as long as a single technology is used. Otherwise, a transaction manager is required - the difference between local and global transactions. Else, events may be lost or published multiple times — bringing us back to the original problem. (See atomic event publication)
-
Idempotent Business Logic
In most (or even all?) cases, business logic can be made idempotent. It’s important to distinguish between absolute and relative state changes.
-
Absolute state changes:
With an absolute state change, a value is typically replaced by a new one — such as an address. These operations are always idempotent. -
Relative state changes:
A relative state change occurs when, for example, an amount is deducted from an account balance. Reprocessing the event would alter the balance again. Alternatively, an account balance can be modeled as a sequence of transactions. If the event ID is unique and part of the SQL schema, reprocessing would result in an error — this is essentially event sourcing.
Remembering Processed Events
Alternatively, the IDs of processed events can be stored in the database. Before processing, a lookup can be performed to check whether the event has already been handled.
For consistency, it’s crucial that storing the event ID happens within the same transaction as all other domain changes.
A prerequisite for a simple solution without locking is that the same event cannot be processed in parallel by different consumers. With brokers like Kafka, this is guaranteed — there can only be one consumer per partition.
Transactional Event Processing
Another alternative is the use of local (or global) transactions, accepting the associated drawbacks — assuming the broker supports transactions at all.
It’s important to note that when using multiple technologies, a transaction manager is required. Local transactions only work within a single technology — in this case the broker. That’s why exactly once processing works in Kafka Streams, where Kafka supports the role of a database as well.
1.2 Atomic Event Publication
Every change in a domain leads to the publication of an event in an event-driven architecture.
For the overall system to remain consistent, it would be ideal if the change and the publication occurred atomically within the same transaction. This would provide the highest possible level of consistency.
However, this is not easily achievable as database and broker are typically separate systems. To enable a global transaction across both, a transaction manager would be needed.
💡 A local transaction refers to a transaction within a single system (or technology), such as a database. When multiple systems are involved, separate local transactions are created independently. What’s missing is a mechanism to coordinate them. That’s where the transaction manager comes in — a separate system that synchronizes local transactions into a global transaction. This is the only way to achieve exactly once semantics across systems. All other approaches relax this guarantee.
The problem is very similar to ACID and CAP considerations in distributed systems, where trade-offs are made between different properties—or guarantees.
What are these guarantees?
-
Ordering Guarantees
A guarantee regarding the order of processing—whether it must match the original processing sequence or not. -
Delivery / Publication Guarantees
A guarantee regarding the delivery of events—whether events may be delivered multiple times, exactly once, or at most once. -
Read Your Writes
A guarantee about immediate consistency — i.e., whether a system can read its own changes right after writing them.
It’s important to note that some guarantees exist along a spectrum. The theory is much more detailed, so this section serves merely as an introduction. There are three approaches to solving this problem. Each fulfills different guarantees and comes with its own trade-offs.
Transaction Manager
A transaction manager coordinates local transactions across independent systems — essentially acting as the orchestrator of a two-phase commit (2PC).
exactly once in-order - A transaction manager provides by far the highest level of consistency between two systems. For this reason, it is often used in banking systems.
Disadvantages
-
Performance
A transaction manager introduces significant overhead, as it is a separate service that orchestrates a 2PC between systems. In Kafka, for example, using local transactions would also mean that events could no longer be processed in batches. -
Cost
Transaction managers are often commercial products — there are few, if any, free alternatives. -
Support
The involved systems must implement certain standards to integrate with a transaction manager.
Transactional Outbox Pattern
With a transactional outbox, the problem of independent transactions is deferred by storing events in the database as part of the same local transaction. A separate process reads and publishes these events in order.
at least once in-order – The storage of events happens exactly once, while their publication is at least once. The problem of independent transactions is effectively shifted by this pattern. As long as reading and publishing do not happen concurrently, the event order is preserved.
Types of Transactional Outboxes
-
Polling Outbox
The database is regularly polled for new events. This approach has higher latency and significantly increased resource consumption. -
Tailing or Subscriber Outbox
This type subscribes to a change stream of the database. Databases like Redis and MongoDB offer built-in subscriber mechanisms. For others, the write-ahead log can be used to achieve the same effect.
Further Considerations and Challenges
-
Scaling
As throughput increases, scaling and latency can become issues. Depending on the context, sharding may help—but it’s important that the outbox sharding aligns with the broker’s sharding. Otherwise, event ordering may break. -
Single Point of Failure
A single outbox instance represents a single point of failure. Therefore, a cluster with leader election is needed to ensure high availability and low latency.
Event Persistence
Similar to the outbox pattern, events are stored in the database along with an additional status field. However, unlike the outbox, events are published by the same process followed by a status update.
An additional process periodically scans the database for unpublished events and publishes them as a fallback.
at least once out of order – Due to the nature of the publishing mechanism, maintaining the correct order is a best effort.
💡 This is by the way the mechanism Spring Modulith uses for its internal eventing. Something to keep in mind.
Separate Transaction
Another option is to more or less ignore the problem.
at most once in-order – Both local transactions would — if possible — be nested, so that a failure in the inner transaction leads to a rollback of the outer one. However, if the outer transaction fails after the inner one has been executed, this will inevitably lead to inconsistency.
💡 This is exactly what happens when using @Transactional in Spring-Kafka and Spring-Data without an external transaction manager.
9.2 - Databases
1. Concepts
1.1 Isolation Levels
During parallel execution of transactions, a variety of race conditions can occur. Transaction isolation levels are methods used to prevent these. Each level is characterized by the type of race condition it prevents. Moreover, each level also prevents all the race conditions addressed by the previous levels. It is important to note that each database engine implements these levels differently.
| Dirty Read | Non Repeatable Read | Write Skew (Phantom) | |
|---|---|---|---|
| Read Uncommitted | X | X | X |
| Read Committed | X | X | |
| Repeatable Read | X | X | |
| Serializable |
Read Committed
The lowest of the isolation levels that provides the following guarantees:
- All reads operate on data that has been fully committed.
No dirty reads. - All writes operate on data that has been fully committed.
No dirty writes.
Without these guarantees, the level is referred to as Read Uncommitted.
Dirty Reads
In a dirty read, a transaction reads the uncommitted changes of ongoing transactions. This allows the reading of intermediate states. If any of these transactions then fail, data that should never have existed would have been read.
Dirty Writes
In a dirty write, an update is made to the uncommitted changes of a concurrently running transaction. This poses a problem because, depending on the timing, only parts of each transaction might be applied.
Example:
- Buying a car requires updates to two tables — the listing and the invoice. Two interested parties, A and B, try to
buy the same car at the exact same time.
- Transaction A updates the listing, but then briefly pauses (this may happen due to a CPU context switch).
- Transaction B overwrites A’s update on the listing.
- Transaction B updates the invoice.
- Transaction A resumes and overwrites the invoice.
- Buyer B has purchased the car (last write operation), but buyer A receives the invoice.
Implementation
To implement Read Committed, PostgreSQL, for example, uses Multi-Version Concurrency Control (MVCC) based on timestamps. Transactions only read data that was written before their start.
Repeatable Read
The isolation level above Read Committed, which additionally prevents nonrepeatable reads. Also referred to as Snapshot Isolation because each transaction operates on its own snapshot of the database.
Nonrepeatable Reads (or Read Skew)
A nonrepeatable read (also known as read skew) occurs when an aggregate function (such as SUM) is applied to a range of rows that change during its computation — particularly entries that have already been read. This affects inserts, updates, and deletes equally. If re-executing the function yields a different result, it is considered a nonrepeatable read.
Example:
| ID | Salary |
|---|---|
| 1 | 1000 |
| 2 | 2000 |
| 3 | 3000 |
| 4 | 2500 |
| 5 | 1000 |
A session executes a SUM over the salaries. Meanwhile, the employee with ID 3 is deleted — at the moment the computation reaches ID 4. As a result, an incorrect total salary is calculated.
Nonrepeatable reads are especially dangerous for long-running processes that rely on data integrity, such as backups, analytic workloads, and integrity checks.
Serializable
The highest isolation level, where transactions are executed sequentially — at least according to the standard. In reality, databases deviate from this. For example, PostgreSQL uses monitoring to detect conflicts between concurrently running sessions. In the event of a conflict, one of the two transactions is aborted.
Sequential execution prevents lost updates, write skews, and phantoms. However, these issues can also be avoided through proper locking.
Lost Updates
A lost update is a read-modify race condition on a shared row. Here, one session reads a value as input for a calculation and then updates it. Meanwhile, a parallel session updates the same value between the read and write operations. This intermediate update is lost.
Example:
| ID | Value |
|---|---|
| 1 | 3 |
- Session A reads the value 3 and intends to increase it by 2.
- Between read and the write, a parallel session writes the value 4.
- When Session A writes the value 5, the update to 4 is lost.
Solutions include serialized execution, locking, and atomic operations, although the use of atomic operations is limited to specific cases.
Write Skew (Phantoms)
A write skew is a race condition involving reads on shared entries and writes on separate entries. This applies equally to inserts, updates, and deletes.
Example (1) — materialized:
- A hospital’s shift plan requires that at least one doctor is on call at all times. For one evening, two doctors (A) and (B) are scheduled. Both wish to sign off.
- (A) and (B) attempt to sign off at the same time. The system checks in parallel whether at least one doctor remains on call — which is true in both cases. Both are removed concurrently.
- As a result, no doctor remains on call.
Example (2) — unmaterialized:
- Booking a meeting room is handled through time slots; stored as entries with a start and end time, assigned to a room and a responsible person.
- Two people (A) and (B) try to book the room at the same time. The system checks if a booking exists for the requested time slot — in both cases, none is found. Two new entries are created simultaneously.
The difference between Example (1) and Example (2) is that in (1) the conflict is materialized, while in (2) it is not. This means in (1) there are existing entries that could be locked; in (2) there are no entries yet — and you cannot lock what doesn’t exist.
Solutions include serialized execution and locks. However, locks require a materialized conflict — or they must be applied to the entire table.
Postgres Specifics
Repeatable Read
In PostgreSQL, Repeatable Read is implemented as snapshot isolation using MVCC. Each transaction sees a snapshot of the database taken at the moment it starts. The following points are important:
- Locking
Locks are applied independently of versions. Only entries visible in the transaction’s snapshot are considered — newer entries are ignored. - Exception Handling
When executing updates, deletes, or applying locks, an exception is thrown if a newer version of the affected entries exists outside the snapshot. Read-only queries without locking are not affected.
On the application side, proper error handling must involve retrying the entire transaction in order to work with a newer snapshot.
Serializable
This isolation level extends the snapshot isolation of Repeatable Read by adding conflict monitoring through predicate locks. True serialization is not achieved; instead, an optimistic approach is used, resolving conflicts by aborting transactions when they are detected.
Access to entries is recorded as predicate locks — visible in pg_locks. If a conflict arises, one of the involved transactions is aborted with an error message. Predicate locks do not play a role in deadlocks!
Important points when using Serializable:
- Exception Handling and Consistency
Applications must implement error handling for serialization exceptions by retrying the entire transaction. - Read Consistency
Reads are only considered consistent after a successful commit because a transaction may still be aborted at any time.
This does not apply to read-only transactions. Their reads are consistent from the moment a snapshot is established; these transactions must be flagged as read-only. - Locking
Explicit locking becomes unnecessary when Serializable is used globally. In fact, for performance reasons, explicit locks should be avoided! - Mixing Isolation Levels
Conflict monitoring only applies to transactions running under Serializable. Therefore, it is recommended to set the isolation level globally.
Important:
Keep in mind that sequential scans and long-running transactions can lead to a high number of aborts as the
number of concurrent users increases. Serializable works best with small, fast transactions.

