Monitoringยถ

Monitoring architectureยถ

Our monitoring architecture looks like this:

@startuml

!theme blueprint
skinparam linetype ortho
left to right direction

node mattermost [
  **๐Ÿ’ฌ   Mattermost**

  Messaging platform.
]

node uptime [
  **๐Ÿšฆ   Uptime**

  Service monitoring.
]

node grafana [
  **๐Ÿ“Š   Grafana**

  Dashboard & alert manager.
]

node prometheus [
  **๐Ÿงฎ   Prometheus**

  Metrics collector.
]

node loki [
  **๐Ÿ“„   Loki**

  Log receiver.
]

node alloy [
  **๐Ÿงฒ   Alloy**

  Log & metrics collector.
]

node node_exporter [
  **๐Ÿ–ฅ   Node Exporter**

  Metrics exporter.
]

uptime --> mattermost
mattermost <-- grafana

grafana ~> prometheus : connects
loki <~ grafana : connects

loki <-- alloy : sends
prometheus <-- alloy : sends

prometheus -> node_exporter : scrapes
@enduml

Monitoring alertsยถ

Monitoring alerts of the Service monitoring & Infrastructure monitoring will be sent to the Monitoring channel in Mattermost.

Service monitoringยถ

Uptime Kumaยถ

Weโ€™re using Uptime Kuma to monitor all of our services.

Important

Please note weโ€™re also running a public status site based on Uptime Kuma.

Note

The deployment, and all documentation for Uptime Kuma can be found in the GitLab Uptime project.

Infrastructure monitoringยถ

Grafanaยถ

Weโ€™re using Grafana OSS as dashboard & alert manager to:

  • Access Prometheus metrics via PromQL

  • Access Loki logs via LogQL

  • Visualise metrics & logs

  • Use it as alert manager to send notifications to Mattermost

Note

The deployment, and all documentation for Grafana can be found in the GitLab Grafana project.

Prometheusยถ

Weโ€™re using Prometheus as metrics collector to:

  • Scrape OS, and ๐Ÿณ Docker metrics

  • Scrape application metrics

  • Persist those metrics

  • Query, aggregate, and correlate metrics via Grafana

Note

The deployment, and all documentation for Prometheus can be found in the GitLab Prometheus project.

Lokiยถ

Weโ€™re using Grafana Loki as log receiver to:

  • Retrieve OS, and ๐Ÿณ Docker logs

  • Retrieve application logs

  • Persist those logs

  • Query, aggregate, and correlate logs via Grafana

Note

The deployment, and all documentation for Loki can be found in the GitLab Loki project.

Alloyยถ

Weโ€™re using Grafana Alloy collectors to:

Note

The deployment, and all documentation for Alloy can be found in the GitLab Alloy project.

Node Exporterยถ

Weโ€™re using Node Exporters to:

Note

The deployment, and all documentation for the Node Exporter can be found in the Ansible node_exporter role.