Infrastructure v2 #53

Closed
opened 2024-09-11 13:25:41 +02:00 by lukas · 4 comments
Owner

This issue tracks a breaking change of libre.moe's server infrastructure. Currently, there are arael and armisael as servers behind libre.moe. During the runtime and maintenance of this setup, I came to realize a few things about server administration and am looking to provision a new infrastructure, build for higher stability, better scaling and better cost efficiency.

New Servers

Sahaquiel

This will be the "core" server, of which most others will be dependent upon and which will only host a few services to maintain high stability and performance.

It will provide a central database for all other services, as well as host the authentication, a new LDAP user directory, Gitea and the Vaultwarden and Mumble instances.

Ireul

Ireul will serve most of the other services, including Drone, Wiki, OnlyOffice, Nextcloud, Szuru, and the libre.moe website.

Leliel

As the name implies, Leliel will serve the art/waifu repo along with Seafile. Analogue to the angel Leliel, the server also symbolized a large unknown space where an unlimited amount of data may reside in, powered by object storage, serving the hosted services with easy scalability, as those services are likely to grow in terms of data storage.

New Policies and Configurations

Transaction email

Emails sent by all services will be routed through Scaleway's transactional email service, as the local relay on Arael will be removed in favor of a regular email service. Due to the nature of email, I want to offload the sending of non-personal mails to a third party at first, while the email system is not yet set up.

Volumes

All local application data will no longer be stored on the sever itself, but instead on a hosted volume, proving triple replication (single-AZ afaik) for the likes of databases, metadata or Gitea stuff, protecting against failure of an entire rack or two, which the local storage is susceptible to.
Volumes will be manually formatted with Btrfs, providing active detection of errors and blocking corrupted data from being read.

Object Storage

Object Storage provides high availability, strong resilience and cost-effective data storage, which is why it will be used for applications storing larger amounts of data.

Most services on libre.moe store data directly in a mounted file system (e.g. like described above), so Hetzner Object Storage will be fused by JuiceFS to present as a local storage mount for applications, enabling cost-effective storage with local NVMe caching for frequently used files.

For applications natively supporting object storage and being used primarily as personal data storage, Backblaze B2 will usually be used instead, as it provides (in my case) large amounts of included data transfer (when accessed directly from outside) at cost-effective pricing. For example: (the (hopefully) soon coming) Blobfisch and Ente Photos will use it as primary storage.

Database dumps and server configurations are also daily mirrored and versioned for a few days on Backblaze B2.

Backups

Log files, as well as user data of a highly temporary nature, may be excluded from all backups.

As all data lives on highly redundant volumes or object stores, backups are largely focussed on providing geo-resilience and versioning in case of issues related to natural disasters, malware, or simply human error. Data is always backed up in the most pure form, not through layers of abstractions like overlay file systems to provide protection against the failure of such a layered system and also enables the real files to be versioned.

Scaleway's Glacier storage in an underground fallout shelter 25m under Paris will be used to store atomic, monthly copies of small files like database dumps or config files and incrementally back up the data of larger folders not suitable for atomic copies from the likes of Szuru, Gitea or Seafile. While servers and data stores are set up in Germany and the Netherlands, this architecture also facilitates multi-regional durability for all data, with the backup being in France.

This backup is on a Glacier-tier, meaning data can not be read directly but instead has to be requested beforehand and can only be accessed once it has been restored from the Glacier, which can take from 12 hours to possibly many days. This is acceptable, as the likeliness of an object store failing to the point where data has been permanently lost is very, very low. Scaleway Glacier is both very resilient in its redundancy, but also protected from many natural disasters, due to being 25m underground in a fallout shelter. The benefit of using this offer is also it's price, at €2/TB (before tax) it is extremely cost-effective, yet it can replace offerings from other vendors because in this architecture, as there is no need for the data to be available instantly, a few days of recovery time is just fine.

Additionally, a local HDD will be asynchronously synchronized with the production data. While there is no redundancy in this third copy, two highly redundant storage systems with one of them being in a nuclear fallout shelter should already provide jaw dropping durability, so this being just a single HDD is very fine.

Too Long; Didn't Read

Role Product Location Redundancy
Production FS Hetzner Volumes Germany 3x replication
Production S3 Hetzner Object Storage Germany yet unknown redundancy
Production S3 Backblaze B2 Netherlands Erasure Coding
Backup Scaleway Glacier France Erasure Coding
Backup 2 HDD Germany None, except checksums

Notes for future archival services

In case I ever realize my idea of a YT archival service, it will probably just use Storj DCS, as it's cheap, highly resilient and globally distributed. While every downloaded byte is billed, the archival nature would make the demand for downloads very low. The goal is to offer a service with dirt cheap but decently reliable storage, meaning it will not be integrated into the here outlined backup policies.

Tasks

  • Migrate all services to hosted transactional email
  • Announce large scale maintenance, take down one service at a time, migrate data and db, bring it back, repeat until done
  • Complete Sahquiel
  • Complete Ireul
  • Complete Leliel
  • Check that all services and servers are properly covered by backup jobs
  • Shut down old servers, take snapshots for emergencies and then sunset the servers
  • Implement #28
This issue tracks a breaking change of libre.moe's server infrastructure. Currently, there are [arael](https://arael.libre.moe) and [armisael](https://armisael.libre.moe) as servers behind libre.moe. During the runtime and maintenance of this setup, I came to realize a few things about server administration and am looking to provision a new infrastructure, build for higher stability, better scaling and better cost efficiency. # New Servers ## Sahaquiel This will be the "core" server, of which most others will be dependent upon and which will only host a few services to maintain high stability and performance. It will provide a central **database** for all other services, as well as host the **authentication**, a new LDAP **user directory**, **Gitea** and the **Vaultwarden** and **Mumble** instances. ## Ireul Ireul will serve most of the other services, including **Drone**, **Wiki**, **OnlyOffice**, **Nextcloud**, **Szuru**, and the libre.moe **website**. ## Leliel As the name implies, Leliel will serve the **art/waifu repo** along with **Seafile**. Analogue to the angel Leliel, the server also symbolized a large unknown space where an unlimited amount of data may reside in, powered by object storage, serving the hosted services with easy scalability, as those services are likely to grow in terms of data storage. # New Policies and Configurations ## Transaction email Emails sent by all services will be routed through Scaleway's transactional email service, as the local relay on Arael will be removed in favor of a regular email service. Due to the nature of email, I want to offload the sending of non-personal mails to a third party at first, while the email system is not yet set up. ## Volumes All local application data will no longer be stored on the sever itself, but instead on a hosted volume, proving triple replication (single-AZ afaik) for the likes of databases, metadata or Gitea stuff, protecting against failure of an entire rack or two, which the local storage is susceptible to. Volumes will be manually formatted with Btrfs, providing active detection of errors and blocking corrupted data from being read. ## Object Storage Object Storage provides high availability, strong resilience and cost-effective data storage, which is why it will be used for applications storing larger amounts of data. Most services on libre.moe store data directly in a mounted file system (e.g. like described above), so Hetzner Object Storage will be fused by JuiceFS to present as a local storage mount for applications, enabling cost-effective storage with local NVMe caching for frequently used files. For applications natively supporting object storage and being used primarily as personal data storage, Backblaze B2 will usually be used instead, as it provides (in my case) large amounts of included data transfer (when accessed directly from outside) at cost-effective pricing. For example: (the (hopefully) soon coming) Blobfisch and Ente Photos will use it as primary storage. Database dumps and server configurations are also daily mirrored and versioned for a few days on Backblaze B2. ## Backups *Log files, as well as user data of a highly temporary nature, may be excluded from all backups.* As all data lives on highly redundant volumes or object stores, backups are largely focussed on providing geo-resilience and versioning in case of issues related to natural disasters, malware, or simply human error. Data is always backed up in the most pure form, not through layers of abstractions like overlay file systems to provide protection against the failure of such a layered system and also enables the real files to be versioned. Scaleway's Glacier storage in an underground fallout shelter 25m under Paris will be used to store atomic, monthly copies of small files like database dumps or config files and incrementally back up the data of larger folders not suitable for atomic copies from the likes of Szuru, Gitea or Seafile. While servers and data stores are set up in Germany and the Netherlands, this architecture also facilitates multi-regional durability for all data, with the backup being in France. This backup is on a Glacier-tier, meaning data can not be read directly but instead has to be requested beforehand and can only be accessed once it has been restored from the Glacier, which can take from 12 hours to possibly many days. This is acceptable, as the likeliness of an object store failing to the point where data has been permanently lost is very, very low. Scaleway Glacier is both very resilient in its redundancy, but also protected from many natural disasters, due to being 25m underground in a fallout shelter. The benefit of using this offer is also it's price, at €2/TB (before tax) it is extremely cost-effective, yet it can replace offerings from other vendors because in this architecture, as there is no need for the data to be available instantly, a few days of recovery time is just fine. Additionally, a local HDD will be asynchronously synchronized with the production data. While there is no redundancy in this third copy, two highly redundant storage systems with one of them being in a nuclear fallout shelter should already provide jaw dropping durability, so this being just a single HDD is very fine. ### Too Long; Didn't Read | Role | Product | Location | Redundancy | |---------------|------------------|-------------|----------------| | Production FS | Hetzner Volumes | Germany | 3x replication | | Production S3 | Hetzner Object Storage | Germany | yet unknown redundancy | | Production S3 | Backblaze B2 | Netherlands | Erasure Coding | | Backup | Scaleway Glacier | France | Erasure Coding | | Backup 2 | HDD | Germany | None, except checksums | ## Notes for future archival services In case I ever realize my idea of a YT archival service, it will probably just use Storj DCS, as it's cheap, highly resilient and globally distributed. While every downloaded byte is billed, the archival nature would make the demand for downloads very low. The goal is to offer a service with dirt cheap but decently reliable storage, meaning it will not be integrated into the here outlined backup policies. # Tasks - [x] Migrate all services to hosted transactional email - [x] Announce large scale maintenance, take down one service at a time, migrate data and db, bring it back, repeat until done - [x] Complete Sahquiel - [x] Complete Ireul - [x] Complete Leliel - [x] Check that all services and servers are properly covered by backup jobs - [x] Shut down old servers, take snapshots for emergencies and then sunset the servers - [x] Implement #28
lukas added the
Breaking
Kind
Enhancement
Domain
libre.moe
Priority
Medium
labels 2024-09-11 13:25:41 +02:00
lukas self-assigned this 2024-09-11 13:25:41 +02:00
lukas added this to the Issue Board project 2024-09-11 13:25:41 +02:00
Author
Owner

Sahaquiel

  • Setup server
  • Install Postgres
  • Move Keycloak over
  • Move Vaultwarden over
  • Move Gitea over
  • Move Mumble over
  • Move coturn over
  • Setup Redis (Valkey)
  • Change & verify backups

Ireul

  • Setup server
  • Move website over
  • Move Wiki over
  • Move Szuru over
  • Move Drone over
  • Move OnlyOffice over
  • Move Nextcloud over
  • Change & verify backups

Leliel

  • Setup server
  • Create JuiceFS on Hetzner
  • Copy and verify data on JuiceFS
  • Migrate Seafile to JuiceFS-backend
  • Run seaf-fsck to verify nothing is missing, and all files are of integrity
  • Terminate StorageBox
  • Move Kirei over
  • Move Seafile over
  • Change & verify backups
# Sahaquiel - [x] Setup server - [x] Install Postgres - [x] Move Keycloak over - [x] Move Vaultwarden over - [x] Move Gitea over - [x] Move Mumble over - [x] Move coturn over - [x] Setup Redis (Valkey) - [x] Change & verify backups # Ireul - [x] Setup server - [x] Move website over - [x] Move Wiki over - [x] Move Szuru over - [x] Move Drone over - [x] Move OnlyOffice over - [x] Move Nextcloud over - [x] Change & verify backups # Leliel - [x] Setup server - [x] Create JuiceFS on Hetzner - [x] Copy and verify data on JuiceFS - [x] Migrate Seafile to JuiceFS-backend - [x] Run `seaf-fsck` to verify nothing is missing, and all files are of integrity - [x] Terminate StorageBox - [x] Move Kirei over - [x] Move Seafile over - [x] Change & verify backups
Author
Owner

Most changes have been completed, and the infrastructure is now running purely on the newly created servers, using block or object storage as backend with backups on B2 and soon also on Scaleway's fallout shelter.

Missing actions

  • Re-launch Kirei with JuiceFS on Leliel
  • Maybe move Szuru to Leliel (because lel)
  • Research object locking on Scaleway
  • Research Redis-based backends for JuiceFS
Most changes have been completed, and the infrastructure is now running purely on the newly created servers, using block or object storage as backend with backups on B2 and soon also on Scaleway's fallout shelter. # Missing actions - [x] Re-launch Kirei with JuiceFS on Leliel - [x] Maybe move Szuru to Leliel (because lel) - [x] Research object locking on Scaleway - [x] Research Redis-based backends for JuiceFS
Author
Owner
  • Delete old Azure backups, on which the object lock will expire on 30.11.2024 at 01:00.
- [x] Delete old Azure backups, on which the object lock will expire on 30.11.2024 at 01:00.
Author
Owner
- [x] Fully implement current [backup strategy](https://wiki.libre.moe/en/libre-moe/backup-policy) (includes #59)
lukas closed this issue 2024-12-17 12:18:13 +01:00
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: KomuSolutions/igot99issues#53
No description provided.