
Luminus Disaster Recovery: DRP with AWS Backup
How Cloud Elemental helped a leading energy provider secure its Cloud-based trading platform with a thorough Disaster Recovery Plan, powered by AWS Backup.

The Client
Luminus is the second-largest electricity generator and energy provider in Belgium, managing power plants and wind farms while securing external energy sources to ensure a reliable power supply for customers.
Their energy trading platform, Aligne, is essential for making quick, informed trading decisions. However, managing and scaling the platform in on-premises data centers presented significant limitations.
With this in mind, it’s important that they have a thorough disaster recovery plan (DRP) to enhance security and ensure preparation in the event of an incident.
The Challenge
- Aligne only supports single servers in a single Availability Zone
Aligne can only run on a single server in one location. If that server fails, the application goes down, leading to potential financial loss. This made AWS Backup critical for fast server restoration.
- Manual configuration changes in AWS posed a risk
Unintentional or malicious changes in the AWS console could cause downtime. GitLab pipelines and drift detection were introduced to identify and automatically correct any drift from the Terraform-defined state.
- No Disaster Recovery Plan to validate Platinum-level recovery
Aligne is held to the “Platinum” standard at Luminus – which means a Recovery Time Objective (RTO) of less than one hour and a Recovery Point Objective (RPO) of zero is expected to be met by all “Platinum” Applications.
Aligne lacked a DRP to test whether Platinum-tier requirements could be met. A detailed DRP was developed and iterated upon, with the goal that even a new team member could follow the documentation to recover from a failure.
Our Approach
Iterative Testing Framework
- We adopted a continuous improvement approach to the DRP development process
- Each test was scoped, executed, reviewed, and refined to create the best possible DRP for Luminus's needs
Full Environment Recovery in Isolated Account
- Restored entire environment in a new AWS account
- Identified manual steps that could be automated via CI/CD
- GitLab pipelines successfully recreated resources
- Documented gaps were updated
- Found that using a brand-new account was not practical
- Backup copy from central vault to local was slow
AZ-Level Failure Scenario
- Focused test on a single Availability Zone failure
- EC2 HA worked well but included some manual steps
- Multi-AZ RDS and FSx performed as expected
- Initiated further automation for EC2 HA
- Local backups for RDS and FSx highlighted as key test points
- Achieved RTO of 46 minutes, RPO of 0 which was Platinum-compliant
Full Local Backup Recovery
- Simulated complete service failure using only local backups
- EC2 HA automated recovery worked end-to-end
- RDS and FSx local backup restores were successful
- Central backups deprioritised for critical scenarios
- Achieved RTO of 1 hour 5 minutes, RPO of 0 which was near Platinum compliance
CI/CD, Drift Detection, and Central Backup Recovery
- Validated GitLab pipelines for detecting and correcting drift
- EC2 HA failed with central-to-local backup restores
- Successful recovery for EC2, RDS, FSx from central-to-local backups
- RTO of 2.5 hours, RPO of 30 minutes
- Determined local backup vaults are essential for low RTO/RPO
- Vault lock implementation identified as a priority
Our Solution
AWS Backup was used to manage daily EC2 backups, point-in-time RDS backups, and hourly FSx backups, meeting both RTO and RPO requirements. Migrating RDS backups from the RDS service to AWS Backup also added an extra layer of security. Centralised backups were avoided due to known limitations.
Multi-AZ was enabled for RDS and FSx to protect against AZ-level failures. In parallel, EC2 High Availability was developed and introduced – an automated tool that restores EC2 instances in a new AZ and updates all required configurations as part of the recovery workflow.
Drift detection was built and scheduled using GitLab CI/CD, enabling alerts for any configuration changes made outside of Terraform. This helped teams quickly identify and resolve changes that could cause downtime during disaster recovery tests.
A comprehensive DRP was created, regularly tested, and refined with stakeholder input. It included detailed documentation and iterative improvements, ensuring that any team member could follow it to recover from failure scenarios.
Our Results
- Proven RTO and RPO targets across multiple failure scenarios, validated through iterative testing
- EC2 High Availability restored all EC2 instances in under 30 minutes
- Full infrastructure recovery completed in under 1 hour
- Multi-AZ RDS and FSx, combined with EC2 HA, significantly improved infrastructure resilience
- AWS Backup provided a single, secure backup location with tightly controlled access
- Terraform with CI/CD pipelines enabled secure, versioned infrastructure as code and rapid remediation of misconfigurations
- Our Disaster Recovery Plan (DRP) served as a repeatable framework for validating recovery objectives and became the standard across additional projects
Ready to transform your Cloud infrastructure?
Cloud solutions, simplified.
Let's discuss how we can help you achieve your Cloud goals with our expertise and proven methodology.


Luminus Aligne Migration: Enhancing Performance and Scalability with AWS
How Cloud Elemental helped a leading energy provider optimising its Cloud-based trading platform with scalable, secure, and automated Cloud solutions.

The Client
Luminus is the second-largest electricity generator and energy provider in Belgium, managing power plants and wind farms while securing external energy sources to ensure a reliable power supply for customers.
Their energy trading platform, Aligne, is essential for making quick, informed trading decisions. However, managing and scaling the platform in on-premises data centers presented significant limitations.
To enhance efficiency and scalability, Cloud Elemental led the migration of Aligne to Amazon Web Services (AWS).
The Challenge
Luminus’s Aligne migration project presented several challenges:
- Outdated & Unsupported Application
As a legacy system, Aligne was difficult to maintain and lacked vendor support
- No Automation
Manual, tedious processes slowed down operations and increased inefficiencies
- Limited Monitoring & Alerts
Lack of real-time insights made issue resolution reactive instead of proactive
- Scalability Constraints
On-premise infrastructure couldn’t handle increasing data and demand
Our Approach
How Cloud Elemental helped a leading energy provider optimising its Cloud-based trading platform with scalable, secure, and automated Cloud solutions.

Assessment
- Identify challenges in security, scalability, and efficiency
- Evaluate existing infrastructure

Foundations & Security
- Move applications and data securely
- Replatform for improved performance

Migration & Optimisation
- Establish connectivity and account structure
- Implement security and compliance safeguards

Scaling & Automation
- Introduce automation and monitoring for efficiency
- Ensure high availability and cost savings
Our Solution
We implemented a highly available, self-healing Cloud architecture using AWS services to ensure the Aligne platform operates with minimal downtime, improved scalability, and robust security.
Here’s how we did it:
To reduce downtime and enhance reliability, we built a bespoke self-healing mechanism that detects, diagnoses, and recovers from service failures without manual intervention.
- Monitors Aligne, which is made up of over 40 Windows services, identifying failures in real-time
- Triggers automated recovery workflows, ensuring minimal disruption
- Utilises idempotent scripts and AWS infrastructure, allowing for faster issue resolution
- Reduces response times by pinpointing failures precisely and applying corrective actions instantly
We implemented Elastic Load Balancing (ELB) to route traffic correctly, monitor service health and allow for servers to be swapped out easily where needed. This improves service availability and speeds up recovery from failure.
- Routes traffic across 6 EC2 servers, enhancing scalability and fault tolerance
- Alarms and automation hooks trigger recovery actions if a a service or an EC2/VM/server fails
- Improves security by encrypting traffic in transit
- Reduces infrastructure complexity, making management and scaling easier
To replace the legacy Windows file server, we introduced Amazon FSx configured for Multi-AZ, ensuring automated failover and backup management for a more resilient storage solution.
- Automated failover, backup, and patching, eliminating manual intervention
- Multi-AZ storage configuration ensures high durability and availability
- Seamlessly integrates with on-premises systems while securing data
- Enhanced security with fine-grained access control via Microsoft Active Directory
To reduce the risk of data loss and downtime, we deployed AWS Backup for automated data protection and implemented automated disaster recovery processes across multiple environments.
- Automated the workflow to restore an EC2 from any failure, including a complete AZ failure
- Maintains local & remote backup vaults, ensuring disaster recovery readiness
- Reduces downtime with a completely restored EC2 with all application configuration and Active Directory joining complete in under 5 minutes
We migrated from on-premises Oracle-on-VM to Amazon RDS Multi-AZ ensuring automatic failover for continuous operations. We developed RDS automation to minimise DBA overhead and limit access to production data.
- Multi-AZ deployment improves database availability and resilience
- On-boarded the Aligne project as the first consumers of the organisational AWS Backup architecture for immutable backups
- Reduced DBA overhead by developing RDS Automation – a workflow to refresh non-production environments with production data without the need to provide any DBA access to production data
Our Results
By migrating Aligne from on-premise to AWS, Luminus has gained greater scalability, enhanced performance, and improved automation capabilities, ensuring their trading platform is more resilient and future-proof. The platform has experienced a 30% increase in performance, with higher data availability and reduced maintenance overhead.
High Availability & Fault Tolerance
- Multi-AZ architecture ensures continuous operations even during infrastructure failures
- Automated failover mechanisms minimise downtime and service disruptions
Resilience & Security
- AWS-native backup and disaster recovery solutions safeguard against unexpected failures
- Strict access control and encryption protect sensitive data and ensure compliance
Automation & Integrated Monitoring
- Automated self-healing workflows proactively resolve application issues
- Logging, security monitoring, and infrastructure automation reduce operational overhead
Scalability & Elasticity
- Load balancing & compute optimisation ensure smooth operations under varying workloads
With AWS, Luminus now operates Aligne with greater efficiency, reliability, and security, empowering their team to tackle challenges with better insights, automated controls, and future-ready infrastructure.

Ready to transform your Cloud Infrastructure?
Let’s discuss how we can help you achieve your Cloud goals with our expertise and proven methodology.