Wet Fish: Building a Resilient, Code-Driven Disaster Recovery Foundation on AWS

Case Studies

admin
25th Mar 2026

About Wet Fish

Wet Fish is an Australian eLearning technology company with nearly three decades of experience in digital learning innovation. Their flagship platform, Scodle, supports over 200,000 users across government, corporate, and education sectors, delivering workforce training and certification for organisations including Rio Tinto, Glencore, and health workforce agencies across every Australian state and territory. More recently, Wet Fish has expanded their offering with Brian, a secure AI platform that helps organisations unlock the value of their content library by giving workers instant, accurate answers drawn exclusively from their own approved resources. With more than two million course completions on record and a platform uptime exceeding 99.99%, Wet Fish operates infrastructure where availability is not a nice-to-have, it is a contractual and operational necessity.

Business Challenge

As Wet Fish’s platform scaled to serve over 200,000 users across government and enterprise, the team made a deliberate decision to mature their AWS infrastructure to match that growth. Their environment had been built rapidly to support expanding demand, and Wet Fish recognised that formalising it through infrastructure as code would give them greater control, consistency, and the foundation needed for a robust disaster recovery capability.

For a platform underpinning mandatory certification for government health workforces and large corporate clients, availability and recoverability are non-negotiable. Wet Fish set out to implement a Disaster Recovery (DR) strategy that could be formally tested, clearly documented, and reliably executed to give their clients the assurance that the platform could recover quickly and completely from any unplanned disruption.

But before any DR strategy could be designed, the entirety of Wet Fish’s existing AWS environment needed to be understood, documented, and brought under controlled management. There was no shortcut: a DR runbook built on top of an undocumented, manually provisioned environment would be unreliable by design.

Partner Solution

DNX began by conducting a comprehensive audit of Wet Fish’s existing AWS environment, cataloguing every resource across the networking, compute, data, and application layers. Rather than building a parallel DR environment alongside the existing one, DNX took a more durable approach: the entire production infrastructure was imported into Terraform, bringing it under infrastructure as code (IaC) management from the ground up. This covered the full stack, including VPCs, subnets, route tables, internet and NAT gateways, security groups, Amazon RDS instances, DynamoDB tables, ECS clusters and services, Lambda functions, and EC2 instances.

With the production environment fully codified, DNX architected and deployed automated CI/CD pipelines built on IaC principles. These pipelines were designed to serve two distinct purposes: controlled incremental updates to the production environment, and full environment replication to a separate AWS region and a dedicated disaster recovery account. Using AWS’s Sydney and Melbourne regions, this architecture provided both geographical redundancy and account-level isolation, ensuring that a failure or compromise in the production account could not propagate to the recovery environment.

DNX then implemented a robust backup architecture using AWS Backup, managed entirely through Terraform. Automated snapshot policies were configured for EC2 instances, RDS databases, and Elastic File System (EFS) volumes, each with defined retention schedules. Critically, all backups were replicated cross-account to the designated DR account, closing the gap between having backups and having backups you can actually rely on in a recovery scenario.

To validate the solution, DNX conducted a full end-to-end disaster recovery simulation with Wet Fish’s active participation. The exercise was run under strict conditions: the entire infrastructure was provisioned from scratch using only IaC templates and AWS Backup resources, with no reference to or dependency on the production environment. The full application stack was brought to an operational state, all services were verified against a designated test DNS endpoint, and formal recovery time objectives (RTO) and recovery point objectives (RPO) were documented. This exercise gave Wet Fish concrete evidence — not just theoretical assurance — that their DR capability worked as designed.

Under the DNX Operation Centre managed services agreement, this DR validation process is repeated annually, ensuring the runbook remains accurate as the platform evolves and that Wet Fish’s recovery capability stays current with their production environment.

Results and Benefits

The engagement gave Wet Fish something they did not have before: a fully documented, tested, and repeatable path to recovery. For the first time, their entire AWS environment exists as code, meaning any authorised engineer can understand, replicate, or restore the infrastructure from a known state. This alone represents a significant reduction in operational risk.

The system targets a demanding Recovery Time Objective (RTO) of 1 hour, reflecting the application’s critical nature and need for rapid service restoration. Achieving this RTO requires a highly automated, well-orchestrated disaster recovery process, using Infrastructure-as-Code (IaC) tools to minimise manual intervention. This aggressive RTO suggests a warm or pilot light recovery strategy. The Recovery Point Objective (RPO) is 24 hours, meaning a maximum acceptable data loss of one day. This RPO dictates a backup and replication strategy centered on daily backups.

Adopting Infrastructure as Code (IaC) enhanced the disaster recovery (DR) process. This shift reduced deployment time by automating resource provisioning, eliminating manual errors, and simplifying the DR execution phase. Recovery is now an automated, repeatable, and verifiable operation managed through version-controlled code, ensuring greater consistency and reliability than traditional multi-step runbooks.

The annual revalidation cadence, delivered as part of DNX’s ongoing managed services engagement, ensures the DR capability does not drift from the reality of the production environment over time, a common failure mode for DR programs that are tested once and then left untouched.

About the Partner

DNX is an AWS Premier Consulting Partner and systems integrator focused on enabling organisations to modernise their cloud environments and operate them with confidence. DNX works with growth-stage and enterprise businesses across Australia and globally to deliver cloud-native architecture, security and compliance by design, and ongoing managed operations that keep infrastructure secure, scalable, and cost-efficient.

Plan Your Next Move with Confidence

Ready to align your technology with your business growth strategy? Talk to DNX about modernising your platform for scalability, resilience, and faster time-to-market.

Get In Touch

Plan Your Next Move with Confidence

Ready to align your technology with your business growth strategy? Talk to DNX about modernising your platform for scalability, resilience, and faster time-to-market.

Related Case Studies