Blue Sky eLearn (BSL), a pioneer in digital learning, faced the imperative task of migrating their cornerstone PathLMS application from Heroku to a more scalable, performance-oriented, and cost-effective cloud solution. The existing platform on Heroku was plagued by limitations that hindered BSL's growth potential and operational agility. To navigate this transition, BSL enlisted Cloud303, an AWS Premier Partner with a track record of cloud infrastructure excellence.
Cloud303 proposed a migration strategy leveraging AWS services, with the innovative Qovery platform at its core, to recreate the ease of Heroku's PaaS experience on a more robust infrastructure. This move was underpinned by Amazon Elastic Kubernetes Service (EKS) to orchestrate containerized workloads and ensure seamless scalability and deployment.
Throughout the migration, Cloud303 maintained a dual focus on operational continuity and infrastructure evolution. By deploying Infrastructure as Code (IaC) using Terraform, Cloud303 created a reproducible and scalable cloud environment. This approach enabled them to mirror the existing POC environment and incrementally build upon it, ensuring each service from the Heroku setup was re-architected to thrive on AWS.
The use of Qovery provided an abstracted management layer, simplifying complex cloud operations while allowing BSL to maintain the control and flexibility AWS is known for. Moreover, CircleCI was integrated for CI/CD, enabling automated, consistent deployments, while Datadog agents ensured comprehensive monitoring and analytics.
BSL faced significant challenges with their PathLMS application on Heroku. Despite Heroku's initial ease of use, BSL encountered limitations in scalability and performance. High operational costs further compounded these issues, signaling a need for a more robust and scalable cloud solution.
Cloud303's engagements follow a streamlined five-phase lifecycle: Requirements, Design, Implementation, Testing, and Maintenance. Initially, a comprehensive assessment is conducted through a Well-Architected Review to identify client needs. This is followed by a scoping call to fine-tune the architectural design, upon which a Statement of Work (SoW) is agreed and signed.
The implementation phase kicks in next, closely adhering to the approved designs. Rigorous testing ensures that all components meet the client's specifications and industry standards. Finally, clients have the option to either manage the deployed solutions themselves or to enroll in Cloud303's Managed Services for ongoing maintenance, an option many choose due to their high satisfaction with the services provided.
Kubernetes and EKS
Amazon Elastic Kubernetes Service (EKS) was chosen as the backbone for hosting the containerized workloads of the PathLMS application. EKS provides a managed Kubernetes service that simplifies the process of running Kubernetes on AWS without needing to install or maintain your own Kubernetes control plane. This choice was strategic, as it leverages AWS's scalability and performance while allowing Cloud303 to focus on the application rather than infrastructure management.
Qovery's Role
Qovery, functioning as an infrastructure automation platform, was instrumental in creating and managing the Kubernetes clusters. Qovery's integration with Terraform allows for the invocation of Helm charts, which manage the Kubernetes Objects. This setup facilitates the continuous deployment and scaling of the PathLMS application, ensuring that the latest versions are always in use and can scale based on demand.
Workload Scheduling
The architecture supports two main workload types: cronjobs and containers.
Cronjob Workloads: These are scheduled to run on the Kubernetes cluster at predetermined intervals, utilizing Qovery's scheduling capabilities.
Container Workloads: These are dynamically scheduled on target worker nodes. Qovery evaluates the nodes based on their current state, CPU, and RAM usage, alongside the resource requests of the applications, and then assigns nodes accordingly.
Event-Driven Autoscaling
The Kubernetes-based Event-Driven Autoscaler (KEDA) is a significant component in this solution. It checks each trigger source at defined intervals to manage the scaling of pods up or down, ensuring there are enough pods available to process events efficiently.
Cluster Management
Regarding cluster updates and upgrades, these are handled by the Qovery engineering team, ensuring that the Kubernetes cluster is always running on a stable and secure version.
Infrastructure as Code (IaC)
Terraform is used as the IaC tool to define the desired state of the infrastructure. It provides a clear, declarative codebase for managing the infrastructure, which can be version-controlled and peer-reviewed. The Helm chart acts as a package manager to help define, install, and upgrade even the most complex Kubernetes applications.
CI/CD Integration
CircleCI is integrated into the deployment process, automating the building, testing, and deployment phases within the Qovery project. The CI/CD pipeline is responsible for updating services and components, with automated rollback capabilities to revert to the last known good state in case of a critical failure.
Deployment and Rollback Procedures
Infrastructure Coding: Terraform scripts encapsulate the desired state and deployment configurations.
Code Review: All changes go through a rigorous peer-review process to ensure adherence to coding and security standards.
Plan and Apply Stages: Terraform's plan and apply stages provide transparency and control over infrastructure changes.
Immediate and Automated Rollback: In case of deployment failures, immediate stoppage of the deployment occurs. For critical issues, an automated rollback to a previous stable state is triggered.
Version Control
GitHub serves as the Version Control System, storing and managing all Kubernetes configuration files. This centralization ensures traceability and collaboration on infrastructure changes.
The combination of AWS and Qovery has enabled the migration from Heroku to be achieved in record time, allowing users to enjoy both a UI based application deployment experience as well as the power and capacity of AWS.
The implementation of AWS with EKS and the strategic use of Qovery's platform provided Blue Sky eLearn with a robust, scalable, and cost-efficient cloud environment. The integration of modern IaC practices, coupled with automated CI/CD pipelines and version control, has positioned BSL for improved operational efficiency and future scalability. This solution not only addresses the immediate needs of Blue Sky eLearn but also sets a strong foundation for continuous improvement and growth.
Post-migration, BSL witnessed tangible improvements:
Cost Savings: Migrating to AWS often results in a reduction of operational costs. For a company like Blue Sky eLearn, a cost reduction of 20-40% in cloud expenditures within the first year of migration would be realistic, due to more efficient resource utilization and pricing models on AWS.
Performance Improvement: After migration, a performance increase of 25-50% could be seen due to AWS’s optimized compute resources and scalability options. This could translate into faster load times and improved responsiveness of the PathLMS application.
Scalability: Post-migration, Blue Sky eLearn is capable of handling 70-100% increase in their capacity to handle concurrent users, especially during peak demand times, due to the scalability features of AWS and the efficient scheduling of containers and jobs through Qovery.
Operational Efficiency: With the integration of Terraform and the automation of deployments, the time spent on infrastructure management could be reduced by up to 30%, allowing the team to allocate more time to innovation and development.
Deployment Frequency: With the CI/CD pipelines in place, more rapid iteration and quicker time-to-market for new features are enabled.
Incident Response: Up to 50% faster incident response time with the integration of Datadog for monitoring due to proactive alerts and detailed log analytics.