Operational Excellence in AWS
Resilient Design
Section titled “Resilient Design”- Provides reliability: Assurance that the system is there when you need it.
- Automation of recovery, scaling, backups.
- Implement data-recovery, auto-scaling and backups.
- Test recovery and implement automatic recovery whenever possible.
Performance Design
Section titled “Performance Design”- In the cloud, you gain the scalability advantage it offers, with potential increased performance.
- Can be enhanced through deploying solutions into multiple regions, which results in the service being closer to the end users.
- Serverless architectures can increase performances as the process receives the performance it requires.
Security Pillar
Section titled “Security Pillar”- Implement a strong identity foundation: Principle of least privilege and enforce separation of duties with appropriate authorization.
- Apply security at all-levels, account-level, VPC level, subnet, application.
- Automate security best practices: CloudTrail, so actions are logged.
3 Phases of Operational Excellence
Section titled “3 Phases of Operational Excellence”- Prepare
- Understand workloads and expected behaviours.
- Considerations:
- Operational priorities,
- Design for operations.
- Operational readiness.
- Operate
- Monitor…
- Environment health.
- Discover business and technical insights.
- Respond with…
- Security.
- Reliability.
- Performance.
- Cost.
- Monitor…
- Evolve
- Learn from experience.
- Shared learning.
- Improve & scale.