Post-Launch Checklist
Maintenance: Site Reliability and DevOps
Some SRE and DevOps duties often required after launch:
Performance
- Blocking or limiting unwanted traffic
- Increasing serving capacity
- Performance tuning and optimization
- Managing traffic to and from clusters
Troubleshooting
- Rollback of faulty software deployments
- Debugging and troubleshooting issues
Monitoring, Communications
- Administering and monitoring production jobs
- Utilizing monitoring systems for alerts and visualization
- Documenting system architecture, components, and dependencies.
- Coordinating and communicating with other teams and stakeholders
Reliability, Security
- Security and vulnerability management
- Managing and maintaining backups and disaster recovery plans
- Managing and updating dependencies and third-party services
- Implementing and enforcing compliance and regulatory requirements
- Managing and updating documentation for systems and processes.