Optimizing CI/CD Pipeline for Student Council Homepage Backend
Overview
This post is about CI/CD optimization for the homepage-backend project.
During service operation, we had repeated problems in key delivery steps
- deployment success verification
- environment-specific image management
- health check timing
- remote command stability
We already had separate pipelines for development and production. But trigger conditions and verification logic were not fully consistent. Because of this, incident analysis and recovery took too long.
I redesigned current pipeline based on GitHub Actions, and stabilized AWS integration across ECR, ECS, and SSM.
Goals
- Standardize deployment automation for dev and prod
- Detect failures early and find root causes faster
- Improve image traceability and reproducibility
- Reduce manual steps and shorten recovery time
What I Changed
1. GitHub Actions workflow improvements
I continuously improved main-cicd.yml and develop-cicd.yml.
- adjusted branch triggers and execution timing
- added
concurrencyto prevent duplicate deployments - improved
fetch-depthhandling for safer tag/commit-based processing
2. Standardized AWS deployment path
- configured AWS OIDC authentication with
configure-aws-credentials - automated ECR login and Docker Buildx (arm64) build/push
- standardized ECS task definition update and service deployment
- added explicit remote deploy verification with SSM:
send-command+wait+get-command-invocation
3. Better health-check and validation strategy
- split health checks into clearer stages
- added retry and wait logic
- verified real service response at
/health, not only command completion - repeatedly debugged escaping/parsing/command composition issues to improve failure visibility
4. Image and runtime environment cleanup
- standardized image tags to
dev_<short_sha>andprod_<short_sha> - improved Dockerfile and compose usage for env-based image references
- updated workflow action versions (including ECS actions v2)
Results
Contribution Rate
- CI/CD work period: 2025-04-07 ~ 2026-02-17
- CI/CD scope contributions (
.github/workflows,Dockerfile, compose/deploy scripts):- 71.1% (123 / 173 commits)
Before
- pipeline focused more on running than on verifiable validation
- failure points were harder to detect early
- rollback decisions were slower due to weaker image traceability
After
- pipeline changed to verifiable automation
- deployment failures are detected earlier inside the pipeline
- SHA-based image tagging made release history and rollback decisions easier
- clear dev/prod deployment standards improved team-level deployment trust
The most important result is operational confidence. Now we can ship faster with clearer signals and safer recovery.
This post is licensed under
CC BY 4.0
by the author.