# Project Context: PetCLinic Microservices -> AWS ## Context and Scope This project will migrate the Spring PetClinic Microservices demo from its local/on-premise setup to AWS Cloud. The focus is infrastructure modernization, CI/CD automation, observability, and resilience but not application feature development. [**Pdf Download (click me)**](assets/week1/week1.pdf) ### Stakeholders | Role | Responsibility | |----|----| | Project Sponsor | Funding, final approval | | Project Manager | Scheduling, stakeholder coordination | | Cloud Architect | Architecture, service selection | | Dev Lead | App changes for cloud readiness | | DevOps Engineer | CI/CD, IaC, deployments, monitoring | | Security Team | IAM, encryption | | End Users / Demo Audience | Acceptance and usability feedback ### Expectations - No app feature development unless necessary for cloud deployment. - AWS is the only target cloud ### Objectives - Run full PetClinic microservices on AWS with CI/CD. - Observability: logs, metrics, traces for 100% of services. - Cost target: keep monthly infra cost under a defined limit . - Security: secrets encrypted, least-privilege IAM, HTTPS for all endpoints. ### Deadlines | Milestone | Date | |----|----| | Project approval | Oct 27, 2025 | | CI/CD & Automation | Nov 3, 2025 | | Infrastructure | Nov 10, 2025 | | Data | Nov 17, 2025 | | Observability | Nov 24, 2025 | | Prep: Presentation, Demo, and Pre-defense | Dec 3, 2025 | ## In Scope | Included items | Objective | |----|----| | Application | Only necessary changes (if applicable) to facilitate cloud integration | | Infrastructure | Design and deploy a reproducible, cloud-native architecture | | CI/CD Automation | Implement automated build, test, and deployment pipelines | | Containerization | Adapt existing microservices to use AWS.| | Monitoring & Logging | Centralized logs, metrics, and traces | | Security & IAM | Least-privilege IAM roles, encryption, and subnet segmentation. | | Backup & Recovery | Redundancy, failover, backup, BCP/DRP | | Documentation | Architecture diagrams, specifications, and operational runbooks. | ## Out of Scope | Excluded items | Reason | |----|----| | Application feature or UI changes | Functionality remains unchanged. | | Multi-cloud or hybrid deployment | Focus solely on AWS environment. | | Cost-optimization | Addressed in a later project if necessary | ## Requirements ### Functional requirements | Stakeholder / Role | Requirement | Description | |----|----|----| | **Developers** | Continuous Integration | Each merge must trigger automated build, test, and image creation. | | | Local to Cloud Parity | Development environment must mirror AWS setup. | | **DevOps Engineers** | Automated Deployment | CI/CD pipeline must deploy microservices to Staging and Prod environments automatically. | | | Test Automation | Integration tests must run automatically in CI/CD pipeline. | || Infrastructure as Code | All AWS resources defined through configuration files | | | Monitoring & Alerts | Centralized logging, metrics, and tracing for all microservices. Automated alerting for service downtime or threshold breaches. | | | Scalability | Services must be scalable | | **Security Team** | Access Control | Roles per service with least-privilege permissions.| | | Secrets Management | All secrets stored securely. | | **Product / Management** | Availability & Demo Readiness | System must be reliable and presentable for client or internal demos.| | **End Users (Demo Audience)** | Stable Access | Web UI and APIs must remain responsive under typical load. | ### Non-Functional Requirements | Category | Requirement | Standard | |----|----|----| | **Development** | Local to Cloud parity | Docker Compose or local ECS simulation | | **Maintainability** | IaC | Code stored in version control | | **Performance** | Pipeline execution time | < 10 minutes per merge | | | Scaling | Services can be scaled horizontally | | | API / UI response | < 200 ms under normal demo load | | **Reliability** | Deployment success rate | ≥ 99% successful deployments | | | Alert response | Alerts trigger within < 5 minutes of failure detection | | | Error tolerance | < 0.1% failed requests | | **Availability** | System uptime | ≥ 99.9% uptime | | **Observability** | Logs, Metrics, Traces | Centralized in monitoring solution | | **Security** | Least-privileged Roles | Roles restricted per service; no default full-access policies | | | Secret encryption | Secrets stored in AWS | | **Cost** | Budget target | Monthly AWS cost ≤ defined cap | ## System Components — Spring PetClinic Microservices | Component | Role / Function | Dependencies | |----|----|----| | `spring-petclinic-admin-server` | Provides admin UI and dashboards | Microservices, Config Server | | `spring-petclinic-api-gateway` | Routes external requests to microservices | Customers, Vets, Visits, GenAI services | | `spring-petclinic-config-server` | Centralized configuration | Git repo | | `spring-petclinic-customers-service` | Manages customer data | RDBMS, Config Server | | `spring-petclinic-vets-service` | Manages veterinary staff | RDBMS, Config Server | | `spring-petclinic-visits-service` | Manages pet visit records | RDBMS, Customers Service | | `spring-petclinic-genai-service` | Optional AI chat-bot | Microservices, RDBMS | | `spring-petclinic-discovery-server` | Service registry / discovery | All microservices | | RDBMS | Persistent storage | Customers, Vets, Visits | ## Architecture and Specifications ### Project - Kanban as agile methodology - Breakdown of work and phases: - Infrastructure Setup - Service Orchestration - Configuration Management - CI/CD Automation - Security - Resilience - Observability #### Assignments: | Role | Responsibilities | |----|----| | **Cloud Architect** | Design AWS target architecture, network, and IAM structure | | **DevOps Engineer** | Build CI/CD pipelines, container orchestration, monitoring setup | | **Dev Lead** | Containerize services, modify configs for cloud compatibility | | **Database Engineer** | Migrate data from local RDBMS to AWS RDS, manage schema updates | | **Security Team** | Set up access and roles for services | | **Everyone** | Validate deployments, pipeline runs, rollback testing | | **Project Lead** | Manage Asana Kanban board, ensure alignment and progress tracking | ### Source Code - Architecture Type: Microservices deployed via containers, managed by ECS, behind an AWS Application Load Balancer (ALB). - Review via pull request process: - All commits merged via PRs. - Peer review required before merging. - Vaildation: Run tests during pipeline build ## CI/CD - Goal: Automate the entire software delivery process for all PetClinic microservices via Jenkins ### Development Cycle Stages | Stage | Description | |----|----| | **01. Code & Merge** | Developer writes code and merges it with the staging branch | | **02. Build** | Compile, resolve dependencies | | **03. Unit Test** | Run service-level tests | | **04. Containerize** | Build Docker image for service | | **05. Security Scan** | Security validation via Trivy | | **06. Push to Registry** | Push validated image | | **07. Deploy to Staging** | Deploy for validation | | **08. Integration tests** | Validate service communication| | **09. Deploy to Production** | Promote validated build | | **11. Observability Check** | Validate monitoring and alerts | ### Jobs and environments - Each microservices has his own Jenkins pipeline per environment. | Environment | Purpose | Infrastructure | |----|----|----| | **Development (Local)** | Local testing, feature validation | Docker Compose | | **Staging (AWS)** | Integration and pre-prod testing | ECS/EKS (staging cluster), RDS (test DB) | | **Production (AWS)** | Live system | ECS/EKS (prod cluster), RDS (prod DB) | ## Storage | **Type** | **Service** | **Use / Description** | **IOPS / Performance** | **Volume / Size** | **Backup Strategy** | |-----|-----|-----|-----|-----|-----| | **1. Database (RDBMS)** | Amazon RDS (MySQL) | Structured data for each microservice schema | 3,000–6,000 (gp3 default) or provisioned as needed | 20 GB per schema | Automated daily snapshots (14-day retention) | | **2. Block Storage** | Amazon EBS (gp3) | EC2-hosted Jenkins & ECS servers| 3,000 baseline | / | Not necessary | | **3. Object Storage** | Amazon S3 | Logs, backups, images | Standard or Infrequent Access tiers | / | Cross-region replication or versioning enabled | ## Data ### 1. Location - Eu-central-1 region - Place database (RDS) and services in the same region and AZs. ### 2. Replication / Distribution | Data Type | Replication / Distribution Strategy | |----|----| | **RDS (Postgres/MySQL)** | Multi-AZ synchronous replication | | **S3 (images, artifacts)** | Automatic cross-AZ durability | ### 3. Links / Access | Access type | Route| |----|----| | **Internal** | Microservices access RDS via private VPC links. | | | Images in S3 accessed via IAM roles or pre-signed URLs. | | **External** | ALB routes external requests. | | | HTTPS enforced for secure data transfer. | ## Network ### Location - Eu-central-1 region - Deploy services across multiple AZs for high availability. - All microservices, databases, and supporting infrastructure live inside a single VPC. - Isolate the network from public internet by default ### Network Segmentation & Filtering - Public subnets: ALB, NAT gateway. - Private subnets: ECS, RDS. - Security groups: Service-specific firewall rules - Tweak default ACLs if necessary ### Addressing - VPC: `10.0.0.0/16` - Public subnet: `10.0.1.0/24` - Private subnet: `10.0.2.0/24` - Every service/database gets a internal IP in the private subnet - Only load balancer or NAT gateway have public IP ## Compute ### Nodes | Environment | Nodes | Notes | |----|----|----| | **Staging** | 3 ECS container instances (EC2) | Handles staging microservices, mirrors production setup | | **Production / Live** | 3 ECS container instances (EC2) | Fixed-size cluster, no autoscaling to reduce costs | | **Scalability** | N/A for autoscaling | Fixed node count to reduce cost but still allow horizontal scaling via ECS task count or manual node addition. | ### Container Management #### Container Registry: - Amazon ECR for all microservice Docker images. - Each microservice image tagged by Git commit SHA. #### Deployment Strategy: - ECS tasks run one or more containers per node. - Service definitions ensure each microservice has the desired number of tasks. - Jenkins updates ECS service definition after build. ### ECS Orchestration #### Cluster Setup: - One ECS cluster per environment (staging and production). - EC2 launch type for fixed nodes. #### Service Definitions: - Each microservice has an ECS service with a desired task count. - Service linked to ALB . ## Security | **Area** | **Implementation / Notes** | |----|----| | **1. Authentication, Authorization, Auditing (AAA)** | Spring Security | | | IAM roles restrict AWS access per service | | | Auditing: Not relevant since we don't handle sensitive data| | | CloudWatch for app/service logs | | **2. Code Security** | Static analysis via SonarQube| | | No hardcoded credentials | | | Secrets in AWS Secrets Manager | | | Dependency scanning via Dependabot | | **3. Traffic Security** | HTTPS enforced via ALB | | | Internal TLS optional for microservices | | | Security groups restrict inbound/outbound ports | | | Private subnets for internal services and databases | | **4. Instance / Container Security** | Use minimal and updated AMIs | | | Regular patching, no direct SSH (bastion-only) | | | Containers run as non-root users | | | Vulnerability scanning before deploy | | | Secrets passed via IAM roles or ECS environment vars | ## Observability | **Aspect** | **Tools** | **Notes** | |----|----|----| | **Metrics** | **Prometheus** | Collect CPU, memory, and ECS task metrics from node exporters | | | | If microservices expose `prometheus-metrics`, integrate directly. | | | **Grafana** | Dashboards for system and service health | | **Logs** | **AWS CloudWatch Logs** | ECS task logs streamed to CloudWatch| | | |Structured JSON logging for easy filtering and search.| | **Traces** | **AWS X-Ray** | Trace API calls across microservices. | | **Alerts** | **CloudWatch Alarms** | CloudWatch for infrastructure-level alerts (CPU, memory, ECS health) | | **Grafana Alerts** | Grafana alert rules for application metrics from Prometheus. | | | | Alerts via email or Slack webhook.| ## Continuity & Recovery | **Aspect** | **Approach / Tooling** | **Notes** | |----|----|----| | **Redundancy** | Multi-AZ deployment | RDS and ECS nodes deployed across multiple Availability Zones for high availability.| | | | Load balancer automatically routes traffic to healthy tasks. | | **Failover** | AWS-managed failover | RDS Multi-AZ provides automatic database failover. | | |ECS services automatically restart failed tasks on healthy nodes.| | | | Manual intervention only needed for regional failures. | | **Backup** | AWS Backup / RDS Snapshots| Automated RDS daily backups with retention policy. | | S3 Versioning | S3 bucket versioning for uploaded images and configs.| | **Business Continuity Plan** | Operate from secondary region if needed | Documented procedure to restore environment in another AWS region using IaC templates (Terraform). | | | | Prioritize restoring RDS, Config Server, and API Gateway. | | **Disaster Recovery Plan** | Cold standby in alternate region | No live duplication to save cost.| | | | Periodic replication of backups and images to secondary region. | ## Architecture Diagram