2025-10-28 08:50:13 +01:00

14 KiB
Raw Permalink Blame History

Project Context: PetCLinic Microservices -> AWS

Context and Scope

This project will migrate the Spring PetClinic Microservices demo from its local/on-premise setup to AWS Cloud. The focus is infrastructure modernization, CI/CD automation, observability, and resilience but not application feature development.

Pdf Download (click me)

Stakeholders

Role Responsibility
Project Sponsor Funding, final approval
Project Manager Scheduling, stakeholder coordination
Cloud Architect Architecture, service selection
Dev Lead App changes for cloud readiness
DevOps Engineer CI/CD, IaC, deployments, monitoring
Security Team IAM, encryption
End Users / Demo Audience Acceptance and usability feedback

Expectations

  • No app feature development unless necessary for cloud deployment.
  • AWS is the only target cloud

Objectives

  • Run full PetClinic microservices on AWS with CI/CD.
  • Observability: logs, metrics, traces for 100% of services.
  • Cost target: keep monthly infra cost under a defined limit .
  • Security: secrets encrypted, least-privilege IAM, HTTPS for all endpoints.

Deadlines

Milestone Date
Project approval Oct 27, 2025
CI/CD & Automation Nov 3, 2025
Infrastructure Nov 10, 2025
Data Nov 17, 2025
Observability Nov 24, 2025
Prep: Presentation, Demo, and Pre-defense Dec 3, 2025

In Scope

Included items Objective
Application Only necessary changes (if applicable) to facilitate cloud integration
Infrastructure Design and deploy a reproducible, cloud-native architecture
CI/CD Automation Implement automated build, test, and deployment pipelines
Containerization Adapt existing microservices to use AWS.
Monitoring & Logging Centralized logs, metrics, and traces
Security & IAM Least-privilege IAM roles, encryption, and subnet segmentation.
Backup & Recovery Redundancy, failover, backup, BCP/DRP
Documentation Architecture diagrams, specifications, and operational runbooks.

Out of Scope

Excluded items Reason
Application feature or UI changes Functionality remains unchanged.
Multi-cloud or hybrid deployment Focus solely on AWS environment.
Cost-optimization Addressed in a later project if necessary

Requirements

Functional requirements

Stakeholder / Role Requirement Description
Developers Continuous Integration Each merge must trigger automated build, test, and image creation.
Local to Cloud Parity Development environment must mirror AWS setup.
DevOps Engineers Automated Deployment CI/CD pipeline must deploy microservices to Staging and Prod environments automatically.
Test Automation Integration tests must run automatically in CI/CD pipeline.
Infrastructure as Code All AWS resources defined through configuration files
Monitoring & Alerts Centralized logging, metrics, and tracing for all microservices. Automated alerting for service downtime or threshold breaches.
Scalability Services must be scalable
Security Team Access Control Roles per service with least-privilege permissions.
Secrets Management All secrets stored securely.
Product / Management Availability & Demo Readiness System must be reliable and presentable for client or internal demos.
End Users (Demo Audience) Stable Access Web UI and APIs must remain responsive under typical load.

Non-Functional Requirements

Category Requirement Standard
Development Local to Cloud parity Docker Compose or local ECS simulation
Maintainability IaC Code stored in version control
Performance Pipeline execution time < 10 minutes per merge
Scaling Services can be scaled horizontally
API / UI response < 200 ms under normal demo load
Reliability Deployment success rate ≥ 99% successful deployments
Alert response Alerts trigger within < 5 minutes of failure detection
Error tolerance < 0.1% failed requests
Availability System uptime ≥ 99.9% uptime
Observability Logs, Metrics, Traces Centralized in monitoring solution
Security Least-privileged Roles Roles restricted per service; no default full-access policies
Secret encryption Secrets stored in AWS
Cost Budget target Monthly AWS cost ≤ defined cap

System Components — Spring PetClinic Microservices

Component Role / Function Dependencies
spring-petclinic-admin-server Provides admin UI and dashboards Microservices, Config Server
spring-petclinic-api-gateway Routes external requests to microservices Customers, Vets, Visits, GenAI services
spring-petclinic-config-server Centralized configuration Git repo
spring-petclinic-customers-service Manages customer data RDBMS, Config Server
spring-petclinic-vets-service Manages veterinary staff RDBMS, Config Server
spring-petclinic-visits-service Manages pet visit records RDBMS, Customers Service
spring-petclinic-genai-service Optional AI chat-bot Microservices, RDBMS
spring-petclinic-discovery-server Service registry / discovery All microservices
RDBMS Persistent storage Customers, Vets, Visits

Architecture and Specifications

Project

  • Kanban as agile methodology
  • Breakdown of work and phases:
    • Infrastructure Setup
    • Service Orchestration
    • Configuration Management
    • CI/CD Automation
    • Security
    • Resilience
    • Observability

Assignments:

Role Responsibilities
Cloud Architect Design AWS target architecture, network, and IAM structure
DevOps Engineer Build CI/CD pipelines, container orchestration, monitoring setup
Dev Lead Containerize services, modify configs for cloud compatibility
Database Engineer Migrate data from local RDBMS to AWS RDS, manage schema updates
Security Team Set up access and roles for services
Everyone Validate deployments, pipeline runs, rollback testing
Project Lead Manage Asana Kanban board, ensure alignment and progress tracking

Source Code

  • Architecture Type: Microservices deployed via containers, managed by ECS, behind an AWS Application Load Balancer (ALB).
  • Review via pull request process:
    • All commits merged via PRs.
    • Peer review required before merging.
  • Vaildation: Run tests during pipeline build

CI/CD

  • Goal: Automate the entire software delivery process for all PetClinic microservices via Jenkins

Development Cycle Stages

Stage Description
01. Code & Merge Developer writes code and merges it with the staging branch
02. Build Compile, resolve dependencies
03. Unit Test Run service-level tests
04. Containerize Build Docker image for service
05. Security Scan Security validation via Trivy
06. Push to Registry Push validated image
07. Deploy to Staging Deploy for validation
08. Integration tests Validate service communication
09. Deploy to Production Promote validated build
11. Observability Check Validate monitoring and alerts

Jobs and environments

  • Each microservices has his own Jenkins pipeline per environment.
Environment Purpose Infrastructure
Development (Local) Local testing, feature validation Docker Compose
Staging (AWS) Integration and pre-prod testing ECS/EKS (staging cluster), RDS (test DB)
Production (AWS) Live system ECS/EKS (prod cluster), RDS (prod DB)

Storage

Type Service Use / Description IOPS / Performance Volume / Size Backup Strategy
1. Database (RDBMS) Amazon RDS (MySQL) Structured data for each microservice schema 3,0006,000 (gp3 default) or provisioned as needed 20 GB per schema Automated daily snapshots (14-day retention)
2. Block Storage Amazon EBS (gp3) EC2-hosted Jenkins & ECS servers 3,000 baseline / Not necessary
3. Object Storage Amazon S3 Logs, backups, images Standard or Infrequent Access tiers / Cross-region replication or versioning enabled

Data

1. Location

  • Eu-central-1 region
  • Place database (RDS) and services in the same region and AZs.

2. Replication / Distribution

Data Type Replication / Distribution Strategy
RDS (Postgres/MySQL) Multi-AZ synchronous replication
S3 (images, artifacts) Automatic cross-AZ durability
Access type Route
Internal Microservices access RDS via private VPC links.
Images in S3 accessed via IAM roles or pre-signed URLs.
External ALB routes external requests.
HTTPS enforced for secure data transfer.

Network

Location

  • Eu-central-1 region
  • Deploy services across multiple AZs for high availability.
  • All microservices, databases, and supporting infrastructure live inside a single VPC.
  • Isolate the network from public internet by default

Network Segmentation & Filtering

  • Public subnets: ALB, NAT gateway.
  • Private subnets: ECS, RDS.
  • Security groups: Service-specific firewall rules
  • Tweak default ACLs if necessary

Addressing

  • VPC: 10.0.0.0/16
  • Public subnet: 10.0.1.0/24
  • Private subnet: 10.0.2.0/24
  • Every service/database gets a internal IP in the private subnet
  • Only load balancer or NAT gateway have public IP

Compute

Nodes

Environment Nodes Notes
Staging 3 ECS container instances (EC2) Handles staging microservices, mirrors production setup
Production / Live 3 ECS container instances (EC2) Fixed-size cluster, no autoscaling to reduce costs
Scalability N/A for autoscaling Fixed node count to reduce cost but still allow horizontal scaling via ECS task count or manual node addition.

Container Management

Container Registry:

  • Amazon ECR for all microservice Docker images.
  • Each microservice image tagged by Git commit SHA.

Deployment Strategy:

  • ECS tasks run one or more containers per node.
  • Service definitions ensure each microservice has the desired number of tasks.
  • Jenkins updates ECS service definition after build.

ECS Orchestration

Cluster Setup:

  • One ECS cluster per environment (staging and production).
  • EC2 launch type for fixed nodes.

Service Definitions:

  • Each microservice has an ECS service with a desired task count.
  • Service linked to ALB .

Security

Area Implementation / Notes
1. Authentication, Authorization, Auditing (AAA) Spring Security
IAM roles restrict AWS access per service
Auditing: Not relevant since we don't handle sensitive data
CloudWatch for app/service logs
2. Code Security Static analysis via SonarQube
No hardcoded credentials
Secrets in AWS Secrets Manager
Dependency scanning via Dependabot
3. Traffic Security HTTPS enforced via ALB
Internal TLS optional for microservices
Security groups restrict inbound/outbound ports
Private subnets for internal services and databases
4. Instance / Container Security Use minimal and updated AMIs
Regular patching, no direct SSH (bastion-only)
Containers run as non-root users
Vulnerability scanning before deploy
Secrets passed via IAM roles or ECS environment vars

Observability

Aspect Tools Notes
Metrics Prometheus Collect CPU, memory, and ECS task metrics from node exporters
If microservices expose prometheus-metrics, integrate directly.
Grafana Dashboards for system and service health
Logs AWS CloudWatch Logs ECS task logs streamed to CloudWatch
Structured JSON logging for easy filtering and search.
Traces AWS X-Ray Trace API calls across microservices.
Alerts CloudWatch Alarms CloudWatch for infrastructure-level alerts (CPU, memory, ECS health)
Grafana Alerts Grafana alert rules for application metrics from Prometheus.
Alerts via email or Slack webhook.

Continuity & Recovery

Aspect Approach / Tooling Notes
Redundancy Multi-AZ deployment RDS and ECS nodes deployed across multiple Availability Zones for high availability.
Load balancer automatically routes traffic to healthy tasks.
Failover AWS-managed failover RDS Multi-AZ provides automatic database failover.
ECS services automatically restart failed tasks on healthy nodes.
Manual intervention only needed for regional failures.
Backup AWS Backup / RDS Snapshots Automated RDS daily backups with retention policy.
S3 Versioning S3 bucket versioning for uploaded images and configs.
Business Continuity Plan Operate from secondary region if needed Documented procedure to restore environment in another AWS region using IaC templates (Terraform).
Prioritize restoring RDS, Config Server, and API Gateway.
Disaster Recovery Plan Cold standby in alternate region No live duplication to save cost.
Periodic replication of backups and images to secondary region.

Architecture Diagram

Main Menu

Solutions stack

Layer Technologies / Services
Application Layer Spring Boot microservices
Runtime / Platform Layer Docker, Amazon ECS, Amazon ECR
CI/CD Layer Jenkins, Gitea
Infrastructure Layer Terraform, Ansible, Amazon EC2, VPC, subnets, security groups
Database / Storage Layer Amazon RDS (MySQL), Amazon S3, Amazon EBS
Observability Layer Prometheus, Grafana, CloudWatch
Security Layer AWS IAM, Security Groups, HTTPS via ALB, Secrets Manager
Continuity & Recovery Layer RDS automated snapshots, S3 versioning/replication, multi-AZ RDS, Terraform for redeploy
Network & Delivery Layer Application Load Balancer (ALB), Route 53, NAT Gateway, Internet Gateway