Production-grade NestJS microservice for real-time loyalty reward calculation and session management
The service is organized around six core modules, each responsible for specific business concerns. Dependencies flow from concrete implementations up to the application layer.
Each module encapsulates related providers and controllers. The dependency injection system manages instantiation and sharing.
Services are stateless where possible. Database operations use repositories. External APIs use client services with error handling.
Minimal logic in controllers. All business logic delegated to services. Request/response validation via DTOs and pipes.
Source code follows NestJS conventions with feature-based module organization.
Application entry point. Bootstraps NestJS app, configures middleware, and starts the HTTP server.
Environment validation and global configuration. Loaded early before any modules initialize.
PDI API client and auth logic. Includes circuit breaker for resilience and OAuth2 token management.
Session entity, repository, and state machine. Handles session lifecycle from creation to finalization.
Core business logic for loyalty operations. Controllers expose HTTP endpoints, services handle calculations.
Health check endpoints for Kubernetes and monitoring systems. Includes PDI endpoint status indicator.
The loyalty service communicates with PDI Point of Sale system through a secure, fault-tolerant client.
| Endpoint | Method | Purpose | Timeout |
|---|---|---|---|
/identify |
POST | Initiate loyalty session, validate customer | 5s |
/calculateRewards |
POST | Calculate earned points for transaction | 8s |
/finalize |
POST | Commit rewards to customer account | 10s |
/getSessionStatus |
GET | Query session state and earned points | 3s |
If PDI doesn't respond within timeout, circuit breaker records failure and transitions accordingly.
401/403 responses trigger token refresh. If refresh fails, service returns 503 with retry-after header.
PDI may reject operations (invalid customer, insufficient points). These propagate to client as 400 errors with details.
When circuit breaker is OPEN, service queues operations and retries with exponential backoff.
{
"customerId": "CUST-12345",
"channelId": "POS-001",
"transactionId": "TXN-67890"
}
{
"sessionId": "sess_abc123",
"status": "active",
"customerName": "John Doe",
"currentPoints": 5000,
"tier": "Gold"
}
{
"sessionId": "sess_abc123",
"transactionAmount": 150.00,
"items": [
{ "sku": "SKU-001", "qty": 2 }
]
}
{
"earnedPoints": 450,
"multiplier": 3.0,
"bonusPoints": 150,
"breakdown": {
"basePoints": 300,
"tierBonus": 150
}
}
{
"sessionId": "sess_abc123",
"transactionId": "TXN-67890",
"finalAmount": 150.00,
"applyRewards": true
}
{
"sessionId": "sess_abc123",
"status": "finalized",
"earnedPoints": 450,
"newBalance": 5450,
"receiptId": "RCP-2025-001"
}
Sessions track the state of a loyalty transaction from customer identification through reward finalization.
export interface Session { id: string; // UUID customerId: string; // PDI customer ID status: SessionStatus; // State enum transactionId: string; // Unique transaction ref earnedPoints: number; // Total earned this session redeemedPoints: number; // If redeemed pdiSessionId: string; // Remote session handle createdAt: Date; // Session start expiresAt: Date; // TTL finalizedAt: Date | null; // When committed metadata: Record<string, any>; // Custom data }
Sessions use pessimistic locking during state transitions to prevent race conditions.
Duplicate requests (same transactionId) return cached result instead of creating duplicate sessions.
Sessions include a version field. Updates fail if version doesn't match, requiring retry.
All database operations run in serializable isolation for strong consistency.
| Job | Schedule | Action |
|---|---|---|
| Expire Sessions | Every 15 minutes | Mark expired sessions as FAILED, soft delete |
| Archive Finalized | Daily at 2 AM UTC | Move 30+ day old sessions to archive table |
| Clear Cache | Every hour | Remove idempotency entries older than 10 minutes |
| Reconciliation | Daily at 1 AM UTC | Verify finalized sessions against PDI records |
All logs are JSON-formatted with consistent fields for easy parsing and aggregation.
// Example log entry { "timestamp": "2025-03-16T14:30:45.123Z", "level": "INFO", "service": "loyalty-service", "context": "LoyaltyController", "message": "Session finalized successfully", "sessionId": "sess_abc123", "customerId": "CUST-12345", "duration": 1250, "earnedPoints": 450, "traceId": "trace-xyz789", "spanId": "span-abc456" }
GET /metrics // Returns Prometheus-formatted metrics // Scrape interval: every 30 seconds // Typical response size: 15-20 KB
| Endpoint | Purpose | Dependencies |
|---|---|---|
GET /health |
General liveness probe | Service process only |
GET /health/ready |
Readiness probe (ready to accept traffic) | Database, Redis, PDI |
GET /health/live |
Deep health check with dependency status | All critical systems |
200 OK { "status": "healthy", "timestamp": "2025-03-16T14:30:45Z", "uptime": 3600, "checks": { "database": { "status": "up", "latency": 5 }, "redis": { "status": "up", "latency": 2 }, "pdi": { "status": "degraded", "circuitBreaker": "HALF_OPEN", "latency": 4500 } } }
Multi-stage build for minimal runtime image with production-grade security.
# Build stage FROM node:20-alpine AS builder WORKDIR /app COPY package.json package-lock.json ./ RUN npm ci --omit=dev COPY . . RUN npm run build # Runtime stage FROM node:20-alpine WORKDIR /app RUN addgroup -g 1001 -S nodejs && \ adduser -S nestjs -u 1001 COPY --from=builder --chown=nestjs:nodejs /app/node_modules ./node_modules COPY --from=builder --chown=nestjs:nodejs /app/dist ./dist USER nestjs EXPOSE 3000 HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \ CMD node -e "require('http').get('http://localhost:3000/health', (r) => {if (r.statusCode !== 200) throw new Error(r.statusCode)})" CMD ["node", "dist/main.js"]
Local development environment with all dependencies.
version: '3.8' services: app: build: . ports: - "3000:3000" environment: - DATABASE_URL=postgresql://user:pass@postgres:5432/loyalty - REDIS_URL=redis://redis:6379 - PDI_API_URL=https://pdi-api.internal - NODE_ENV=development depends_on: postgres: { condition: service_healthy } redis: { condition: service_healthy } postgres: image: postgres:16-alpine environment: - POSTGRES_USER=user - POSTGRES_PASSWORD=pass - POSTGRES_DB=loyalty volumes: - postgres_data:/var/lib/postgresql/data healthcheck: test: ["CMD-SHELL", "pg_isready -U user"] redis: image: redis:7-alpine healthcheck: test: ["CMD", "redis-cli", "ping"] volumes: postgres_data: {}
| Variable | Required | Default | Description |
|---|---|---|---|
NODE_ENV |
Yes | production | Environment: development, test, production |
PORT |
No | 3000 | HTTP server port |
DATABASE_URL |
Yes | — | PostgreSQL connection string |
REDIS_URL |
Yes | — | Redis connection string for caching |
PDI_API_URL |
Yes | — | PDI API base URL |
PDI_CLIENT_ID |
Yes | — | OAuth2 client ID for PDI |
PDI_CLIENT_SECRET |
Yes | — | OAuth2 client secret (secret) |
LOG_LEVEL |
No | info | Min log level: debug, info, warn, error |
SESSION_TTL_HOURS |
No | 24 | Session expiry time in hours |
Standard k8s manifests for ECS, GKE, or self-hosted clusters.
apiVersion: v1 kind: Pod metadata: labels: app: loyalty-service spec: containers: - name: loyalty-service image: loyalty-service:latest ports: - containerPort: 3000 livenessProbe: httpGet: path: /health/live port: 3000 initialDelaySeconds: 40 periodSeconds: 30 readinessProbe: httpGet: path: /health/ready port: 3000 initialDelaySeconds: 20 periodSeconds: 10 resources: requests: cpu: "500m" memory: "512Mi" limits: cpu: "2000m" memory: "2Gi"
Service is stateless, scales horizontally with load balancer.
Connection pooling prevents exhaustion under load.
Protects backend systems and ensures fair resource allocation.
Distributes traffic across replicas efficiently.