Distributed Systems Exercise: Smart Concert / Festival Management Platform

Course focus: distributed systems, message-oriented middleware, reliable messaging, error handling, and long-running workflows (SAGA).
Tech focus: .NET backend microservices, RabbitMQ, CloudEvents message format, REST APIs, YARP reverse proxy, SignalR notifications, databases, transactional outbox, orchestration, Docker Compose.

You will receive an empty starter repository via GitHub Classroom. Your job is to build the system in milestones.
You must commit regularly and create a Git tag at the end of each milestone:

MS1-SetUp
MS2-AddMessagingMiddleware
MS3-DefineMessageFormat
MS4-BuildBasicWebApp
MS5-AddYARPReverseProxy
MS6-BuildBusinessLogicAndMessaging
MS7-AddSignalRNotifications
MS8-AddDatabasesForStorageAPIs
MS9-AddTransactionalOutbox
MS10-AddOrchestratorSaga
MS11-AddCallbackService
MS12-ContainerizeDockerCompose
MS13-OptionalExtensions

Important: This document tells you what to build and why (goals & constraints). It does not tell you how to implement it.
You are expected to research APIs and libraries yourself.

1. Scenario: “Festivo” — a Smart Festival Platform

Your team is building Festivo, a backend-heavy system for a medium-sized music festival.

The festival has:

Ticket purchases (simple ticket types)
Entry scanning at gates
Stage schedules and artist lineup
Live capacity tracking per area/stage
Push notifications to visitors (status updates)

The system must be distributed: multiple services, each owning its data and exposing a REST API.
Services coordinate via asynchronous events using a message broker (RabbitMQ), and implement reliability patterns such as outbox and sagas.

Key qualities we want to practice

Asynchronous communication via events
Eventual consistency between services
Reliable message delivery (publish/consume, retries, dead-letter handling)
Long-running processes with SAGA orchestration
Observability (optional extension: distributed logging/tracing)

2. High-Level Domain Model (Concepts)

Entities / concepts (minimum set):

Ticket: created when purchased, can be validated and used for entry
Visitor: optional (may be anonymous), linked to ticket
Gate Scan: attempts to enter/exit (may repeat)
Venue Area/Stage: has capacity and current occupancy
Schedule Item: artist, stage, start/end time
Alert/Notification: messages to users (e.g., “Stage A crowded”)

3. Services Overview

You will implement multiple services. Each service must be a separate project in the same solution (monorepo), with its own API, data store, and background workers where needed.

3.1 API Gateway (YARP)

A reverse proxy that exposes a single entry point to the frontend and routes requests to backend services.

Responsibilities

Route frontend requests to backend REST APIs
Provide a stable base URL for the web app

3.2 Ticket Service

Manages ticket purchasing and ticket state.

Owns

Ticket records, ticket type, status

REST API (minimum)

POST /tickets/purchase — purchase a ticket (returns ticket id + a “ticket code” string)
GET /tickets/{ticketId} — read ticket state
POST /tickets/{ticketId}/refund — request refund/cancel (used by SAGA compensation)

Publishes events

TicketPurchased
TicketRefunded (or TicketCancelled)

Consumes events

(Optional) events that might affect ticket validity (e.g., EntryDenied)

3.3 Access Control Service (Gate Scanning)

Simulates scanning tickets at entry/exit gates and enforces rules like “no double entry”.

Owns

Scan log and current “inside/outside” state per ticket

REST API (minimum)

POST /gates/scan-entry — scan ticket code for entry attempt
POST /gates/scan-exit — scan ticket code for exit attempt
GET /gates/tickets/{ticketId}/status — whether ticket is inside/outside

Publishes events

EntryRequested (entry attempt)
EntryGranted / EntryDenied
ExitGranted / ExitDenied

Consumes events

TicketPurchased (so it can recognize a valid ticket code)
TicketRefunded (so it can reject entry)

Why: Gate scanning is a great source of duplicate messages and out-of-order events.
You must design idempotent handling and clear rejection reasons.

3.4 Schedule Service

Stores the lineup and stage schedule.

Owns

Schedule items (artist, stage, time range)

REST API (minimum)

POST /schedule/items — create schedule item
GET /schedule/stages/{stageId} — read schedule for a stage
GET /schedule/items/{itemId} — read one schedule item

Publishes events

ScheduleItemCreated (optional, can drive notifications)

Consumes events

none required

Why: This service gives you stable “reference data” and a reason for cross-service lookups later.

3.5 Crowd Monitor Service

Tracks occupancy for each stage/area and raises alerts when near/over capacity.

Owns

Occupancy counters per stage/area, plus alert state

REST API (minimum)

GET /crowd/stages/{stageId} — occupancy + capacity status
POST /crowd/stages/{stageId}/configure — set capacity thresholds (admin)

Publishes events

OccupancyUpdated
CapacityWarningIssued
CapacityCriticalIssued
CapacityBackToNormal (optional)

Consumes events

EntryGranted / ExitGranted (to update occupancy)
(Optional) schedule events (to focus on “active stage”)

Why: Occupancy is a classic eventually consistent value that is updated from other services’ events.

3.6 Notification Service

Sends status updates to clients (via SignalR) and can store a history of notifications.

Owns

Notification history (optional but recommended)

REST API (minimum)

GET /notifications/recent?limit=... — list recent notifications (for new clients)

SignalR Hub (minimum)

/hubs/notifications — pushes status events to clients

Publishes events

none required

Consumes events

TicketPurchased, EntryGranted, EntryDenied, CapacityWarningIssued, etc.

Why: Real-time updates make event-driven systems visible and debuggable.

3.7 Orchestrator Service (SAGA)

Coordinates a long-running business process and implements compensating actions.

You will implement a SAGA that handles at least one of these workflows:

Ticket Refund on Failed Entry
Example: user buys ticket, tries to enter, entry fails for a “system reason” → refund ticket automatically.
“Crowd-safe routing”
Example: entry granted triggers occupancy, if capacity critical then orchestrator triggers an alert and optionally blocks entry for a short time window.
“VIP upgrade”
Example: visitor requests upgrade, payment confirmed, ticket updated, notification sent.

Owns

Saga state instances (e.g., EntrySaga, RefundSaga)

REST API (minimum)

POST /processes/start — start a process (or used indirectly by events)

Publishes events

process commands/events to other services (you decide naming)

Consumes events

depends on your chosen saga workflow

Why: SAGA is the standard way to handle distributed transactions without a shared database transaction.

3.8 Callback Service (REST or gRPC)

A small service that enables downstream services to request info from another service at runtime.

Goal: demonstrate that event-driven systems still sometimes need request/response for missing data.

Example use cases:

Crowd Monitor needs stage capacity configuration from Schedule Service (or a dedicated Config service)
Access Control needs to check ticket validity with Ticket Service if it misses an event
Orchestrator fetches schedule information to include in user notification

Requirements

Implement at least one cross-service synchronous call pattern:
- REST: simple HTTP call
- or gRPC: typed contract

Why: Pure event-driven architectures are rare. You need to understand when and how to do synchronous calls safely.

4. System Structure Diagram

flowchart LR
  UI[Blazor Web App] -->|HTTP via Gateway| GW[YARP API Gateway]

  GW --> TS[Ticket Service]
  GW --> ACS[Access Control Service]
  GW --> SS[Schedule Service]
  GW --> CMS[Crowd Monitor Service]
  GW --> NS[Notification Service]
  GW --> ORCH[Orchestrator Service]

  subgraph Broker[RabbitMQ]
    MQ[(Exchange/Queues)]
  end

  TS <-->|events| MQ
  ACS <-->|events| MQ
  SS <-->|events| MQ
  CMS <-->|events| MQ
  NS <-->|events| MQ
  ORCH <-->|events| MQ

  NS -->|SignalR| UI

  ORCH -. sync call .-> CBS[Callback Service]
  ACS -. sync call .-> CBS
  CMS -. sync call .-> CBS
  CBS -.-> TS
  CBS -.-> SS

5. Main Data Flow (Example)

Example: Ticket purchase → entry scan → occupancy update → warning notification

sequenceDiagram
  participant UI as Blazor UI
  participant GW as YARP Gateway
  participant TS as Ticket Service
  participant MQ as RabbitMQ
  participant ACS as Access Control
  participant CMS as Crowd Monitor
  participant NS as Notification Service

  UI->>GW: POST /tickets/purchase
  GW->>TS: POST /tickets/purchase
  TS-->>UI: ticketId + ticketCode
  TS->>MQ: TicketPurchased (CloudEvent)

  UI->>GW: POST /gates/scan-entry (ticketCode)
  GW->>ACS: POST /gates/scan-entry
  ACS->>MQ: EntryRequested (CloudEvent)
  ACS->>MQ: EntryGranted OR EntryDenied (CloudEvent)

  MQ->>CMS: EntryGranted
  CMS->>MQ: OccupancyUpdated
  CMS->>MQ: CapacityWarningIssued (if threshold exceeded)

  MQ->>NS: TicketPurchased / EntryGranted / CapacityWarningIssued
  NS-->>UI: SignalR push updates

6. Messaging Standard: CloudEvents

All messages that travel through RabbitMQ must use a uniform event envelope using the CloudEvents standard.

Why:

Consistent metadata across services (event type, id, time, source)
Easier troubleshooting and filtering
Interoperable across languages and systems

What to include

id (unique per event)
type (event type name)
source (service name)
time
subject (optional: entity id)
datacontenttype
data (your event payload)

Where to read

CloudEvents specification (CNCF)
CloudEvents .NET libraries (if you choose to use them)

You decide how your event data is shaped, but it must be versionable and documented.

7. Milestones and Requirements

Each milestone must end with:

all tests/build passing (if you have tests)
working demo for the milestone’s scope
a Git tag on the commit: git tag MSx-...

MS1 — SetUp

Tag: MS1-SetUp

Goal (why): Create a clean, repeatable starting point for a multi-service system.

Requirements

Create a single .sln containing individual projects:
- ApiGateway (empty for now) Project-Type: Web API
- TicketService Project-Type: Web API
- AccessControlService Project-Type: Web API
- ScheduleService Project-Type: Web API
- CrowdMonitorService Project-Type: Web API
- NotificationService Project-Type: Web API
- OrchestratorService Project-Type: Web API
- CallbackService Project-Type: Web API
- WebApp Project-Type: Blazor WASM
- Shared (contracts/helpers; keep minimal and avoid tight coupling) Project-Type: Class Library
Each service must:
- Run as an independent process
- Expose a GET /health endpoint returning a simple OK response
Repository structure must include:
- docs/ folder (you may keep notes, event definitions, etc.)
- README.md with how to run services (basic)
Git requirements:
- At least 5 commits showing incremental work
- Tag MS1-SetUp on the final milestone commit

MS2 — AddMessagingMiddleware (RabbitMQ)

Tag: MS2-AddMessagingMiddleware

Goal (why): Introduce asynchronous communication and decouple services.

Requirements

Have RabbitMQ Docker container running and available during development.
Add RabbitMQ connectivity to all backend services (not the WebApp).
Define a standard configuration approach (e.g., environment variables / appsettings).
Each service must be able to:
- Publish a test message on startup (or via a test endpoint)
- Consume a test message and log that it was received
Define exchanges/queues in a consistent way:
- Either one exchange with routing keys, or per-service exchanges (your choice)
- Make sure to have necessary exchanges and queues available (e.g. each service creates own infrastructure on startup, or centralized initialization logic that needs to run before any other service runs, or …)
Error handling requirement:
- If a consumer fails to process a message, the failure must be visible (log)
- Messages must not be silently lost
- No message must be lost. If it can’t be processed it should go to a dead-letter queue.

MS3 — DefineMessageFormat (CloudEvents + serialization)

Tag: MS3-DefineMessageFormat

Goal (why): Ensure a uniform event format and consistent serialization across the system.

Requirements

All messages published to RabbitMQ must be wrapped as CloudEvents.
Define:
- How you serialize CloudEvents (e.g., JSON)
- How you handle event type mapping to .NET classes
- How you version your event payloads (at least a documented strategy)
Create a contracts documentation file in docs/:
- List each event type name
- Describe its data schema (fields + meaning)
- Define producer(s) and consumer(s)
- Provide diagram showing emitted and consumed events for each service.

MS4 — BuildBasicWebApp

Tag: MS4-BuildBasicWebApp

Goal (why): Provide a simple user entry point and a way to observe system behavior later.

Requirements

Create a minimal Blazor WASM Web App with pages:
- Purchase Ticket
- Scan Entry/Exit
- Live Status (placeholder for now)
The web app must call backend APIs directly (temporary) OR show placeholders.
UI requirements:
- Keep it simple; functionality > styling
- Must display returned IDs/codes clearly for testing

MS5 — AddYARPReverseProxy

Tag: MS5-AddYARPReverseProxy

Goal (why): Centralize access and avoid the frontend needing to know service URLs.

Requirements

Implement YARP gateway project.
WebApp must call only the gateway, not services directly.
Gateway must route to:
- TicketService endpoints
- AccessControlService endpoints
- ScheduleService endpoints
- CrowdMonitorService endpoints
- NotificationService endpoints (REST endpoints; SignalR later)
Gateway must provide SSL/TLS termination. Frontend to YARP communication uses encryption (HTTPS), YARP forwards to services using unencrypted messages (HTTP)
Add documentation in README.md:
- which routes exist
- how to run gateway + services locally

MS6 — BuildBasicBusinessLogicAndMessaging

Tag: MS6-BuildBusinessLogicAndMessaging

Goal (why): Build the first real event-driven workflow with clear states and outcomes.

Functional workflow (minimum)

Purchase ticket in Ticket Service
Ticket Service publishes TicketPurchased
Access Control consumes TicketPurchased and registers the ticket code
Entry scan triggers EntryRequested and results in EntryGranted or EntryDenied
Crowd Monitor consumes EntryGranted / ExitGranted and updates occupancy

Requirements

TicketService:
- Must generate a ticket code (string) returned to client
- Must store ticket state in memory for now (DB later)
AccessControlService:
- Must reject unknown/invalid/refunded tickets
- Must enforce “no double entry” (enter twice without exit → denied)
- Must write scan decisions with a reason
CrowdMonitorService:
- Must track occupancy per stage/area (choose a simple model)
- Must publish OccupancyUpdated when occupancy changes
All inter-service updates must happen via RabbitMQ events (not direct calls)

Reliability requirements

Consumers must be idempotent for at least one event type (document which and how you ensure it).
Add a “poison message” strategy:
- messages that repeatedly fail must end up somewhere observable (e.g., dead-letter queue). This should already be considered in MS2.

MS7 — AddSignalRForStatusUpdateNotifications

Tag: MS7-AddSignalRNotifications

Goal (why): Make the asynchronous system visible to users and developers in real time.

Requirements

NotificationService hosts a SignalR hub at /hubs/notifications.
NotificationService consumes at least these events and broadcasts updates:
- TicketPurchased
- EntryGranted / EntryDenied
- OccupancyUpdated
- (If implemented) CapacityWarningIssued
WebApp connects to the hub and shows a live event feed:
- timestamp
- event type
- short description (human-readable)
Implement network communication in one of two ways (document what you chose and why):
- Frontend directly connects to NotificationService (not using YARP API Gateway)
- or Gateway must support routing for SignalR (WebSockets) to NotificationService.

MS8 — AddDatabase(s) For Storage-First APIs

Tag: MS8-AddDatabasesForStorageAPIs

Goal (why): Introduce persistence and independent data ownership per service.

Requirements

Add a database for at least:
- TicketService (tickets)
- AccessControlService (scan log / inside status)
- OrchestratorService (saga state) can be DB later; optional in this milestone
You can add individual databases (or database containers) or use a single database instance with multiple schemas.
Each service must have it’s own database user that only has access to its own database instance / schema.
Define clear ownership:
- Each service has its own schema/database (no shared tables)
APIs must read/write from the database (not in-memory).
Provide a minimal migration strategy (documented).

Keep schemas small and straightforward. The focus is distributed behavior, not data modeling perfection.

MS9 — AddTransactionalOutboxPattern

Tag: MS9-AddTransactionalOutbox

Goal (why): Ensure messages are not lost when a service writes to DB but fails before publishing.

Requirements

Implement an outbox table in at least one service (TicketService strongly recommended).
When the service changes state in its DB, the outgoing message must be recorded in the outbox within the same local DB transaction.
A background publisher (worker) must read the outbox and publish messages to RabbitMQ.
Outbox messages must be marked as sent (or deleted) only after successful publish.
Document:
- how duplicates are prevented/handled
- what happens if RabbitMQ is down

MS10 — AddOrchestratorServiceForSAGAImplementation

Tag: MS10-AddOrchestratorSaga

Goal (why): Coordinate a multi-step process across services with compensation for failures.

Information on SAGA: SAGA Pattern

Required SAGA workflow (choose ONE and implement fully)

Option A: Auto-refund on entry failure (recommended)

When a ticket is purchased, a saga instance is created.
The user attempts entry:
- If EntryGranted → saga completes
- If EntryDenied for a system reason (define a reason category) → orchestrator triggers refund via TicketService
TicketService publishes TicketRefunded, which completes the saga.

Option B: Capacity-based entry throttling

If CapacityCriticalIssued, orchestrator sends a command to Access Control to deny further entries for a time window.

Requirements

Orchestrator must store saga state (in memory is acceptable initially, DB preferred).
Orchestrator must correlate messages to saga instances (define a correlation id strategy).
Must implement at least one compensating action (e.g., refund).
Must publish saga status events for NotificationService to show progress:
- SagaStarted, SagaStepCompleted, SagaCompensated, SagaCompleted, SagaFailed (names can vary)

MS11 — AddCallbackService (REST or gRPC)

Tag: MS11-AddCallbackService

Goal (why): Demonstrate safe synchronous calls between services for missing context.

Requirements

Implement CallbackService as a dedicated “facade” that performs at least one of these:
- Provide TicketService ticket validity details to AccessControlService (fallback check)
- Provide ScheduleService stage/capacity config details to CrowdMonitorService
- Provide enriched info to Orchestrator (e.g., stage name, artist name) for notifications
Must use either REST or gRPC (your choice).
Must include:
- timeouts
- failure handling (what if the call fails?)
- minimal caching allowed but must be documented

MS12 — Containerize Whole Application (Docker + Compose)

Tag: MS12-ContainerizeDockerCompose

Goal (why): Make the system runnable the same way on any machine.

Requirements

Each service and the web app must have a Dockerfile.
Provide docker-compose.yml that starts:
- RabbitMQ
- Databases (as needed)
- All services
- Gateway
- Web app
Compose must expose:
- Web app URL
- RabbitMQ management UI (optional but helpful)
Provide README.md steps:
- how to run with Docker Compose
- how to verify it works (a small test scenario)

MS13 — Optional Extensions

Tag: MS13-OptionalExtensions

Choose at least one (or more):

A) Distributed logging / tracing

Add structured logging with correlation ids
Optionally add OpenTelemetry tracing and a collector

B) Improved error handling

Retry policies (with backoff) for consumers
Better DLQ inspection endpoints or dashboards

C) Frontend styling

Make the UI look like a real festival app (simple but coherent)

D) Admin tools

Add admin pages to configure stage capacities or schedule items

8. Non-Functional Requirements (All Milestones)

Version control & discipline

Use Git from day one.
Commit messages must be meaningful.
Tag exactly at the end of each milestone.

Service boundaries

No “shared database”.
Avoid sharing domain models directly between services.
Shared project should contain only:
- minimal event contracts (or event names)
- shared serialization helpers
- common small utilities (e.g., correlation id helpers)

Observability (minimum)

Each service logs:
- when it publishes an event (type + id + correlation id)
- when it consumes an event (type + id + outcome)
- when it rejects a request (why)

Reliability mindset

Assume:
- messages can be delivered more than once
- messages can arrive out of order
- a service can be down temporarily
- a publish can fail
Design behaviors that make the system stable under these conditions.

9. Definition of Done (System Demo)

At the end (MS12 or MS13), you must be able to demonstrate this scenario:

Open the WebApp
Purchase a ticket
Scan entry (granted)
Observe live notifications and occupancy updates
Trigger at least one failure path (entry denied / refund saga / capacity warning)
Show that messages are still delivered reliably (e.g., via outbox or retry/DLQ behavior)
Run the entire system via docker compose up

10. Deliverables Checklist

Working code with all milestones tagged
docs/ includes event definitions and correlation strategy
README.md includes run instructions (local + Docker Compose)
System demonstrates asynchronous messaging + reliability patterns

Quick note on naming

You may rename services, endpoints, and event names.
However, you must keep the same architectural responsibilities and milestone outcomes.

Deep Thought

Explorer

Exercise Festivo Distributed Systems

Distributed Systems Exercise: Smart Concert / Festival Management Platform

1. Scenario: “Festivo” — a Smart Festival Platform

Key qualities we want to practice

2. High-Level Domain Model (Concepts)

3. Services Overview

3.1 API Gateway (YARP)

3.2 Ticket Service

3.3 Access Control Service (Gate Scanning)

3.4 Schedule Service

3.5 Crowd Monitor Service

3.6 Notification Service

3.7 Orchestrator Service (SAGA)

3.8 Callback Service (REST or gRPC)

4. System Structure Diagram

5. Main Data Flow (Example)

Example: Ticket purchase → entry scan → occupancy update → warning notification

6. Messaging Standard: CloudEvents

7. Milestones and Requirements

MS1 — SetUp

MS2 — AddMessagingMiddleware (RabbitMQ)

MS3 — DefineMessageFormat (CloudEvents + serialization)

MS4 — BuildBasicWebApp

MS5 — AddYARPReverseProxy

MS6 — BuildBasicBusinessLogicAndMessaging

MS7 — AddSignalRForStatusUpdateNotifications

MS8 — AddDatabase(s) For Storage-First APIs

MS9 — AddTransactionalOutboxPattern

MS10 — AddOrchestratorServiceForSAGAImplementation

MS11 — AddCallbackService (REST or gRPC)

MS12 — Containerize Whole Application (Docker + Compose)

MS13 — Optional Extensions

8. Non-Functional Requirements (All Milestones)

Version control & discipline

Service boundaries

Observability (minimum)

Reliability mindset

9. Definition of Done (System Demo)

10. Deliverables Checklist

Quick note on naming

Graph View

Table of Contents

Backlinks