Assignment: Backup & Restore for MySQL and MongoDB

Topic: Database backup and restore using Docker Deliverable: A written report (Protokoll) documenting your work Databases: MySQL and MongoDB (you do the full exercise for both)


1. Learning objectives

After this assignment you should be able to:

  • Run a MySQL and a MongoDB instance in Docker for development.
  • Create a full backup of each database and restore it.
  • Explain the difference between full, incremental, and differential backups.
  • Recognise why backups matter by deliberately corrupting data and recovering it.
  • Document a technical procedure cleanly so that someone else could reproduce it.

2. Background

2.1 Why back up at all?

A backup is a copy of your data from a known point in time. You need it for at least three situations: hardware/storage failure, a software or human error (the classic DELETE without a WHERE), and malicious damage (ransomware, a disgruntled user). A backup that you have never tried to restore is not a backup — it is a hope. That is why the practical part of this assignment makes you actually restore.

A useful rule of thumb is the 3-2-1 rule: keep 3 copies of your data, on 2 different kinds of media, with 1 copy off-site.

2.2 Full vs. incremental vs. differential

TypeWhat it storesBackup speed/sizeRestore complexity
FullThe complete dataset every timeSlow, largeSimplest — restore one backup
IncrementalOnly the changes since the last backup of any kindFast, smallRestore the last full + every increment since, in order
DifferentialOnly the changes since the last full backupMediumRestore the last full + the latest differential

The trade-off is always backup cost vs. restore cost. A full backup is expensive to take but trivial to restore. An incremental backup is cheap to take, but to restore you need an unbroken chain: the full backup plus all increments in sequence. If one increment is missing or corrupt, the chain breaks.

In this assignment the demonstrations below use full backups, because they map directly onto mysqldump / mongodump. Incremental backups are covered conceptually in section 5, with an optional practical part.

2.3 How each database implements backups

MySQL

  • Logical full backup: mysqldump writes the schema and data out as SQL statements (a .sql text file). Portable and human-readable, but slower to restore on large datasets.
  • Physical full backup: copying the data files directly (e.g. with Percona XtraBackup). Faster for large databases, less portable. Out of scope here.
  • Incremental: MySQL achieves this with the binary log (binlog), which records every change. A full backup plus the binlogs that followed it allow point-in-time recovery (PITR).

MongoDB

  • Logical full backup: mongodump writes a BSON dump per collection; mongorestore reads it back.
  • Physical full backup: copying the data directory or filesystem snapshots. Out of scope here.
  • Incremental: MongoDB uses the oplog (operations log), which only exists in a replica set. mongodump --oplog plus oplog replay enables point-in-time recovery.

3. Environment setup (Docker)

You already host development databases in Docker, so this should be familiar. Create a working folder for the assignment and run both containers from there.

mkdir -p ~/backup-lab/mysql ~/backup-lab/mongo
cd ~/backup-lab

3.1 MySQL container

docker run --name mysql-lab \
  -e MYSQL_ROOT_PASSWORD=secret \
  -e MYSQL_DATABASE=shop \
  -p 3306:3306 \
  -d mysql:8

Open a SQL shell inside the container:

docker exec -it mysql-lab mysql -uroot -psecret shop

3.2 MongoDB container

docker run --name mongo-lab \
  -p 27017:27017 \
  -d mongo:7

Open a Mongo shell inside the container:

docker exec -it mongo-lab mongosh shop

Tip: Keep the password (secret) the same everywhere so your commands stay copy-pasteable. This is a throwaway lab database — do not use weak passwords like this in real systems.


4. The demonstration you must produce

For each database (MySQL and MongoDB), you will walk through the following six steps and document every one. The story it tells: we take backups while everything is fine, then we break the data and prove we can get the good version back.

  1. Initialize Data — create a table/collection and insert a handful of rows/documents.
  2. Backup #1 — take a full backup of the initial state.
  3. Modify data (still correct) — make a legitimate change (e.g. add a product, update a price). Verify the data is correct.
  4. Backup #2 — take a second full backup of this newer, still-correct state.
  5. Modify data (wrong / corrupt) — deliberately damage the data (e.g. DELETE everything, set all prices to 0, drop a collection). Verify it is broken.
  6. Restore — restore from Backup #2 and verify the good data is back.

At every step, capture the data state (a SELECT / find() and its output, or a screenshot). Those outputs are the evidence in your report.


4.1 MySQL walkthrough

Step 1 — Initialize Data (run inside the MySQL shell):

CREATE TABLE products (
  id    INT PRIMARY KEY AUTO_INCREMENT,
  name  VARCHAR(100) NOT NULL,
  price DECIMAL(8,2) NOT NULL,
  stock INT NOT NULL
);
 
INSERT INTO products (name, price, stock) VALUES
  ('Keyboard', 29.90, 50),
  ('Mouse',    15.50, 80),
  ('Monitor', 199.00, 12);
 
SELECT * FROM products;   -- evidence: initial state

Step 2 — Backup #1 (run on your host, not inside the shell):

docker exec mysql-lab mysqldump -uroot -psecret shop > ~/backup-lab/mysql/backup1.sql

Step 3 — Modify (still correct):

INSERT INTO products (name, price, stock) VALUES ('Webcam', 45.00, 25);
UPDATE products SET price = 27.90 WHERE name = 'Keyboard';
SELECT * FROM products;   -- evidence: 4 rows, keyboard now 27.90

Step 4 — Backup #2:

docker exec mysql-lab mysqldump -uroot -psecret shop > ~/backup-lab/mysql/backup2.sql

Step 5 — Modify (wrong / corrupt): simulate a disaster.

DELETE FROM products;     -- oops, forgot the WHERE clause
SELECT * FROM products;   -- evidence: empty table — data lost!

Step 6 — Restore from Backup #2:

docker exec -i mysql-lab mysql -uroot -psecret shop < ~/backup-lab/mysql/backup2.sql

Then verify in the shell:

SELECT * FROM products;   -- evidence: the 4 correct rows are back

Note on restore behaviour: mysqldump files contain DROP TABLE IF EXISTS / CREATE TABLE statements, so restoring cleanly replaces the corrupted table. Explain in your report why you restored from Backup #2 and not Backup #1.


4.2 MongoDB walkthrough

Step 1 — Initialize Data (run inside mongosh):

db.products.insertMany([
  { name: "Keyboard", price: 29.90, stock: 50 },
  { name: "Mouse",    price: 15.50, stock: 80 },
  { name: "Monitor",  price: 199.00, stock: 12 }
]);
 
db.products.find();   // evidence: initial state

Step 2 — Backup #1 (run on your host). mongodump writes into the container, then we copy it out:

docker exec mongo-lab mongodump --db=shop --out=/dump/backup1
docker cp mongo-lab:/dump/backup1 ~/backup-lab/mongo/backup1

Step 3 — Modify (still correct):

db.products.insertOne({ name: "Webcam", price: 45.00, stock: 25 });
db.products.updateOne({ name: "Keyboard" }, { $set: { price: 27.90 } });
db.products.find();   // evidence: 4 documents, keyboard now 27.90

Step 4 — Backup #2:

docker exec mongo-lab mongodump --db=shop --out=/dump/backup2
docker cp mongo-lab:/dump/backup2 ~/backup-lab/mongo/backup2

Step 5 — Modify (wrong / corrupt):

db.products.deleteMany({});   // oops — wiped the whole collection
db.products.find();           // evidence: nothing left

Step 6 — Restore from Backup #2: copy the dump back in, then restore with --drop so the corrupted collection is replaced:

docker cp ~/backup-lab/mongo/backup2 mongo-lab:/dump/restore2
docker exec mongo-lab mongorestore --db=shop --drop /dump/restore2/shop

Then verify in mongosh:

db.products.find();   // evidence: the 4 correct documents are back

--drop matters: without it, mongorestore adds documents instead of replacing, which can leave you with duplicates or stale data. Explain this choice in your report.


5. Incremental backups (theory + optional practical)

You are not required to fully implement incremental backups, but you must explain them in your report. Doing the practical part earns bonus marks.

5.1 MySQL — binary log (point-in-time recovery)

The idea: take a full backup, then let MySQL log every subsequent change to the binary log. To recover to any moment, restore the full backup and replay the binlog up to that point.

To enable the binary log, start MySQL with logging on:

docker run --name mysql-lab \
  -e MYSQL_ROOT_PASSWORD=secret -e MYSQL_DATABASE=shop \
  -p 3306:3306 -d mysql:8 \
  --log-bin=mysql-bin --server-id=1

Inspect and replay logs with mysqlbinlog:

docker exec mysql-lab sh -c "ls /var/lib/mysql | grep mysql-bin"
# Replay a binlog file into the server:
docker exec mysql-lab sh -c "mysqlbinlog /var/lib/mysql/mysql-bin.000001 | mysql -uroot -psecret"

In your report, explain: restoring = full backup + all binlogs since, replayed in order. Stopping the replay just before the bad statement is how you “undo” an accident without losing later good changes.

5.2 MongoDB — oplog (replica set required)

MongoDB’s incremental mechanism, the oplog, only exists when the server runs as a replica set. With it, mongodump --oplog captures a consistent snapshot, and mongorestore --oplogReplay replays operations for point-in-time recovery.

Start a single-node replica set:

docker run --name mongo-lab -p 27017:27017 -d mongo:7 --replSet rs0
docker exec mongo-lab mongosh --eval "rs.initiate()"

In your report, explain: why the oplog needs a replica set, and how it is the conceptual equivalent of MySQL’s binary log.


6. The report (Protokoll)

Hand in one report covering both databases. Markdown or PDF is fine. It must let a classmate reproduce your work exactly.

Required structure

  1. Title page / header — your name, class, date.
  2. Environment — Docker version, the exact docker run commands you used.
  3. Theory (in your own words, ~½ page):
    • Difference between full, incremental, and differential backups.
    • How MySQL and MongoDB implement full and incremental backups.
  4. MySQL protocol — for each of the six steps:
    • The command(s) you ran.
    • The output / a screenshot showing the data state.
    • One sentence saying what the step proves.
  5. MongoDB protocol — same six-step structure.
  6. Incremental backups — your explanation (and the optional practical, if attempted).
  7. Reflection — answer the questions in section 7.

Documentation rules

  • Show commands and their output, not just commands. The output is your proof.
  • Make the corruption step and the recovery unmistakable — the reader should clearly see the data was broken and then came back.
  • State which backup you restored from and why.

7. Reflection questions (answer in the report)

  1. Why did you restore from Backup #2 instead of Backup #1? What would you have lost with each?
  2. What is the practical risk of restoring a backup over a live database?
  3. You take a full backup every Sunday and incrementals every other day. The database dies on Friday. Which files do you need to restore Friday’s state, and in what order?
  4. Why is an untested backup dangerous? How would you test backups regularly?
  5. How would you automate these backups so they run nightly without you (e.g. cron)? Sketch the idea.
  6. The lab used the password secret and exposed ports on localhost. Name two things you would change for a production system.

8. Hand-in & assessment

  • Submit: the report file. Including your backup files / a script is welcome but optional.
  • Deadline: (set by your teacher).
CriterionPoints
MySQL six-step demonstration documented with evidence30
MongoDB six-step demonstration documented with evidence30
Theory: full vs. incremental/differential, correctly explained15
Reflection questions answered15
Report clarity & reproducibility10
Bonus: incremental backup implemented (binlog or oplog)+10

9. Cleanup

When you are finished, remove the lab containers so they don’t linger:

docker rm -f mysql-lab mongo-lab

10. Troubleshooting

  • Access denied on MySQL: the password must match MYSQL_ROOT_PASSWORD. There is no space between -p and the password: -psecret.
  • MySQL “not ready” right after start: the container needs a few seconds to initialise on first run. Wait, then retry.
  • mongorestore made duplicates: you forgot --drop. Restore replaces only when you tell it to.
  • Restore points to the wrong folder: mongorestore expects the path to the database folder inside the dump (e.g. .../shop), not the dump root.
  • Empty mysqldump file: check you redirected on the host with >, and that the database name (shop) is correct.