BDA Exam 2025

Submission of TP1

Prepare your TP1 work to be submitted.

You must provide a single archive (zip or tar.gz) containing:

You may take some time to clean up your TP, but be careful — that’s time you won’t have for the rest of the exam.

The grade will take into account both the quality of the Exam submission and the TP.

The submission must be sent by email.

Partitioning of the Two Tables

We will start again from TD1, setting up a mechanism with multiple writable databases that share the messages table.

If you have not yet executed the preparatory script for this exam, start by doing so, by downloading this script on a fresh VM created for the occasion.

This script will:

  1. Prepare the VM and give access to the graders.
  2. Install several packages.
  3. Make three PostgreSQL instances coexist on the same machine.

The databases will coexist on ports 5432, 5433, and 5434.

CREATE TABLE channel(
    name TEXT PRIMARY KEY, 
    description TEXT, 
    created_date TIMESTAMP default now()
);

CREATE TABLE messages(
    uuid TEXT PRIMARY KEY, 
    channel_name TEXT REFERENCES channel(name),
    pseudo TEXT, 
    message TEXT, 
    inserted TIMESTAMP default now()
);

The goal of this TD-exam is to distribute each of the two tables across three PostgreSQL instances. Since the number of instances is fixed, we do not need to implement any elasticity algorithm and can use a naive approach.

Part 1. Row Distribution for Each Table

Each table must be distributed across the three servers in a non-elastic way. We will not add any more databases later.

Describe a row distribution strategy for both tables across the three instances in your Readme file, ensuring that each row is present in exactly two PostgreSQL instances (replication factor of two).

Will your strategy evenly distribute rows across all databases?

Question 2. Replication Factor

What are the advantages and disadvantages of a replication factor of \(2\) compared to other possible values?

Question 3. Implement Your Strategy

Adapt the API you built in TD1 to implement the strategy you proposed in Question 1.

Ensure that the data returned by your API is properly deduplicated and chronologically ordered in the function retrieving the latest messages.

Question 4. Available System? Consistent System?

In the current state of your implementation, and without significantly modifying it, indicate (and justify) whether your system is consistent, available, both, or neither.

Question 5. Evaluate the Efficiency of Your Strategy

In your Readme, detail a methodology to estimate the efficiency of your implementation strategy.

Implement this strategy or a simplified version of it and provide the measurement results in your Readme.

Question 6. Eventual Consistency

An eventual consistency strategy allows an always-available system to become consistent again when nodes reconnect by reconciling their state. Propose without implementing it an eventual consistency strategy that you will describe in your Readme.

Try to implement this mechanism in your API.


Compiled the: lun. 20 oct. 2025 09:55:53 CEST