Database I, année 2022
Database 1: advanced databases and efficient querying.
- 24 hours, 6 lectures 6 exercises
- Education cycle: Second cycle, Master Program 1st year
- Main field of study: Computer Science and Engineering, Applied Mathematics
Lectures objective
At the end of the course, a successful student should be able to master the main techniques and algorithms that allow relational databases to handle efficiently large amount of data. These methods form the bases of systems that handle larger amounts of data. After attending this course the successful student will be able to:
- Understand the relationship between relational algebra, logic, SQL,
- Manipulate a common db-engine (PostgreSQL)
- Write queries,
- Use methods to improve the performance of a database,
- Know various structures of indexes and when to use them,
- Understand how queries are compiled, optimized and executed,
- Tune queries so as to make them more efficient.
Evaluations
- One graded homework
- One terminal evaluation on machine (on the last lecture)
Temporary plan
- Introduction to relational databases - General and historical presentations 
- Relational algebra - Codd’s relational model of data
- Set vs Bag semantic
- Joins
- Null, Three-valued logic and more
 
- Reminder about SQL - From Relations to Table
- Simple queries
- Joins syntax
- Nesting and aliasing
 
- (Simple) SQL Schemas - Types
- Constraints
- View vs Queries
 
- Lab: - SQLite minimal presentation
- PostgreSQL minimal presentation
- Starting with psql
- Playing with PostgreSQL with Python
 
 
- Data-manipulation and Transactions - Logical constraints: - Within table constraints
- Foreign-key constraints
- Generated values
 
- Transactions - Concurrency point of view
- Defferable constraints
 
- ACID - WAL and durability
- Definitions in details
- Impact on physical layout
 
- Views - Temporary table
- (Materialized) View
- Triggering
 
- Lab - Playing with transaction levels
 
 
- Simple Querying - PostgreSQL architectures - Client/Server architectures
- Processes to share the load
- Performances outcome
- Query Life Cycle
 
- Physical layouts vs Logical layout - Clusertering
- Tweaking physical layouts
- Toasted values
 
- Indexing (without details) - DataStructures to accelerate search
- Selectivity
 
- Joinfree Query cost-model - data density
- cost models
- Statistical aggregates
- A full scan cost
 
- Lab (A study of join-free querying) - Join-free queries in Python: building from scratch
- Simple QueryPlan in PostgreSQL: playing with the cost-model
 
 
- DataStructure for Databases - Variation around B-trees 
- GIN 
- bloom filters (if time) 
- Lab - TBA
 
 
One of those:
- Scalability of relational databases
- Advance querying
- Deploying Python within PostgreSQL
Compiled the: ven. 05 sept. 2025 16:27:42 CEST