Database I, année 2022
Database 1: advanced databases and efficient querying.
- 24 hours, 6 lectures 6 exercises
- Education cycle: Second cycle, Master Program 1st year
- Main field of study: Computer Science and Engineering, Applied Mathematics
Lectures objective
At the end of the course, a successful student should be able to master the main techniques and algorithms that allow relational databases to handle efficiently large amount of data. These methods form the bases of systems that handle larger amounts of data. After attending this course the successful student will be able to:
- Understand the relationship between relational algebra, logic, SQL,
- Manipulate a common db-engine (PostgreSQL)
- Write queries,
- Use methods to improve the performance of a database,
- Know various structures of indexes and when to use them,
- Understand how queries are compiled, optimized and executed,
- Tune queries so as to make them more efficient.
Evaluations
- One graded homework
- One terminal evaluation on machine (on the last lecture)
Temporary plan
Introduction to relational databases
General and historical presentations
Relational algebra
- Codd’s relational model of data
- Set vs Bag semantic
- Joins
- Null, Three-valued logic and more
Reminder about SQL
- From Relations to Table
- Simple queries
- Joins syntax
- Nesting and aliasing
(Simple) SQL Schemas
- Types
- Constraints
- View vs Queries
Lab:
- SQLite minimal presentation
- PostgreSQL minimal presentation
- Starting with psql
- Playing with PostgreSQL with Python
Data-manipulation and Transactions
Logical constraints:
- Within table constraints
- Foreign-key constraints
- Generated values
Transactions
- Concurrency point of view
- Defferable constraints
ACID
- WAL and durability
- Definitions in details
- Impact on physical layout
Views
- Temporary table
- (Materialized) View
- Triggering
Lab
- Playing with transaction levels
Simple Querying
PostgreSQL architectures
- Client/Server architectures
- Processes to share the load
- Performances outcome
- Query Life Cycle
Physical layouts vs Logical layout
- Clusertering
- Tweaking physical layouts
- Toasted values
Indexing (without details)
- DataStructures to accelerate search
- Selectivity
Joinfree Query cost-model
- data density
- cost models
- Statistical aggregates
- A full scan cost
Lab (A study of join-free querying)
- Join-free queries in Python: building from scratch
- Simple QueryPlan in PostgreSQL: playing with the cost-model
DataStructure for Databases
Variation around B-trees
GIN
bloom filters (if time)
Lab
- TBA
One of those:
- Scalability of relational databases
- Advance querying
- Deploying Python within PostgreSQL
Compiled the: mer. 04 sept. 2024 12:49:42 CEST