Database I, année 2022

Database 1: advanced databases and efficient querying.

Lectures objective

At the end of the course, a successful student should be able to master the main techniques and algorithms that allow relational databases to handle efficiently large amount of data. These methods form the bases of systems that handle larger amounts of data. After attending this course the successful student will be able to:

Evaluations

Temporary plan

  1. Introduction to relational databases

    • General and historical presentations

    • Relational algebra

      • Codd’s relational model of data
      • Set vs Bag semantic
      • Joins
      • Null, Three-valued logic and more
    • Reminder about SQL

      • From Relations to Table
      • Simple queries
      • Joins syntax
      • Nesting and aliasing
    • (Simple) SQL Schemas

      • Types
      • Constraints
      • View vs Queries
    • Lab:

      • SQLite minimal presentation
      • PostgreSQL minimal presentation
      • Starting with psql
      • Playing with PostgreSQL with Python
  2. Data-manipulation and Transactions

    • Logical constraints:

      • Within table constraints
      • Foreign-key constraints
      • Generated values
    • Transactions

      • Concurrency point of view
      • Defferable constraints
    • ACID

      • WAL and durability
      • Definitions in details
      • Impact on physical layout
    • Views

      • Temporary table
      • (Materialized) View
      • Triggering
    • Lab

      • Playing with transaction levels
  3. Simple Querying

    • PostgreSQL architectures

      • Client/Server architectures
      • Processes to share the load
      • Performances outcome
      • Query Life Cycle
    • Physical layouts vs Logical layout

      • Clusertering
      • Tweaking physical layouts
      • Toasted values
    • Indexing (without details)

      • DataStructures to accelerate search
      • Selectivity
    • Joinfree Query cost-model

      • data density
      • cost models
      • Statistical aggregates
      • A full scan cost
    • Lab (A study of join-free querying)

      • Join-free queries in Python: building from scratch
      • Simple QueryPlan in PostgreSQL: playing with the cost-model
  4. DataStructure for Databases

    • Variation around B-trees

    • GIN

    • bloom filters (if time)

    • Lab

      • TBA

One of those:

  1. Scalability of relational databases
  2. Advance querying
  3. Deploying Python within PostgreSQL

Mastodon