""

CSL4030: Data Engineering

Jul-Dec, 2023


Table of Contents

Lectures

Event Date Description References
Lecture 1 Jul 31 Who are Data Engineers? [ Slides ]
Lectures 2 and 3 Aug 4,7 What is Big Data? [ Slides ]
Lecture 4 Aug 9 What is a Data Model? [ Slides ]
Lecture 5 Aug 11 A Brief History of Data Models Part I [ Chap 2, Kleppmann ]
Lecture 6 Aug 14 A Brief History of Data Models Part II [ Chap 2, Kleppmann ]
Lecture 7 Aug 16 An Introduction to the Relational Data Model [ Slides ]
Lecture 8 Aug 18 Normalization and Denormalization of Relations [ Slides ]
Lecture 9 Aug 21 #NoSQL: The Document Data Model [ Slides ]
Lecture 10 Aug 23 #NoSQL: The Graph-like Data Model [ Slides ]
Lectures 11, 12 Aug 25, 28 Distributed Data Storage and Management Parts I, II: Introduction [ Slides ]
Lecture 13 Aug 30 Distributed Data Storage and Management Part III: Global Transactions [ Slides ]
Lecture 14 Sep 1 Distributed Data Storage and Management Part IV: Commit Protocols [ Slides ]
Minor-1 Sep 6 Minor-1 Exam [ Question Paper,
Answer Outlines,
Feedback]
Lecture 15 Sep 11 Distributed Data Storage and Management Part V: Persistent Messaging [ Slides ]
Expert's Talk Sep 12 'Vitess Demystified: Navigating the World of Distributed Databases' by Manan Gupta from PlanetScale [ Video, Slides ]
Lectures 16, 17, 18 Sep 13, 15, 18 Distributed Data Storage and Management Part VI: Concurrency Control [ Slides ]
Lectures 19, 20, 21 Sep 20, 22, 27 Distributed Data Storage and Management Part VII: Ensuring Availability [ Slides ]
Lectures 22, 23 Sep 29, Oct 4 Distributed Data Storage and Management Part VIII: Heterogeneous Distributed Databases [ Slides ]
Lecture 24 Oct 6 Distributed Data Storage and Management Part IX: Directory Access Protocols [ Slides ]
Lectures 25, 26 Oct 9, 11 Hashing and Indexing [ Slides ]
Minor-2 Oct 16 Minor-2 Exam [ Question Paper,
Answer Outlines]
Lectures 27, 28, 29 Oct 13, 20, 26 Data Warehousing (OLTP, OLAP) [ Slides ]
Lecture 30 Oct 27 Distributed Query Processing [ Slides ]
Lectures 31-35 Oct 30, Nov 1, 6, 8, 10 Query Optimization [ Slides ]
Lectures 36-41 Nov 15, 17, 20, 22, 24, 26 Streaming Data Analytics [ Slides ]

Labs

Event Date Description References
Lab 1 Aug 4 Data Collection Techniques [ Slides ]
Lab 2 Aug 9 SQL queries with PostgreSQL [ Material ]
Lab 3 Aug 16 ERWin Data Modeler, Database Normalization [ Material, ERWin Manual ]
Lab 4 Aug 23 SQL Joins [ Slides ]
Lab 5 Aug 30 Database Transactions [ Slides ]
Lab 6 Sep 13 Document-oriented database management with MongoDB [ Slides ]
Lab 7 Sep 27 Data encoding with XML [ Slides ]
Lab 8 Oct 4 JSON [ Slides ]
Lab 9 Oct 11 LDAP [ Slides ]

Text and Reference Book(s)

  1. Kleppmann: M. KLEPPMANN (2017), Designing Data-Intensive Applications, O’Reilly.
  2. Korth: A. SILBERSCHATZ, H.F. KORTH, S. SUDARSHAN (2011), Database System Concepts, McGraw Hill Publications, 6th Edition.
  3. Navathe: R. ELMASRI, S.B. NAVATHE (2017), Fundamentals of Database Systems, Pearson Education, 7th Edition.
  4. Raj & Raman: P. RAJ, A. RAMAN, D. NAGARAJ, S. DUGGIRALA (2015), High-Performance Big-Data Analytics: Computing Systems and Approaches, Springer, 1st Edition.
  5. Reis & Housley: J. Reis, M. Housley (2022), Fundamentals of Data Engineering, O'Reilly Media, Inc.,ISBN: 9781098108304.

Similar Courses

Data Engineering Conferences

Data Engineering Podcasts