FE512 Database Engineering
Course Catalog Description
Introduction
Campus | Fall | Spring | Summer |
---|---|---|---|
On Campus | X | ||
Web Campus |
Instructors
Professor | Office | |
---|---|---|
Ionut Florescu | ifloresc@stevens.edu | Babbio 544 |
Edward Loeser | eloeser@stevens.edu | Babbio 109 |
More Information
Course Description
The course covers basic and advanced knowledge of SQL using PostgreSQL. Python and R will be used alongside SQL to interact with data systems. Additional topics will be introduced.
Course Outcomes
At the end of this course, students will be able to: evaluate a data problem and implement a simple software stack to solve it; connect to databases, call data APIs, and manipulate data using a programming language; use SQL to write DDL and DML statement
The objective of the course is for students to:
• build fundamental skills in writing programs to manipulate data and interact with data APIs
• learn to write more complex, data-intensive programs
• understand basic database concepts
• become comfortable researching, evaluating, and implementing open-source software tools
• gain a basic understanding of the large and rapidly changing data storage technology landscape
Course Resources
Prerequisites and Textboks
There are no course prerequisites. Students will need to be comfortable with Python, SQL, and text-based user interfaces. There is no required textbook. However we present a list of reference books for the course:
• Designing Data-Intensive Applications, Martin Kleppmann
• Options, Futures, and Other Derivatives, John C. Hull
• Effective awk Programming, Arnold Robbins
• SQL in 10 Minutes, Ben Forta
• PostgreSQL Docs , The PostgreSQL Global Development Group
Grading
Grading Policies
All Stevens, graduate students promise to be fully truthful and avoid dishonesty, fraud, misrepresentation, and deceit of any type in relation to their academic work. A student's submission of work for academic credit indicates that the work is the student's own. All outside assistance must be acknowledged. Any student who violates this code or who knowingly assists another student in violating this code shall be subject to discipline. If identical submissions are founded, the students will receive 0 point for the homework.
Grading Procedures:
• Programming Assignment 1: 20%
• Programming Assignment 2: 20%
• Programming Assignment 3: 30%
• Homework Assignments: 30%
Late Policy: All assignments will be due on the date and time posted.
Lecture Outline
Topic | Due | |
---|---|---|
SQL basics and programming interfaces | ||
Week 1 | Environment Setup, Python, R, CLI Tools, Linux, Windows PowerShell, MacOS Terminal, Shell Commands/Scripting, SSH, Generating Keys, Git, ODBC, JDBC, psycopg2 | |
Week 2 | SQL Basics, SQL Structure, pgAdmin, SQL Math | HW 1 |
Week 3 | SQL Concepts, SQL DML, PostgreSQL, Postgres Ecosystem, Using R and Python to run SQL commands | HW 2 |
Databases and data wrangling | ||
Week 4 | Databases, File Storage, File Formats, Importing / Exporting Data, JSON, XML, XBRL [Machine Readable News Case Study] | HW 3 |
Week 5 | Databases, SQL DDL, Schema Design, Normalizing Data. [Machine Readable News Case Study] | HW 4 |
Week 6 | Parsing Larger Data Files. Constraints / Technology Choices. Improving performance by indexing. [Datascope Select Case Study] | HW 5 |
Week 7 | Troubleshooting Techniques, Command Line Interface (CLI) Tools: hexdump, regex, awk, etc. [Datascope Select Case Study] | PA 1, HW 6 |
Application Programming Interfaces (API's) and Extract, Transform, Load (ETL) | ||
Week 8 | APIs, File Formats, HTTP Protocol, REST APIs. [Bloomberg, Kaiko Case Studies] | HW 7 |
Week 9 | Pulling Data from APIs, OpenAI, EDGAR [Datascope Select Case Study] | HW 8 |
Week 10 | Data Engineering Overview, ETL / ELT Concepts and Ecosystem, Introduction to Airflow, Using Python and R [Machine Readable News Case Study] | HW 9 |
Containerization tools and more ETL | ||
Week 11 | DevOps Overview, Docker, Docker Compose, Kubernetes | HW 10, PA 2 |
Week 12 | Airflow Concepts, Running Airflow Locally [Machine Readable News Case Study] | HW 11 |
Week 13 | Airflow Environment, Writing Directed Acyclic Graphs (DAGs) [Machine Readable News Case Study] | HW 12 |
Week 14 | Writing and Debugging DAGs [Machine Readable News Case Study] | PA 3 |