Apr 19, 2024  
2018-2019 Undergraduate Academic Catalog 
    
2018-2019 Undergraduate Academic Catalog [ARCHIVED CATALOG]

Add to Portfolio (opens a new window)

CS 3860 - Database Systems

3 lecture hours 2 lab hours 4 credits
Course Description
This course introduces the theory and practice of modern database design and application, with an emphasis on data modeling and data-driven application design. Relational and non-relational data models, cloud and distributed storage systems, database performance, writing effective database queries, and current topics in database systems are introduced. Topics include entity-relationship modeling; relational algebra; relational, dimensional, and non-relational data models; normalization techniques; SQL; data-driven application design, authentication and access control; transactions, concurrency and performance optimization; CAP theorem; and distributed and cloud computing environments. Lab assignments reinforce class topics. (prereq: CS 2852 , MA 2310 
Course Learning Outcomes
Upon successful completion of this course, the student will be able to:
  • Be able to design database models using entity-relationship, relational, dimensional, and non-relational data models.
  • Understand how to refine a database design using normalization techniques.
  • Understand database properties of atomicity, consistency, independence and durability.
  • Understand the concepts and tradeoffs of consistency, availability, and partition tolerance in distributed database systems.
  • Be able to use database languages (i.e., SQL) for querying, manipulating, and basic management of databases.
  • Be able to use and know when to apply scripting languages like Python and supporting packages Numpy and Pandas for basic data processing and analysis.
  • Be able to design and deploy scalable, distributed database and data processing applications such as Apache Hadoop and Spark
  • Be able to describe the CAP theorem and apply it to proper database selection and design.
  • Be able to describe the purposes and typical mechanisms used to maintain data integrity relating to protecting existence, maintaining quality, and ensuring confidentiality.
  • Be aware of modern trends in the area of database systems. 

Prerequisites by Topic
  • Understand and apply data structures and algorithms
  • Use appropriate algorithms (and associated data structures) to solve complex problems
  • Be able to analyze the time complexity of algorithms, both sequential and recursive
  • Be able to use data structures in software design and implementation

Course Topics
  • Introduction to Database Systems
  • Database design and introduction to ER modeling 
  • ER modeling continued 
  • SQL
  • Schema refinement and normal forms 
  • Anomalies, integrity constraints, and normal forms 
  • Relational algebra 
  • Transactions 
  • ACID 
  • Security 
  • Storage, retrieval, and indexing 
  • Scripting languages for data processing and analysis
  • Big Storage in Modern Databases, CAP Theorem 
  • Hadoop ecosystem and MapReduce 
  • Big Data Storage in Modern Databases, CAP Theorem
  • Hadoop Storage Systems: Distributed, non-realtional database system. Hive 
  • Dimensional Modeling 
  • Apache Spark, data processing, RDDs
  • Apache Spark, dataframes, SQL, data analysis

Laboratory Topics
  • Introduction to database systems
  • Data modeling
  • Building a relational database with SQL
  • Query evaluation, applying integrity constraints
  • Data analytics application with SQL
  • Indexing and Storage 
  • Transactions and integrity constraints tutorial
  • Python, Numpy, and Pandas
  • Map Reduce SQL
  • Apache Hive
  • Apache Spark, ETL, RDDs (resilient distributed datasets), distributed data processing
  • Apache Spark, dataframes, SQL, data analysis

Coordinator
Jay Urbain



Add to Portfolio (opens a new window)