|
CS 3860 - Database Systems3 lecture hours 2 lab hours 4 credits Course Description This course introduces the theory and practice of modern database design and application, with an emphasis on data modeling and data-driven application design. Relational and non-relational data models, cloud and distributed storage systems, database performance, writing effective database queries, and current topics in database systems are introduced. Topics include entity-relationship modeling; relational algebra; relational, dimensional, and non-relational data models; normalization techniques; SQL; data-driven application design, authentication and access control; transactions, concurrency and performance optimization; CAP theorem; and distributed and cloud computing environments. Lab assignments reinforce class topics. (prereq: CS 2852 , MA 2310 ) Course Learning Outcomes Upon successful completion of this course, the student will be able to:
- Be able to design database models using entity-relationship, relational, dimensional, and non-relational data models.
- Understand how to refine a database design using normalization techniques.
- Understand database properties of atomicity, consistency, independence and durability.
- Understand the concepts and tradeoffs of consistency, availability, and partition tolerance in distributed database systems.
- Be able to use database languages (i.e., SQL) for querying, manipulating, and basic management of databases.
- Be able to use and know when to apply scripting languages like Python and supporting packages Numpy and Pandas for basic data processing and analysis.
- Be able to design and deploy scalable, distributed database and data processing applications such as Apache Hadoop and Spark
- Be able to describe the CAP theorem and apply it to proper database selection and design.
- Be able to describe the purposes and typical mechanisms used to maintain data integrity relating to protecting existence, maintaining quality, and ensuring confidentiality.
- Be aware of modern trends in the area of database systems.
Prerequisites by Topic
- Understand and apply data structures and algorithms
- Use appropriate algorithms (and associated data structures) to solve complex problems
- Be able to analyze the time complexity of algorithms, both sequential and recursive
- Be able to use data structures in software design and implementation
Course Topics
- Introduction to Database Systems
- Database design and introduction to ER modeling
- ER modeling continued
- SQL
- Schema refinement and normal forms
- Anomalies, integrity constraints, and normal forms
- Relational algebra
- Transactions
- ACID
- Security
- Storage, retrieval, and indexing
- Scripting languages for data processing and analysis
- Big Storage in Modern Databases, CAP Theorem
- Hadoop ecosystem and MapReduce
- Big Data Storage in Modern Databases, CAP Theorem
- Hadoop Storage Systems: Distributed, non-realtional database system. Hive
- Dimensional Modeling
- Apache Spark, data processing, RDDs
- Apache Spark, dataframes, SQL, data analysis
Laboratory Topics
- Introduction to database systems
- Data modeling
- Building a relational database with SQL
- Query evaluation, applying integrity constraints
- Data analytics application with SQL
- Indexing and Storage
- Transactions and integrity constraints tutorial
- Python, Numpy, and Pandas
- Map Reduce SQL
- Apache Hive
- Apache Spark, ETL, RDDs (resilient distributed datasets), distributed data processing
- Apache Spark, dataframes, SQL, data analysis
Coordinator Jay Urbain
Add to Portfolio (opens a new window)
|
|