| |
Mar 14, 2026
|
|
|
|
|
CSC 6714 - Large Language Models4 lecture hours 0 lab hours 4 credits Course Description This course introduces large language models (LLMs) from the ground up. Beginning with a discussion of the implementation of a modern LLM and its training, the course includes discussion of the applications of LLMs and a mechanism-aware discussion of prompt engineering. By the end of the course, students will have a concrete understanding of how LLMs perform computations and be able to apply that knowledge to use LLMs to solve real-world tasks. Prereq: (MTH 2340, MTH 5810 , or similar course work) and (CSC 2621, CSC 5610 , or similar coursework) or instructor consent Note: None Course Learning Outcomes Upon successful completion of this course, the student will be able to:
- Describe the inference process for a token-prediction transformer model including token generation
- Explain the structure and mechanics of the transformer architecture and its components, including recent advances
- Train a small transformer model from scratch to predict the next token in a sequence
- Compare techniques for training transformer models including pretraining, instruction fine-tuning, and reinforcement learning
- Demonstrate working applications of LLMs on real-world problems of varying difficulty
- Use advanced prompting techniques such as chain of thought and structured output to improve LLM performance on a task
- Evaluate the performance of an LLM on tasks using approaches such as perplexity and AI as judge
Prerequisites by Topic
- Python programming experience including the use of data science and machine learning libraries
- Able to perform and interpret matrix and vector arithmetic including addition, dot products, and matrix-vector multiplication
- Able to train and apply various classical machine learning models for classification and regression problems
- Able to design and execute experiments to evaluate machine learning models
- Able to interpret metrics such as accuracy, precision, and recall to evaluate model prediction performance
Course Topics
- The modern transformer architecture including embedding components
- Process for inference and token generation from transformer architectures
- Computational efficiency of and optimizations for transformer inference
- Techniques for training LLMs as foundation and instruction-following models
- Examples of LLM applications (varies depending on instructor preference)
- Foundational and advanced prompt engineering techniques such as prompt structure, chain of thought, few-shot learning, and structured output
- Evaluation of LLMs including challenges caused by and approaches for evaluating open-ended output
Coordinator Dr. Josiah Yoder
Add to Portfolio (opens a new window)
|
|