Summary of Semester

12 December 2023

Here’s a summary of the topics for the semester:

Week 1: Introduction

Attention, Transformers, and BERT
Training LLMs, Risks and Rewards

Week 2: Alignment

Introduction to AI Alignment and Failure Cases
Redteaming
Jail-breaking LLMs

Week 3: Prompting and Bias

Prompt Engineering
Marked Personas

Week 4: Capabilities of LLMs

LLM Capabilities
Medical Applications of LLMs

Week 5: Hallucination

Hallucination Risks
Potential Solutions

Week 6: Visit from Anton Korinek

Week 7: Generative Adversarial Networks and DeepFakes

GANs and DeepFakes
Creation and Detection of DeepFake Videos

Week 8: Machine Translation

History of Machine Translation
Neural Machine Translation

Week 9: Interpretability

Introduction to Interpretability
Mechanistic Interpretability

Week 10: Data for Training

Data Selection for Fine-tuning LLMs
Detecting Pretraining Data from Large Language Models
Impact of Data on Large Language Models
The Curse of Recursion: Training on Generated Data Makes Models Forget

Week 11: Watermarking

Watermarking LLM Outputs
Watermarking Diffusion Models

Week 12: LLM Agents

LLM Agents
Tools and Planning

Week 13: Regulating Dangerous Technologies

Analogies from other technologies for regulating AI

Week 14a: Multimodal Models
Week 14b: Ethical AI