Summary of Semester

Here’s a summary of the topics for the semester:

Week 1: Introduction

  • Attention, Transformers, and BERT
  • Training LLMs, Risks and Rewards

Week 2: Alignment

  • Introduction to AI Alignment and Failure Cases
  • Redteaming
  • Jail-breaking LLMs

Week 3: Prompting and Bias

  • Prompt Engineering
  • Marked Personas

Week 4: Capabilities of LLMs

  • LLM Capabilities
  • Medical Applications of LLMs

Week 5: Hallucination

  • Hallucination Risks
  • Potential Solutions

Week 6: Visit from Anton Korinek

Week 7: Generative Adversarial Networks and DeepFakes

  • GANs and DeepFakes
  • Creation and Detection of DeepFake Videos

Week 8: Machine Translation

  • History of Machine Translation
  • Neural Machine Translation

Week 9: Interpretability

  • Introduction to Interpretability
  • Mechanistic Interpretability

Week 10: Data for Training

  • Data Selection for Fine-tuning LLMs
  • Detecting Pretraining Data from Large Language Models
  • Impact of Data on Large Language Models
  • The Curse of Recursion: Training on Generated Data Makes Models Forget

Week 11: Watermarking

  • Watermarking LLM Outputs
  • Watermarking Diffusion Models

Week 12: LLM Agents

  • LLM Agents
  • Tools and Planning

Week 13: Regulating Dangerous Technologies

  • Analogies from other technologies for regulating AI

Week 14a: Multimodal Models
Week 14b: Ethical AI