Week 5: Hallucination

(see bottom for assigned readings and questions) Hallucination (Week 5) Presenting Team: Liu Zhe, Peng Wang, Sikun Guo, Yinhan He, Zhepei Wei Blogging Team: Anshuman Suri, Jacob Christopher, Kasra Lekan, Kaylee Liu, My Dinh Wednesday, September 27th: Intro to Hallucination People Hallucinate Too Hallucination Definition There are three types of hallucinations according to the “Siren's Song in the AI Ocean” paper: Input-conflict: This subcategory of hallucinations deviates from user input. Context-conflict: Context-conflict hallucinations occur when a model generates contradicting information within a response.

Read More…

Week 4: Capabilities of LLMs

(see bottom for assigned readings and questions) Capabilities of LLMs (Week 4) Presenting Team: Xindi Guo, Mengxuan Hu, Tseganesh Beyene Kebede, Zihan Guan Blogging Team: Ajwa Shahid, Caroline Gihlstorf, Changhong Yang, Hyeongjin Kim, Sarah Boyce Monday, September 18 Jingfeng Yang, Hongye Jin, Ruixiang Tang, Xiaotian Han, Qizhang Feng, Haoming Jiang, Bing Yin, Xia Hu. Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond. April 2023. https://arxiv.org/abs/2304.13712

Read More…

Week 3: Prompting and Bias

(see bottom for assigned readings and questions) Prompt Engineering (Week 3) Presenting Team: Haolin Liu, Xueren Ge, Ji Hyun Kim, Stephanie Schoch Blogging Team: Aparna Kishore, Erzhen Hu, Elena Long, Jingping Wan (Monday, 09/11/2023) Prompt Engineering Warm-up questions What is Prompt Engineering? How is prompt-based learning different from traditional supervised learning? In-context learning and different types of prompts What is the difference between prompts and fine-tuning? When is the best to use prompts vs fine-tuning?

Read More…

Week 2: Alignment

(see bottom for assigned readings and questions) Table of Contents (Monday, 09/04/2023) Introduction to Alignment Introduction to AI Alignment and Failure Cases Discussion Questions The Alignment Problem from a Deep Learning Perspective Group of RL-based methods Group of LLM-based methods Group of Other ML methods (Wednesday, 09/06/2023) Alignment Challenges and Solutions Opening Discussion Introduction to Red-Teaming In-class Activity (5 groups) How to use Red-Teaming? Alignment Solutions LLM Jailbreaking - Introduction LLM Jailbreaking - Demo Observations Potential Improvement Ideas Closing Remarks (by Prof.

Read More…

Week 1: Introduction

(see bottom for assigned readings and questions) Attention, Transformers, and BERT Monday, 28 August Transformers1 are a class of deep learning models that have revolutionized the field of natural language processing (NLP) and various other domains. The concept of transformers originated as an attempt to address the limitations of traditional recurrent neural networks (RNNs) in sequential data processing. Here’s an overview of transformers’ evolution and significance. Background and Origin RNNs2 were one of the earliest models used for sequence-based tasks in machine learning.

Read More…

Github Discussions

Everyone should have received an invitation to the github discussions site, and be able to see the posts there and submit your own posts and comments. If you didn’t get this invitation, it was probably blocked by the email system. Try visiting: https://github.com/orgs/llmrisks/invitation (while logged into the github account you listed on your form). Once you’ve accepted the invitation, you should be able to visit https://github.com/llmrisks/discussions/discussions/2 (the now-finalized discussion post for Week 1), and contribute to the discussions there.

Read More…

Class 0: Getting Organized

I’ve updated the Schedule and Bi-Weekly Schedule based on the discussions today. The plan is below: Week Lead Team Blogging Team Everyone Else Two Weeks Before Come up with idea for the week and planned readings, send to me by 5:29pm on Tuesday (2 weeks - 1 day before) - - Week Before Post plan and questions in github discussions by no later than 9am Wednesday; prepare for leading meetings Prepare plan for blogging (how you will divide workload, collaborative tools for taking notes and writing) Read/do materials and respond to preparation questions in github discussions (by 5:29pm Sunday) Week of Leading Meetings Lead interesting, engaging, and illuminating meetings!

Read More…

Updates

Some materials have been posted on the course site:

  • Syllabus
  • Schedule (you will find out which team you are on at the first class Wednesday)
  • Readings and Topics (a start on a list of some potential readings and topics that we might want to cover)

Dall-E Prompt: "comic style drawing of a phd seminar on AI"

Welcome Survey

Please submit this welcome survey before 8:59pm on Monday, August 21:

https://forms.gle/dxhFmJH7WRs32s1ZA

Your answers won’t be shared publicly, but I will use the responses to the survey to plan the seminar, including forming teams, and may share some aggregate and anonymized results and anonymized quotes from the surveys.

Welcome to the LLM Risks Seminar

Full Transcript Seminar Plan The actual seminar won’t be fully planned by GPT-4, but more information on it won’t be available until later. I’m expecting the structure and format to that combines aspects of this seminar on adversarial machine learning and this course on computing ethics, but with a topic focused on learning as much as we can about the potential for both good and harm from generative AI (including large language models) and things we can do (mostly technically, but including policy) to mitigate the harms.

Read More…

All Posts by Category or Tags.