Sign Up

Towards Principled Post-Training of Large Language Models

Tuesday, February 20, 2024 11:10am to 12:10pm

150 Western Avenue, Allston, MA 02134

Reinforcement Learning from Human Feedback (RLHF) is a pivotal technique that aligns large language models (LLMs) closely with human-centric values, and has created several leading LLMs, including GPT-4, Claude and Llama 2. The first step of RLHF involves learning human values using a reward model from ranking data. It is observed that the performance of the reward model degrades after one epoch of training, and optimizing the language model too much against the learned proxy reward model hinders the true objective. This talk delves into these issues, leveraging the theoretical insights from statistical decision theory to design improved reward learning algorithms. We also introduce advanced prompting techniques that generate high-quality open-source dataset for RLHF. By combining the high-quality RLHF dataset with our improved RLHF algorithms, we created the open-source language model Starling-7B, which ranks first among all 7B models according to human evaluation in Chatbot Arena.


Jess Brenn is inviting you to a scheduled Zoom meeting.

Topic: Applied Math & Kempner Institute Talks
Time: This is a recurring meeting Meet anytime

Join Zoom meeting
https://harvard.zoom.us/j/95576259221?pwd=RWdCUnhESkJXeE9aamNndXBRRGxvdz09

Password: 806229

Join by telephone (use any number to dial in)
        +1 305 224 1968
        +1 309 205 3325
        +1 312 626 6799
        +1 646 931 3860
        +1 929 436 2866
        +1 301 715 8592
        +1 669 900 6833
        +1 689 278 1000
        +1 719 359 4580
        +1 253 205 0468
        +1 253 215 8782
        +1 346 248 7799
        +1 360 209 5623
        +1 386 347 5053
        +1 507 473 4847
        +1 564 217 2000
        +1 669 444 9171

International numbers available: https://harvard.zoom.us/u/adktlc85Z1

One tap mobile: +13052241968,,95576259221# US
    
Join by SIP conference room system
Meeting ID: 955 7625 9221
95576259221.806229@zoomcrc.com