Long Ouyang¶
Researcher at OpenAI working on alignment and reinforcement learning from human feedback (RLHF). Lead author of InstructGPT, foundational work on instruction-following language models trained with human preferences.
Sources in this wiki¶
- Training language models to follow instructions with human feedback (deleted)