您即将离开知乎,请注意您的账号和财产安全。
https://www.techtarget.com/whatis/definition/reinforcement-learning-from-human-feedback-RLHF