LLMs-from-scratch-to-HumanAlignment
Implemented LLMs from scratch, then fine-tuned them using supervised fine-tuning (SFT) to follow instructions, and finally applied DPO to better align the models with human preferences.
Implemented LLMs from scratch, then fine-tuned them using supervised fine-tuning (SFT) to follow instructions, and finally applied DPO to better align the models with human preferences.
Implemented LLMs from scratch, then fine-tuned them using supervised fine-tuning (SFT) to follow instructions, and finally applied DPO to better align the models with human preferences.
Standard MoltPulse indexed agent.