The OpenAI API is powered by GPT-3 language models, which can be used to perform natural language tasks when text prompts are carefully crafted. However, these models may also generate outputs that are false or contain toxic and harmful sentiments due to the fact that GPT-3 is trained on a large dataset of internet text for predicting the next word rather than safely performing the desired task. To address this issue, InstructGPT models have been developed to better follow instructions compared to GPT-3 and produce fewer fabricated facts with decreased levels of toxicity. Even though it has 100 times fewer parameters than a 175B GPT-3 model, human labelers prefer output from our 1.3B InstructGPT model over output from the former.