Zhou et al., 2023
paper: https://arxiv.org/pdf/2305.11206.pdf
unorganized notes
- the authors propose that the vast majority of knowledge llms posses is aquired during the pretraining stage
- they further predict that small datasets (1000 examples) of high quality, curated data is sufficient to effectively tune the model to a specific task
- this is known as the “Super Alignment Hypothesis”
- alignment (fine tuning) is primarily about teaching output style to the model
- the paper discusses how/where to find high quality data. sources it uses are wikihow, stack exchange and reddit
- stack exchange and wikihow in general contain high quality data, whil reddit data needs to be carefully curated
the specific breakdown of data used in the 1000 example data set used for alignment
- training for LIMA is done starting from LLaMa 70B with their 1000 example dataset
- they fine tune for 15 epochs with AdamW using standard hyperparmeters for fine tuning
- they make use of dropout over residual connections
- the LIMA model is evaluated by comparing responses from LIMA and other popular llm assistants such as GPT-4 and Bard
- responses are scored by human annotators similar tog llm boxing
details of evaluation results
- further analysis shows that LIMA provides “excellent” responses to 45% of prompts and only fails to comprehensively answer the question asked 20% of the time
- considering the minimal amount of examples in the training set (only 13) LIMA performs well in terms of safety, responding well to 80% of sensitive prompts
- the findings of this paper illustrate that diversity and quality of examples in a fine-tuning dataset are far more important that quantity
- i.e less is moree for alignment
- LIMA has its disadvantages (curation time, lack of robustness etc.), but the approach taken seems promising for aligning models in the future