unorganized notes

the authors propose that the vast majority of knowledge llms posses is aquired during the pretraining stage
- they further predict that small datasets (1000 examples) of high quality, curated data is sufficient to effectively tune the model to a specific task
- this is known as the “Super Alignment Hypothesis”
alignment (fine tuning) is primarily about teaching output style to the model
the paper discusses how/where to find high quality data. sources it uses are wikihow, stack exchange and reddit
- stack exchange and wikihow in general contain high quality data, whil reddit data needs to be carefully curated

the specific breakdown of data used in the 1000 example data set used for alignment

training for LIMA is done starting from LLaMa 70B with their 1000 example dataset
- they fine tune for 15 epochs with AdamW using standard hyperparmeters for fine tuning
- they make use of dropout over residual connections
the LIMA model is evaluated by comparing responses from LIMA and other popular llm assistants such as GPT-4 and Bard
- responses are scored by human annotators similar tog llm boxing

details of evaluation results

further analysis shows that LIMA provides “excellent” responses to 45% of prompts and only fails to comprehensively answer the question asked 20% of the time
considering the minimal amount of examples in the training set (only 13) LIMA performs well in terms of safety, responding well to 80% of sensitive prompts
the findings of this paper illustrate that diversity and quality of examples in a fine-tuning dataset are far more important that quantity
- i.e less is moree for alignment
LIMA has its disadvantages (curation time, lack of robustness etc.), but the approach taken seems promising for aligning models in the future