Frequently Asked Questions (FAQ)#
Install & Preperation#
Why are my LLaMA checkpoints *.bin or *.safetensors?#
There are two common formats for LLaMA checkpoints: the original format provided by Meta, and the
Huggingface format. Checkpoints in the former format are stored in consolidated.*.pth files, whereas
checkpoints in the latter format are stored in *.bin or *.safetensors files. LLaMA2-Accessory works
with the former format (consolidated.*.pth files).
Model#
How to set llama_config?#
In LLaMA2-Accessory, each model class has a corresponding ModelArgs class, which is defined in the same file as the model class.
The ModelArgs class contains arguments for configuring of the model. An example of ModelArgs can be found in
this file. Arguments in ModelArgs can be given default values at their definition.
On the other hand, the overriding of the arguments can be achieved by filling the llama_config argument when
creating MetaModel.
llama_config is expected to be a list of strings, specifying the paths to the *.json configuration files.
The most commonly used configuration files are those defining model sizes (7B, 13B, 65B, etc.), which are officially
provided by Meta and named params.json. For example, the configuration file for 13B llama is
provided at https://huggingface.co/meta-llama/Llama-2-13b/blob/main/params.json. So generally when you want to change
the model from 7B to 13B while leaving other things consistent, you can simply change llama_config from
['/path/to/7B/params.json'] to ['/path/to/13B/params.json']
Except model size, there are still other things to configure and can be different from model to model. For example, the PEFT model llama_peft allows users to configure the detailed PEFT settings, including the rank of lora and whether to tune bias. llamaPeft_normBiasLora.json contains the configuration that we usually use:
{"lora_rank": 16, "bias_tuning": true}
Based on this, when instantiating a llama_peft model, we can set llama_type=llama_peft, and
llama_config = ['/path/to/7B/params.json', '/path/to/llamaPeft_normBiasLora.json'] for 7B model, and
llama_config = ['/path/to/13B/params.json', '/path/to/llamaPeft_normBiasLora.json'] for 13B model. Of course, you can
also merge the size and PEFT configs into a single file, and the effect is the same.
Note
When multiple .json config files are assigned to llama_config, The combined configuration from all these files
will be used, with keys from later files overwriting those from earlier ones. This is especially handy when you want
to make specific model configuration adjustments, like the LoRA dimension, which is consistent across various model
sizes, eliminating the need to produce individual files for each size.
Note that the following arguments in ModelArgs are relatively special and their final values are not determined by
the specification in llama_config:
max_seq_len:MetaModel.__init__receives an argument with the same name, which directly determines the valuemax_batch_size: is currently hard-coded to be 32 inMetaModel.__init__vocab_sizeis dynamically determined by the actual vocabulary size of the tokenizer
How to set tokenizer_path?#
LLaMA2-Accessory supports both spm tokenizers (provided by Meta, generally named tokenizer.model) and huggingface
tokenizers (composed of tokenizer.json and tokenizer_config.json). When using spm tokenizers,
tokenizer_path should point to the tokenizer.model file; when using huggingface tokenizers,
tokenizer_path should point to the directory containing tokenizer.json and tokenizer_config.json.
Tip
For the LLaMA family, the tokenizer is the same across LLaMA and LLaMA2, and across different model sizes (in most
cases the tokenizer.model file is downloaded together with LLaMA weights; you can also download it separately from
here). In contrast, CodeLLaMA uses a
different tokenizer.
Should you have any further queries, please don’t hesitate to post in the issue section. We will endeavor to respond promptly to your questions. Thank you for engaging with our project.