Frequently Asked Questions (FAQ)#
Install & Preperation#
Why are my LLaMA checkpoints *.bin
or *.safetensors
?#
There are two common formats for LLaMA checkpoints: the original format provided by Meta, and the
Huggingface format. Checkpoints in the former format are stored in consolidated.*.pth
files, whereas
checkpoints in the latter format are stored in *.bin
or *.safetensors
files. LLaMA2-Accessory works
with the former format (consolidated.*.pth
files).
Model#
How to set llama_config
?#
In LLaMA2-Accessory, each model class has a corresponding ModelArgs
class, which is defined in the same file as the model class.
The ModelArgs
class contains arguments for configuring of the model. An example of ModelArgs
can be found in
this file. Arguments in ModelArgs
can be given default values at their definition.
On the other hand, the overriding of the arguments can be achieved by filling the llama_config
argument when
creating MetaModel
.
llama_config
is expected to be a list of strings, specifying the paths to the *.json
configuration files.
The most commonly used configuration files are those defining model sizes (7B, 13B, 65B, etc.), which are officially
provided by Meta and named params.json
. For example, the configuration file for 13B llama is
provided at https://huggingface.co/meta-llama/Llama-2-13b/blob/main/params.json
. So generally when you want to change
the model from 7B
to 13B
while leaving other things consistent, you can simply change llama_config
from
['/path/to/7B/params.json']
to ['/path/to/13B/params.json']
Except model size, there are still other things to configure and can be different from model to model. For example, the PEFT model llama_peft allows users to configure the detailed PEFT settings, including the rank of lora and whether to tune bias. llamaPeft_normBiasLora.json contains the configuration that we usually use:
{"lora_rank": 16, "bias_tuning": true}
Based on this, when instantiating a llama_peft
model, we can set llama_type=llama_peft
, and
llama_config = ['/path/to/7B/params.json', '/path/to/llamaPeft_normBiasLora.json']
for 7B model, and
llama_config = ['/path/to/13B/params.json', '/path/to/llamaPeft_normBiasLora.json']
for 13B model. Of course, you can
also merge the size and PEFT configs into a single file, and the effect is the same.
Note
When multiple .json
config files are assigned to llama_config
, The combined configuration from all these files
will be used, with keys from later files overwriting those from earlier ones. This is especially handy when you want
to make specific model configuration adjustments, like the LoRA dimension, which is consistent across various model
sizes, eliminating the need to produce individual files for each size.
Note that the following arguments in ModelArgs
are relatively special and their final values are not determined by
the specification in llama_config
:
max_seq_len
:MetaModel.__init__
receives an argument with the same name, which directly determines the valuemax_batch_size
: is currently hard-coded to be 32 inMetaModel.__init__
vocab_size
is dynamically determined by the actual vocabulary size of the tokenizer
How to set tokenizer_path
?#
LLaMA2-Accessory supports both spm tokenizers (provided by Meta, generally named tokenizer.model
) and huggingface
tokenizers (composed of tokenizer.json
and tokenizer_config.json
). When using spm tokenizers,
tokenizer_path
should point to the tokenizer.model
file; when using huggingface tokenizers,
tokenizer_path
should point to the directory containing tokenizer.json
and tokenizer_config.json
.
Tip
For the LLaMA family, the tokenizer is the same across LLaMA and LLaMA2, and across different model sizes (in most
cases the tokenizer.model
file is downloaded together with LLaMA weights; you can also download it separately from
here). In contrast, CodeLLaMA uses a
different tokenizer.
Should you have any further queries, please don’t hesitate to post in the issue section. We will endeavor to respond promptly to your questions. Thank you for engaging with our project.