Issu loading the LLM locally #9
astroyouth
started this conversation in
General
Replies: 1 comment
-
|
Ive just realised that this is better placed in the Issues section. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
in code snippet 42, spesifically the following:
4. Instantiate the model
llm_model = AutoModelForCausalLM.from_pretrained(pretrained_model_name_or_path=model_id,
torch_dtype=torch.float16, # datatype to use, we want float16
quantization_config=quantization_config if use_quantization_config else None,
low_cpu_mem_usage=False, # use full memory
attn_implementation=attn_implementation) # which attention version to use
I get the following error which I cannot seem to resolve.
the only suggestion I can find is to remove the torch_dtype=torch.float16, but that doesnt help.
this is my error read out:
ValueError Traceback (most recent call last)
Cell In[3], line 30
27 tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path=model_id)
29 # 4. Instantiate the model
---> 30 llm_model = AutoModelForCausalLM.from_pretrained(pretrained_model_name_or_path=model_id,
31 torch_dtype=torch.float16, # datatype to use, we want float16
32 quantization_config=quantization_config if use_quantization_config else None,
33 low_cpu_mem_usage=False, # use full memory
34 attn_implementation=attn_implementation) # which attention version to use
36 if not use_quantization_config: # quantization takes care of device setting automatically, so if it's not used, send model to GPU
37 llm_model.to("cuda")
File ~/Coding-stuff/simple-local-rag/venv/lib/python3.12/site-packages/transformers/models/auto/auto_factory.py:561, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
559 elif type(config) in cls._model_mapping.keys():
560 model_class = _get_model_class(config, cls._model_mapping)
--> 561 return model_class.from_pretrained(
562 pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs
563 )
564 raise ValueError(
565 f"Unrecognized configuration class {config.class} for this kind of AutoModel: {cls.name}.\n"
566 f"Model type should be one of {', '.join(c.name for c in cls._model_mapping.keys())}."
567 )
File ~/Coding-stuff/simple-local-rag/venv/lib/python3.12/site-packages/transformers/modeling_utils.py:3558, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, *model_args, **kwargs)
...
2539 # For GPTQ models, we prevent users from casting the model to another dytpe to restrict unwanted behaviours.
2540 # the correct API should be to load the model with the desired dtype directly through
from_pretrained.2541 dtype_present_in_args = False
ValueError:
.tois not supported for4-bitor8-bitbitsandbytes models. Please use the model as it is, since the model has already been set to the correct devices and casted to the correctdtype.Beta Was this translation helpful? Give feedback.
All reactions