SwiftModel

class swift.tuners.base.SwiftModel(model, config, extra_state_keys=None, inference_mode=False, **kwargs)[源代码]

The Swift wrapper model.

参数:
  • model (Union[nn.Module, 'SwiftModel'])

  • config (Union[SwiftConfig, Dict[str, SwiftConfig]]) -- SwiftConfig}. If it's a config class, the adapter_name will be default

  • extra_state_keys (List[str], optional)

  • inference_mode (bool, optional) -- Load model at inference mode, default False.

state_dict(*args, destination=None, prefix='', keep_vars=False, adapter_name=None, peft_format=False, **kwargs)[源代码]
参数:
  • destination (dict, optional) -- If provided, the state of module will be updated into the dict and the same object is returned. Otherwise, an OrderedDict will be created and returned. Default: None.

  • prefix (str, optional) -- a prefix added to parameter and buffer names to compose the keys in state_dict. Default: ''.

  • keep_vars (bool, optional) -- by default the Tensor s returned in the state dict are detached from autograd. If it's set to True, detaching will not be performed. Default: False.

  • adapter_name (str, optional) -- The name of the adapter's parameters to be saved, None input will save all adapters.

  • peft_format (bool, optional) -- Save with peft format (extra base_model.model. prefix)

  • **kwargs -- save_adapter(bool): Save adapters or not, default True save_extra_states(bool): Save extra states or not, default True

返回:

The state dict to be saved.

static load_state_file(path, device=None)[源代码]

Load a state dict file by the input path.

参数:

path -- The local dir to load the state file.

返回:

The state dict.

classmethod from_pretrained(model, model_id=None, adapter_name=None, inference_mode=False, revision=None, **kwargs)[源代码]

Load a set of tuners and corresponding weights by a model_id.

参数:
  • model (Union[torch.nn.Module, 'SwiftModel']) -- The model to be tuned, if the model is already a SwiftModel it will be un-wrapped and re-wrapped..

  • model_id (str) -- The model_id or a local model dir of tuners to use to tune the model.

  • adapter_name (Union[str, List[str], Dict[str, str]]) -- The adapter_names saved in the model repo to load. Default None, means load all tuners saved in the model_id

  • inference_mode (bool) -- Use in the inference mode or not.

  • revision (str) -- The model revision to use.

  • **kwargs -- extra_state_keys (List[str], optional) A list of regex to match the extra state keys to be saved. Other parameters will be passed to the device_map.

返回:

The SwiftModel instance.

create_or_update_model_card(output_dir)[源代码]

Updates or create the model card.

add_weighted_adapter(adapters, weights, adapter_name, combination_type='svd', svd_rank=None, svd_clamp=None, svd_full_matrices=True, svd_driver=None, density=None, majority_sign_method='total')[源代码]

This method adds a new adapter by merging the given adapters with the given weights.

When using the cat combination_type you should be aware that rank of the resulting adapter will be equal to the sum of all adapters ranks. So it's possible that the mixed adapter may become too big and result in OOM errors.

参数:
  • adapters (list) -- List of adapter names to be merged.

  • weights (list) -- List of weights for each adapter.

  • adapter_name (str) -- Name of the new adapter.

  • combination_type (str) -- The merging type can be one of [svd, linear, cat, ties, ties_svd, dare_ties, dare_linear, dare_ties_svd, dare_linear_svd, magnitude_prune, magnitude_prune_svd]. When using the cat combination_type, the rank of the resulting adapter is equal to the sum of all adapters ranks (the mixed adapter may be too big and result in OOM errors).

  • svd_rank (int, optional) -- Rank of output adapter for svd. If None provided, will use max rank of merging adapters.

  • svd_clamp (float, optional) -- A quantile threshold for clamping SVD decomposition output. If None is provided, do not perform clamping. Defaults to None.

  • svd_full_matrices (bool, optional) -- Controls whether to compute the full or reduced SVD, and consequently, the shape of the returned tensors U and Vh. Defaults to True.

  • svd_driver (str, optional) -- Name of the cuSOLVER method to be used. This keyword argument only works when merging on CUDA. Can be one of [None, gesvd, gesvdj, gesvda]. For more info please refer to torch.linalg.svd documentation. Defaults to None.

  • density (float, optional) -- Value between 0 and 1. 0 means all values are pruned and 1 means no values are pruned. Should be used with [ties, ties_svd, dare_ties, dare_linear, dare_ties_svd, dare_linear_svd, magnintude_prune, magnitude_prune_svd]

  • majority_sign_method (str) -- The method, should be one of ["total", "frequency"], to use to get the magnitude of the sign values. Should be used with [ties, ties_svd, dare_ties, dare_ties_svd]

save_pretrained(save_directory, safe_serialization=False, adapter_name=None, **kwargs)[源代码]

Save the adapters to a local directory.

参数:
  • save_directory (str) -- The directory to use.

  • safe_serialization (bool) -- Use safe tensors to save the weights, default False.

  • adapter_name (Union[str, List[str]]) -- The adapters to be saved, default is None to save all.

set_active_adapters(adapter_names, offload=None)[源代码]

Set activated adapters

参数:
  • adapter_names (Union[List[str], str]) -- The adapters needed to be activated

  • offload (str) -- Whether to offload the deactivated ones to cpu or meta device

activate_adapter(adapter_name)[源代码]

Activate one adapter

参数:

adapter_name (str) -- The adapter needed to be activated

deactivate_adapter(adapter_name, offload=None)[源代码]

Deactivate one adapter

参数:
  • adapter_name (str) -- The adapter needed to be activated

  • offload (str) -- Whether to offload to cpu or meta device

get_trainable_parameters()[源代码]

Get the content of trainable parameters in the model.