Modules package#
Subpackages#
Submodules#
Optimizers module#
- class ablator.modules.optimizer.AdamConfig(*args, **kwargs)[source]#
Bases:
OptimizerArgs
Configuration for an
Adam
optimizer. This class hasinit_optimizer()
method used to initialize and return anAdam
optimizer.- Attributes:
- betasTuple[float, float]
Coefficients for computing running averages of gradient and its square (default is
(0.5, 0.9)
).- weight_decayfloat
Weight decay rate (default is
0.0
).
- config_class#
alias of
AdamConfig
- init_optimizer(model: Module)[source]#
Creates and returns an
Adam
optimizer that optimizes the model’s parameters. These parameters will be processed viaget_optim_parameters
before used to initalized the optimizer.- Parameters:
- modeltorch.nn.Module
The model that has parameters that the optimizer will optimize.
- Returns:
- Optimizer
An instance of the
Adam
optimizer.
Examples
>>> config = AdamConfig(lr=0.1, weight_decay=0.5, betas=(0.6,0.9)) >>> config.init_optimizer(MyModel()) Adam ( Parameter Group 0 amsgrad: False betas: (0.6, 0.9) capturable: False differentiable: False eps: 1e-08 foreach: None fused: False lr: 0.1 maximize: False weight_decay: 0.5 Parameter Group 1 amsgrad: False betas: (0.6, 0.9) capturable: False differentiable: False eps: 1e-08 foreach: None fused: False lr: 0.1 maximize: False weight_decay: 0.0 )
- class ablator.modules.optimizer.AdamWConfig(*args, **kwargs)[source]#
Bases:
OptimizerArgs
Configuration for an AdamW optimizer. This class has
init_optimizer()
method used to initialize and return anAdamW
optimizer.Examples
>>> config = AdamWConfig(lr=0.1, weight_decay=0.5, betas=(0.9,0.99))
- Attributes:
- betasTuple[float, float]
Coefficients for computing running averages of gradient and its square (default is
(0.9, 0.999)
).- epsfloat
Term added to the denominator to improve numerical stability (default is
1e-8
).- weight_decayfloat
Weight decay rate (default is
0.0
).
- config_class#
alias of
AdamWConfig
- init_optimizer(model: Module)[source]#
Creates and returns an
AdamW
optimizer that optimizes the model’s parameters. These parameters will be processed viaget_optim_parameters
before used to initalized the optimizer.- Parameters:
- modeltorch.nn.Module
The model that has parameters that the optimizer will optimize.
- Returns:
- Optimizer
An instance of the
AdamW
optimizer.
Examples
>>> config = AdamWConfig(lr=0.1, weight_decay=0.5, betas=(0.9,0.99), eps=0.001) >>> config.init_optimizer(MyModel()) AdamW ( Parameter Group 0 amsgrad: False betas: (0.9, 0.99) capturable: False eps: 0.001 foreach: None lr: 0.1 maximize: False weight_decay: 0.5 Parameter Group 1 amsgrad: False betas: (0.9, 0.99) capturable: False eps: 0.001 foreach: None lr: 0.1 maximize: False weight_decay: 0.0 )
- class ablator.modules.optimizer.OptimizerArgs(*args, **kwargs)[source]#
Bases:
ConfigBase
A base class for optimizer arguments, here we define learning rate lr.
- Attributes:
- lrfloat
Learning rate of the optimizer
- config_class#
alias of
OptimizerArgs
- class ablator.modules.optimizer.OptimizerConfig(name, arguments: dict[str, Any])[source]#
Bases:
ConfigBase
Configuration for an optimizer, including optimizer name and arguments (these arguments are specific to a certain type of optimizer like SGD, Adam, AdamW).
- Attributes:
- namestr
Name of the optimizer.
- argumentsOptimizerArgs
Arguments for the optimizer, specific to a certain type of optimizer.
- __init__(name, arguments: dict[str, Any])[source]#
Initializes the optimizer configuration. Add any provided settings to the optimizer.
- Parameters:
- namestr
Name of the optimizer, this can be any in
['adamw', 'adam', 'sgd']
.- argumentsdict[str, ty.Any]
Arguments for the optimizer, specific to a certain type of optimizer. A common argument can be learning rate, e.g
{'lr': 0.5}
. Ifname
is"adamw"
, can addeps
toarguments
, e.g{'lr': 0.5, 'eps': 0.001}
.
Examples
In the following example,
optim_config
will initialize propertyarguments
of typeSGDConfig
, settinglr=0.5
as its property. We also have access toinit_optimizer()
method of the property, which initalizes an SGD optimizer. This method is actually called inmake_optimizer()
>>> optim_config = OptimizerConfig("sgd", {"lr": 0.5})
- config_class#
alias of
OptimizerConfig
- make_optimizer(model: Module) Optimizer [source]#
Creates and returns an optimizer for the given model.
- Parameters:
- modeltorch.nn.Module
The model to optimize.
- Returns:
- optimizertorch.optim.Optimizer
The created optimizer.
Examples
>>> optim_config = OptimizerConfig("sgd", {"lr": 0.5, "weight_decay": 0.5}) >>> optim_config.make_optimizer(my_module) SGD ( Parameter Group 0 dampening: 0 differentiable: False foreach: None lr: 0.5 maximize: False momentum: 0.0 nesterov: False weight_decay: 0.5 Parameter Group 1 dampening: 0 differentiable: False foreach: None lr: 0.5 maximize: False momentum: 0.0 nesterov: False weight_decay: 0.0 )
- class ablator.modules.optimizer.SGDConfig(*args, **kwargs)[source]#
Bases:
OptimizerArgs
Configuration for an SGD optimizer. This class has
init_optimizer()
method, which is used to initialize and return an SGD optimizer.Examples
>>> config = SGDConfig(lr=0.1, momentum=0.9)
- Attributes:
- weight_decayfloat
Weight decay rate.
- momentumfloat
Momentum factor.
- init_optimizer(model: Module)[source]#
Creates and returns an SGD optimizer that optimizes the model’s parameters. These parameters will be processed via
get_optim_parameters
before used to initalized the optimizer.- Parameters:
- modeltorch.nn.Module
The model that has parameters that the optimizer will optimize.
- Returns:
- optimizertorch.optim.SGD
The created SGD optimizer.
Examples
>>> config = SGDConfig(lr=0.1, weight_decay=0.5, momentum=0.9) >>> config.init_optimizer(MyModel()) SGD ( Parameter Group 0 dampening: 0 differentiable: False foreach: None lr: 0.1 maximize: False momentum: 0.9 nesterov: False weight_decay: 0.5 Parameter Group 1 dampening: 0 differentiable: False foreach: None lr: 0.1 maximize: False momentum: 0.9 nesterov: False weight_decay: 0.0 )
- ablator.modules.optimizer.get_optim_parameters(model: Module, weight_decay: float | None = None, only_requires_grad: bool = True)[source]#
Setup the optimizer. Get model parameters to be optimized. If
weight_decay
is afloat
, apply weight decaying to the parameters too (except for bias and parameters from layer normalization module).- Parameters:
- modeltorch.nn.Module
The model for which to get parameters that will be optimized.
- weight_decayfloat | None
The amount of weight decay to use, by default
None
.- only_requires_gradbool
Whether to only use parameters that require gradient or all parameters, by default
True
.
- Returns:
- dict | list
If weight_decay is
None
, return all model parameters.If weight_decay is not
None
, return a dictionary of parameter groups of different weight decay. In specific, bias parameters and parameters from layer normalization module will have weight decay of0.0
, while any other parameters will have weight decay ofweight_decay
.
Notes
We provide a reasonable default that works well. If you want to use something else, you can pass a tuple in the Trainer’s init through
optimizers
, or subclass and override this method in a subclass.Examples
>>> class MyModel(nn.Module): >>> def __init__(self, embedding_dim=10, vocab_size=10, *args, **kwargs) -> None: >>> super().__init__(*args, **kwargs) >>> self.param = nn.Parameter(torch.ones(100)) >>> self.embedding = nn.Embedding(num_embeddings=vocab_size, >>> embedding_dim=embedding_dim) >>> self.norm_layer = nn.LayerNorm(embedding_dim) >>> def forward(self): >>> x = self.param + torch.rand_like(self.param) * 0.01 >>> return x.sum().abs() >>> mM = MyModel() >>> get_optim_parameters(mM, 0.2) [ {'params': ['param', 'embedding.weight'], 'weight_decay': 0.2}, {'params': ['norm_layer.weight', 'norm_layer.bias'], 'weight_decay': 0.0} ]
- ablator.modules.optimizer.get_parameter_names(model: Module, forbidden_layer_types: list[type])[source]#
Recurse into the module and return parameter names of all submodules, excluding modules that are of any type defined in
forbidden_layer_types
.- Parameters:
- modeltorch.nn.Module
The model for which to get parameter names.
- forbidden_layer_typeslist[type]
A list of types of modules inside which parameter names should not be included.
- Returns:
- list[str]
The names of the parameters with the following format:
<submodule-name>.<parameter-name>
.
Examples
>>> class MyModel(nn.Module): >>> def __init__(self, embedding_dim=10, vocab_size=10, *args, **kwargs) -> None: >>> super().__init__(*args, **kwargs) >>> self.param = nn.Parameter(torch.ones(100)) >>> self.embedding = nn.Embedding(num_embeddings=vocab_size, >>> embedding_dim=embedding_dim) >>> self.norm_layer = nn.LayerNorm(embedding_dim) >>> def forward(self): >>> x = self.param + torch.rand_like(self.param) * 0.01 >>> return x.sum().abs() >>> mM = MyModel() >>> get_parameter_names(mM,[]) ['embedding.weight', 'norm_layer.weight', 'norm_layer.bias', 'param'] >>> get_parameter_names(mM, [torch.nn.LayerNorm]) ['embedding.weight', 'param']
Schedulers module#
- class ablator.modules.scheduler.OneCycleConfig(*args, **kwargs)[source]#
Bases:
SchedulerArgs
Configuration class for the OneCycleLR scheduler.
- Attributes:
- max_lrfloat
Upper learning rate boundaries in the cycle.
- total_stepsDerived[int]
The total number of steps to run the scheduler in a cycle.
- step_whenStepType
The step type at which the scheduler.step() should be invoked:
'train'
,'val'
, or'epoch'
.
- config_class#
alias of
OneCycleConfig
- init_scheduler(model: Module, optimizer: Optimizer)[source]#
Initializes the OneCycleLR scheduler. Creates and returns a OneCycleLR scheduler that monitors optimizer’s learning rate.
- Parameters:
- modelnn.Module
The model.
- optimizerOptimizer
The optimizer used to update the model parameters, whose learning rate we want to monitor.
- Returns:
- OneCycleLR
The OneCycleLR scheduler, initialized with arguments defined as attributes of this class.
Examples
>>> optimizer = torch.optim.SGD(model.parameters(), lr=0.7, momentum=0.9) >>> scheduler = OneCycleConfig(max_lr=0.5, total_steps=100) >>> scheduler.init_scheduler(model, optimizer)
- class ablator.modules.scheduler.PlateuaConfig(*args, **kwargs)[source]#
Bases:
SchedulerArgs
Configuration class for ReduceLROnPlateau scheduler.
- Attributes:
- patienceint
Number of epochs with no improvement after which learning rate will be reduced.
- min_lrfloat
A lower bound on the learning rate.
- modestr
One of
'min'
,'max'
, or'auto'
, which defines the direction of optimization, so as to adjust the learning rate accordingly, i.e when a certain metric ceases improving.- factorfloat
Factor by which the learning rate will be reduced.
new_lr = lr * factor
.- thresholdfloat
Threshold for measuring the new optimum, to only focus on significant changes.
- verbosebool
If
True
, prints a message tostdout
for each update.- step_whenStepType
The step type at which the scheduler should be invoked:
'train'
,'val'
, or'epoch'
.
- config_class#
alias of
PlateuaConfig
- init_scheduler(model: Module, optimizer: Optimizer)[source]#
Initialize the ReduceLROnPlateau scheduler.
- Parameters:
- modelnn.Module
The model being optimized.
- optimizerOptimizer
The optimizer used to update the model parameters, whose learning rate we want to monitor.
- Returns:
- ReduceLROnPlateau
The ReduceLROnPlateau scheduler, initialized with arguments defined as attributes of this class.
Examples
>>> optimizer = torch.optim.SGD(model.parameters(), lr=0.7, momentum=0.9) >>> scheduler = PlateuaConfig(min_lr=1e-7, mode='min') >>> scheduler.init_scheduler(model, optimizer)
- class ablator.modules.scheduler.SchedulerArgs(*args, **kwargs)[source]#
Bases:
ConfigBase
Abstract base class for defining arguments to initialize a learning rate scheduler.
- Attributes:
- step_whenStepType
The step type at which the scheduler.step() should be invoked:
'train'
,'val'
, or'epoch'
.
- config_class#
alias of
SchedulerArgs
- class ablator.modules.scheduler.SchedulerConfig(name, arguments: dict[str, Any])[source]#
Bases:
ConfigBase
Class that defines a configuration for a learning rate scheduler.
- Attributes:
- namestr
The name of the scheduler.
- argumentsSchedulerArgs
The arguments needed to initialize the scheduler.
- __init__(name, arguments: dict[str, Any])[source]#
Initializes the scheduler configuration.
- Parameters:
- namestr
The name of the scheduler, this can be any in
['None', 'step', 'cycle', 'plateau']
.- argumentsdict[str, ty.Any]
The arguments for the scheduler, specific to a certain type of scheduler.
Examples
In the following example,
scheduler_config
will initialize propertyarguments
of typeStepLRConfig
, settingstep_size=1
,gamma=0.99
as its properties. We also have access toinit_scheduler()
method of the property, which initalizes an StepLR scheduler. This method is actually called inmake_scheduler()
>>> scheduler_config = SchedulerConfig("step", arguments={"step_size": 1, "gamma": 0.99})
- config_class#
alias of
SchedulerConfig
- make_scheduler(model, optimizer) _LRScheduler | ReduceLROnPlateau | Any [source]#
Creates a new scheduler for an optimizer, based on the configuration.
- Parameters:
- model
The model.
- optimizer
The optimizer used to update the model parameters, whose learning rate we want to monitor.
- Returns:
- Scheduler
The scheduler.
Examples
>>> scheduler_config = SchedulerConfig("step", arguments={"step_size": 1, "gamma": 0.99}) >>> optimizer = torch.optim.SGD(model.parameters(), lr=0.7, momentum=0.9) >>> scheduler_config.make_scheduler(model, optimizer)
- class ablator.modules.scheduler.StepLRConfig(*args, **kwargs)[source]#
Bases:
SchedulerArgs
Configuration class for StepLR scheduler.
- Parameters:
- step_sizeint
Period of learning rate decay, by default 1.
- gammafloat
Multiplicative factor of learning rate decay, by default 0.99.
- step_whenStepType
The step type at which the scheduler should be invoked:
'train'
,'val'
, or'epoch'
.
- config_class#
alias of
StepLRConfig
- init_scheduler(model: Module, optimizer: Optimizer)[source]#
Initialize the StepLR scheduler for a given model and optimizer.
- Parameters:
- modelnn.Module
The model to apply the scheduler.
- optimizerOptimizer
The optimizer used to update the model parameters, whose learning rate we want to monitor.
- Returns:
- StepLR
The StepLR scheduler, initialized with arguments defined as attributes of this class.
Examples
>>> optimizer = torch.optim.SGD(model.parameters(), lr=0.7, momentum=0.9) >>> scheduler = StepLRConfig(step_size=20, gamma=0.9) >>> scheduler.init_scheduler(model, optimizer)