Modules package#

Subpackages#

Submodules#

Optimizers module#

class ablator.modules.optimizer.AdamConfig(*args, **kwargs)[source]#

Bases: OptimizerArgs

Configuration for an Adam optimizer. This class has init_optimizer() method used to initialize and return an Adam optimizer.

Attributes:
betasTuple[float, float]

Coefficients for computing running averages of gradient and its square (default is (0.5, 0.9)).

weight_decayfloat

Weight decay rate (default is 0.0).

config_class#

alias of AdamConfig

init_optimizer(model: Module)[source]#

Creates and returns an Adam optimizer that optimizes the model’s parameters. These parameters will be processed via get_optim_parameters before used to initalized the optimizer.

Parameters:
modeltorch.nn.Module

The model that has parameters that the optimizer will optimize.

Returns:
Optimizer

An instance of the Adam optimizer.

Examples

>>> config = AdamConfig(lr=0.1, weight_decay=0.5, betas=(0.6,0.9))
>>> config.init_optimizer(MyModel())
Adam (
Parameter Group 0
    amsgrad: False
    betas: (0.6, 0.9)
    capturable: False
    differentiable: False
    eps: 1e-08
    foreach: None
    fused: False
    lr: 0.1
    maximize: False
    weight_decay: 0.5
Parameter Group 1
    amsgrad: False
    betas: (0.6, 0.9)
    capturable: False
    differentiable: False
    eps: 1e-08
    foreach: None
    fused: False
    lr: 0.1
    maximize: False
    weight_decay: 0.0
)
class ablator.modules.optimizer.AdamWConfig(*args, **kwargs)[source]#

Bases: OptimizerArgs

Configuration for an AdamW optimizer. This class has init_optimizer() method used to initialize and return an AdamW optimizer.

Examples

>>> config = AdamWConfig(lr=0.1, weight_decay=0.5, betas=(0.9,0.99))
Attributes:
betasTuple[float, float]

Coefficients for computing running averages of gradient and its square (default is (0.9, 0.999)).

epsfloat

Term added to the denominator to improve numerical stability (default is 1e-8).

weight_decayfloat

Weight decay rate (default is 0.0).

config_class#

alias of AdamWConfig

init_optimizer(model: Module)[source]#

Creates and returns an AdamW optimizer that optimizes the model’s parameters. These parameters will be processed via get_optim_parameters before used to initalized the optimizer.

Parameters:
modeltorch.nn.Module

The model that has parameters that the optimizer will optimize.

Returns:
Optimizer

An instance of the AdamW optimizer.

Examples

>>> config = AdamWConfig(lr=0.1, weight_decay=0.5, betas=(0.9,0.99), eps=0.001)
>>> config.init_optimizer(MyModel())
AdamW (
Parameter Group 0
    amsgrad: False
    betas: (0.9, 0.99)
    capturable: False
    eps: 0.001
    foreach: None
    lr: 0.1
    maximize: False
    weight_decay: 0.5
Parameter Group 1
    amsgrad: False
    betas: (0.9, 0.99)
    capturable: False
    eps: 0.001
    foreach: None
    lr: 0.1
    maximize: False
    weight_decay: 0.0
)
class ablator.modules.optimizer.OptimizerArgs(*args, **kwargs)[source]#

Bases: ConfigBase

A base class for optimizer arguments, here we define learning rate lr.

Attributes:
lrfloat

Learning rate of the optimizer

config_class#

alias of OptimizerArgs

abstract init_optimizer(model: Module)[source]#

Abstract method to be implemented by derived classes, which initializes the optimizer.

class ablator.modules.optimizer.OptimizerConfig(name, arguments: dict[str, Any])[source]#

Bases: ConfigBase

Configuration for an optimizer, including optimizer name and arguments (these arguments are specific to a certain type of optimizer like SGD, Adam, AdamW).

Attributes:
namestr

Name of the optimizer.

argumentsOptimizerArgs

Arguments for the optimizer, specific to a certain type of optimizer.

__init__(name, arguments: dict[str, Any])[source]#

Initializes the optimizer configuration. Add any provided settings to the optimizer.

Parameters:
namestr

Name of the optimizer, this can be any in ['adamw', 'adam', 'sgd'].

argumentsdict[str, ty.Any]

Arguments for the optimizer, specific to a certain type of optimizer. A common argument can be learning rate, e.g {'lr': 0.5}. If name is "adamw", can add eps to arguments, e.g {'lr': 0.5, 'eps': 0.001}.

Examples

In the following example, optim_config will initialize property arguments of type SGDConfig, setting lr=0.5 as its property. We also have access to init_optimizer() method of the property, which initalizes an SGD optimizer. This method is actually called in make_optimizer()

>>> optim_config = OptimizerConfig("sgd", {"lr": 0.5})
config_class#

alias of OptimizerConfig

make_optimizer(model: Module) Optimizer[source]#

Creates and returns an optimizer for the given model.

Parameters:
modeltorch.nn.Module

The model to optimize.

Returns:
optimizertorch.optim.Optimizer

The created optimizer.

Examples

>>> optim_config = OptimizerConfig("sgd", {"lr": 0.5, "weight_decay": 0.5})
>>> optim_config.make_optimizer(my_module)
SGD (
Parameter Group 0
    dampening: 0
    differentiable: False
    foreach: None
    lr: 0.5
    maximize: False
    momentum: 0.0
    nesterov: False
    weight_decay: 0.5
Parameter Group 1
    dampening: 0
    differentiable: False
    foreach: None
    lr: 0.5
    maximize: False
    momentum: 0.0
    nesterov: False
    weight_decay: 0.0
)
class ablator.modules.optimizer.SGDConfig(*args, **kwargs)[source]#

Bases: OptimizerArgs

Configuration for an SGD optimizer. This class has init_optimizer() method, which is used to initialize and return an SGD optimizer.

Examples

>>> config = SGDConfig(lr=0.1, momentum=0.9)
Attributes:
weight_decayfloat

Weight decay rate.

momentumfloat

Momentum factor.

config_class#

alias of SGDConfig

init_optimizer(model: Module)[source]#

Creates and returns an SGD optimizer that optimizes the model’s parameters. These parameters will be processed via get_optim_parameters before used to initalized the optimizer.

Parameters:
modeltorch.nn.Module

The model that has parameters that the optimizer will optimize.

Returns:
optimizertorch.optim.SGD

The created SGD optimizer.

Examples

>>> config = SGDConfig(lr=0.1, weight_decay=0.5, momentum=0.9)
>>> config.init_optimizer(MyModel())
SGD (
Parameter Group 0
    dampening: 0
    differentiable: False
    foreach: None
    lr: 0.1
    maximize: False
    momentum: 0.9
    nesterov: False
    weight_decay: 0.5
Parameter Group 1
    dampening: 0
    differentiable: False
    foreach: None
    lr: 0.1
    maximize: False
    momentum: 0.9
    nesterov: False
    weight_decay: 0.0
)
ablator.modules.optimizer.get_optim_parameters(model: Module, weight_decay: float | None = None, only_requires_grad: bool = True)[source]#

Setup the optimizer. Get model parameters to be optimized. If weight_decay is a float, apply weight decaying to the parameters too (except for bias and parameters from layer normalization module).

Parameters:
modeltorch.nn.Module

The model for which to get parameters that will be optimized.

weight_decayfloat | None

The amount of weight decay to use, by default None.

only_requires_gradbool

Whether to only use parameters that require gradient or all parameters, by default True.

Returns:
dict | list
  • If weight_decay is None, return all model parameters.

  • If weight_decay is not None, return a dictionary of parameter groups of different weight decay. In specific, bias parameters and parameters from layer normalization module will have weight decay of 0.0, while any other parameters will have weight decay of weight_decay.

Notes

We provide a reasonable default that works well. If you want to use something else, you can pass a tuple in the Trainer’s init through optimizers, or subclass and override this method in a subclass.

Examples

>>> class MyModel(nn.Module):
>>>     def __init__(self, embedding_dim=10, vocab_size=10, *args, **kwargs) -> None:
>>>         super().__init__(*args, **kwargs)
>>>         self.param = nn.Parameter(torch.ones(100))
>>>         self.embedding = nn.Embedding(num_embeddings=vocab_size,
>>>                                     embedding_dim=embedding_dim)
>>>         self.norm_layer = nn.LayerNorm(embedding_dim)
>>>     def forward(self):
>>>         x = self.param + torch.rand_like(self.param) * 0.01
>>>         return x.sum().abs()
>>> mM = MyModel()
>>> get_optim_parameters(mM, 0.2)
[
    {'params': ['param', 'embedding.weight'], 'weight_decay': 0.2},
    {'params': ['norm_layer.weight', 'norm_layer.bias'], 'weight_decay': 0.0}
]
ablator.modules.optimizer.get_parameter_names(model: Module, forbidden_layer_types: list[type])[source]#

Recurse into the module and return parameter names of all submodules, excluding modules that are of any type defined in forbidden_layer_types.

Parameters:
modeltorch.nn.Module

The model for which to get parameter names.

forbidden_layer_typeslist[type]

A list of types of modules inside which parameter names should not be included.

Returns:
list[str]

The names of the parameters with the following format: <submodule-name>.<parameter-name>.

Examples

>>> class MyModel(nn.Module):
>>>     def __init__(self, embedding_dim=10, vocab_size=10, *args, **kwargs) -> None:
>>>         super().__init__(*args, **kwargs)
>>>         self.param = nn.Parameter(torch.ones(100))
>>>         self.embedding = nn.Embedding(num_embeddings=vocab_size,
>>>                                     embedding_dim=embedding_dim)
>>>         self.norm_layer = nn.LayerNorm(embedding_dim)
>>>     def forward(self):
>>>         x = self.param + torch.rand_like(self.param) * 0.01
>>>         return x.sum().abs()
>>> mM = MyModel()
>>> get_parameter_names(mM,[])
['embedding.weight', 'norm_layer.weight', 'norm_layer.bias', 'param']
>>> get_parameter_names(mM, [torch.nn.LayerNorm])
['embedding.weight', 'param']

Schedulers module#

class ablator.modules.scheduler.OneCycleConfig(*args, **kwargs)[source]#

Bases: SchedulerArgs

Configuration class for the OneCycleLR scheduler.

Attributes:
max_lrfloat

Upper learning rate boundaries in the cycle.

total_stepsDerived[int]

The total number of steps to run the scheduler in a cycle.

step_whenStepType

The step type at which the scheduler.step() should be invoked: 'train', 'val', or 'epoch'.

config_class#

alias of OneCycleConfig

init_scheduler(model: Module, optimizer: Optimizer)[source]#

Initializes the OneCycleLR scheduler. Creates and returns a OneCycleLR scheduler that monitors optimizer’s learning rate.

Parameters:
modelnn.Module

The model.

optimizerOptimizer

The optimizer used to update the model parameters, whose learning rate we want to monitor.

Returns:
OneCycleLR

The OneCycleLR scheduler, initialized with arguments defined as attributes of this class.

Examples

>>> optimizer = torch.optim.SGD(model.parameters(), lr=0.7, momentum=0.9)
>>> scheduler = OneCycleConfig(max_lr=0.5, total_steps=100)
>>> scheduler.init_scheduler(model, optimizer)
class ablator.modules.scheduler.PlateuaConfig(*args, **kwargs)[source]#

Bases: SchedulerArgs

Configuration class for ReduceLROnPlateau scheduler.

Attributes:
patienceint

Number of epochs with no improvement after which learning rate will be reduced.

min_lrfloat

A lower bound on the learning rate.

modestr

One of 'min', 'max', or 'auto', which defines the direction of optimization, so as to adjust the learning rate accordingly, i.e when a certain metric ceases improving.

factorfloat

Factor by which the learning rate will be reduced. new_lr = lr * factor.

thresholdfloat

Threshold for measuring the new optimum, to only focus on significant changes.

verbosebool

If True, prints a message to stdout for each update.

step_whenStepType

The step type at which the scheduler should be invoked: 'train', 'val', or 'epoch'.

config_class#

alias of PlateuaConfig

init_scheduler(model: Module, optimizer: Optimizer)[source]#

Initialize the ReduceLROnPlateau scheduler.

Parameters:
modelnn.Module

The model being optimized.

optimizerOptimizer

The optimizer used to update the model parameters, whose learning rate we want to monitor.

Returns:
ReduceLROnPlateau

The ReduceLROnPlateau scheduler, initialized with arguments defined as attributes of this class.

Examples

>>> optimizer = torch.optim.SGD(model.parameters(), lr=0.7, momentum=0.9)
>>> scheduler = PlateuaConfig(min_lr=1e-7, mode='min')
>>> scheduler.init_scheduler(model, optimizer)
class ablator.modules.scheduler.SchedulerArgs(*args, **kwargs)[source]#

Bases: ConfigBase

Abstract base class for defining arguments to initialize a learning rate scheduler.

Attributes:
step_whenStepType

The step type at which the scheduler.step() should be invoked: 'train', 'val', or 'epoch'.

config_class#

alias of SchedulerArgs

abstract init_scheduler(model, optimizer)[source]#

Abstract method to be implemented by derived classes, which creates and returns a scheduler object.

class ablator.modules.scheduler.SchedulerConfig(name, arguments: dict[str, Any])[source]#

Bases: ConfigBase

Class that defines a configuration for a learning rate scheduler.

Attributes:
namestr

The name of the scheduler.

argumentsSchedulerArgs

The arguments needed to initialize the scheduler.

__init__(name, arguments: dict[str, Any])[source]#

Initializes the scheduler configuration.

Parameters:
namestr

The name of the scheduler, this can be any in ['None', 'step', 'cycle', 'plateau'].

argumentsdict[str, ty.Any]

The arguments for the scheduler, specific to a certain type of scheduler.

Examples

In the following example, scheduler_config will initialize property arguments of type StepLRConfig, setting step_size=1, gamma=0.99 as its properties. We also have access to init_scheduler() method of the property, which initalizes an StepLR scheduler. This method is actually called in make_scheduler()

>>> scheduler_config = SchedulerConfig("step", arguments={"step_size": 1, "gamma": 0.99})
config_class#

alias of SchedulerConfig

make_scheduler(model, optimizer) _LRScheduler | ReduceLROnPlateau | Any[source]#

Creates a new scheduler for an optimizer, based on the configuration.

Parameters:
model

The model.

optimizer

The optimizer used to update the model parameters, whose learning rate we want to monitor.

Returns:
Scheduler

The scheduler.

Examples

>>> scheduler_config = SchedulerConfig("step", arguments={"step_size": 1, "gamma": 0.99})
>>> optimizer = torch.optim.SGD(model.parameters(), lr=0.7, momentum=0.9)
>>> scheduler_config.make_scheduler(model, optimizer)
class ablator.modules.scheduler.StepLRConfig(*args, **kwargs)[source]#

Bases: SchedulerArgs

Configuration class for StepLR scheduler.

Parameters:
step_sizeint

Period of learning rate decay, by default 1.

gammafloat

Multiplicative factor of learning rate decay, by default 0.99.

step_whenStepType

The step type at which the scheduler should be invoked: 'train', 'val', or 'epoch'.

config_class#

alias of StepLRConfig

init_scheduler(model: Module, optimizer: Optimizer)[source]#

Initialize the StepLR scheduler for a given model and optimizer.

Parameters:
modelnn.Module

The model to apply the scheduler.

optimizerOptimizer

The optimizer used to update the model parameters, whose learning rate we want to monitor.

Returns:
StepLR

The StepLR scheduler, initialized with arguments defined as attributes of this class.

Examples

>>> optimizer = torch.optim.SGD(model.parameters(), lr=0.7, momentum=0.9)
>>> scheduler = StepLRConfig(step_size=20, gamma=0.9)
>>> scheduler.init_scheduler(model, optimizer)

Module contents#