Quick start with Ablator#

Welcome to the Ablator tutorial! In this chapter, you will learn how to use Ablator from scratch. We will provide you a simple demo where you can see what it looks like to run Ablator and also play around with Ablator with your own ideas. You are also welcome to download this demo @Colab or Github

Let’s get started!

Installing#

We assume that you have already installed Python and pip on your local machine. Please use the following command to install Ablator:

pip install ablator

Preparations#

To use Ablator in your own projects, there are some minimum codes you need to write. We can identify them as follows:

  • Set up configurations

  • Define a model and datasets

  • Launch Ablator

Before started, let’s import the necessary packages:

import shutil
import argparse
from typing import Any, Callable, Dict
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
from sklearn.metrics import accuracy_score

Set up configurations#

There are multiple ways to set up configurations for Ablator. In this chapter, we will use codes and parameters to set up the configurations for Ablator. The following code shows how to set up the configurations for Ablator:

from ablator import ModelConfig, TrainConfig, OptimizerConfig, RunConfig,
                     configclass, Literal

@configclass
class SimpleConfig(ModelConfig):
    name: Literal["simplenet"]

@configclass
class SimpleRunConfig(RunConfig):
    model_config: SimpleConfig

run_config = SimpleRunConfig(
    experiment_dir = "/tmp/dir",
    train_config = TrainConfig(
        dataset = "mnist",
        batch_size = 64,
        epochs = 10,
        scheduler_config = None,
        rand_weights_init = False,
        optimizer_config = OptimizerConfig(
            name = "sgd",
            arguments = {
                "lr": 0.001,
                "momentum": 0.1
            }
        )
    ),
    model_config = SimpleConfig(name = "simplenet"),
    metrics_n_batches = 200,
    device= "cpu",
    amp=False
)

Define a model and datasets#

The core parts of a single experiment in Ablator are actually your customized models and datasets. In this demo, we will use a simple LeNet-5 model and classic MNIST dataset to run a training experiment with Ablator. The following code shows how to define a model and datasets:

# Define a simple CNN model using components from PyTorch packages
# And then we wrap up the CNN model in a wrapper class, which defines the loss function,
# forward pass and indicated output formats

class SimpleCNN(nn.Module):
    def __init__(self):
        super(SimpleCNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.relu1 = nn.ReLU()
        self.pool1 = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.relu2 = nn.ReLU()
        self.pool2 = nn.MaxPool2d(2, 2)
        self.fc1 = nn.Linear(16 * 4 * 4, 120)
        self.relu3 = nn.ReLU()
        self.fc2 = nn.Linear(120, 84)
        self.relu4 = nn.ReLU()
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool1(self.relu1(self.conv1(x)))
        x = self.pool2(self.relu2(self.conv2(x)))
        x = x.view(-1, 16 * 4 * 4)
        x = self.relu3(self.fc1(x))
        x = self.relu4(self.fc2(x))
        x = self.fc3(x)
        return x


class MyModel(nn.Module):
    def __init__(self, config: SimpleConfig) -> None:
        super().__init__()
        self.model = SimpleCNN()
        self.loss = nn.CrossEntropyLoss()
        # self.optimizer = optim.SGD(self.model.parameters(), lr=0.001, momentum=0.9)

    def forward(self, x, labels, custom_input=None):
        # custom_input is for demo purposes only, defined in the dataset wrapper
        out = self.model(x)
        loss = self.loss(out, labels)
        if labels is not None:
            loss = self.loss(out, labels)

        out = out.argmax(dim=-1)
        return {"y_pred": out, "y_true": labels}, loss


# Create the training & validation dataloaders from the MNIST dataset.
# Also, data preprocessing is defined here, including normalization and other transformations

transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

trainset = torchvision.datasets.MNIST(root='./datasets', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True, num_workers=2)

testset = torchvision.datasets.MNIST(root='./datasets', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=64, shuffle=False, num_workers=2)


# A evaluation function is definded here for Ablator to evaluate the model and training process.

def my_accuracy(y_true, y_pred):
    return accuracy_score(y_true.flatten(), y_pred.flatten())

Launch Ablator#

As a final step, we can launch wrap up all we have done before and launch Ablator.

Before launching Ablator, we have to make sure the temporary directory to cache the results are created and empty. Please use this line of codes to do it:

mkdir /tmp/dir

Then, we can launch Ablator with the following codes:

class MyModelWrapper(ModelWrapper):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

    def make_dataloader_train(self, run_config: SimpleRunConfig):  # type: ignore
        return trainloader

    def make_dataloader_val(self, run_config: SimpleRunConfig):  # type: ignore
        return testloader

    def evaluation_functions(self) -> Dict[str, Callable]:
        return {"accuracy_score": my_accuracy}


if __name__ == "__main__":
    wrapper = MyModelWrapper(model_class=MyModel)
    # run_config = SimpleRunConfig.load(config)
    # shutil.rmtree(run_config.experiment_dir)
    ablator = ProtoTrainer(
        wrapper=wrapper,
        run_config=run_config,
    )
    ablator.launch()

If you are using Juypter Notebook, you can directly run the above codes in the notebook. If you are using a Python script, you can save the above codes in a Python script and run it with the following command:

python <your_script_name>.py

If Ablator is successfully launched, you should see information printed on the console!

Access the results#

The training process should have saved in the temporary directory you specified in the run_config. To retrieve the training process, you can access the results by using the following codes:

cd /tmp/dir/<experiment_id>
cat results.json

You should see the training results from each epoch.

If you are using the Jupyter Notebook, you can also visualize the results by using Tensorboard:

# Load the TensorBoard extension
import tensorflow as tf
%load_ext tensorboard

# Start TensorBoard
%tensorboard --logdir /tmp/dir/<experiment_id>/dashboard/tensorboard

Next steps#

Ablator is far beyond what we show you in this tutorial. Please refer to the following chapters for more features and functionalies of Ablator!