Setting up environment#

Installation#

Install Ablator via pip:

pip install ablator

For development version of Ablator:

pip install ablator[dev]

For nightly release, install from dev branch of git repository:

pip install git+https://github.com/fostiropoulos/ablator.git@dev

Note: Python version is should be 3.10 or newer

Prerequisites#

Setting up ray cluster#

This section provides steps to set up a ray cluster manually for a set of machines that share the same network. However, there are several other ways to set up a ray cluster (AWS, GCP, or Kubernetes). Refer to ray clusters docs for other methods.

We assume that ablator (which includes ray) is already installed on each machine.

Start the Head node#

Choose any node to be the head node and run the following. Note that Ray will choose port 6379 by default:

ray start --head

The command will print out the Ray cluster address, which can be passed to ray start on other machines to start the worker nodes (see below). If you receive a ConnectionError, check your firewall settings and network configuration.

Start Worker nodes#

On each of the other nodes, run the following command to connect to the head node you just created:

ray start --address=<head-node-address:port>

head-node-address:port should be the value printed by the command on the head node (it should look something like 123.45.67.89:6379).

Note

This tutorial for setting up ray cluster is adapted from this ray doc. Here you can find instructions to alternatively launch the ray cluster using cluster-launcher, which sets up all nodes at once instead of manually going over each of the node.

GPU-Accelerated Computing (Optional)#

If you have GPUs available, you can run your experiments wth GPU acceleration, which can significantly speed up the training process. To run CUDA Python, you’ll need the CUDA Toolkit installed on your system with CUDA-capable GPUs.

After setting up CUDA, you can install the cuda-enabled torch package. Refer to this tutorial for instructions on how to install cuda enabled package. A sample command to install torch cuda package can be found below:

pip install torch torchvision --index-url https://download.pytorch.org/whl/cu117 --force-reinstall