Analysis package#

Subpackages#

Submodules#

Analysis module module#

class ablator.analysis.main.Analysis(results: DataFrame, categorical_attributes: list[str], numerical_attributes: list[str], optim_metrics: dict[str, ablator.main.configs.Optim], save_dir: str | None = None, cache=False)[source]#

Bases: object

A class for analyzing experimental results.

Attributes:
optim_metricsdict[str, Optim]

A dictionary mapping metric names to optimization directions.

save_dirstr | None

The directory to save analysis results to.

cacheMemory | None

A joblib memory cache for saving results.

categorical_attributeslist[str]

The list of all the categorical hyperparameter names

numerical_attributeslist[str]

The list of all the numerical hyperparameter names

experiment_attributeslist[str]

The list of all the hyperparameter names

resultspd.DataFrame

The dataframe extracted from the results file based on given metrics names and hyperparameter names.

__init__(results: DataFrame, categorical_attributes: list[str], numerical_attributes: list[str], optim_metrics: dict[str, ablator.main.configs.Optim], save_dir: str | None = None, cache=False) None[source]#

Initialize the Analysis class.

Parameters:
resultspd.DataFrame

The result dataframe.

categorical_attributeslist[str]

The list of all the categorical hyperparameter names

numerical_attributeslist[str]

The list of all the numerical hyperparameter names

optim_metricsdict[str, Optim]

A dictionary mapping metric names to optimization directions.

save_dirstr | None

The directory to save analysis results to.

cachebool

Whether to cache results.

Analysis results module#

class ablator.analysis.results.Results(config: type[ablator.main.configs.ParallelConfig], experiment_dir: str | Path, cache: bool = False, use_ray: bool = False)[source]#

Bases: object

Class for processing experiment results.

Parameters:
configtype[ParallelConfig]

The configuration class used

experiment_dirstr | Path

The path to the experiment directory.

cachebool, optional

Whether to cache the results, by default False

use_raybool, optional

Whether to use ray for parallel processing, by default False

Attributes:
experiment_dirPath

The path to the experiment directory.

configtype[ParallelConfig]

The configuration class used

metric_mapdict[str, Optim]

A dictionary mapping optimize metric names to their optimization direction.

data: pd.DataFrame

The processed results of the experiment. Refer read_results for more details.

config_attrs: list[str]

The list of all the optimizable hyperparameter names

search_space: dict[str, ty.Any]

All the search space of the experiment.

numerical_attributes: list[str]

The list of all the numerical hyperparameter names

categorical_attributes: list[str]

The list of all the categorical hyperparameter names

__init__(config: type[ablator.main.configs.ParallelConfig], experiment_dir: str | Path, cache: bool = False, use_ray: bool = False) None[source]#
property metric_names: list[str]#

Get the list of all optimize directions

Returns:
list[str]

list of optimize metric names

classmethod read_results(config_type: type[ablator.config.main.ConfigBase], experiment_dir: Path, num_cpus=None) DataFrame[source]#

Read multiple results from experiment directory with ray to enable parallel processing.

This function calls read_result many times, refer to read_result for more details.

Parameters:
config_typetype[ConfigBase]

The configuration class

experiment_dirPath

The experiment directory

num_cpusint, optional

Number of CPUs to use for ray processing, by default None

Returns:
pd.DataFrame

A dataframe of all the results

ablator.analysis.results.process_row(row: str, **aux_info) dict[str, Any] | None[source]#

Process a given row to make it conform to the JSON format, loading it as a JSON object, and updating it with auxiliary information.

Parameters:
rowstr

The input row to be processed, expected to be a JSON-like string.

**aux_infodict

Additional key-value pairs to be added to the resulting dictionary.

Returns:
dict[str, ty.Any] | None

A dictionary resulting from the combination of the processed row and auxiliary information, or None if the input row cannot be parsed as a JSON object.

Raises:
AssertionError

If there are overlapping column names between the auxiliary information and the input row.

Examples

>>> row = '"name": "John Doe", "age": 30'
>>> aux_info = {"city": "San Francisco"}
>>> process_row(row, **aux_info)
{'name': 'John Doe', 'age': 30, 'city': 'San Francisco'}
>>> row = '{"name": "John Doe", "age": 30}'
>>> aux_info = {"age": 25, "city": "San Francisco"}
>>> process_row(row, **aux_info)
AssertionError: Overlapping column names between auxiliary dictionary and run results.
aux_info: {'age': 25, 'city': 'San Francisco'}
row: {"name": "John Doe", "age": 30}
ablator.analysis.results.read_result(config_type: type[ablator.config.main.ConfigBase], json_path: Path) DataFrame[source]#

Read the results of an experiment and return them as a pandas DataFrame.

The function reads the data from a JSON file, processes each row, and appends experiment attributes from a YAML configuration file. The resulting DataFrame is indexed and returned.

Parameters:
config_typetype[ConfigBase]

The type of the configuration class that is used to load the experiment configuration from a YAML file.

json_pathPath

The path to the JSON file containing the results of the experiment.

Returns:
pd.DataFrame

A pandas DataFrame containing the processed experiment results.

Raises:
Exception

If there is an error in processing the JSON file or loading the experiment configuration, the exception will be caught and the traceback will be printed.

Examples

>>> result json file:
{
"run_id": "run_1",
"accuracy": 0.85,
"loss": 0.35
}
{
"run_id": "run_2",
"accuracy": 0.87,
"loss": 0.32
}
>>> config file
experiment_name: "My Experiment"
batch_size: 64
>>> return value
       run_id  accuracy loss experiment_name batch_size     path
0       run_1      0.85  0.35    My Experiment    64  path/to/experiment
1        run_2      0.87  0.32    My Experiment    64  path/to/experiment

Module contents#