Analysis package#
Subpackages#
Submodules#
Analysis module module#
- class ablator.analysis.main.Analysis(results: DataFrame, categorical_attributes: list[str], numerical_attributes: list[str], optim_metrics: dict[str, ablator.main.configs.Optim], save_dir: str | None = None, cache=False)[source]#
Bases:
object
A class for analyzing experimental results.
- Attributes:
- optim_metricsdict[str, Optim]
A dictionary mapping metric names to optimization directions.
- save_dirstr | None
The directory to save analysis results to.
- cacheMemory | None
A joblib memory cache for saving results.
- categorical_attributeslist[str]
The list of all the categorical hyperparameter names
- numerical_attributeslist[str]
The list of all the numerical hyperparameter names
- experiment_attributeslist[str]
The list of all the hyperparameter names
- resultspd.DataFrame
The dataframe extracted from the results file based on given metrics names and hyperparameter names.
- __init__(results: DataFrame, categorical_attributes: list[str], numerical_attributes: list[str], optim_metrics: dict[str, ablator.main.configs.Optim], save_dir: str | None = None, cache=False) None [source]#
Initialize the Analysis class.
- Parameters:
- resultspd.DataFrame
The result dataframe.
- categorical_attributeslist[str]
The list of all the categorical hyperparameter names
- numerical_attributeslist[str]
The list of all the numerical hyperparameter names
- optim_metricsdict[str, Optim]
A dictionary mapping metric names to optimization directions.
- save_dirstr | None
The directory to save analysis results to.
- cachebool
Whether to cache results.
Analysis results module#
- class ablator.analysis.results.Results(config: type[ablator.main.configs.ParallelConfig], experiment_dir: str | Path, cache: bool = False, use_ray: bool = False)[source]#
Bases:
object
Class for processing experiment results.
- Parameters:
- configtype[ParallelConfig]
The configuration class used
- experiment_dirstr | Path
The path to the experiment directory.
- cachebool, optional
Whether to cache the results, by default
False
- use_raybool, optional
Whether to use ray for parallel processing, by default
False
- Attributes:
- experiment_dirPath
The path to the experiment directory.
- configtype[ParallelConfig]
The configuration class used
- metric_mapdict[str, Optim]
A dictionary mapping optimize metric names to their optimization direction.
- data: pd.DataFrame
The processed results of the experiment. Refer
read_results
for more details.- config_attrs: list[str]
The list of all the optimizable hyperparameter names
- search_space: dict[str, ty.Any]
All the search space of the experiment.
- numerical_attributes: list[str]
The list of all the numerical hyperparameter names
- categorical_attributes: list[str]
The list of all the categorical hyperparameter names
- __init__(config: type[ablator.main.configs.ParallelConfig], experiment_dir: str | Path, cache: bool = False, use_ray: bool = False) None [source]#
- property metric_names: list[str]#
Get the list of all optimize directions
- Returns:
- list[str]
list of optimize metric names
- classmethod read_results(config_type: type[ablator.config.main.ConfigBase], experiment_dir: Path, num_cpus=None) DataFrame [source]#
Read multiple results from experiment directory with ray to enable parallel processing.
This function calls
read_result
many times, refer toread_result
for more details.- Parameters:
- config_typetype[ConfigBase]
The configuration class
- experiment_dirPath
The experiment directory
- num_cpusint, optional
Number of CPUs to use for ray processing, by default
None
- Returns:
- pd.DataFrame
A dataframe of all the results
- ablator.analysis.results.process_row(row: str, **aux_info) dict[str, Any] | None [source]#
Process a given row to make it conform to the JSON format, loading it as a JSON object, and updating it with auxiliary information.
- Parameters:
- rowstr
The input row to be processed, expected to be a JSON-like string.
- **aux_infodict
Additional key-value pairs to be added to the resulting dictionary.
- Returns:
- dict[str, ty.Any] | None
A dictionary resulting from the combination of the processed row and auxiliary information, or
None
if the input row cannot be parsed as a JSON object.
- Raises:
- AssertionError
If there are overlapping column names between the auxiliary information and the input row.
Examples
>>> row = '"name": "John Doe", "age": 30' >>> aux_info = {"city": "San Francisco"} >>> process_row(row, **aux_info) {'name': 'John Doe', 'age': 30, 'city': 'San Francisco'}
>>> row = '{"name": "John Doe", "age": 30}' >>> aux_info = {"age": 25, "city": "San Francisco"} >>> process_row(row, **aux_info) AssertionError: Overlapping column names between auxiliary dictionary and run results. aux_info: {'age': 25, 'city': 'San Francisco'} row: {"name": "John Doe", "age": 30}
- ablator.analysis.results.read_result(config_type: type[ablator.config.main.ConfigBase], json_path: Path) DataFrame [source]#
Read the results of an experiment and return them as a pandas DataFrame.
The function reads the data from a JSON file, processes each row, and appends experiment attributes from a YAML configuration file. The resulting DataFrame is indexed and returned.
- Parameters:
- config_typetype[ConfigBase]
The type of the configuration class that is used to load the experiment configuration from a YAML file.
- json_pathPath
The path to the JSON file containing the results of the experiment.
- Returns:
- pd.DataFrame
A pandas DataFrame containing the processed experiment results.
- Raises:
- Exception
If there is an error in processing the JSON file or loading the experiment configuration, the exception will be caught and the traceback will be printed.
Examples
>>> result json file: { "run_id": "run_1", "accuracy": 0.85, "loss": 0.35 } { "run_id": "run_2", "accuracy": 0.87, "loss": 0.32 } >>> config file experiment_name: "My Experiment" batch_size: 64 >>> return value run_id accuracy loss experiment_name batch_size path 0 run_1 0.85 0.35 My Experiment 64 path/to/experiment 1 run_2 0.87 0.32 My Experiment 64 path/to/experiment