Qlib Recorder: Experiment Management

Introduction

Qlib contains an experiment management system named QlibRecorder, which is designed to help users handle experiment and analyse results in an efficient way.

There are three components of the system:

  • ExperimentManager
    a class that manages experiments.
  • Experiment
    a class of experiment, and each instance of it is responsible for a single experiment.
  • Recorder
    a class of recorder, and each instance of it is responsible for a single run.

Here is a general view of the structure of the system:

This experiment management system defines a set of interface and provided a concrete implementation MLflowExpManager, which is based on the machine learning platform: MLFlow (link).

If users set the implementation of ExpManager to be MLflowExpManager, they can use the command mlflow ui to visualize and check the experiment results. For more information, please refer to the related documents here.

Qlib Recorder

QlibRecorder provides a high level API for users to use the experiment management system. The interfaces are wrapped in the variable R in Qlib, and users can directly use R to interact with the system. The following command shows how to import R in Python:

from qlib.workflow import R

QlibRecorder includes several common API for managing experiments and recorders within a workflow. For more available APIs, please refer to the following section about Experiment Manager, Experiment and Recorder.

Here are the available interfaces of QlibRecorder:

class qlib.workflow.__init__.QlibRecorder(exp_manager: qlib.workflow.expm.ExpManager)

A global system that helps to manage the experiments.

__init__(exp_manager: qlib.workflow.expm.ExpManager)

Initialize self. See help(type(self)) for accurate signature.

start(*, experiment_id: Optional[str] = None, experiment_name: Optional[str] = None, recorder_id: Optional[str] = None, recorder_name: Optional[str] = None, uri: Optional[str] = None, resume: bool = False)

Method to start an experiment. This method can only be called within a Python’s with statement. Here is the example code:

# start new experiment and recorder
with R.start(experiment_name='test', recorder_name='recorder_1'):
    model.fit(dataset)
    R.log...
    ... # further operations

# resume previous experiment and recorder
with R.start(experiment_name='test', recorder_name='recorder_1', resume=True): # if users want to resume recorder, they have to specify the exact same name for experiment and recorder.
    ... # further operations
Parameters:
  • experiment_id (str) – id of the experiment one wants to start.
  • experiment_name (str) – name of the experiment one wants to start.
  • recorder_id (str) – id of the recorder under the experiment one wants to start.
  • recorder_name (str) – name of the recorder under the experiment one wants to start.
  • uri (str) – The tracking uri of the experiment, where all the artifacts/metrics etc. will be stored. The default uri is set in the qlib.config. Note that this uri argument will not change the one defined in the config file. Therefore, the next time when users call this function in the same experiment, they have to also specify this argument with the same value. Otherwise, inconsistent uri may occur.
  • resume (bool) – whether to resume the specific recorder with given name under the given experiment.
start_exp(*, experiment_id=None, experiment_name=None, recorder_id=None, recorder_name=None, uri=None, resume=False)

Lower level method for starting an experiment. When use this method, one should end the experiment manually and the status of the recorder may not be handled properly. Here is the example code:

R.start_exp(experiment_name='test', recorder_name='recorder_1')
... # further operations
R.end_exp('FINISHED') or R.end_exp(Recorder.STATUS_S)
Parameters:
  • experiment_id (str) – id of the experiment one wants to start.
  • experiment_name (str) – the name of the experiment to be started
  • recorder_id (str) – id of the recorder under the experiment one wants to start.
  • recorder_name (str) – name of the recorder under the experiment one wants to start.
  • uri (str) – the tracking uri of the experiment, where all the artifacts/metrics etc. will be stored. The default uri are set in the qlib.config.
  • resume (bool) – whether to resume the specific recorder with given name under the given experiment.
Returns:

Return type:

An experiment instance being started.

end_exp(recorder_status='FINISHED')

Method for ending an experiment manually. It will end the current active experiment, as well as its active recorder with the specified status type. Here is the example code of the method:

R.start_exp(experiment_name='test')
... # further operations
R.end_exp('FINISHED') or R.end_exp(Recorder.STATUS_S)
Parameters:status (str) – The status of a recorder, which can be SCHEDULED, RUNNING, FINISHED, FAILED.
search_records(experiment_ids, **kwargs)

Get a pandas DataFrame of records that fit the search criteria.

The arguments of this function are not set to be rigid, and they will be different with different implementation of ExpManager in Qlib. Qlib now provides an implementation of ExpManager with mlflow, and here is the example code of the method with the MLflowExpManager:

R.log_metrics(m=2.50, step=0)
records = R.search_records([experiment_id], order_by=["metrics.m DESC"])
Parameters:
  • experiment_ids (list) – list of experiment IDs.
  • filter_string (str) – filter query string, defaults to searching all runs.
  • run_view_type (int) – one of enum values ACTIVE_ONLY, DELETED_ONLY, or ALL (e.g. in mlflow.entities.ViewType).
  • max_results (int) – the maximum number of runs to put in the dataframe.
  • order_by (list) – list of columns to order by (e.g., “metrics.rmse”).
Returns:

  • A pandas.DataFrame of records, where each metric, parameter, and tag
  • are expanded into their own columns named metrics., params.*, and tags.**
  • respectively. For records that don’t have a particular metric, parameter, or tag, their
  • value will be (NumPy) Nan, None, or None respectively.

list_experiments()

Method for listing all the existing experiments (except for those being deleted.)

exps = R.list_experiments()
Returns:
Return type:A dictionary (name -> experiment) of experiments information that being stored.
list_recorders(experiment_id=None, experiment_name=None)

Method for listing all the recorders of experiment with given id or name.

If user doesn’t provide the id or name of the experiment, this method will try to retrieve the default experiment and list all the recorders of the default experiment. If the default experiment doesn’t exist, the method will first create the default experiment, and then create a new recorder under it. (More information about the default experiment can be found here).

Here is the example code:

recorders = R.list_recorders(experiment_name='test')
Parameters:
  • experiment_id (str) – id of the experiment.
  • experiment_name (str) – name of the experiment.
Returns:

Return type:

A dictionary (id -> recorder) of recorder information that being stored.

get_exp(*, experiment_id=None, experiment_name=None, create: bool = True, start: bool = False) → qlib.workflow.exp.Experiment

Method for retrieving an experiment with given id or name. Once the create argument is set to True, if no valid experiment is found, this method will create one for you. Otherwise, it will only retrieve a specific experiment or raise an Error.

  • If ‘create’ is True:

    • If active experiment exists:

      • no id or name specified, return the active experiment.
      • if id or name is specified, return the specified experiment. If no such exp found, create a new experiment with given id or name.
    • If active experiment not exists:

      • no id or name specified, create a default experiment, and the experiment is set to be active.
      • if id or name is specified, return the specified experiment. If no such exp found, create a new experiment with given name or the default experiment.
  • Else If ‘create’ is False:

    • If active experiment exists:

      • no id or name specified, return the active experiment.
      • if id or name is specified, return the specified experiment. If no such exp found, raise Error.
    • If active experiment not exists:

      • no id or name specified. If the default experiment exists, return it, otherwise, raise Error.
      • if id or name is specified, return the specified experiment. If no such exp found, raise Error.

Here are some use cases:

# Case 1
with R.start('test'):
    exp = R.get_exp()
    recorders = exp.list_recorders()

# Case 2
with R.start('test'):
    exp = R.get_exp(experiment_name='test1')

# Case 3
exp = R.get_exp() -> a default experiment.

# Case 4
exp = R.get_exp(experiment_name='test')

# Case 5
exp = R.get_exp(create=False) -> the default experiment if exists.
Parameters:
  • experiment_id (str) – id of the experiment.
  • experiment_name (str) – name of the experiment.
  • create (boolean) – an argument determines whether the method will automatically create a new experiment according to user’s specification if the experiment hasn’t been created before.
  • start (bool) – when start is True, if the experiment has not started(not activated), it will start It is designed for R.log_params to auto start experiments
Returns:

Return type:

An experiment instance with given id or name.

delete_exp(experiment_id=None, experiment_name=None)

Method for deleting the experiment with given id or name. At least one of id or name must be given, otherwise, error will occur.

Here is the example code:

R.delete_exp(experiment_name='test')
Parameters:
  • experiment_id (str) – id of the experiment.
  • experiment_name (str) – name of the experiment.
get_uri()

Method for retrieving the uri of current experiment manager.

Here is the example code:

uri = R.get_uri()
Returns:
Return type:The uri of current experiment manager.
set_uri(uri: Optional[str])

Method to reset the default uri of current experiment manager.

NOTE:

  • When the uri is refer to a file path, please using the absolute path instead of strings like “~/mlruns/” The backend don’t support strings like this.
uri_context(uri: str)

Temporarily set the exp_manager’s default_uri to uri

NOTE: - Please refer to the NOTE in the set_uri

Parameters:uri (Text) – the temporal uri
get_recorder(*, recorder_id=None, recorder_name=None, experiment_id=None, experiment_name=None) → qlib.workflow.recorder.Recorder

Method for retrieving a recorder.

  • If active recorder exists:

    • no id or name specified, return the active recorder.
    • if id or name is specified, return the specified recorder.
  • If active recorder not exists:

    • no id or name specified, raise Error.
    • if id or name is specified, and the corresponding experiment_name must be given, return the specified recorder. Otherwise, raise Error.

The recorder can be used for further process such as save_object, load_object, log_params, log_metrics, etc.

Here are some use cases:

# Case 1
with R.start(experiment_name='test'):
    recorder = R.get_recorder()

# Case 2
with R.start(experiment_name='test'):
    recorder = R.get_recorder(recorder_id='2e7a4efd66574fa49039e00ffaefa99d')

# Case 3
recorder = R.get_recorder() -> Error

# Case 4
recorder = R.get_recorder(recorder_id='2e7a4efd66574fa49039e00ffaefa99d') -> Error

# Case 5
recorder = R.get_recorder(recorder_id='2e7a4efd66574fa49039e00ffaefa99d', experiment_name='test')

Here are some things users may concern - Q: What recorder will it return if multiple recorder meets the query (e.g. query with experiment_name) - A: If mlflow backend is used, then the recorder with the latest start_time will be returned. Because MLflow’s search_runs function guarantee it

Parameters:
  • recorder_id (str) – id of the recorder.
  • recorder_name (str) – name of the recorder.
  • experiment_name (str) – name of the experiment.
Returns:

Return type:

A recorder instance.

delete_recorder(recorder_id=None, recorder_name=None)

Method for deleting the recorders with given id or name. At least one of id or name must be given, otherwise, error will occur.

Here is the example code:

R.delete_recorder(recorder_id='2e7a4efd66574fa49039e00ffaefa99d')
Parameters:
  • recorder_id (str) – id of the experiment.
  • recorder_name (str) – name of the experiment.
save_objects(local_path=None, artifact_path=None, **kwargs)

Method for saving objects as artifacts in the experiment to the uri. It supports either saving from a local file/directory, or directly saving objects. User can use valid python’s keywords arguments to specify the object to be saved as well as its name (name: value).

In summary, this API is designs for saving objects to the experiments management backend path, 1. Qlib provide two methods to specify objects - Passing in the object directly by passing with **kwargs (e.g. R.save_objects(trained_model=model)) - Passing in the local path to the object, i.e. local_path parameter. 2. artifact_path represents the the experiments management backend path

  • If active recorder exists: it will save the objects through the active recorder.
  • If active recorder not exists: the system will create a default experiment, and a new recorder and save objects under it.

Note

If one wants to save objects with a specific recorder. It is recommended to first get the specific recorder through get_recorder API and use the recorder the save objects. The supported arguments are the same as this method.

Here are some use cases:

# Case 1
with R.start(experiment_name='test'):
    pred = model.predict(dataset)
    R.save_objects(**{"pred.pkl": pred}, artifact_path='prediction')
    rid = R.get_recorder().id
...
R.get_recorder(recorder_id=rid).load_object("prediction/pred.pkl")  #  after saving objects, you can load the previous object with this api

# Case 2
with R.start(experiment_name='test'):
    R.save_objects(local_path='results/pred.pkl', artifact_path="prediction")
    rid = R.get_recorder().id
...
R.get_recorder(recorder_id=rid).load_object("prediction/pred.pkl")  #  after saving objects, you can load the previous object with this api
Parameters:
  • local_path (str) – if provided, them save the file or directory to the artifact URI.
  • artifact_path (str) – the relative path for the artifact to be stored in the URI.
  • **kwargs (Dict[Text, Any]) – the object to be saved. For example, {“pred.pkl”: pred}
load_object(name: str)

Method for loading an object from artifacts in the experiment in the uri.

log_params(**kwargs)

Method for logging parameters during an experiment. In addition to using R, one can also log to a specific recorder after getting it with get_recorder API.

  • If active recorder exists: it will log parameters through the active recorder.
  • If active recorder not exists: the system will create a default experiment as well as a new recorder, and log parameters under it.

Here are some use cases:

# Case 1
with R.start('test'):
    R.log_params(learning_rate=0.01)

# Case 2
R.log_params(learning_rate=0.01)
Parameters:argument (keyword) – name1=value1, name2=value2, …
log_metrics(step=None, **kwargs)

Method for logging metrics during an experiment. In addition to using R, one can also log to a specific recorder after getting it with get_recorder API.

  • If active recorder exists: it will log metrics through the active recorder.
  • If active recorder not exists: the system will create a default experiment as well as a new recorder, and log metrics under it.

Here are some use cases:

# Case 1
with R.start('test'):
    R.log_metrics(train_loss=0.33, step=1)

# Case 2
R.log_metrics(train_loss=0.33, step=1)
Parameters:argument (keyword) – name1=value1, name2=value2, …
log_artifact(local_path: str, artifact_path: Optional[str] = None)

Log a local file or directory as an artifact of the currently active run

  • If active recorder exists: it will set tags through the active recorder.
  • If active recorder not exists: the system will create a default experiment as well as a new recorder, and set the tags under it.
Parameters:
  • local_path (str) – Path to the file to write.
  • artifact_path (Optional[str]) – If provided, the directory in artifact_uri to write to.
download_artifact(path: str, dst_path: Optional[str] = None) → str

Download an artifact file or directory from a run to a local directory if applicable, and return a local path for it.

Parameters:
  • path (str) – Relative source path to the desired artifact.
  • dst_path (Optional[str]) – Absolute path of the local filesystem destination directory to which to download the specified artifacts. This directory must already exist. If unspecified, the artifacts will either be downloaded to a new uniquely-named directory on the local filesystem.
Returns:

Local path of desired artifact.

Return type:

str

set_tags(**kwargs)

Method for setting tags for a recorder. In addition to using R, one can also set the tag to a specific recorder after getting it with get_recorder API.

  • If active recorder exists: it will set tags through the active recorder.
  • If active recorder not exists: the system will create a default experiment as well as a new recorder, and set the tags under it.

Here are some use cases:

# Case 1
with R.start('test'):
    R.set_tags(release_version="2.2.0")

# Case 2
R.set_tags(release_version="2.2.0")
Parameters:argument (keyword) – name1=value1, name2=value2, …

Experiment Manager

The ExpManager module in Qlib is responsible for managing different experiments. Most of the APIs of ExpManager are similar to QlibRecorder, and the most important API will be the get_exp method. User can directly refer to the documents above for some detailed information about how to use the get_exp method.

class qlib.workflow.expm.ExpManager(uri: str, default_exp_name: Optional[str])

This is the ExpManager class for managing experiments. The API is designed similar to mlflow. (The link: https://mlflow.org/docs/latest/python_api/mlflow.html)

The ExpManager is expected to be a singleton (btw, we can have multiple Experiment`s with different uri. user can get different experiments from different uri, and then compare records of them). Global Config (i.e. `C) is also a singleton.

So we try to align them together. They share the same variable, which is called default uri. Please refer to ExpManager.default_uri for details of variable sharing.

When the user starts an experiment, the user may want to set the uri to a specific uri (it will override default uri during this period), and then unset the specific uri and fallback to the default uri. ExpManager._active_exp_uri is that specific uri.

__init__(uri: str, default_exp_name: Optional[str])

Initialize self. See help(type(self)) for accurate signature.

start_exp(*, experiment_id: Optional[str] = None, experiment_name: Optional[str] = None, recorder_id: Optional[str] = None, recorder_name: Optional[str] = None, uri: Optional[str] = None, resume: bool = False, **kwargs) → qlib.workflow.exp.Experiment

Start an experiment. This method includes first get_or_create an experiment, and then set it to be active.

Maintaining _active_exp_uri is included in start_exp, remaining implementation should be included in _end_exp in subclass

Parameters:
  • experiment_id (str) – id of the active experiment.
  • experiment_name (str) – name of the active experiment.
  • recorder_id (str) – id of the recorder to be started.
  • recorder_name (str) – name of the recorder to be started.
  • uri (str) – the current tracking URI.
  • resume (boolean) – whether to resume the experiment and recorder.
Returns:

Return type:

An active experiment.

end_exp(recorder_status: str = 'SCHEDULED', **kwargs)

End an active experiment.

Maintaining _active_exp_uri is included in end_exp, remaining implementation should be included in _end_exp in subclass

Parameters:
  • experiment_name (str) – name of the active experiment.
  • recorder_status (str) – the status of the active recorder of the experiment.
create_exp(experiment_name: Optional[str] = None)

Create an experiment.

Parameters:experiment_name (str) – the experiment name, which must be unique.
Returns:
  • An experiment object.
  • Raise
  • —–
  • ExpAlreadyExistError
search_records(experiment_ids=None, **kwargs)

Get a pandas DataFrame of records that fit the search criteria of the experiment. Inputs are the search criteria user want to apply.

Returns:
  • A pandas.DataFrame of records, where each metric, parameter, and tag
  • are expanded into their own columns named metrics., params.*, and tags.**
  • respectively. For records that don’t have a particular metric, parameter, or tag, their
  • value will be (NumPy) Nan, None, or None respectively.
get_exp(*, experiment_id=None, experiment_name=None, create: bool = True, start: bool = False)

Retrieve an experiment. This method includes getting an active experiment, and get_or_create a specific experiment.

When user specify experiment id and name, the method will try to return the specific experiment. When user does not provide recorder id or name, the method will try to return the current active experiment. The create argument determines whether the method will automatically create a new experiment according to user’s specification if the experiment hasn’t been created before.

  • If create is True:

    • If active experiment exists:

      • no id or name specified, return the active experiment.
      • if id or name is specified, return the specified experiment. If no such exp found, create a new experiment with given id or name. If start is set to be True, the experiment is set to be active.
    • If active experiment not exists:

      • no id or name specified, create a default experiment.
      • if id or name is specified, return the specified experiment. If no such exp found, create a new experiment with given id or name. If start is set to be True, the experiment is set to be active.
  • Else If create is False:

    • If active experiment exists:

      • no id or name specified, return the active experiment.
      • if id or name is specified, return the specified experiment. If no such exp found, raise Error.
    • If active experiment not exists:

      • no id or name specified. If the default experiment exists, return it, otherwise, raise Error.
      • if id or name is specified, return the specified experiment. If no such exp found, raise Error.
Parameters:
  • experiment_id (str) – id of the experiment to return.
  • experiment_name (str) – name of the experiment to return.
  • create (boolean) – create the experiment it if hasn’t been created before.
  • start (boolean) – start the new experiment if one is created.
Returns:

Return type:

An experiment object.

delete_exp(experiment_id=None, experiment_name=None)

Delete an experiment.

Parameters:
  • experiment_id (str) – the experiment id.
  • experiment_name (str) – the experiment name.
default_uri

Get the default tracking URI from qlib.config.C

uri

Get the default tracking URI or current URI.

Returns:
Return type:The tracking URI string.
list_experiments()

List all the existing experiments.

Returns:
Return type:A dictionary (name -> experiment) of experiments information that being stored.

For other interfaces such as create_exp, delete_exp, please refer to Experiment Manager API.

Experiment

The Experiment class is solely responsible for a single experiment, and it will handle any operations that are related to an experiment. Basic methods such as start, end an experiment are included. Besides, methods related to recorders are also available: such methods include get_recorder and list_recorders.

class qlib.workflow.exp.Experiment(id, name)

This is the Experiment class for each experiment being run. The API is designed similar to mlflow. (The link: https://mlflow.org/docs/latest/python_api/mlflow.html)

__init__(id, name)

Initialize self. See help(type(self)) for accurate signature.

start(*, recorder_id=None, recorder_name=None, resume=False)

Start the experiment and set it to be active. This method will also start a new recorder.

Parameters:
  • recorder_id (str) – the id of the recorder to be created.
  • recorder_name (str) – the name of the recorder to be created.
  • resume (bool) – whether to resume the first recorder
Returns:

Return type:

An active recorder.

end(recorder_status='SCHEDULED')

End the experiment.

Parameters:recorder_status (str) – the status the recorder to be set with when ending (SCHEDULED, RUNNING, FINISHED, FAILED).
create_recorder(recorder_name=None)

Create a recorder for each experiment.

Parameters:recorder_name (str) – the name of the recorder to be created.
Returns:
Return type:A recorder object.
search_records(**kwargs)

Get a pandas DataFrame of records that fit the search criteria of the experiment. Inputs are the search criteria user want to apply.

Returns:
  • A pandas.DataFrame of records, where each metric, parameter, and tag
  • are expanded into their own columns named metrics., params.*, and tags.**
  • respectively. For records that don’t have a particular metric, parameter, or tag, their
  • value will be (NumPy) Nan, None, or None respectively.
delete_recorder(recorder_id)

Create a recorder for each experiment.

Parameters:recorder_id (str) – the id of the recorder to be deleted.
get_recorder(recorder_id=None, recorder_name=None, create: bool = True, start: bool = False) → qlib.workflow.recorder.Recorder

Retrieve a Recorder for user. When user specify recorder id and name, the method will try to return the specific recorder. When user does not provide recorder id or name, the method will try to return the current active recorder. The create argument determines whether the method will automatically create a new recorder according to user’s specification if the recorder hasn’t been created before.

  • If create is True:

    • If active recorder exists:

      • no id or name specified, return the active recorder.
      • if id or name is specified, return the specified recorder. If no such exp found, create a new recorder with given id or name. If start is set to be True, the recorder is set to be active.
    • If active recorder not exists:

      • no id or name specified, create a new recorder.
      • if id or name is specified, return the specified experiment. If no such exp found, create a new recorder with given id or name. If start is set to be True, the recorder is set to be active.
  • Else If create is False:

    • If active recorder exists:

      • no id or name specified, return the active recorder.
      • if id or name is specified, return the specified recorder. If no such exp found, raise Error.
    • If active recorder not exists:

      • no id or name specified, raise Error.
      • if id or name is specified, return the specified recorder. If no such exp found, raise Error.
Parameters:
  • recorder_id (str) – the id of the recorder to be deleted.
  • recorder_name (str) – the name of the recorder to be deleted.
  • create (boolean) – create the recorder if it hasn’t been created before.
  • start (boolean) – start the new recorder if one is created.
Returns:

Return type:

A recorder object.

list_recorders(rtype: typing_extensions.Literal['dict', 'list'][dict, list] = 'dict', **flt_kwargs) → Union[List[qlib.workflow.recorder.Recorder], Dict[str, qlib.workflow.recorder.Recorder]]

List all the existing recorders of this experiment. Please first get the experiment instance before calling this method. If user want to use the method R.list_recorders(), please refer to the related API document in QlibRecorder.

flt_kwargs : dict
filter recorders by conditions e.g. list_recorders(status=Recorder.STATUS_FI)
Returns:
if rtype == “dict”:
A dictionary (id -> recorder) of recorder information that being stored.
elif rtype == “list”:
A list of Recorder.
Return type:The return type depends on rtype

For other interfaces such as search_records, delete_recorder, please refer to Experiment API.

Qlib also provides a default Experiment, which will be created and used under certain situations when users use the APIs such as log_metrics or get_exp. If the default Experiment is used, there will be related logged information when running Qlib. Users are able to change the name of the default Experiment in the config file of Qlib or during Qlib’s initialization, which is set to be ‘Experiment’.

Recorder

The Recorder class is responsible for a single recorder. It will handle some detailed operations such as log_metrics, log_params of a single run. It is designed to help user to easily track results and things being generated during a run.

Here are some important APIs that are not included in the QlibRecorder:

class qlib.workflow.recorder.Recorder(experiment_id, name)

This is the Recorder class for logging the experiments. The API is designed similar to mlflow. (The link: https://mlflow.org/docs/latest/python_api/mlflow.html)

The status of the recorder can be SCHEDULED, RUNNING, FINISHED, FAILED.

__init__(experiment_id, name)

Initialize self. See help(type(self)) for accurate signature.

save_objects(local_path=None, artifact_path=None, **kwargs)

Save objects such as prediction file or model checkpoints to the artifact URI. User can save object through keywords arguments (name:value).

Please refer to the docs of qlib.workflow:R.save_objects

Parameters:
  • local_path (str) – if provided, them save the file or directory to the artifact URI.
  • artifact_path=None (str) – the relative path for the artifact to be stored in the URI.
load_object(name)

Load objects such as prediction file or model checkpoints.

Parameters:name (str) – name of the file to be loaded.
Returns:
Return type:The saved object.
start_run()

Start running or resuming the Recorder. The return value can be used as a context manager within a with block; otherwise, you must call end_run() to terminate the current run. (See ActiveRun class in mlflow)

Returns:
Return type:An active running object (e.g. mlflow.ActiveRun object)
end_run()

End an active Recorder.

log_params(**kwargs)

Log a batch of params for the current run.

Parameters:arguments (keyword) – key, value pair to be logged as parameters.
log_metrics(step=None, **kwargs)

Log multiple metrics for the current run.

Parameters:arguments (keyword) – key, value pair to be logged as metrics.
log_artifact(local_path: str, artifact_path: Optional[str] = None)

Log a local file or directory as an artifact of the currently active run.

Parameters:
  • local_path (str) – Path to the file to write.
  • artifact_path (Optional[str]) – If provided, the directory in artifact_uri to write to.
set_tags(**kwargs)

Log a batch of tags for the current run.

Parameters:arguments (keyword) – key, value pair to be logged as tags.
delete_tags(*keys)

Delete some tags from a run.

Parameters:keys (series of strs of the keys) – all the name of the tag to be deleted.
list_artifacts(artifact_path: str = None)

List all the artifacts of a recorder.

Parameters:artifact_path (str) – the relative path for the artifact to be stored in the URI.
Returns:
Return type:A list of artifacts information (name, path, etc.) that being stored.
download_artifact(path: str, dst_path: Optional[str] = None) → str

Download an artifact file or directory from a run to a local directory if applicable, and return a local path for it.

Parameters:
  • path (str) – Relative source path to the desired artifact.
  • dst_path (Optional[str]) – Absolute path of the local filesystem destination directory to which to download the specified artifacts. This directory must already exist. If unspecified, the artifacts will either be downloaded to a new uniquely-named directory on the local filesystem.
Returns:

Local path of desired artifact.

Return type:

str

list_metrics()

List all the metrics of a recorder.

Returns:
Return type:A dictionary of metrics that being stored.
list_params()

List all the params of a recorder.

Returns:
Return type:A dictionary of params that being stored.
list_tags()

List all the tags of a recorder.

Returns:
Return type:A dictionary of tags that being stored.

For other interfaces such as save_objects, load_object, please refer to Recorder API.

Record Template

The RecordTemp class is a class that enables generate experiment results such as IC and backtest in a certain format. We have provided three different Record Template class:

  • SignalRecord: This class generates the prediction results of the model.
  • SigAnaRecord: This class generates the IC, ICIR, Rank IC and Rank ICIR of the model.

Here is a simple example of what is done in SigAnaRecord, which users can refer to if they want to calculate IC, Rank IC, Long-Short Return with their own prediction and label.

from qlib.contrib.eva.alpha import calc_ic, calc_long_short_return

ic, ric = calc_ic(pred.iloc[:, 0], label.iloc[:, 0])
long_short_r, long_avg_r = calc_long_short_return(pred.iloc[:, 0], label.iloc[:, 0])
  • PortAnaRecord: This class generates the results of backtest. The detailed information about backtest as well as the available strategy, users can refer to Strategy and Backtest.

Here is a simple example of what is done in PortAnaRecord, which users can refer to if they want to do backtest based on their own prediction and label.

from qlib.contrib.strategy.strategy import TopkDropoutStrategy
from qlib.contrib.evaluate import (
    backtest as normal_backtest,
    risk_analysis,
)

# backtest
STRATEGY_CONFIG = {
    "topk": 50,
    "n_drop": 5,
}
BACKTEST_CONFIG = {
    "limit_threshold": 0.095,
    "account": 100000000,
    "benchmark": BENCHMARK,
    "deal_price": "close",
    "open_cost": 0.0005,
    "close_cost": 0.0015,
    "min_cost": 5,
}

strategy = TopkDropoutStrategy(**STRATEGY_CONFIG)
report_normal, positions_normal = normal_backtest(pred_score, strategy=strategy, **BACKTEST_CONFIG)

# analysis
analysis = dict()
analysis["excess_return_without_cost"] = risk_analysis(report_normal["return"] - report_normal["bench"])
analysis["excess_return_with_cost"] = risk_analysis(report_normal["return"] - report_normal["bench"] - report_normal["cost"])
analysis_df = pd.concat(analysis)  # type: pd.DataFrame
print(analysis_df)

For more information about the APIs, please refer to Record Template API.

Known Limitations

  • The Python objects are saved based on pickle, which may results in issues when the environment dumping objects and loading objects are different.