Aanalysis: Evaluation & Results Analysis¶

Introduction¶

Aanalysis is designed to show the graphical reports of Intraday Trading , which helps users to evaluate and analyse investment portfolios visually. The following are some graphics to view:

analysis_position
- report_graph
- score_ic_graph
- cumulative_return_graph
- risk_analysis_graph
- rank_label_graph
analysis_model
- model_performance_graph

Graphical Reports¶

Users can run the following code to get all supported reports.

>> import qlib.contrib.report as qcr
>> print(qcr.GRAPH_NAME_LIST)
['analysis_position.report_graph', 'analysis_position.score_ic_graph', 'analysis_position.cumulative_return_graph', 'analysis_position.risk_analysis_graph', 'analysis_position.rank_label_graph', 'analysis_model.model_performance_graph']

Note

For more details, please refer to the function document: similar to help(qcr.analysis_position.report_graph)

Usage & Example¶

Usage of analysis_position.report¶

API¶

qlib.contrib.report.analysis_position.report.report_graph(report_df: pandas.core.frame.DataFrame, show_notebook: bool = True) → [<class 'list'>, <class 'tuple'>]¶

display backtest report

Example:

from qlib.contrib.evaluate import backtest
from qlib.contrib.strategy import TopkDropoutStrategy

# backtest parameters
bparas = {}
bparas['limit_threshold'] = 0.095
bparas['account'] = 1000000000

sparas = {}
sparas['topk'] = 50
sparas['n_drop'] = 230
strategy = TopkDropoutStrategy(**sparas)

report_normal_df, _ = backtest(pred_df, strategy, **bparas)

qcr.report_graph(report_normal_df)

Parameters:

report_df –

df.index.name must be date, df.columns must contain return, turnover, cost, bench

            return          cost            bench           turnover
date
2017-01-04      0.003421        0.000864        0.011693        0.576325
2017-01-05      0.000508        0.000447        0.000721        0.227882
2017-01-06      -0.003321       0.000212        -0.004322       0.102765
2017-01-09      0.006753        0.000212        0.006874        0.105864
2017-01-10      -0.000416       0.000440        -0.003350       0.208396

show_notebook – whether to display graphics in notebook, the default is True

Returns:

if show_notebook is True, display in notebook; else return plotly.graph_objs.Figure list

Graphical Result¶

Note

Axis X: Trading day
Axis Y:
- cum bench
  
  Cumulative returns series of benchmark
- cum return wo cost
  
  Cumulative returns series of portfolio without cost
- cum return w cost
  
  Cumulative returns series of portfolio with cost
- return wo mdd
  
  Maximum drawdown series of cumulative return without cost
- return w cost mdd:
  
  Maximum drawdown series of cumulative return with cost
- cum ex return wo cost
  
  The CAR (cumulative abnormal return) series of the portfolio compared to the benchmark without cost.
- cum ex return w cost
  
  The CAR (cumulative abnormal return) series of the portfolio compared to the benchmark with cost.
- turnover
  
  Turnover rate series
- cum ex return wo cost mdd
  
  Drawdown series of CAR (cumulative abnormal return) without cost
- cum ex return w cost mdd
  
  Drawdown series of CAR (cumulative abnormal return) with cost
The shaded part above: Maximum drawdown corresponding to cum return wo cost
The shaded part below: Maximum drawdown corresponding to cum ex return wo cost

Usage of analysis_position.score_ic¶

API¶

qlib.contrib.report.analysis_position.score_ic.score_ic_graph(pred_label: pandas.core.frame.DataFrame, show_notebook: bool = True) → [<class 'list'>, <class 'tuple'>]¶

score IC

Example:

from qlib.data import D
from qlib.contrib.report import analysis_position
pred_df_dates = pred_df.index.get_level_values(level='datetime')
features_df = D.features(D.instruments('csi500'), ['Ref($close, -2)/Ref($close, -1)-1'], pred_df_dates.min(), pred_df_dates.max())
features_df.columns = ['label']
pred_label = pd.concat([features_df, pred], axis=1, sort=True).reindex(features_df.index)
analysis_position.score_ic_graph(pred_label)

Parameters:

pred_label –

index is pd.MultiIndex, index name is [instrument, datetime]; columns names is [score, label]

instrument      datetime        score       label
SH600004        2017-12-11      -0.013502       -0.013502
            2017-12-12  -0.072367       -0.072367
            2017-12-13  -0.068605       -0.068605
            2017-12-14  0.012440        0.012440
            2017-12-15  -0.102778       -0.102778

show_notebook – whether to display graphics in notebook, the default is True

Returns:

if show_notebook is True, display in notebook; else return plotly.graph_objs.Figure list

Graphical Result¶

Note

Axis X: Trading day
Axis Y:
- ic
  
  The Pearson correlation coefficient series between label and prediction score. In the above example, the label is formulated as Ref($close, -1)/$close - 1. Please refer to Data API Featrue for more details.
- rank_ic
  
  The Spearman’s rank correlation coefficient series between label and prediction score.

Usage of analysis_position.cumulative_return¶

API¶

qlib.contrib.report.analysis_position.cumulative_return.cumulative_return_graph(position: dict, report_normal: pandas.core.frame.DataFrame, label_data: pandas.core.frame.DataFrame, show_notebook=True, start_date=None, end_date=None) → Iterable[plotly.graph_objs._figure.Figure]¶

Backtest buy, sell, and holding cumulative return graph

Example:
from qlib.data import D
from qlib.contrib.evaluate import risk_analysis, backtest, long_short_backtest
from qlib.contrib.strategy import TopkDropoutStrategy

# backtest parameters
bparas = {}
bparas['limit_threshold'] = 0.095
bparas['account'] = 1000000000

sparas = {}
sparas['topk'] = 50
sparas['n_drop'] = 5
strategy = TopkDropoutStrategy(**sparas)

report_normal_df, positions = backtest(pred_df, strategy, **bparas)

pred_df_dates = pred_df.index.get_level_values(level='datetime')
features_df = D.features(D.instruments('csi500'), ['Ref($close, -1)/$close - 1'], pred_df_dates.min(), pred_df_dates.max())
features_df.columns = ['label']

qcr.cumulative_return_graph(positions, report_normal_df, features_df)
Graph desc:

Axis X: Trading day

Axis Y:

Above axis Y: (((Ref($close, -1)/$close - 1) * weight).sum() / weight.sum()).cumsum()

Below axis Y: Daily weight sum

In the sell graph, y < 0 stands for profit; in other cases, y > 0 stands for profit.

In the buy_minus_sell graph, the y value of the weight graph at the bottom is buy_weight + sell_weight.

In each graph, the red line in the histogram on the right represents the average.

Parameters:

position – position data

report_normal –

                return      cost            bench           turnover
date
2017-01-04      0.003421        0.000864        0.011693        0.576325
2017-01-05      0.000508        0.000447        0.000721        0.227882
2017-01-06      -0.003321       0.000212        -0.004322       0.102765
2017-01-09      0.006753        0.000212        0.006874        0.105864
2017-01-10      -0.000416       0.000440        -0.003350       0.208396

label_data – D.features result; index is pd.MultiIndex, index name is [instrument, datetime]; columns names is [label].

The label T is the change from T to T+1, it is recommended to use close, example: D.features(D.instruments(‘csi500’), [‘Ref($close, -1)/$close-1’])

                                label
instrument      datetime
SH600004        2017-12-11      -0.013502
                2017-12-12      -0.072367
                2017-12-13      -0.068605
                2017-12-14      0.012440
                2017-12-15      -0.102778

Parameters:	show_notebook – True or False. If True, show graph in notebook, else return figures start_date – start date end_date – end date
Returns:

Graphical Result¶

Note

Axis X: Trading day
Axis Y:
- Above axis Y: (((Ref($close, -1)/$close - 1) * weight).sum() / weight.sum()).cumsum()
- Below axis Y: Daily weight sum
In the sell graph, y < 0 stands for profit; in other cases, y > 0 stands for profit.
In the buy_minus_sell graph, the y value of the weight graph at the bottom is buy_weight + sell_weight.
In each graph, the red line in the histogram on the right represents the average.

../_images/cumulative_return_buy_minus_sell.png

Usage of analysis_position.risk_analysis¶

API¶

qlib.contrib.report.analysis_position.risk_analysis.risk_analysis_graph(analysis_df: pandas.core.frame.DataFrame = None, report_normal_df: pandas.core.frame.DataFrame = None, report_long_short_df: pandas.core.frame.DataFrame = None, show_notebook: bool = True) → Iterable[plotly.graph_objs._figure.Figure]¶

Generate analysis graph and monthly analysis

Example:

from qlib.contrib.evaluate import risk_analysis, backtest, long_short_backtest
from qlib.contrib.strategy import TopkDropoutStrategy
from qlib.contrib.report import analysis_position

# backtest parameters
bparas = {}
bparas['limit_threshold'] = 0.095
bparas['account'] = 1000000000

sparas = {}
sparas['topk'] = 50
sparas['n_drop'] = 230
strategy = TopkDropoutStrategy(**sparas)

report_normal_df, positions = backtest(pred_df, strategy, **bparas)
# long_short_map = long_short_backtest(pred_df)
# report_long_short_df = pd.DataFrame(long_short_map)

analysis = dict()
# analysis['pred_long'] = risk_analysis(report_long_short_df['long'])
# analysis['pred_short'] = risk_analysis(report_long_short_df['short'])
# analysis['pred_long_short'] = risk_analysis(report_long_short_df['long_short'])
analysis['excess_return_without_cost'] = risk_analysis(report_normal_df['return'] - report_normal_df['bench'])
analysis['excess_return_with_cost'] = risk_analysis(report_normal_df['return'] - report_normal_df['bench'] - report_normal_df['cost'])
analysis_df = pd.concat(analysis)

analysis_position.risk_analysis_graph(analysis_df, report_normal_df)

Parameters:

analysis_df –

analysis data, index is pd.MultiIndex; columns names is [risk].

                                                  risk
excess_return_without_cost mean               0.000692
                           std                0.005374
                           annualized_return  0.174495
                           information_ratio  2.045576
                           max_drawdown      -0.079103
excess_return_with_cost    mean               0.000499
                           std                0.005372
                           annualized_return  0.125625
                           information_ratio  1.473152
                           max_drawdown      -0.088263

report_normal_df –

df.index.name must be date, df.columns must contain return, turnover, cost, bench

            return          cost            bench           turnover
date
2017-01-04      0.003421        0.000864        0.011693        0.576325
2017-01-05      0.000508        0.000447        0.000721        0.227882
2017-01-06      -0.003321       0.000212        -0.004322       0.102765
2017-01-09      0.006753        0.000212        0.006874        0.105864
2017-01-10      -0.000416       0.000440        -0.003350       0.208396

report_long_short_df –

df.index.name must be date, df.columns contain long, short, long_short

            long            short           long_short
date
2017-01-04      -0.001360       0.001394        0.000034
2017-01-05      0.002456        0.000058        0.002514
2017-01-06      0.000120        0.002739        0.002859
2017-01-09      0.001436        0.001838        0.003273
2017-01-10      0.000824        -0.001944       -0.001120

show_notebook – Whether to display graphics in a notebook, default True If True, show graph in notebook If False, return graph figure

Returns:

Graphical Result¶

Note

general graphics
- std
  
  excess_return_without_cost
  
  The Standard Deviation of CAR (cumulative abnormal return) without cost.
  
  excess_return_with_cost
  
  The Standard Deviation of CAR (cumulative abnormal return) with cost.
- annualized_return
  
  excess_return_without_cost
  
  The Annualized Rate of CAR (cumulative abnormal return) without cost.
  
  excess_return_with_cost
  
  The Annualized Rate of CAR (cumulative abnormal return) with cost.
- information_ratio
  
  excess_return_without_cost
  
  The Information Ratio without cost.
  
  excess_return_with_cost
  
  The Information Ratio with cost.
  
  To know more about Information Ratio, please refer to Information Ratio – IR.
- max_drawdown
  
  excess_return_without_cost
  
  The Maximum Drawdown of CAR (cumulative abnormal return) without cost.
  
  excess_return_with_cost
  
  The Maximum Drawdown of CAR (cumulative abnormal return) with cost.

Note

annualized_return/max_drawdown/information_ratio/std graphics
- Axis X: Trading days grouped by month
- Axis Y:
  
  annualized_return graphics
  
  excess_return_without_cost_annualized_return
  
  The Annualized Rate series of monthly CAR (cumulative abnormal return) without cost.
  
  excess_return_with_cost_annualized_return
  
  The Annualized Rate series of monthly CAR (cumulative abnormal return) with cost.
  
  max_drawdown graphics
  
  excess_return_without_cost_max_drawdown
  
  The Maximum Drawdown series of monthly CAR (cumulative abnormal return) without cost.
  
  excess_return_with_cost_max_drawdown
  
  The Maximum Drawdown series of monthly CAR (cumulative abnormal return) with cost.
  
  information_ratio graphics
  
  excess_return_without_cost_information_ratio
  
  The Information Ratio series of monthly CAR (cumulative abnormal return) without cost.
  
  excess_return_with_cost_information_ratio
  
  The Information Ratio series of monthly CAR (cumulative abnormal return) with cost.
  
  std graphics
  
  excess_return_without_cost_max_drawdown
  
  The Standard Deviation series of monthly CAR (cumulative abnormal return) without cost.
  
  excess_return_with_cost_max_drawdown
  
  The Standard Deviation series of monthly CAR (cumulative abnormal return) with cost.

../_images/risk_analysis_annualized_return.png

../_images/risk_analysis_max_drawdown.png

../_images/risk_analysis_information_ratio.png

Usage of analysis_position.rank_label¶

API¶

qlib.contrib.report.analysis_position.rank_label.rank_label_graph(position: dict, label_data: pandas.core.frame.DataFrame, start_date=None, end_date=None, show_notebook=True) → Iterable[plotly.graph_objs._figure.Figure]¶

Ranking percentage of stocks buy, sell, and holding on the trading day. Average rank-ratio(similar to sell_df[‘label’].rank(ascending=False) / len(sell_df)) of daily trading

Example:

from qlib.data import D
from qlib.contrib.evaluate import backtest
from qlib.contrib.strategy import TopkDropoutStrategy

# backtest parameters
bparas = {}
bparas['limit_threshold'] = 0.095
bparas['account'] = 1000000000

sparas = {}
sparas['topk'] = 50
sparas['n_drop'] = 230
strategy = TopkDropoutStrategy(**sparas)

_, positions = backtest(pred_df, strategy, **bparas)

pred_df_dates = pred_df.index.get_level_values(level='datetime')
features_df = D.features(D.instruments('csi500'), ['Ref($close, -1)/$close-1'], pred_df_dates.min(), pred_df_dates.max())
features_df.columns = ['label']

qcr.rank_label_graph(positions, features_df, pred_df_dates.min(), pred_df_dates.max())

Parameters:	position – position data; qlib.contrib.backtest.backtest.backtest result label_data – D.features result; index is pd.MultiIndex, index name is [instrument, datetime]; columns names is [label].

The label T is the change from T to T+1, it is recommended to use close, example: D.features(D.instruments(‘csi500’), [‘Ref($close, -1)/$close-1’])

                                label
instrument      datetime
SH600004        2017-12-11      -0.013502
                2017-12-12      -0.072367
                2017-12-13      -0.068605
                2017-12-14      0.012440
                2017-12-15      -0.102778

Parameters:	start_date – start date end_date – end_date show_notebook – True or False. If True, show graph in notebook, else return figures
Returns:

Graphical Result¶

Note

hold/sell/buy graphics:
- Axis X: Trading day
- Axis Y:
  
  Average ranking ratio`of `label for stocks that is held/sold/bought on the trading day.
  
  In the above example, the label is formulated as Ref($close, -1)/$close - 1. The ranking ratio can be formulated as follows. .. math:
  
  ranking\ ratio = \frac{Ascending\ Ranking\ of\ label}{Number\ of\ Stocks\ in\ the\ Portfolio}

Usage of analysis_model.analysis_model_performance¶

API¶

qlib.contrib.report.analysis_model.analysis_model_performance.ic_figure(ic_df: pandas.core.frame.DataFrame, show_nature_day=True, **kwargs) → plotly.graph_objs._figure.Figure¶

IC figure

Parameters:	ic_df – ic DataFrame show_nature_day – whether to display the abscissa of non-trading day
Returns:	plotly.graph_objs.Figure

qlib.contrib.report.analysis_model.analysis_model_performance.model_performance_graph(pred_label: pandas.core.frame.DataFrame, lag: int = 1, N: int = 5, reverse=False, rank=False, graph_names: list = ['group_return', 'pred_ic', 'pred_autocorr'], show_notebook: bool = True, show_nature_day=True) → [<class 'list'>, <class 'tuple'>]¶

Model performance

Parameters:

pred_label –

index is pd.MultiIndex, index name is [instrument, datetime]; columns names is [score, label]

instrument      datetime        score       label
SH600004        2017-12-11      -0.013502       -0.013502
            2017-12-12  -0.072367       -0.072367
            2017-12-13  -0.068605       -0.068605
            2017-12-14  0.012440        0.012440
            2017-12-15  -0.102778       -0.102778

lag – pred.groupby(level=’instrument’)[‘score’].shift(lag). It will be only used in the auto-correlation computing.
N – group number, default 5
reverse – if True, pred[‘score’] *= -1
rank – if True, calculate rank ic
graph_names – graph names; default [‘cumulative_return’, ‘pred_ic’, ‘pred_autocorr’, ‘pred_turnover’]
show_notebook – whether to display graphics in notebook, the default is True
show_nature_day – whether to display the abscissa of non-trading day

Returns:

if show_notebook is True, display in notebook; else return plotly.graph_objs.Figure list

Graphical Results¶

Note

cumulative return graphics
- Group1:
  
  The Cumulative Return series of stocks group with (ranking ratio of label <= 20%)
- Group2:
  
  The Cumulative Return series of stocks group with (20% < ranking ratio of label <= 40%)
- Group3:
  
  The Cumulative Return series of stocks group with (40% < ranking ratio of label <= 60%)
- Group4:
  
  The Cumulative Return series of stocks group with (60% < ranking ratio of label <= 80%)
- Group5:
  
  The Cumulative Return series of stocks group with (80% < ranking ratio of label)
- long-short:
  
  The Difference series between Cumulative Return of Group1 and of Group5
- long-average
  
  The Difference series between Cumulative Return of Group1 and average Cumulative Return for all stocks.
The ranking ratio can be formulated as follows.

\[ranking\ ratio = \frac{Ascending\ Ranking\ of\ label}{Number\ of\ Stocks\ in\ the\ Portfolio}\]

../_images/analysis_model_cumulative_return.png

Note

long-short/long-average

The distribution of long-short/long-average returns on each trading day

../_images/analysis_model_long_short.png

Note

Information Coefficient
- The Pearson correlation coefficient series between labels and prediction scores of stocks in portfolio.
- The graphics reports can be used to evaluate the prediction scores.

Note

Monthly IC

Monthly average of the Information Coefficient

../_images/analysis_model_monthly_IC.png

Note

IC

The distribution of the Information Coefficient on each trading day.
IC Normal Dist. Q-Q

The Quantile-Quantile Plot is used for the normal distribution of Information Coefficient on each trading day.

Note

Auto Correlation
- The Pearson correlation coefficient series between the latest prediction scores and the prediction scores lag days ago of stocks in portfolio on each trading day.
- The graphics reports can be used to estimate the turnover rate.

../_images/analysis_model_auto_correlation.png