API Reference¶
Here you can find all Qlib
interfaces.
Data¶
Provider¶
-
class
qlib.data.data.
CalendarProvider
¶ Calendar provider base class
Provide calendar data.
-
calendar
(start_time=None, end_time=None, freq='day', future=False)¶ Get calendar of certain market in given time range.
Parameters: - start_time (str) – start of the time range
- end_time (str) – end of the time range
- freq (str) – time frequency, available: year/quarter/month/week/day
- future (bool) – whether including future trading day
Returns: calendar list
Return type: list
-
locate_index
(start_time, end_time, freq, future)¶ Locate the start time index and end time index in a calendar under certain frequency.
Parameters: - start_time (str) – start of the time range
- end_time (str) – end of the time range
- freq (str) – time frequency, available: year/quarter/month/week/day
- future (bool) – whether including future trading day
Returns: - pd.Timestamp – the real start time
- pd.Timestamp – the real end time
- int – the index of start time
- int – the index of end time
-
-
class
qlib.data.data.
InstrumentProvider
¶ Instrument provider base class
Provide instrument data.
-
static
instruments
(market='all', filter_pipe=None)¶ Get the general config dictionary for a base market adding several dynamic filters.
Parameters: - market (str) – market/industry/index shortname, e.g. all/sse/szse/sse50/csi300/csi500
- filter_pipe (list) – the list of dynamic filters
Returns: dict of stockpool config {`market`=>base market name, `filter_pipe`=>list of filters}
example : {‘market’: ‘csi500’,
- ’filter_pipe’: [{‘filter_type’: ‘ExpressionDFilter’,
’rule_expression’: ‘$open<40’, ‘filter_start_time’: None, ‘filter_end_time’: None, ‘keep’: False},
- {‘filter_type’: ‘NameDFilter’,
’name_rule_re’: ‘SH[0-9]{4}55’, ‘filter_start_time’: None, ‘filter_end_time’: None}]}
Return type: dict
-
list_instruments
(instruments, start_time=None, end_time=None, freq='day', as_list=False)¶ List the instruments based on a certain stockpool config.
Parameters: - instruments (dict) – stockpool config
- start_time (str) – start of the time range
- end_time (str) – end of the time range
- as_list (bool) – return instruments as list or dict
Returns: instruments list or dictionary with time spans
Return type: dict or list
-
static
-
class
qlib.data.data.
FeatureProvider
¶ Feature provider class
Provide feature data.
-
feature
(instrument, field, start_time, end_time, freq)¶ Get feature data.
Parameters: - instrument (str) – a certain instrument
- field (str) – a certain field of feature
- start_time (str) – start of the time range
- end_time (str) – end of the time range
- freq (str) – time frequency, available: year/quarter/month/week/day
Returns: data of a certain feature
Return type: pd.Series
-
-
class
qlib.data.data.
ExpressionProvider
¶ Expression provider class
Provide Expression data.
-
expression
(instrument, field, start_time=None, end_time=None, freq='day')¶ Get Expression data.
Parameters: - instrument (str) – a certain instrument
- field (str) – a certain field of feature
- start_time (str) – start of the time range
- end_time (str) – end of the time range
- freq (str) – time frequency, available: year/quarter/month/week/day
Returns: data of a certain expression
Return type: pd.Series
-
-
class
qlib.data.data.
DatasetProvider
¶ Dataset provider class
Provide Dataset data.
-
dataset
(instruments, fields, start_time=None, end_time=None, freq='day')¶ Get dataset data.
Parameters: - instruments (list or dict) – list/dict of instruments or dict of stockpool config
- fields (list) – list of feature instances
- start_time (str) – start of the time range
- end_time (str) – end of the time range
- freq (str) – time frequency
Returns: a pandas dataframe with <instrument, datetime> index
Return type: pd.DataFrame
-
static
get_instruments_d
(instruments, freq)¶ Parse different types of input instruments to output instruments_d Wrong format of input instruments will lead to exception.
-
static
get_column_names
(fields)¶ Get column names from input fields
-
static
dataset_processor
(instruments_d, column_names, start_time, end_time, freq)¶ Load and process the data, return the data set. - default using multi-kernel method.
-
static
expression_calculator
(inst, start_time, end_time, freq, column_names, spans=None, C=None)¶ Calculate the expressions for one instrument, return a df result. If the expression has been calculated before, load from cache.
return value: A data frame with index ‘datetime’ and other data columns.
-
-
class
qlib.data.data.
LocalCalendarProvider
(**kwargs)¶ Local calendar data provider class
Provide calendar data from local data source.
-
calendar
(start_time=None, end_time=None, freq='day', future=False)¶ Get calendar of certain market in given time range.
Parameters: - start_time (str) – start of the time range
- end_time (str) – end of the time range
- freq (str) – time frequency, available: year/quarter/month/week/day
- future (bool) – whether including future trading day
Returns: calendar list
Return type: list
-
-
class
qlib.data.data.
LocalInstrumentProvider
¶ Local instrument data provider class
Provide instrument data from local data source.
-
list_instruments
(instruments, start_time=None, end_time=None, freq='day', as_list=False)¶ List the instruments based on a certain stockpool config.
Parameters: - instruments (dict) – stockpool config
- start_time (str) – start of the time range
- end_time (str) – end of the time range
- as_list (bool) – return instruments as list or dict
Returns: instruments list or dictionary with time spans
Return type: dict or list
-
-
class
qlib.data.data.
LocalFeatureProvider
(**kwargs)¶ Local feature data provider class
Provide feature data from local data source.
-
feature
(instrument, field, start_index, end_index, freq)¶ Get feature data.
Parameters: - instrument (str) – a certain instrument
- field (str) – a certain field of feature
- start_time (str) – start of the time range
- end_time (str) – end of the time range
- freq (str) – time frequency, available: year/quarter/month/week/day
Returns: data of a certain feature
Return type: pd.Series
-
-
class
qlib.data.data.
LocalExpressionProvider
¶ Local expression data provider class
Provide expression data from local data source.
-
expression
(instrument, field, start_time=None, end_time=None, freq='day')¶ Get Expression data.
Parameters: - instrument (str) – a certain instrument
- field (str) – a certain field of feature
- start_time (str) – start of the time range
- end_time (str) – end of the time range
- freq (str) – time frequency, available: year/quarter/month/week/day
Returns: data of a certain expression
Return type: pd.Series
-
-
class
qlib.data.data.
LocalDatasetProvider
¶ Local dataset data provider class
Provide dataset data from local data source.
-
dataset
(instruments, fields, start_time=None, end_time=None, freq='day')¶ Get dataset data.
Parameters: - instruments (list or dict) – list/dict of instruments or dict of stockpool config
- fields (list) – list of feature instances
- start_time (str) – start of the time range
- end_time (str) – end of the time range
- freq (str) – time frequency
Returns: a pandas dataframe with <instrument, datetime> index
Return type: pd.DataFrame
-
static
multi_cache_walker
(instruments, fields, start_time=None, end_time=None, freq='day')¶ This method is used to prepare the expression cache for the client. Then the client will load the data from expression cache by itself.
-
static
cache_walker
(inst, start_time, end_time, freq, column_names)¶ If the expressions of one instrument haven’t been calculated before, calculate it and write it into expression cache.
-
-
class
qlib.data.data.
ClientCalendarProvider
¶ Client calendar data provider class
Provide calendar data by requesting data from server as a client.
-
calendar
(start_time=None, end_time=None, freq='day', future=False)¶ Get calendar of certain market in given time range.
Parameters: - start_time (str) – start of the time range
- end_time (str) – end of the time range
- freq (str) – time frequency, available: year/quarter/month/week/day
- future (bool) – whether including future trading day
Returns: calendar list
Return type: list
-
-
class
qlib.data.data.
ClientInstrumentProvider
¶ Client instrument data provider class
Provide instrument data by requesting data from server as a client.
-
list_instruments
(instruments, start_time=None, end_time=None, freq='day', as_list=False)¶ List the instruments based on a certain stockpool config.
Parameters: - instruments (dict) – stockpool config
- start_time (str) – start of the time range
- end_time (str) – end of the time range
- as_list (bool) – return instruments as list or dict
Returns: instruments list or dictionary with time spans
Return type: dict or list
-
-
class
qlib.data.data.
ClientDatasetProvider
¶ Client dataset data provider class
Provide dataset data by requesting data from server as a client.
-
dataset
(instruments, fields, start_time=None, end_time=None, freq='day', disk_cache=0, return_uri=False)¶ Get dataset data.
Parameters: - instruments (list or dict) – list/dict of instruments or dict of stockpool config
- fields (list) – list of feature instances
- start_time (str) – start of the time range
- end_time (str) – end of the time range
- freq (str) – time frequency
Returns: a pandas dataframe with <instrument, datetime> index
Return type: pd.DataFrame
-
-
class
qlib.data.data.
BaseProvider
¶ Local provider class
To keep compatible with old qlib provider.
-
features
(instruments, fields, start_time=None, end_time=None, freq='day', disk_cache=None)¶ - disk_cache : int
- whether to skip(0)/use(1)/replace(2) disk_cache
This function will try to use cache method which has a keyword disk_cache, and will use provider method if a type error is raised because the DatasetD instance is a provider class.
-
-
class
qlib.data.data.
LocalProvider
¶ -
features_uri
(instruments, fields, start_time, end_time, freq, disk_cache=1)¶ Return the uri of the generated cache of features/dataset
Parameters: - disk_cache –
- instruments –
- fields –
- start_time –
- end_time –
- freq –
-
-
class
qlib.data.data.
ClientProvider
¶ Client Provider
- Requesting data from server as a client. Can propose requests:
- Calendar : Directly respond a list of calendars
- Instruments (without filter): Directly respond a list/dict of instruments
- Instruments (with filters): Respond a list/dict of instruments
- Features : Respond a cache uri
The general workflow is described as follows: When the user use client provider to propose a request, the client provider will connect the server and send the request. The client will start to wait for the response. The response will be made instantly indicating whether the cache is available. The waiting procedure will terminate only when the client get the reponse saying feature_available is true. BUG : Everytime we make request for certain data we need to connect to the server, wait for the response and disconnect from it. We can’t make a sequence of requests within one connection. You can refer to https://python-socketio.readthedocs.io/en/latest/client.html for documentation of python-socketIO client.
-
class
qlib.data.data.
Wrapper
¶ Data Provider Wrapper
-
qlib.data.data.
register_wrapper
(wrapper, cls_or_obj)¶ Parameters: - wrapper – A wrapper of all kinds of providers
- cls_or_obj – A class or class name or object instance in data/data.py
-
qlib.data.data.
register_all_wrappers
()¶
Filter¶
-
class
qlib.data.filter.
BaseDFilter
¶ Dynamic Instruments Filter Abstract class
Users can override this class to construct their own filter
Override __init__ to input filter regulations
Override filter_main to use the regulations to filter instruments
-
static
from_config
(config)¶ Construct an instance from config dict.
Parameters: config (dict) – dict of config parameters
-
to_config
()¶ Construct an instance from config dict.
Returns: return the dict of config parameters Return type: dict
-
static
-
class
qlib.data.filter.
SeriesDFilter
(fstart_time=None, fend_time=None)¶ Dynamic Instruments Filter Abstract class to filter a series of certain features
Filters should provide parameters:
- filter start time
- filter end time
- filter rule
Override __init__ to assign a certain rule to filter the series.
Override _getFilterSeries to use the rule to filter the series and get a dict of {inst => series}, or override filter_main for more advanced series filter rule
-
filter_main
(instruments, start_time=None, end_time=None)¶ Implement this method to filter the instruments.
Parameters: - instruments (dict) – input instruments to be filtered
- start_time (str) – start of the time range
- end_time (str) – end of the time range
Returns: filtered instruments, same structure as input instruments
Return type: dict
-
class
qlib.data.filter.
NameDFilter
(name_rule_re, fstart_time=None, fend_time=None)¶ Name dynamic instrument filter
Filter the instruments based on a regulated name format.
A name rule regular expression is required.
-
static
from_config
(config)¶ Construct an instance from config dict.
Parameters: config (dict) – dict of config parameters
-
to_config
()¶ Construct an instance from config dict.
Returns: return the dict of config parameters Return type: dict
-
static
-
class
qlib.data.filter.
ExpressionDFilter
(rule_expression, fstart_time=None, fend_time=None, keep=False)¶ Expression dynamic instrument filter
Filter the instruments based on a certain expression.
An expression rule indicating a certain feature field is required.
Examples
- basic features filter : rule_expression = ‘$close/$open>5’
- cross-sectional features filter : rule_expression = ‘$rank($close)<10’
- time-sequence features filter : rule_expression = ‘$Ref($close, 3)>100’
-
from_config
()¶ Construct an instance from config dict.
Parameters: config (dict) – dict of config parameters
-
to_config
()¶ Construct an instance from config dict.
Returns: return the dict of config parameters Return type: dict
Feature¶
Class¶
-
class
qlib.data.base.
Expression
¶ Expression base class
-
load
(instrument, start_index, end_index, freq)¶ load feature
Parameters: - instrument (str) – instrument code
- start_index (str) – feature start index [in calendar]
- end_index (str) – feature end index [in calendar]
- freq (str) – feature frequency
Returns: feature series: The index of the series is the calendar index
Return type: pd.Series
-
get_longest_back_rolling
()¶ Get the longest length of historical data the feature has accessed
This is designed for getting the needed range of the data to calculate the features in specific range at first. However, situations like Ref(Ref($close, -1), 1) can not be handled rightly.
So this will only used for detecting the length of historical data needed.
-
get_extended_window_size
()¶ get_extend_window_size
For to calculate this Operator in range[start_index, end_index] We have to get the leaf feature in range[start_index - lft_etd, end_index + rght_etd].
Returns: lft_etd, rght_etd Return type: (int, int)
-
-
class
qlib.data.base.
Feature
(name=None)¶ Static Expression
This kind of feature will load data from provider
-
get_longest_back_rolling
()¶ Get the longest length of historical data the feature has accessed
This is designed for getting the needed range of the data to calculate the features in specific range at first. However, situations like Ref(Ref($close, -1), 1) can not be handled rightly.
So this will only used for detecting the length of historical data needed.
-
get_extended_window_size
()¶ get_extend_window_size
For to calculate this Operator in range[start_index, end_index] We have to get the leaf feature in range[start_index - lft_etd, end_index + rght_etd].
Returns: lft_etd, rght_etd Return type: (int, int)
-
-
class
qlib.data.base.
ExpressionOps
¶ Operator Expression
This kind of feature will use operator for feature construction on the fly.
Operator¶
-
class
qlib.data.ops.
Abs
(feature)¶ Feature Absolute Value
Parameters: feature (Expression) – feature instance Returns: a feature instance with absolute output Return type: Expression
-
class
qlib.data.ops.
Sign
(feature)¶ Feature Sign
Parameters: feature (Expression) – feature instance Returns: a feature instance with sign Return type: Expression
-
class
qlib.data.ops.
Log
(feature)¶ Feature Log
Parameters: feature (Expression) – feature instance Returns: a feature instance with log Return type: Expression
-
class
qlib.data.ops.
Power
(feature, exponent)¶ Feature Power
Parameters: feature (Expression) – feature instance Returns: a feature instance with power Return type: Expression
-
class
qlib.data.ops.
Mask
(feature, instrument)¶ Feature Mask
Parameters: - feature (Expression) – feature instance
- instrument (str) – instrument mask
Returns: a feature instance with masked instrument
Return type:
-
class
qlib.data.ops.
Not
(feature)¶ Not Operator
Parameters: - feature_left (Expression) – feature instance
- feature_right (Expression) – feature instance
Returns: feature elementwise not output
Return type:
-
class
qlib.data.ops.
Add
(feature_left, feature_right)¶ Add Operator
Parameters: - feature_left (Expression) – feature instance
- feature_right (Expression) – feature instance
Returns: two features’ sum
Return type:
-
class
qlib.data.ops.
Sub
(feature_left, feature_right)¶ Subtract Operator
Parameters: - feature_left (Expression) – feature instance
- feature_right (Expression) – feature instance
Returns: two features’ subtraction
Return type:
-
class
qlib.data.ops.
Mul
(feature_left, feature_right)¶ Multiply Operator
Parameters: - feature_left (Expression) – feature instance
- feature_right (Expression) – feature instance
Returns: two features’ product
Return type:
-
class
qlib.data.ops.
Div
(feature_left, feature_right)¶ Division Operator
Parameters: - feature_left (Expression) – feature instance
- feature_right (Expression) – feature instance
Returns: two features’ division
Return type:
-
class
qlib.data.ops.
Greater
(feature_left, feature_right)¶ Greater Operator
Parameters: - feature_left (Expression) – feature instance
- feature_right (Expression) – feature instance
Returns: greater elements taken from the input two features
Return type:
-
class
qlib.data.ops.
Less
(feature_left, feature_right)¶ Less Operator
Parameters: - feature_left (Expression) – feature instance
- feature_right (Expression) – feature instance
Returns: smaller elements taken from the input two features
Return type:
-
class
qlib.data.ops.
Gt
(feature_left, feature_right)¶ Greater Than Operator
Parameters: - feature_left (Expression) – feature instance
- feature_right (Expression) – feature instance
Returns: bool series indicate left > right
Return type:
-
class
qlib.data.ops.
Ge
(feature_left, feature_right)¶ Greater Equal Than Operator
Parameters: - feature_left (Expression) – feature instance
- feature_right (Expression) – feature instance
Returns: bool series indicate left >= right
Return type:
-
class
qlib.data.ops.
Lt
(feature_left, feature_right)¶ Less Than Operator
Parameters: - feature_left (Expression) – feature instance
- feature_right (Expression) – feature instance
Returns: bool series indicate left < right
Return type:
-
class
qlib.data.ops.
Le
(feature_left, feature_right)¶ Less Equal Than Operator
Parameters: - feature_left (Expression) – feature instance
- feature_right (Expression) – feature instance
Returns: bool series indicate left <= right
Return type:
-
class
qlib.data.ops.
Eq
(feature_left, feature_right)¶ Equal Operator
Parameters: - feature_left (Expression) – feature instance
- feature_right (Expression) – feature instance
Returns: bool series indicate left == right
Return type:
-
class
qlib.data.ops.
Ne
(feature_left, feature_right)¶ Not Equal Operator
Parameters: - feature_left (Expression) – feature instance
- feature_right (Expression) – feature instance
Returns: bool series indicate left != right
Return type:
-
class
qlib.data.ops.
And
(feature_left, feature_right)¶ And Operator
Parameters: - feature_left (Expression) – feature instance
- feature_right (Expression) – feature instance
Returns: two features’ row by row & output
Return type:
-
class
qlib.data.ops.
Or
(feature_left, feature_right)¶ Or Operator
Parameters: - feature_left (Expression) – feature instance
- feature_right (Expression) – feature instance
Returns: two features’ row by row | outputs
Return type:
-
class
qlib.data.ops.
If
(condition, feature_left, feature_right)¶ If Operator
Parameters: - condition (Expression) – feature instance with bool values as condition
- feature_left (Expression) – feature instance
- feature_right (Expression) – feature instance
-
get_longest_back_rolling
()¶ Get the longest length of historical data the feature has accessed
This is designed for getting the needed range of the data to calculate the features in specific range at first. However, situations like Ref(Ref($close, -1), 1) can not be handled rightly.
So this will only used for detecting the length of historical data needed.
-
get_extended_window_size
()¶ get_extend_window_size
For to calculate this Operator in range[start_index, end_index] We have to get the leaf feature in range[start_index - lft_etd, end_index + rght_etd].
Returns: lft_etd, rght_etd Return type: (int, int)
-
class
qlib.data.ops.
Ref
(feature, N)¶ Feature Reference
Parameters: - feature (Expression) – feature instance
- N (int) – N = 0, retrieve the first data; N > 0, retrieve data of N periods ago; N < 0, future data
Returns: a feature instance with target reference
Return type: -
get_longest_back_rolling
()¶ Get the longest length of historical data the feature has accessed
This is designed for getting the needed range of the data to calculate the features in specific range at first. However, situations like Ref(Ref($close, -1), 1) can not be handled rightly.
So this will only used for detecting the length of historical data needed.
-
get_extended_window_size
()¶ get_extend_window_size
For to calculate this Operator in range[start_index, end_index] We have to get the leaf feature in range[start_index - lft_etd, end_index + rght_etd].
Returns: lft_etd, rght_etd Return type: (int, int)
-
class
qlib.data.ops.
Mean
(feature, N)¶ Rolling Mean (MA)
Parameters: - feature (Expression) – feature instance
- N (int) – rolling window size
Returns: a feature instance with rolling average
Return type:
-
class
qlib.data.ops.
Sum
(feature, N)¶ Rolling Sum
Parameters: - feature (Expression) – feature instance
- N (int) – rolling window size
Returns: a feature instance with rolling sum
Return type:
-
class
qlib.data.ops.
Std
(feature, N)¶ Rolling Std
Parameters: - feature (Expression) – feature instance
- N (int) – rolling window size
Returns: a feature instance with rolling std
Return type:
-
class
qlib.data.ops.
Var
(feature, N)¶ Rolling Variance
Parameters: - feature (Expression) – feature instance
- N (int) – rolling window size
Returns: a feature instance with rolling variance
Return type:
-
class
qlib.data.ops.
Skew
(feature, N)¶ Rolling Skewness
Parameters: - feature (Expression) – feature instance
- N (int) – rolling window size
Returns: a feature instance with rolling skewness
Return type:
-
class
qlib.data.ops.
Kurt
(feature, N)¶ Rolling Kurtosis
Parameters: - feature (Expression) – feature instance
- N (int) – rolling window size
Returns: a feature instance with rolling kurtosis
Return type:
-
class
qlib.data.ops.
Max
(feature, N)¶ Rolling Max
Parameters: - feature (Expression) – feature instance
- N (int) – rolling window size
Returns: a feature instance with rolling max
Return type:
-
class
qlib.data.ops.
IdxMax
(feature, N)¶ Rolling Max Index
Parameters: - feature (Expression) – feature instance
- N (int) – rolling window size
Returns: a feature instance with rolling max index
Return type:
-
class
qlib.data.ops.
Min
(feature, N)¶ Rolling Min
Parameters: - feature (Expression) – feature instance
- N (int) – rolling window size
Returns: a feature instance with rolling min
Return type:
-
class
qlib.data.ops.
IdxMin
(feature, N)¶ Rolling Min Index
Parameters: - feature (Expression) – feature instance
- N (int) – rolling window size
Returns: a feature instance with rolling min index
Return type:
-
class
qlib.data.ops.
Quantile
(feature, N, qscore)¶ Rolling Quantile
Parameters: - feature (Expression) – feature instance
- N (int) – rolling window size
Returns: a feature instance with rolling quantile
Return type:
-
class
qlib.data.ops.
Med
(feature, N)¶ Rolling Median
Parameters: - feature (Expression) – feature instance
- N (int) – rolling window size
Returns: a feature instance with rolling median
Return type:
-
class
qlib.data.ops.
Mad
(feature, N)¶ Rolling Mean Absolute Deviation
Parameters: - feature (Expression) – feature instance
- N (int) – rolling window size
Returns: a feature instance with rolling mean absolute deviation
Return type:
-
class
qlib.data.ops.
Rank
(feature, N)¶ Rolling Rank (Percentile)
Parameters: - feature (Expression) – feature instance
- N (int) – rolling window size
Returns: a feature instance with rolling rank
Return type:
-
class
qlib.data.ops.
Count
(feature, N)¶ Rolling Count
Parameters: - feature (Expression) – feature instance
- N (int) – rolling window size
Returns: a feature instance with rolling count of number of non-NaN elements
Return type:
-
class
qlib.data.ops.
Delta
(feature, N)¶ Rolling Delta
Parameters: - feature (Expression) – feature instance
- N (int) – rolling window size
Returns: a feature instance with end minus start in rolling window
Return type:
-
class
qlib.data.ops.
Slope
(feature, N)¶ Rolling Slope
Parameters: - feature (Expression) – feature instance
- N (int) – rolling window size
Returns: a feature instance with regression slope of given window
Return type:
-
class
qlib.data.ops.
Rsquare
(feature, N)¶ Rolling R-value Square
Parameters: - feature (Expression) – feature instance
- N (int) – rolling window size
Returns: a feature instance with regression r-value square of given window
Return type:
-
class
qlib.data.ops.
Resi
(feature, N)¶ Rolling Regression Residuals
Parameters: - feature (Expression) – feature instance
- N (int) – rolling window size
Returns: a feature instance with regression residuals of given window
Return type:
-
class
qlib.data.ops.
WMA
(feature, N)¶ Rolling WMA
Parameters: - feature (Expression) – feature instance
- N (int) – rolling window size
Returns: a feature instance with weighted moving average output
Return type:
-
class
qlib.data.ops.
EMA
(feature, N)¶ Rolling Exponential Mean (EMA)
Parameters: - feature (Expression) – feature instance
- N (int) – rolling window size
Returns: a feature instance with regression r-value square of given window
Return type:
-
class
qlib.data.ops.
Corr
(feature_left, feature_right, N)¶ Rolling Correlation
Parameters: - feature_left (Expression) – feature instance
- feature_right (Expression) – feature instance
- N (int) – rolling window size
Returns: a feature instance with rolling correlation of two input features
Return type:
-
class
qlib.data.ops.
Cov
(feature_left, feature_right, N)¶ Rolling Covariance
Parameters: - feature_left (Expression) – feature instance
- feature_right (Expression) – feature instance
- N (int) – rolling window size
Returns: a feature instance with rolling max of two input features
Return type:
Cache¶
-
class
qlib.data.cache.
MemCacheUnit
(*args, **kwargs)¶ Memory Cache Unit.
-
class
qlib.data.cache.
MemCache
(mem_cache_size_limit=None, limit_type='length')¶ Memory cache.
-
class
qlib.data.cache.
ExpressionCache
(provider)¶ Expression cache mechanism base class.
This class is used to wrap expression provider with self-defined expression cache mechanism.
Note
Override the _uri and _expression method to create your own expression cache mechanism.
-
expression
(instrument, field, start_time, end_time, freq)¶ Get expression data.
Note
Same interface as expression method in expression provider
-
update
(cache_uri)¶ Update expression cache to latest calendar.
Overide this method to define how to update expression cache corresponding to users’ own cache mechanism.
Parameters: cache_uri (str) – the complete uri of expression cache file (include dir path) Returns: 0(successful update)/ 1(no need to update)/ 2(update failure) Return type: int
-
-
class
qlib.data.cache.
DatasetCache
(provider)¶ Dataset cache mechanism base class.
This class is used to wrap dataset provider with self-defined dataset cache mechanism.
Note
Override the _uri and _dataset method to create your own dataset cache mechanism.
-
dataset
(instruments, fields, start_time=None, end_time=None, freq='day', disk_cache=1)¶ Get feature dataset.
Note
Same interface as dataset method in dataset provider
Note
The server use redis_lock to make sure read-write conflicts will not be triggered
but client readers are not considered.
-
update
(cache_uri)¶ Update dataset cache to latest calendar.
Overide this method to define how to update dataset cache corresponding to users’ own cache mechanism.
Parameters: cache_uri (str) – the complete uri of dataset cache file (include dir path) Returns: 0(successful update)/ 1(no need to update)/ 2(update failure) Return type: int
-
static
cache_to_origin_data
(data, fields)¶ cache data to origin data
Parameters: - data – pd.DataFrame, cache data
- fields – feature fields
Returns: pd.DataFrame
-
static
normalize_uri_args
(instruments, fields, freq)¶ normalize uri args
-
-
class
qlib.data.cache.
DiskExpressionCache
(provider, **kwargs)¶ Prepared cache mechanism for server.
-
gen_expression_cache
(expression_data, cache_path, instrument, field, freq, last_update)¶ use bin file to save like feature-data.
-
update
(sid, cache_uri)¶ Update expression cache to latest calendar.
Overide this method to define how to update expression cache corresponding to users’ own cache mechanism.
Parameters: cache_uri (str) – the complete uri of expression cache file (include dir path) Returns: 0(successful update)/ 1(no need to update)/ 2(update failure) Return type: int
-
-
class
qlib.data.cache.
DiskDatasetCache
(provider, **kwargs)¶ Prepared cache mechanism for server.
-
classmethod
read_data_from_cache
(cache_path, start_time, end_time, fields)¶ read_cache_from
This function can read data from the disk cache dataset
Parameters: - cache_path –
- start_time –
- end_time –
- fields – The fields order of the dataset cache is sorted. So rearrange the columns to make it consistent
Returns:
-
class
IndexManager
(cache_path)¶ The lock is not considered in the class. Please consider the lock outside the code. This class is the proxy of the disk data.
-
gen_dataset_cache
(cache_path, instruments, fields, freq)¶ Note
This function does not consider the cache read write lock. Please
Aquire the lock outside this function
The format the cache contains 3 parts(followed by typical filename).
- index : cache/d41366901e25de3ec47297f12e2ba11d.index
The content of the file may be in following format(pandas.Series)
start end 1999-11-10 00:00:00 0 1 1999-11-11 00:00:00 1 2 1999-11-12 00:00:00 2 3 ...
Note
The start is closed. The end is open!!!!!
- Each line contains two element <timestamp, end_index>
- It indicates the end_index of the data for timestamp
meta data: cache/d41366901e25de3ec47297f12e2ba11d.meta
- data : cache/d41366901e25de3ec47297f12e2ba11d
- This is a hdf file sorted by datetime
Parameters: - cache_path – The path to store the cache
- instruments – The instruments to store the cache
- fields – The fields to store the cache
- freq – The freq to store the cache
:return type pd.DataFrame; The fields of the returned DataFrame are consistent with the parameters of the function
-
update
(cache_uri)¶ Update dataset cache to latest calendar.
Overide this method to define how to update dataset cache corresponding to users’ own cache mechanism.
Parameters: cache_uri (str) – the complete uri of dataset cache file (include dir path) Returns: 0(successful update)/ 1(no need to update)/ 2(update failure) Return type: int
-
classmethod
Contrib¶
Data Handler¶
-
class
qlib.contrib.estimator.handler.
BaseDataHandler
(processors=[], **kwargs)¶ -
split_rolling_periods
(train_start_date, train_end_date, validate_start_date, validate_end_date, test_start_date, test_end_date, rolling_period, calendar_freq='day')¶ Calculating the Rolling split periods, the period rolling on market calendar. :param train_start_date: :param train_end_date: :param validate_start_date: :param validate_end_date: :param test_start_date: :param test_end_date: :param rolling_period: The market period of rolling :param calendar_freq: The frequence of the market calendar :yield: Rolling split periods
-
get_split_data
(train_start_date, train_end_date, validate_start_date, validate_end_date, test_start_date, test_end_date)¶ all return types are DataFrame
-
setup_process_data
(df_train, df_valid, df_test)¶ process the train, valid and test data :return: the processed train, valid and test data.
-
get_origin_test_label_with_date
(test_start_date, test_end_date, freq='day')¶ Get origin test label
Parameters: - test_start_date – test start date
- test_end_date – test end date
- freq – freq
Returns: pd.DataFrame
-
setup_feature
()¶ - Implement this method to load raw feature.
- the format of the feature is below
return: df_features
-
setup_label
()¶ - Implement this method to load and calculate label.
- the format of the label is below
return: df_label
-
-
class
qlib.contrib.estimator.handler.
QLibDataHandler
(start_date, end_date, *args, **kwargs)¶ -
setup_feature
()¶ Load the raw data. return: df_features
-
setup_label
()¶ Build up labels in df through users’ method :return: df_labels
-
-
qlib.contrib.estimator.handler.
parse_config_to_fields
(config)¶ create factors from config
- config = {
‘kbar’: {}, # whether to use some hard-code kbar features ‘price’: { # whether to use raw price features
‘windows’: [0, 1, 2, 3, 4], # use price at n days ago ‘feature’: [‘OPEN’, ‘HIGH’, ‘LOW’] # which price field to use}, ‘volume’: { # whether to use raw volume features
‘windows’: [0, 1, 2, 3, 4], # use volume at n days ago}, ‘rolling’: { # whether to use rolling operator based features
‘windows’: [5, 10, 20, 30, 60], # rolling windows size ‘include’: [‘ROC’, ‘MA’, ‘STD’], # rolling operator to use #if include is None we will use default operators ‘exclude’: [‘RANK’], # rolling operator not to use}
}
-
class
qlib.contrib.estimator.handler.
ConfigQLibDataHandler
(start_date, end_date, processors=None, **kwargs)¶
-
class
qlib.contrib.estimator.handler.
ALPHA360
(start_date, end_date, processors=None, **kwargs)¶
-
class
qlib.contrib.estimator.handler.
QLibDataHandlerV1
(start_date, end_date, processors=None, **kwargs)¶ -
setup_label
()¶ load the labels df :return: df_labels
-
-
class
qlib.contrib.estimator.handler.
QLibDataHandlerClose
(start_date, end_date, processors=None, **kwargs)¶
Model¶
-
class
qlib.contrib.model.base.
Model
¶ Model base class
-
fit
(x_train, y_train, x_valid, y_valid, w_train=None, w_valid=None, **kwargs)¶ fix train with cross-validation Fit model when ex_config.finetune is False
Parameters: - x_train (pd.dataframe) – train data
- y_train (pd.dataframe) – train label
- x_valid (pd.dataframe) – valid data
- y_valid (pd.dataframe) – valid label
- w_train (pd.dataframe) – train weight
- w_valid (pd.dataframe) – valid weight
Returns: trained model
Return type:
-
score
(x_test, y_test, w_test=None, **kwargs)¶ evaluate model with test data/label
Parameters: - x_test (pd.dataframe) – test data
- y_test (pd.dataframe) – test label
- w_test (pd.dataframe) – test weight
Returns: evaluation score
Return type: float
-
predict
(x_test, **kwargs)¶ predict given test data
Parameters: x_test (pd.dataframe) – test data Returns: test predict label Return type: np.ndarray
-
save
(fname, **kwargs)¶ save model
Parameters: fname (str) – model filename
-
load
(buffer, **kwargs)¶ load model
Parameters: buffer (bytes) – binary data of model parameters Returns: loaded model Return type: Model
-
get_data_with_date
(date, **kwargs)¶ Will be called in online module need to return the data that used to predict the label (score) of stocks at date.
- :param
- date: pd.Timestamp
- predict date
Returns: data: the input data that used to predict the label (score) of stocks at predict date.
-
finetune
(x_train, y_train, x_valid, y_valid, w_train=None, w_valid=None, **kwargs)¶ Finetune model In RollingTrainer:
- if loader.model_index is None:
- If provide ‘Static Model’, based on the provided ‘Static’ model update. If provide ‘Rolling Model’, skip the model of load, based on the last ‘provided model’ update.
- if loader.model_index is not None:
- Based on the provided model(loader.model_index) update.
- In StaticTrainer:
- If the load is ‘static model’:
- Based on the ‘static model’ update
- If the load is ‘rolling model’:
- Based on the provided model(loader.model_index) update. If loader.model_index is None, use the last model.
Parameters: - x_train (pd.dataframe) – train data
- y_train (pd.dataframe) – train label
- x_valid (pd.dataframe) – valid data
- y_valid (pd.dataframe) – valid label
- w_train (pd.dataframe) – train weight
- w_valid (pd.dataframe) – valid weight
Returns: finetune model
Return type:
-
Strategy¶
-
class
qlib.contrib.strategy.strategy.
StrategyWrapper
(inner_strategy)¶ StrategyWrapper is a wrapper of another strategy. By overriding some methods to make some changes on the basic strategy Cost control and risk control will base on this class.
-
class
qlib.contrib.strategy.strategy.
AdjustTimer
¶ Responsible for timing of position adjusting
This is designed as multiple inheritance mechanism due to - the is_adjust may need access to the internel state of a strategyw - it can be reguard as a enhancement to the existing strategy
-
is_adjust
(trade_date)¶ Return if the strategy can adjust positions on trade_date Will normally be used in strategy do trading with trade frequency
-
-
class
qlib.contrib.strategy.strategy.
ListAdjustTimer
(adjust_dates=None)¶ -
is_adjust
(trade_date)¶ Return if the strategy can adjust positions on trade_date Will normally be used in strategy do trading with trade frequency
-
-
class
qlib.contrib.strategy.strategy.
WeightStrategyBase
(order_generator_cls_or_obj=<class 'qlib.contrib.strategy.order_generator.OrderGenWInteract'>, *args, **kwargs)¶ -
generate_target_weight_position
(score, current, trade_date)¶ Parameter: score : pred score for this trade date, pd.Series, index is stock_id, contain ‘score’ column current : current position, use Position() class trade_exchange : Exchange() trade_date : trade date generate target position from score for this date and the current position The cash is not considered in the position
-
generate_order_list
(score_series, current, trade_exchange, pred_date, trade_date)¶ Parameter score_series : pd.Seires
stock_id , score- current : Position()
- current of account
- trade_exchange : Exchange()
- exchange
- trade_date : pd.Timestamp
- date
-
-
class
qlib.contrib.strategy.strategy.
TopkDropoutStrategy
(topk, n_drop, method='bottom', risk_degree=0.95, thresh=1, hold_thresh=1, **kwargs)¶ -
get_risk_degree
(date)¶ Return the proportion of your total value you will used in investment. Dynamically risk_degree will result in Market timing
-
generate_order_list
(score_series, current, trade_exchange, pred_date, trade_date)¶ - Gnererate order list according to score_series at trade_date.
- will not change current.
- Parameter
- score_series : pd.Seires
- stock_id , score
- current : Position()
- current of account
- trade_exchange : Exchange()
- exchange
- pred_date : pd.Timestamp
- predict date
- trade_date : pd.Timestamp
- trade date
-
Evaluate¶
-
qlib.contrib.evaluate.
risk_analysis
(r, N=252)¶ Risk Analysis
Parameters: - r (pandas.Series) – daily return series
- N (int) – scaler for annualizing information_ratio (day: 250, week: 50, month: 12)
-
qlib.contrib.evaluate.
get_strategy
(strategy=None, topk=50, margin=0.5, n_drop=5, risk_degree=0.95, str_type='amount', adjust_dates=None)¶ Parameters: - strategy (Strategy()) – strategy used in backtest
- topk (int (Default value: 50)) – top-N stocks to buy.
- margin (int or float(Default value: 0.5)) –
- if isinstance(margin, int):
- sell_limit = margin
- else:
- sell_limit = pred_in_a_day.count() * margin
buffer margin, in single score_mode, continue holding stock if it is in nlargest(sell_limit) sell_limit should be no less than topk
- n_drop (int) – number of stocks to be replaced in each trading date
- risk_degree (float) – 0-1, 0.95 for example, use 95% money to trade
- str_type ('amount', 'weight' or 'dropout') – strategy type: TopkAmountStrategy ,TopkWeightStrategy or TopkDropoutStrategy
Returns: - class: Strategy
- an initialized strategy object
-
qlib.contrib.evaluate.
get_exchange
(pred, exchange=None, subscribe_fields=[], open_cost=0.0015, close_cost=0.0025, min_cost=5.0, trade_unit=None, limit_threshold=None, deal_price=None, extract_codes=False, shift=1)¶ Parameters: - exchange related arguments (#) –
- exchange (Exchange()) –
- subscribe_fields (list) – subscribe fields
- open_cost (float) – open transaction cost
- close_cost (float) – close transaction cost
- min_cost (float) – min transaction cost
- trade_unit (int) – 100 for China A
- deal_price (str) – dealing price type: ‘close’, ‘open’, ‘vwap’
- limit_threshold (float) – limit move 0.1 (10%) for example, long and short with same limit
- extract_codes (bool) – will we pass the codes extracted from the pred to the exchange. NOTE: This will be faster with offline qlib.
Returns: - class: Exchange
- an initialized Exchange object
-
qlib.contrib.evaluate.
backtest
(pred, account=1000000000.0, shift=1, benchmark='SH000905', verbose=True, **kwargs)¶ This function will help you set a reasonable Exchange and provide default value for strategy :param # backtest workflow related or commmon arguments: :param pred: predict should has <instrument, datetime> index and one score column :type pred: pandas.DataFrame :param account: init account value :type account: float :param shift: whether to shift prediction by one day :type shift: int :param benchmark: benchmark code, default is SH000905 CSI 500 :type benchmark: str :param verbose: whether to print log :type verbose: bool :param # strategy related arguments: :param strategy: strategy used in backtest :type strategy: Strategy() :param topk: top-N stocks to buy. :type topk: int (Default value: 50) :param margin:
- if isinstance(margin, int):
- sell_limit = margin
- else:
- sell_limit = pred_in_a_day.count() * margin
buffer margin, in single score_mode, continue holding stock if it is in nlargest(sell_limit) sell_limit should be no less than topk
Parameters: - n_drop (int) – number of stocks to be replaced in each trading date
- risk_degree (float) – 0-1, 0.95 for example, use 95% money to trade
- str_type ('amount', 'weight' or 'dropout') – strategy type: TopkAmountStrategy ,TopkWeightStrategy or TopkDropoutStrategy
- exchange related arguments (#) –
- exchange (Exchange()) – pass the exchange for speeding up.
- subscribe_fields (list) – subscribe fields
- open_cost (float) – open transaction cost. The default value is 0.002(0.2%).
- close_cost (float) – close transaction cost. The default value is 0.002(0.2%).
- min_cost (float) – min transaction cost
- trade_unit (int) – 100 for China A
- deal_price (str) – dealing price type: ‘close’, ‘open’, ‘vwap’
- limit_threshold (float) – limit move 0.1 (10%) for example, long and short with same limit
- extract_codes (bool) –
will we pass the codes extracted from the pred to the exchange.
Note
This will be faster with offline qlib.
-
qlib.contrib.evaluate.
long_short_backtest
(pred, topk=50, deal_price=None, shift=1, open_cost=0, close_cost=0, trade_unit=None, limit_threshold=None, min_cost=5, subscribe_fields=[], extract_codes=False)¶ A backtest for long-short strategy
Parameters: - pred – The trading signal produced on day T
- topk – The short topk securities and long topk securities
- deal_price – The price to deal the trading
- shift – Whether to shift prediction by one day. The trading day will be T+1 if shift==1.
- open_cost – open transaction cost
- close_cost – close transaction cost
- trade_unit – 100 for China A
- limit_threshold – limit move 0.1 (10%) for example, long and short with same limit
- min_cost – min transaction cost
- subscribe_fields – subscribe fields
- extract_codes – bool will we pass the codes extracted from the pred to the exchange. NOTE: This will be faster with offline qlib.
Returns: The result of backtest, it is represented by a dict. { “long”: long_returns(excess),
”short”: short_returns(excess), “long_short”: long_short_returns}
Report¶
-
qlib.contrib.report.analysis_position.report.
report_graph
(report_df: pandas.core.frame.DataFrame, show_notebook: bool = True) → [<class 'list'>, <class 'tuple'>]¶ display backtest report
Example:
from qlib.contrib.evaluate import backtest from qlib.contrib.strategy import TopkDropoutStrategy # backtest parameters bparas = {} bparas['limit_threshold'] = 0.095 bparas['account'] = 1000000000 sparas = {} sparas['topk'] = 50 sparas['n_drop'] = 230 strategy = TopkDropoutStrategy(**sparas) report_normal_df, _ = backtest(pred_df, strategy, **bparas) qcr.report_graph(report_normal_df)
Parameters: - report_df –
df.index.name must be date, df.columns must contain return, turnover, cost, bench
return cost bench turnover date 2017-01-04 0.003421 0.000864 0.011693 0.576325 2017-01-05 0.000508 0.000447 0.000721 0.227882 2017-01-06 -0.003321 0.000212 -0.004322 0.102765 2017-01-09 0.006753 0.000212 0.006874 0.105864 2017-01-10 -0.000416 0.000440 -0.003350 0.208396
- show_notebook – whether to display graphics in notebook, the default is True
Returns: if show_notebook is True, display in notebook; else return plotly.graph_objs.Figure list
- report_df –
-
qlib.contrib.report.analysis_position.score_ic.
score_ic_graph
(pred_label: pandas.core.frame.DataFrame, show_notebook: bool = True) → [<class 'list'>, <class 'tuple'>]¶ score IC
Example:
from qlib.data import D from qlib.contrib.report import analysis_position pred_df_dates = pred_df.index.get_level_values(level='datetime') features_df = D.features(D.instruments('csi500'), ['Ref($close, -2)/Ref($close, -1)-1'], pred_df_dates.min(), pred_df_dates.max()) features_df.columns = ['label'] pred_label = pd.concat([features_df, pred], axis=1, sort=True).reindex(features_df.index) analysis_position.score_ic_graph(pred_label)
Parameters: - pred_label –
index is pd.MultiIndex, index name is [instrument, datetime]; columns names is [score, label]
instrument datetime score label SH600004 2017-12-11 -0.013502 -0.013502 2017-12-12 -0.072367 -0.072367 2017-12-13 -0.068605 -0.068605 2017-12-14 0.012440 0.012440 2017-12-15 -0.102778 -0.102778
- show_notebook – whether to display graphics in notebook, the default is True
Returns: if show_notebook is True, display in notebook; else return plotly.graph_objs.Figure list
- pred_label –
-
qlib.contrib.report.analysis_position.cumulative_return.
cumulative_return_graph
(position: dict, report_normal: pandas.core.frame.DataFrame, label_data: pandas.core.frame.DataFrame, show_notebook=True, start_date=None, end_date=None) → Iterable[plotly.graph_objs._figure.Figure]¶ Backtest buy, sell, and holding cumulative return graph
Example:
from qlib.data import D from qlib.contrib.evaluate import risk_analysis, backtest, long_short_backtest from qlib.contrib.strategy import TopkDropoutStrategy # backtest parameters bparas = {} bparas['limit_threshold'] = 0.095 bparas['account'] = 1000000000 sparas = {} sparas['topk'] = 50 sparas['n_drop'] = 5 strategy = TopkDropoutStrategy(**sparas) report_normal_df, positions = backtest(pred_df, strategy, **bparas) pred_df_dates = pred_df.index.get_level_values(level='datetime') features_df = D.features(D.instruments('csi500'), ['Ref($close, -1)/$close - 1'], pred_df_dates.min(), pred_df_dates.max()) features_df.columns = ['label'] qcr.cumulative_return_graph(positions, report_normal_df, features_df)
- Graph desc:
- Axis X: Trading day
- Axis Y:
- Above axis Y: (((Ref($close, -1)/$close - 1) * weight).sum() / weight.sum()).cumsum()
- Below axis Y: Daily weight sum
- In the sell graph, y < 0 stands for profit; in other cases, y > 0 stands for profit.
- In the buy_minus_sell graph, the y value of the weight graph at the bottom is buy_weight + sell_weight.
- In each graph, the red line in the histogram on the right represents the average.
Parameters: - position – position data
- report_normal –
return cost bench turnover date 2017-01-04 0.003421 0.000864 0.011693 0.576325 2017-01-05 0.000508 0.000447 0.000721 0.227882 2017-01-06 -0.003321 0.000212 -0.004322 0.102765 2017-01-09 0.006753 0.000212 0.006874 0.105864 2017-01-10 -0.000416 0.000440 -0.003350 0.208396
- label_data – D.features result; index is pd.MultiIndex, index name is [instrument, datetime]; columns names is [label].
The label T is the change from T to T+1, it is recommended to use
close
, example: D.features(D.instruments(‘csi500’), [‘Ref($close, -1)/$close-1’])label instrument datetime SH600004 2017-12-11 -0.013502 2017-12-12 -0.072367 2017-12-13 -0.068605 2017-12-14 0.012440 2017-12-15 -0.102778
Parameters: - show_notebook – True or False. If True, show graph in notebook, else return figures
- start_date – start date
- end_date – end date
Returns:
-
qlib.contrib.report.analysis_position.risk_analysis.
risk_analysis_graph
(analysis_df: pandas.core.frame.DataFrame = None, report_normal_df: pandas.core.frame.DataFrame = None, report_long_short_df: pandas.core.frame.DataFrame = None, show_notebook: bool = True) → Iterable[plotly.graph_objs._figure.Figure]¶ Generate analysis graph and monthly analysis
Example:
from qlib.contrib.evaluate import risk_analysis, backtest, long_short_backtest from qlib.contrib.strategy import TopkDropoutStrategy from qlib.contrib.report import analysis_position # backtest parameters bparas = {} bparas['limit_threshold'] = 0.095 bparas['account'] = 1000000000 sparas = {} sparas['topk'] = 50 sparas['n_drop'] = 230 strategy = TopkDropoutStrategy(**sparas) report_normal_df, positions = backtest(pred_df, strategy, **bparas) # long_short_map = long_short_backtest(pred_df) # report_long_short_df = pd.DataFrame(long_short_map) analysis = dict() # analysis['pred_long'] = risk_analysis(report_long_short_df['long']) # analysis['pred_short'] = risk_analysis(report_long_short_df['short']) # analysis['pred_long_short'] = risk_analysis(report_long_short_df['long_short']) analysis['excess_return_without_cost'] = risk_analysis(report_normal_df['return'] - report_normal_df['bench']) analysis['excess_return_with_cost'] = risk_analysis(report_normal_df['return'] - report_normal_df['bench'] - report_normal_df['cost']) analysis_df = pd.concat(analysis) analysis_position.risk_analysis_graph(analysis_df, report_normal_df)
Parameters: - analysis_df –
analysis data, index is pd.MultiIndex; columns names is [risk].
risk excess_return_without_cost mean 0.000692 std 0.005374 annualized_return 0.174495 information_ratio 2.045576 max_drawdown -0.079103 excess_return_with_cost mean 0.000499 std 0.005372 annualized_return 0.125625 information_ratio 1.473152 max_drawdown -0.088263
- report_normal_df –
df.index.name must be date, df.columns must contain return, turnover, cost, bench
return cost bench turnover date 2017-01-04 0.003421 0.000864 0.011693 0.576325 2017-01-05 0.000508 0.000447 0.000721 0.227882 2017-01-06 -0.003321 0.000212 -0.004322 0.102765 2017-01-09 0.006753 0.000212 0.006874 0.105864 2017-01-10 -0.000416 0.000440 -0.003350 0.208396
- report_long_short_df –
df.index.name must be date, df.columns contain long, short, long_short
long short long_short date 2017-01-04 -0.001360 0.001394 0.000034 2017-01-05 0.002456 0.000058 0.002514 2017-01-06 0.000120 0.002739 0.002859 2017-01-09 0.001436 0.001838 0.003273 2017-01-10 0.000824 -0.001944 -0.001120
- show_notebook – Whether to display graphics in a notebook, default True If True, show graph in notebook If False, return graph figure
Returns: - analysis_df –
-
qlib.contrib.report.analysis_position.rank_label.
rank_label_graph
(position: dict, label_data: pandas.core.frame.DataFrame, start_date=None, end_date=None, show_notebook=True) → Iterable[plotly.graph_objs._figure.Figure]¶ Ranking percentage of stocks buy, sell, and holding on the trading day. Average rank-ratio(similar to sell_df[‘label’].rank(ascending=False) / len(sell_df)) of daily trading
Example:
from qlib.data import D from qlib.contrib.evaluate import backtest from qlib.contrib.strategy import TopkDropoutStrategy # backtest parameters bparas = {} bparas['limit_threshold'] = 0.095 bparas['account'] = 1000000000 sparas = {} sparas['topk'] = 50 sparas['n_drop'] = 230 strategy = TopkDropoutStrategy(**sparas) _, positions = backtest(pred_df, strategy, **bparas) pred_df_dates = pred_df.index.get_level_values(level='datetime') features_df = D.features(D.instruments('csi500'), ['Ref($close, -1)/$close-1'], pred_df_dates.min(), pred_df_dates.max()) features_df.columns = ['label'] qcr.rank_label_graph(positions, features_df, pred_df_dates.min(), pred_df_dates.max())
Parameters: - position – position data; qlib.contrib.backtest.backtest.backtest result
- label_data – D.features result; index is pd.MultiIndex, index name is [instrument, datetime]; columns names is [label].
The label T is the change from T to T+1, it is recommended to use
close
, example: D.features(D.instruments(‘csi500’), [‘Ref($close, -1)/$close-1’])label instrument datetime SH600004 2017-12-11 -0.013502 2017-12-12 -0.072367 2017-12-13 -0.068605 2017-12-14 0.012440 2017-12-15 -0.102778
Parameters: - start_date – start date
- end_date – end_date
- show_notebook – True or False. If True, show graph in notebook, else return figures
Returns:
-
qlib.contrib.report.analysis_model.analysis_model_performance.
ic_figure
(ic_df: pandas.core.frame.DataFrame, show_nature_day=True, **kwargs) → plotly.graph_objs._figure.Figure¶ IC figure
Parameters: - ic_df – ic DataFrame
- show_nature_day – whether to display the abscissa of non-trading day
Returns: plotly.graph_objs.Figure
-
qlib.contrib.report.analysis_model.analysis_model_performance.
model_performance_graph
(pred_label: pandas.core.frame.DataFrame, lag: int = 1, N: int = 5, reverse=False, rank=False, graph_names: list = ['group_return', 'pred_ic', 'pred_autocorr'], show_notebook: bool = True, show_nature_day=True) → [<class 'list'>, <class 'tuple'>]¶ Model performance
Parameters: - pred_label –
index is pd.MultiIndex, index name is [instrument, datetime]; columns names is [score, label]
instrument datetime score label SH600004 2017-12-11 -0.013502 -0.013502 2017-12-12 -0.072367 -0.072367 2017-12-13 -0.068605 -0.068605 2017-12-14 0.012440 0.012440 2017-12-15 -0.102778 -0.102778
- lag – pred.groupby(level=’instrument’)[‘score’].shift(lag). It will be only used in the auto-correlation computing.
- N – group number, default 5
- reverse – if True, pred[‘score’] *= -1
- rank – if True, calculate rank ic
- graph_names – graph names; default [‘cumulative_return’, ‘pred_ic’, ‘pred_autocorr’, ‘pred_turnover’]
- show_notebook – whether to display graphics in notebook, the default is True
- show_nature_day – whether to display the abscissa of non-trading day
Returns: if show_notebook is True, display in notebook; else return plotly.graph_objs.Figure list
- pred_label –