deeptables.models package¶

Submodules¶

deeptables.models.config module¶

class deeptables.models.config.ModelConfig[source]¶

Bases: deeptables.models.config.ModelConfig

first_metric_name¶

deeptables.models.deepmodel module¶

class deeptables.models.deepmodel.DeepModel(task, num_classes, config, categorical_columns, continuous_columns, model_file=None)[source]¶

Bases: object

Class for neural network models

apply(X, output_layers=[], concat_outputs=False, batch_size=128, verbose=0, transformer=None)[source]¶

evaluate(X_test, y_test, batch_size=256, verbose=0)[source]¶

fit(X=None, y=None, batch_size=128, epochs=1, verbose=1, callbacks=None, validation_split=0.2, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0, steps_per_epoch=None, validation_steps=None, validation_freq=1, max_queue_size=10, workers=1, use_multiprocessing=False)[source]¶

predict(X, batch_size=128, verbose=0)[source]¶

release()[source]¶

save(filepath)[source]¶

class deeptables.models.deepmodel.ModelDesc[source]¶

Bases: object

add_input(name, num_columns)[source]¶

add_net(name, input_shape, output_shape)[source]¶

nets_desc()[source]¶

optimizer_info()[source]¶

set_concat_embed_dense(output_shape)[source]¶

set_dense(dense_dropout, use_batchnormalization)[source]¶

set_embeddings(input_dims, output_dims, embedding_dropout)[source]¶

set_output(activation, output_shape, use_bias)[source]¶

deeptables.models.deepnets module¶

deeptables.models.deepnets.afm_nets(embeddings, flatten_emb_layer, dense_layer, concat_emb_dense, config, model_desc)[source]¶: Attentional Factorization Machine (AFM), which learns the importance of each feature interaction from datasets via a neural attention network.

deeptables.models.deepnets.autoint_nets(embeddings, flatten_emb_layer, dense_layer, concat_emb_dense, config, model_desc)[source]¶: AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks.

deeptables.models.deepnets.cin_nets(embeddings, flatten_emb_layer, dense_layer, concat_emb_dense, config, model_desc)[source]¶: Compressed Interaction Network (CIN), with the following considerations: (1) interactions are applied at vector-wise level, not at bit-wise level; (2) high-order feature interactions is measured explicitly; (3) the complexity of network will not grow exponentially with the degree of interactions.

deeptables.models.deepnets.cross_dnn_nets(embeddings, flatten_emb_layer, dense_layer, concat_emb_dense, config, model_desc)[source]¶: Cross nets -> DNN -> logit_out

deeptables.models.deepnets.cross_nets(embeddings, flatten_emb_layer, dense_layer, concat_emb_dense, config, model_desc)[source]¶: The Cross networks is composed of cross layers to apply explicit feature crossing in an efficient way.

deeptables.models.deepnets.custom_dnn_D_A_D_B(x, params, cellname='dnn_D_A_D_B')[source]¶

deeptables.models.deepnets.dcn_nets(embeddings, flatten_emb_layer, dense_layer, concat_emb_dense, config, model_desc)[source]¶: Concat the outputs from Cross nets and DNN nets and feed into a standard logits layer

deeptables.models.deepnets.deserialize(name, custom_objects=None)[source]¶

deeptables.models.deepnets.dnn(x, params, cellname='dnn')[source]¶

deeptables.models.deepnets.dnn_nets(embeddings, flatten_emb_layer, dense_layer, concat_emb_dense, config, model_desc)[source]¶: MLP (fully-connected feed-forward neural nets)

deeptables.models.deepnets.fg_nets(embeddings, flatten_emb_layer, dense_layer, concat_emb_dense, config, model_desc)[source]¶

Feature Generation leverages the strength of CNN to generate local patterns and recombine: them to generate new features.

References

[1]	`Liu B, Tang R, Chen Y, et al. Feature generation by convolutional neural network

for click-through rate prediction[C]//The World Wide Web Conference. 2019: 1119-1129.`

deeptables.models.deepnets.fgcnn_afm_nets(embeddings, flatten_emb_layer, dense_layer, concat_emb_dense, config, model_desc)[source]¶: FGCNN with AFM as deep classifier

deeptables.models.deepnets.fgcnn_cin_nets(embeddings, flatten_emb_layer, dense_layer, concat_emb_dense, config, model_desc)[source]¶: FGCNN with CIN as deep classifier

deeptables.models.deepnets.fgcnn_dnn_nets(embeddings, flatten_emb_layer, dense_layer, concat_emb_dense, config, model_desc)[source]¶: FGCNN with DNN as deep classifier

deeptables.models.deepnets.fgcnn_fm_nets(embeddings, flatten_emb_layer, dense_layer, concat_emb_dense, config, model_desc)[source]¶: FGCNN with FM as deep classifier

deeptables.models.deepnets.fgcnn_ipnn_nets(embeddings, flatten_emb_layer, dense_layer, concat_emb_dense, config, model_desc)[source]¶: FGCNN with IPNN as deep classifier

deeptables.models.deepnets.fibi_dnn_nets(embeddings, flatten_emb_layer, dense_layer, concat_emb_dense, config, model_desc)[source]¶: FiBiNet with DNN as deep classifier

deeptables.models.deepnets.fibi_nets(embeddings, flatten_emb_layer, dense_layer, concat_emb_dense, config, model_desc)[source]¶: The SENET layer can convert an embedding layer into the SENET-Like embedding features, which helps to boost feature discriminability. The following Bilinear-Interaction layer models second order feature interactions on the original embedding and the SENET-Like embedding respectively. Subsequently, these cross features are concatenated by a combination layer which merges the outputs of Bilinear-Interaction layer.

deeptables.models.deepnets.fm_nets(embeddings, flatten_emb_layer, dense_layer, concat_emb_dense, config, model_desc)[source]¶: FM models pairwise(order-2) feature interactions

deeptables.models.deepnets.get(identifier)[source]¶

Returns function. :param identifier: Function or string

Returns:	Function corresponding to the input string or input function.
Return type:	Nets function denoted by input

For example: >>> nets.get(‘dnn_nets’)

<function dnnlogit at 0x1222a3d90>

deeptables.models.deepnets.get_nets(nets)[source]¶

deeptables.models.deepnets.ipnn_nets(embeddings, flatten_emb_layer, dense_layer, concat_emb_dense, config, model_desc)[source]¶: Inner Product-based Neural Network InnerProduct+DNN

deeptables.models.deepnets.linear(embeddings, flatten_emb_layer, dense_layer, concat_emb_dense, config, model_desc)[source]¶: Linear(order-1) interactions

deeptables.models.deepnets.opnn_nets(embeddings, flatten_emb_layer, dense_layer, concat_emb_dense, config, model_desc)[source]¶: Outer Product-based Neural Network OuterProduct+DNN

deeptables.models.deepnets.pnn_nets(embeddings, flatten_emb_layer, dense_layer, concat_emb_dense, config, model_desc)[source]¶: Concatenation of inner product and outer product + DNN

deeptables.models.deepnets.register_nets(nets_fn)[source]¶

deeptables.models.deepnets.serialize(nets_fn)[source]¶

deeptables.models.deeptable module¶

Training and inference for tabular datasets using neural nets.

class deeptables.models.deeptable.DeepTable(config=None)[source]¶

Bases: object

DeepTables can be use to solve classification and regression prediction problems on tabular datasets. Easy to use and provide good performance out of box, no datasets preprocessing is required.

Parameters:

config (ModelConfig) –

name: str, (default=’conf-1’)

nets: list of str or callable object, (default=[‘dnn_nets’])

DeepFM -> [‘linear’,’dnn_nets’,’fm_nets’]

xDeepFM

DCN

PNN

WideDeep

AutoInt

AFM

FGCNN

FibiNet

‘dnn_nets’

’linear’

’cin_nets’

’fm_nets’

’afm_nets’

’opnn_nets’

’ipnn_nets’

’pnn_nets’,

’cross_nets’

’cross_dnn_nets’

’dcn_nets’,

’autoint_nets’

’fg_nets’

’fgcnn_cin_nets’

’fgcnn_fm_nets’

’fgcnn_ipnn_nets’

’fgcnn_dnn_nets’

’fibi_nets’

’fibi_dnn_nets’

>>>from deeptables.models import deepnets >>>#preset nets >>>conf = ModelConfig(nets=deepnets.DeepFM) >>>#list of names of nets >>>conf = ModelConfig(nets=[‘linear’,’dnn_nets’,’cin_nets’,’cross_nets’]) >>>#mixed preset nets and names >>>conf = ModelConfig(nets=deepnets.WideDeep+[‘cin_nets’]) >>>#mixed names and custom nets >>>def custom_net(embeddings, flatten_emb_layer, dense_layer, concat_emb_dense, config, model_desc): >>> out = layers.Dense(10)(flatten_emb_layer) >>> return out >>>conf = ModelConfig(nets=[‘linear’, custom_net])

categorical_columns: list of strings, (default=’auto’)

’auto’

get the columns of categorical type automatically. By default, the object, bool and category will be selected. if ‘auto’ the [auto_categorize] will no longer takes effect.

list of strings

e.g. [‘x1’,’x2’,’x3’,’..’]

exclude_columns: list of strings, (default=[])

pos_label: str or int, (default=None)

The label of positive class, used only when task is binary.

metrics: list of string or callable object, (default=[‘accuracy’])

List of metrics to be evaluated by the model during training and testing. Typically you will use metrics=[‘accuracy’] or metrics=[‘AUC’]. Every metric should be a built-in evaluation metric in tf.keras.metrics or a callable object like r2(y_true, y_pred):… . See also: https://tensorflow.google.cn/versions/r2.0/api_docs/python/tf/keras/metrics

auto_categorize: bool, (default=False)

cat_exponent: float, (default=0.5)

cat_remain_numeric: bool, (default=True)

auto_encode_label: bool, (default=True)

auto_imputation: bool, (default=True)

auto_discrete: bool, (default=False)

apply_gbm_features: bool, (default=False)

gbm_params: dict, (default={})

gbm_feature_type: str, (default=embedding)

embedding

dense

fixed_embedding_dim: bool, (default=True)

embeddings_output_dim: int, (default=4)

embeddings_initializer: str or object, (default=’uniform’)

Initializer for the embeddings matrix.

embeddings_regularizer: str or object, (default=None)

Regularizer function applied to the embeddings matrix.

dense_dropout: float, (default=0) between 0 and 1

Fraction of the dense input units to drop.

embedding_dropout: float, (default=0.3) between 0 and 1

Fraction of the embedding input units to drop.

stacking_op: str, (default=’add’)

add

concat

output_use_bias: bool, (default=True)

apply_class_weight: bool, (default=False)

optimizer: str or object, (default=’auto’)

auto

str

object

loss: str or object, (default=’auto’)

dnn_params: dict, (default={‘dnn_units’: ((128, 0, False), (64, 0, False)),

’dnn_activation’: ‘relu’})

autoint_params:dict, (default={‘num_attention’: 3,’num_heads’: 1,

’dropout_rate’: 0,’use_residual’: True})

fgcnn_params={‘fg_filters’: (14, 16),

’fg_widths’: (7, 7), ‘fg_pool_widths’: (2, 2), ‘fg_new_feat_filters’: (2, 2), },

fibinet_params={

‘senet_pooling_op’: ‘mean’, ‘senet_reduction_ratio’: 3, ‘bilinear_type’: ‘field_interaction’,

}, cross_params={

’num_cross_layer’: 4,

}, pnn_params={

’outer_product_kernel_type’: ‘mat’,

}, afm_params={

’attention_factor’: 4, ‘dropout_rate’: 0

}, cin_params={

’cross_layer_size’: (128, 128), ‘activation’: ‘relu’, ‘use_residual’: False, ‘use_bias’: False, ‘direct’: False, ‘reduce_D’: False,

},

home_dir: str, (default=None)

The home directory for saving model-related files. Each time running fit(…) or fit_cross_validation(…), a subdirectory with a time-stamp will be created in this directory.

monitor_metric: str, (default=None)

earlystopping_patience: int, (default=1)

gpu_usage_strategy: str, (default=’memory_growth’)

memory_growth

None

distribute_strategy: tensorflow.python.distribute.distribute_lib.Strategy, (default=None)

task¶

Type of prediction problem, if ‘config.task = None’(by default), it will be inferred base on the values of y when calling ‘fit(…)’ or ‘fit_cross_validation(…)’. -‘binary’ : binary classification task -‘multiclass’ multiclass classfication task -‘regression’ regression task

Type:	str

num_classes¶

The number of classes, used only when task is multiclass.

Type:	int

pos_label¶

The label of positive class, used only when task is binary.

Type:	str or int

output_path¶

Path to directory used to save models. In addition, if a valid ‘X_test’ is passed into fit_cross_validation(…), the prediction results of the test set will be saved in this path as well. The path is a subdirectory with time-stamp created in the home directory. home directory is specified through config.home_dir, if config.home_dir=None output_path will be created in working directory.

Type:	str

preprocessor¶

Preprocessor is used to perform datasets preprocessing, such as categorization, label encoding, imputation, discretization, etc., before feeding into neural nets.

Type:	AbstractPreprocessor (default = DefaultPreprocessor)

nets¶

List of the network cells used to build the DeepModel

Type:	list(str)

monitor¶

The metric for monitoring the quality of model in early_stopping, if not specified, the first metric in [config.metrics] will be used. (e.g. log_loss/auc_val/accuracy_val…)

Type:	str

modelset¶

The models produced by fit(…) or fit_cross_validation(…)

Type:	ModelSet

best_model¶

A set of models will be produced by fit_cross_validation(…), instead of only one model by fit(…). The Best Model is the model with best performance on specific metric. The first metric in [config.metrics] will be used by default.

Type:	Model

leaderboard¶

List sorted by specific metric with some meta information and scores. The first metric in [config.metrics] will be used by default.

Type:	pandas.DataFrame

References

[1]

``_

Examples

>>>X_train = pd.read_csv(‘https://storage.googleapis.com/tf-datasets/titanic/train.csv’) >>>X_eval = pd.read_csv(‘https://storage.googleapis.com/tf-datasets/titanic/eval.csv’) >>>y_train = X_train.pop(‘survived’) >>>y_eval = X_eval.pop(‘survived’) >>> >>>config = ModelConfig(nets=deepnets.DeepFM, fixed_embedding_dim=True, embeddings_output_dim=4, auto_discrete=True) >>>dt = DeepTable(config=config) >>> >>>model, history = dt.fit(train, y_train, epochs=100) >>>preds = dt.predict(X_eval)

apply(X, output_layers, concat_outputs=False, batch_size=128, verbose=0, model_selector='current', auto_transform_data=True, transformer=None)[source]¶

best_model

concat_emb_dense(flatten_emb_layer, dense_layer)[source]¶

evaluate(X_test, y_test, batch_size=256, verbose=0, model_selector='current')[source]¶

fit(X=None, y=None, batch_size=128, epochs=1, verbose=1, callbacks=None, validation_split=0.2, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0, steps_per_epoch=None, validation_steps=None, validation_freq=1, max_queue_size=10, workers=1, use_multiprocessing=False)[source]¶

fit_cross_validation(X, y, X_eval=None, X_test=None, num_folds=5, stratified=False, iterators=None, batch_size=None, epochs=1, verbose=1, callbacks=None, n_jobs=1, random_state=9527, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0, steps_per_epoch=None, validation_steps=None, validation_freq=1, max_queue_size=10, workers=1, use_multiprocessing=False)[source]¶

get_class_weight(y)[source]¶

get_model(model_selector='current')[source]¶

leaderboard

static load(filepath)[source]¶

load_deepmodel(filepath)[source]¶

modelset

monitor

num_classes

pos_label

predict(X, encode_to_label=True, batch_size=128, verbose=0, model_selector='current', auto_transform_data=True)[source]¶

predict_proba(X, batch_size=128, verbose=0, model_selector='current', auto_transform_data=True)[source]¶

predict_proba_all(X, batch_size=128, verbose=0, auto_transform_data=True)[source]¶

proba2predict(proba, encode_to_label=True)[source]¶

restore_modelset(filepath)[source]¶

save(filepath)[source]¶

task

deeptables.models.deeptable.infer_task_type(y)[source]¶

deeptables.models.deeptable.probe_evaluate(dt, X, y, X_test, y_test, layers, score_fn={})[source]¶

deeptables.models.evaluation module¶

deeptables.models.evaluation.calc_score(y_true, y_proba, y_preds, metrics, task, pos_label=1)[source]¶

deeptables.models.layers module¶

class deeptables.models.layers.AFM(params, **kwargs)[source]¶

Bases: tensorflow.python.keras.engine.base_layer.Layer

Attentional Factorization Machine (AFM), which learns the importance of each feature interaction from datasets via a neural attention network.

Parameters:	hidden_factor – int, (default=16) activation_function – str, (default=’relu’) kernel_regularizer – str or object, (default=None) dropout_rate – float, (default=0)

Call arguments:: x: A list of 3D tensor.

A list of 3D tensor with shape: (batch_size, 1, embedding_size)

2D tensor with shape:

(batch_size, 1)

References

[1]	`Xiao J, Ye H, He X, et al. Attentional factorization machines: Learning the weight of feature

interactions via attention networks[J]. arXiv preprint arXiv:1708.04617, 2017.` .. [2] https://github.com/hexiangnan/attentional_factorization_machine

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters:	input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(x, **kwargs)[source]¶

This is where the layer’s logic lives.

Parameters:	inputs – Input tensor, or list/tuple of input tensors. **kwargs – Additional keyword arguments.
Returns:	A tensor or list/tuple of tensors.

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Returns:	Python dictionary.

class deeptables.models.layers.BilinearInteraction(bilinear_type='field_interaction', **kwargs)[source]¶

Bases: tensorflow.python.keras.engine.base_layer.Layer

The Bilinear-Interaction layer combines the inner product and Hadamard product to learn the feature interactions.

Parameters:	bilinear_type – str, (default=’field_interaction’) the type of bilinear functions - field_interaction - field_all - field_each

Call arguments:: x: A 3D tensor.

3D tensor with shape:

(batch_size, field_size, embedding_size)

3D tensor with shape:

(batch_size, *, embedding_size)

References

[1]	`Huang T, Zhang Z, Zhang J. FiBiNET: combining feature importance and bilinear feature

interaction for click-through rate prediction[C]//Proceedings of the 13th ACM Conference on Recommender Systems. 2019: 169-177.`

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters:	input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(x, **kwargs)[source]¶

This is where the layer’s logic lives.

Parameters:	inputs – Input tensor, or list/tuple of input tensors. **kwargs – Additional keyword arguments.
Returns:	A tensor or list/tuple of tensors.

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Returns:	Python dictionary.

class deeptables.models.layers.BinaryFocalLoss(gamma=2.0, alpha=0.25, reduction='auto', name='focal_loss')[source]¶

Bases: tensorflow.python.keras.losses.Loss

Binary form of focal loss.: FL(p_t) = -alpha * (1 - p_t)**gamma * log(p_t) where p = sigmoid(x), p_t = p or 1 - p depending on if the label is 1 or 0, respectively.

Parameters:	-- the same as weighing factor in balanced cross entropy (alpha) – -- focusing parameter for modulating factor (gamma) –

Default value:: gamma – 2.0 as mentioned in the paper alpha – 0.25 as mentioned in the paper

References

https://arxiv.org/pdf/1708.02002.pdf https://github.com/umbertogriffo/focal-loss-keras

Usage:: model.compile(loss=[BinaryFocalLoss(alpha=.25, gamma=2)], metrics=[“accuracy”], optimizer=adam)

call(y_true, y_pred)[source]¶

Invokes the Loss instance.

Parameters:	y_true – Ground truth values, with the same shape as ‘y_pred’. y_pred – The predicted values.

get_config()[source]¶

class deeptables.models.layers.CIN(params, **kwargs)[source]¶

Bases: tensorflow.python.keras.engine.base_layer.Layer

Compressed Interaction Network (CIN), with the following considerations: (1) interactions are applied at vector-wise level, not at bit-wise level; (2) high-order feature interactions is measured explicitly; (3) the complexity of network will not grow exponentially with the degree of interactions.

Parameters:	cross_layer_size – tuple of int, (default = (128, 128,)) activation – str, (default=’relu’) use_residual – bool, (default=False) use_bias – bool, (default=False) direct – bool, (default=False) reduce_D – bool, (default=False)

Call arguments:: x: A 3D tensor.

A 3D tensor with shape:

(batch_size, num_fields, embedding_size)

2D tensor with shape:

(batch_size, *)

References

[1]	`Lian J, Zhou X, Zhang F, et al. xdeepfm: Combining explicit and implicit feature interactions

for recommender systems[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018: 1754-1763.` .. [2] https://github.com/Leavingseason/xDeepFM

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters:	input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(x, **kwargs)[source]¶

This is where the layer’s logic lives.

Parameters:	inputs – Input tensor, or list/tuple of input tensors. **kwargs – Additional keyword arguments.
Returns:	A tensor or list/tuple of tensors.

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Returns:	Python dictionary.

class deeptables.models.layers.CategoricalFocalLoss(gamma=2.0, alpha=0.25, reduction='auto', name='focal_loss')[source]¶

Bases: tensorflow.python.keras.losses.Loss

Softmax version of focal loss.

m

FL = ∑ -alpha * (1 - p_o,c)^gamma * y_o,c * log(p_o,c): c=1

where m = number of classes, c = class and o = observation

Parameters:	-- the same as weighing factor in balanced cross entropy (alpha) – -- focusing parameter for modulating factor (gamma) –

Default value:: gamma – 2.0 as mentioned in the paper alpha – 0.25 as mentioned in the paper

References

Official paper: https://arxiv.org/pdf/1708.02002.pdf https://github.com/umbertogriffo/focal-loss-keras

Usage:: model.compile(loss=[CategoricalFocalLoss(alpha=.25, gamma=2)], metrics=[“accuracy”], optimizer=adam)

call(y_true, y_pred)[source]¶

Invokes the Loss instance.

Parameters:	y_true – Ground truth values, with the same shape as ‘y_pred’. y_pred – The predicted values.

get_config()[source]¶

class deeptables.models.layers.Cross(params, **kwargs)[source]¶

Bases: tensorflow.python.keras.engine.base_layer.Layer

The cross network is composed of cross layers to apply explicit feature crossing in an efficient way.

Parameters:	num_cross_layer – int, (default=2) the number of cross layers

Call arguments:: x: A 2D tensor.

2D tensor with shape:

(batch_size, field_size)

References

[1]	`Wang R, Fu B, Fu G, et al. Deep & cross network for ad click predictions[M]//Proceedings

of the ADKDD‘17. 2017: 1-7.`

build(input)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters:	input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(x, **kwargs)[source]¶

This is where the layer’s logic lives.

Parameters:	inputs – Input tensor, or list/tuple of input tensors. **kwargs – Additional keyword arguments.
Returns:	A tensor or list/tuple of tensors.

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Returns:	Python dictionary.

class deeptables.models.layers.FGCNN(filters, kernel_height, new_filters, pool_height, activation='tanh', **kwargs)[source]¶

Bases: tensorflow.python.keras.engine.base_layer.Layer

Feature Generation nets leverages the strength of CNN to generate local patterns and recombine them to generate new features.

Arguments:

filters: int

the filters of convolutional layer

kernel_height

the height of kernel_size of convolutional layer

new_filters

the number of new features’ map in recombination layer

pool_height

the height of pool_size of pooling layer

activation: str, (default=’tanh’)

Call arguments:

x: A 4D tensor.

4D tensor with shape:

(batch_size, field_size, embedding_size, 1)

pooling_output - 4D tensor new_features - 3D tensor with shape:

(batch_size, field_size*new_filters, embedding_size)

[1] `Liu B, Tang R, Chen Y, et al. Feature generation by convolutional neural network

for click-through rate prediction[C]//The World Wide Web Conference. 2019: 1119-1129.`

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters:	input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(x, **kwargs)[source]¶

This is where the layer’s logic lives.

Parameters:	inputs – Input tensor, or list/tuple of input tensors. **kwargs – Additional keyword arguments.
Returns:	A tensor or list/tuple of tensors.

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Returns:	Python dictionary.

class deeptables.models.layers.FM(**kwargs)[source]¶

Bases: tensorflow.python.keras.engine.base_layer.Layer

Factorization Machine to model order-2 feature interactions Arguments:

Call arguments:: x: A 3D tensor.

3D tensor with shape:

(batch_size, field_size, embedding_size)

2D tensor with shape:

(batch_size, 1)

References

[1]	Rendle S. Factorization machines[C]//2010 IEEE International Conference on Data Mining. IEEE, 2010: 995-1000.

[2]	Guo H, Tang R, Ye Y, et al. Deepfm: An end-to-end wide & deep learning framework for CTR prediction[J]. arXiv preprint arXiv:1804.04950, 2018.

call(x, **kwargs)[source]¶

This is where the layer’s logic lives.

Parameters:	inputs – Input tensor, or list/tuple of input tensors. **kwargs – Additional keyword arguments.
Returns:	A tensor or list/tuple of tensors.

class deeptables.models.layers.GHMCLoss(bins=10, momentum=0.75)[source]¶

Bases: object

calc(input, target, mask=None, is_mask=False)[source]¶

Args: input [batch_num, class_num]:

The direct prediction of classification fc layer.

target [batch_num, class_num]:: Binary target (0 or 1) for each sample each class. The value is -1 when the sample is ignored.

mask [batch_num, class_num]

get_acc_sum(bins)[source]¶

get_edges(bins)[source]¶

class deeptables.models.layers.InnerProduct(**kwargs)[source]¶

Bases: tensorflow.python.keras.engine.base_layer.Layer

Inner-Product layer

Arguments:

Call arguments:: x: A list of 3D tensor.

A list of 3D tensor with shape (batch_size, 1, embedding_size)

2D tensor with shape:

(batch_size, num_fields*(num_fields-1)/2)

References

[1]	`Qu Y, Cai H, Ren K, et al. Product-based neural networks for user response prediction[C]//2016

IEEE 16th International Conference on Data Mining (ICDM). IEEE, 2016: 1149-1154.` .. [2] Qu Y, Fang B, Zhang W, et al. Product-based neural networks for user response prediction over multi-field categorical datasets[J]. ACM Transactions on Information Systems (TOIS), 2018, 37(1): 1-35. .. [3] https://github.com/Atomu2014/product-nets

call(x, **kwargs)[source]¶

This is where the layer’s logic lives.

Parameters:	inputs – Input tensor, or list/tuple of input tensors. **kwargs – Additional keyword arguments.
Returns:	A tensor or list/tuple of tensors.

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Returns:	Python dictionary.

class deeptables.models.layers.MultiColumnEmbedding(input_dims, output_dims, dropout_rate=0.0, embeddings_initializer='uniform', embeddings_regularizer=None, activity_regularizer=None, embeddings_constraint=None, mask_zero=False, **kwargs)[source]¶

Bases: tensorflow.python.keras.engine.base_layer.Layer

This class is adapted from tensorflow’s implementation of Embedding We modify the code to make it suitable for multiple variables from different column in one input.

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters:	input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]¶

This is where the layer’s logic lives.

Parameters:	inputs – Input tensor, or list/tuple of input tensors. **kwargs – Additional keyword arguments.
Returns:	A tensor or list/tuple of tensors.

compute_mask(inputs, mask=None)[source]¶

Computes an output mask tensor.

Parameters:

inputs – Tensor or list of tensors.
mask – Tensor or list of tensors.

Returns:

None or a tensor (or list of tensors,: one per output tensor of the layer).

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Returns:	Python dictionary.

class deeptables.models.layers.MultiheadAttention(params, **kwargs)[source]¶

Bases: tensorflow.python.keras.engine.base_layer.Layer

A multihead self-attentive nets with residual connections to explicitly model the feature interactions.

Parameters:	params – dict ------ – num_head: int, (default=1) dropout_rate: float, (default=0) use_residual: bool, (default=True)

Call arguments:: x: A 3D tensor.

3D tensor with shape:

(batch_size, field_size, embedding_size)

3D tensor with shape:

(batch_size, field_size, embedding_size*num_head)

References

[1]	`Song W, Shi C, Xiao Z, et al. Autoint: Automatic feature interaction learning via

self-attentive neural networks[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 2019: 1161-1170.` .. [2] https://github.com/shichence/AutoInt

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters:	input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(x, **kwargs)[source]¶

This is where the layer’s logic lives.

Parameters:	inputs – Input tensor, or list/tuple of input tensors. **kwargs – Additional keyword arguments.
Returns:	A tensor or list/tuple of tensors.

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Returns:	Python dictionary.

class deeptables.models.layers.OuterProduct(params, **kwargs)[source]¶

Bases: tensorflow.python.keras.engine.base_layer.Layer

Outer-Product layer

Parameters:	outer_product_kernel_type – str, (default=’mat’) the type of outer product kernel - mat - vec - num

Call arguments:: x: A list of 3D tensor.

A list of 3D tensor with shape (batch_size, 1, embedding_size)

2D tensor with shape:

(batch_size, num_fields*(num_fields-1)/2)

References

[1]	`Qu Y, Cai H, Ren K, et al. Product-based neural networks for user response prediction[C]//2016

IEEE 16th International Conference on Data Mining (ICDM). IEEE, 2016: 1149-1154.` .. [2] Qu Y, Fang B, Zhang W, et al. Product-based neural networks for user response prediction over multi-field categorical datasets[J]. ACM Transactions on Information Systems (TOIS), 2018, 37(1): 1-35. .. [3] https://github.com/Atomu2014/product-nets

build(input)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters:	input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(x, **kwargs)[source]¶

This is where the layer’s logic lives.

Parameters:	inputs – Input tensor, or list/tuple of input tensors. **kwargs – Additional keyword arguments.
Returns:	A tensor or list/tuple of tensors.

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Returns:	Python dictionary.

class deeptables.models.layers.SENET(pooling_op='mean', reduction_ratio=3, **kwargs)[source]¶

Bases: tensorflow.python.keras.engine.base_layer.Layer

SENET layer can dynamically increase the weights of important features and decrease the weights of uninformative features to let the model pay more attention to more important features.

Arguments:

pooling_op: str, (default=’mean’)

pooling methods to squeeze the original embedding E into a statistic vector Z - mean - max

reduction_ratio: float, (default=3)

hyper-parameter for dimensionality-reduction

Call arguments:

x: A 3D tensor.

3D tensor with shape:

(batch_size, field_size, embedding_size)

3D tensor with shape:

(batch_size, field_size, embedding_size)

[1] `Huang T, Zhang Z, Zhang J. FiBiNET: combining feature importance and bilinear feature

interaction for click-through rate prediction[C]//Proceedings of the 13th ACM Conference on Recommender Systems. 2019: 169-177.`

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters:	input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(x, training=None, **kwargs)[source]¶

This is where the layer’s logic lives.

Parameters:	inputs – Input tensor, or list/tuple of input tensors. **kwargs – Additional keyword arguments.
Returns:	A tensor or list/tuple of tensors.

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Returns:	Python dictionary.

deeptables.models.layers.register_custom_objects(objs_dict: dict)[source]¶

deeptables.models.metainfo module¶

class deeptables.models.metainfo.CategoricalColumn[source]¶: Bases: deeptables.models.metainfo.CategoricalColumn

class deeptables.models.metainfo.ContinuousColumn[source]¶: Bases: deeptables.models.metainfo.ContinuousColumn

deeptables.models.modelset module¶

class deeptables.models.modelset.ModelInfo(type, name, model, score, **meta)[source]¶

Bases: object

dict_lower_keys(dict)[source]¶

get_score(metric_name)[source]¶

class deeptables.models.modelset.ModelSet(metric='AUC', best_mode='max')[source]¶

Bases: object

best_model()[source]¶

clear()[source]¶

get_modelinfo(name)[source]¶

get_modelinfos(type=None)[source]¶

get_models(type=None)[source]¶

leaderboard(top=0, type=None)[source]¶

push(modelinfo)[source]¶

top_n(top=0, type=None)[source]¶

deeptables.models.preprocessor module¶

class deeptables.models.preprocessor.AbstractPreprocessor(config: deeptables.models.config.ModelConfig)[source]¶

Bases: object

fit_transform(X, y, copy_data=True)[source]¶

get_categorical_columns()[source]¶

get_continuous_columns()[source]¶

inverse_transform_y(y_indicator)[source]¶

labels¶

static load(filepath)[source]¶

pos_label¶

save(filepath)[source]¶

task¶

transform(X, y, copy_data=True)[source]¶

transform_X(X, copy_data=True)[source]¶

transform_y(y, copy_data=True)[source]¶

class deeptables.models.preprocessor.DefaultPreprocessor(config: deeptables.models.config.ModelConfig)[source]¶

Bases: deeptables.models.preprocessor.AbstractPreprocessor

fit_transform(X, y, copy_data=True)[source]¶

fit_transform_y(y)[source]¶

get_categorical_columns()[source]¶

get_continuous_columns()[source]¶

inverse_transform_y(y_indicator)[source]¶

prepare_X(X)[source]¶

reset()[source]¶

transform(X, y, copy_data=True)[source]¶

transform_X(X, copy_data=True)[source]¶

transform_y(y, copy_data=True)[source]¶

deeptables.models package¶

Submodules¶

deeptables.models.config module¶

deeptables.models.deepmodel module¶

deeptables.models.deepnets module¶

deeptables.models.deeptable module¶

deeptables.models.evaluation module¶

deeptables.models.layers module¶

deeptables.models.metainfo module¶

deeptables.models.modelset module¶

deeptables.models.preprocessor module¶

Module contents¶