Layers

FM

Factorization Machine to model order-2 feature interactions.

Call arguments:

  • x: A 3D tensor.

Input shape:

  • 3D tensor with shape: (batch_size, field_size, embedding_size)

Output shape:

  • 2D tensor with shape: (batch_size, 1)

References:

  • [1] Rendle S. Factorization machines[C]//2010 IEEE International Conference on Data Mining. IEEE, 2010: 995-1000.
  • [2] Guo H, Tang R, Ye Y, et al. Deepfm: An end-to-end wide & deep learning framework for CTR prediction[J]. arXiv preprint arXiv:1804.04950, 2018.

AFM

Attentional Factorization Machine (AFM), which learns the importance of each feature interaction from data via a neural attention network.

Arguments:

  • hidden_factor: int, (default=16)
  • activation_function : str, (default=’relu’)
  • kernel_regularizer : str or object, (default=None)
  • dropout_rate: float, (default=0)

Call arguments:

  • x: A list of 3D tensor.

Input shape:

  • A list of 3D tensor with shape: (batch_size, 1, embedding_size)

Output shape:

  • 2D tensor with shape: (batch_size, 1)

References:

CIN

Compressed Interaction Network (CIN), with the following considerations: (1) interactions are applied at vector-wise level, not at bit-wise level; (2) high-order feature interactions is measured explicitly; (3) the complexity of network will not grow exponentially with the degree of interactions.

Arguments:

  • cross_layer_size: tuple of int, (default = (128, 128,))
  • activation: str, (default=’relu’)
  • use_residual: bool, (default=False)
  • use_bias: bool, (default=False)
  • direct: bool, (default=False)
  • reduce_D:bool, (default=False)

Call arguments:

  • x: A 3D tensor.

Input shape:

  • A 3D tensor with shape: (batch_size, num_fields, embedding_size)

Output shape:

  • 2D tensor with shape: (batch_size, *)

References:

  • [1] Lian J, Zhou X, Zhang F, et al. xdeepfm: Combining explicit and implicit feature interactions for recommender systems[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018: 1754-1763.
  • [2] https://github.com/Leavingseason/xDeepFM

MultiheadAttention

A multihead self-attentive nets with residual connections to explicitly model the feature interactions.

Arguments:

  • num_head: int, (default=1)
  • dropout_rate: float, (default=0)
  • use_residual: bool, (default=True)

Call arguments:

  • x: A 3D tensor.

Input shape:

  • 3D tensor with shape: (batch_size, field_size, embedding_size)

Output shape:

  • 3D tensor with shape: (batch_size, field_size, embedding_size*num_head)

References:

  • [1] Song W, Shi C, Xiao Z, et al. Autoint: Automatic feature interaction learning via self-attentive neural networks[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 2019: 1161-1170.
  • [2] https://github.com/shichence/AutoInt

FGCNN

Feature Generation nets leverages the strength of CNN to generate local patterns and recombine them to generate new features.

Arguments:

  • filters: int, the filters of convolutional layer
  • kernel_height: int, the height of kernel_size of convolutional layer
  • new_filters: int, the number of new features’ map in recombination layer
  • pool_height: int, the height of pool_size of pooling layer
  • activation: str, (default=’tanh’)

Call arguments:

  • x: A 4D tensor.

Input shape:

  • 4D tensor with shape: (batch_size, field_size, embedding_size, 1)

Output shape:

  • pooling_output - 4D tensor
  • new_features - 3D tensor with shape: (batch_size, field_size*new_filters, embedding_size)

References:

  • [1] Liu B, Tang R, Chen Y, et al. Feature generation by convolutional neural network for click-through rate prediction[C]//The World Wide Web Conference. 2019: 1119-1129.

SENET

SENET layer can dynamically increase the weights of important features and decrease the weights of uninformative features to let the model pay more attention to more important features.

Arguments:

  • pooling_op: str, (default=’mean’). Pooling methods to squeeze the original embedding E into a statistic vector Z
  • reduction_ratio: float, (default=3). Hyper-parameter for dimensionality-reduction

Call arguments:

  • x: A 3D tensor.

Input shape:

  • 3D tensor with shape: (batch_size, field_size, embedding_size)

Output shape:

  • 3D tensor with shape: (batch_size, field_size, embedding_size)

References:

  • [1] Huang T, Zhang Z, Zhang J. FiBiNET: combining feature importance and bilinear feature interaction for click-through rate prediction[C]//Proceedings of the 13th ACM Conference on Recommender Systems. 2019: 169-177.

BilinearInteraction

The Bilinear-Interaction layer combines the inner product and Hadamard product to learn the feature interactions.

Arguments:

  • bilinear_type: str, (default=’field_interaction’). The type of bilinear functions
    • field_interaction
    • field_all
    • field_each

Call arguments:

  • x: A 3D tensor.

Input shape:

  • 3D tensor with shape: (batch_size, field_size, embedding_size)

Output shape:

  • 3D tensor with shape: (batch_size, *, embedding_size)

References:

  • [1] Huang T, Zhang Z, Zhang J. FiBiNET: combining feature importance and bilinear feature interaction for click-through rate prediction[C]//Proceedings of the 13th ACM Conference on Recommender Systems. 2019: 169-177.

Cross

The cross network is composed of cross layers to apply explicit feature crossing in an efficient way.

Arguments:

  • num_cross_layer: int, (default=2). The number of cross layers

Call arguments:

  • x: A 2D tensor.

Input shape:

  • 2D tensor with shape: (batch_size, field_size)

Output shape:

  • 2D tensor with shape: (batch_size, field_size)

References:

  • [1] Wang R, Fu B, Fu G, et al. Deep & cross network for ad click predictions[M]//Proceedings of the ADKDD’17. 2017: 1-7.

InnerProduct

InnerProduct layer used in PNN

Call arguments:

  • x: A list of 3D tensor.

Input shape:

  • A list of 3D tensor with shape (batch_size, 1, embedding_size)

Output shape:

  • 2D tensor with shape: (batch_size, num_fields*(num_fields-1)/2)

References:

  • [1] Qu Y, Cai H, Ren K, et al. Product-based neural networks for user response prediction[C]//2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 2016: 1149-1154.
  • [2] Qu Y, Fang B, Zhang W, et al. Product-based neural networks for user response prediction over multi-field categorical data[J]. ACM Transactions on Information Systems (TOIS), 2018, 37(1): 1-35.
  • [3] https://github.com/Atomu2014/product-nets

OuterProduct

OuterProduct layer used in PNN

Arguments:

  • outer_product_kernel_type: str, (default=’mat’). The type of outer product kernel
    • mat
    • vec
    • num Call arguments:
  • x: A list of 3D tensor.

Input shape:

  • A list of 3D tensor with shape (batch_size, 1, embedding_size)

Output shape:

  • 2D tensor with shape: (batch_size, num_fields*(num_fields-1)/2)

References:

  • [1] Qu Y, Cai H, Ren K, et al. Product-based neural networks for user response prediction[C]//2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 2016: 1149-1154.
  • [2] Qu Y, Fang B, Zhang W, et al. Product-based neural networks for user response prediction over multi-field categorical data[J]. ACM Transactions on Information Systems (TOIS), 2018, 37(1): 1-35.
  • [3] https://github.com/Atomu2014/product-nets