Models#
Below are the models that are currently supported in Cornac.
Recommender (Generic Class)#
- class cornac.models.recommender.ANNMixin[source]#
Mixin class for Approximate Nearest Neighbor Search.
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Return type:
raise NotImplementedError
- class cornac.models.recommender.NextBasketRecommender(name, trainable=True, verbose=False)[source]#
Generic class for a next basket recommender model. All next basket recommendation models should inherit from this class.
- Parameters:
name (str, required) – Name of the recommender model.
trainable (boolean, optional, default: True) – When False, the model is not trainable.
verbose (boolean, optional, default: False) – When True, running logs are displayed.
- total_users#
Number of users in training, validation, and test data. In other words, this includes unknown/unseen users.
- Type:
- total_items#
Number of items in training, validation, and test data. In other words, this includes unknown/unseen items.
- Type:
- score(user_idx, history_baskets, **kwargs)[source]#
Predict the scores for all items based on input history baskets
- Parameters:
history_baskets (list of lists) – The list of history baskets in sequential manner for next-basket prediction.
- Returns:
res – Relative scores of all known items
- Return type:
a Numpy array
- class cornac.models.recommender.NextItemRecommender(name, trainable=True, verbose=False)[source]#
Generic class for a next item recommender model. All next item recommendation models should inherit from this class.
- Parameters:
name (str, required) – Name of the recommender model.
trainable (boolean, optional, default: True) – When False, the model is not trainable.
verbose (boolean, optional, default: False) – When True, running logs are displayed.
- total_users#
Number of users in training, validation, and test data. In other words, this includes unknown/unseen users.
- Type:
- total_items#
Number of items in training, validation, and test data. In other words, this includes unknown/unseen items.
- Type:
- score(user_idx, history_items, **kwargs)[source]#
Predict the scores for all items based on input history items
- Parameters:
history_items (list of lists) – The list of history items in sequential manner for next-item prediction.
- Returns:
res – Relative scores of all known items
- Return type:
a Numpy array
- class cornac.models.recommender.Recommender(name, trainable=True, verbose=False)[source]#
Generic class for a recommender model. All recommendation models should inherit from this class.
- Parameters:
name (str, required) – Name of the recommender model.
trainable (boolean, optional, default: True) – When False, the model is not trainable.
verbose (boolean, optional, default: False) – When True, running logs are displayed.
- total_users#
Number of users in training, validation, and test data. In other words, this includes unknown/unseen users.
- Type:
- total_items#
Number of items in training, validation, and test data. In other words, this includes unknown/unseen items.
- Type:
- clone(new_params=None)[source]#
Clone an instance of the model object.
- Parameters:
new_params (dict, optional, default: None) – New parameters for the cloned instance.
- Returns:
object
- Return type:
cornac.models.Recommender
- default_score()[source]#
Overwrite this function if your algorithm has special treatment for cold-start problem
- early_stop(train_set, val_set, min_delta=0.0, patience=0)[source]#
Check if training should be stopped when validation loss has stopped improving.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).min_delta (float, optional, default: 0.) – The minimum increase in monitored value on validation set to be considered as improvement, i.e. an increment of less than min_delta will count as no improvement.
patience (int, optional, default: 0) – Number of epochs with no improvement after which training should be stopped.
- Returns:
res – Return True if model training should be stopped (no improvement on validation set), otherwise return False.
- Return type:
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- is_unknown_item(item_idx)[source]#
Return whether the model knows item by its index. Reverse of knows_item() function, for better readability in some cases.
- is_unknown_user(user_idx)[source]#
Return whether the model knows user by its index. Reverse of knows_user() function, for better readability in some cases.
- property item_ids#
Return the list of raw item IDs
- static load(model_path, trainable=False)[source]#
Load a recommender model from the filesystem.
- Parameters:
model_path (str, required) – Path to a file or directory where the model is stored. If a directory is provided, the latest model will be loaded.
trainable (boolean, optional, default: False) – Set it to True if you would like to finetune the model. By default, the model parameters are assumed to be fixed after being loaded.
- Returns:
self
- Return type:
- monitor_value(train_set, val_set)[source]#
Calculating monitored value used for early stopping on validation set (val_set). This function will be called by early_stop() function. Note: val_set could be None thus it needs to be checked before usage.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Return type:
raise NotImplementedError
- rank(user_idx, item_indices=None, k=-1, **kwargs)[source]#
Rank all test items for a given user.
- Parameters:
user_idx (int, required) – The index of the user for whom to perform item raking.
item_indices (1d array, optional, default: None) – A list of candidate item indices to be ranked by the user. If None, list of ranked known item indices and their scores will be returned.
k (int, required) – Cut-off length for recommendations, k=-1 will return ranked list of all items. This is more important for ANN to know the limit to avoid exhaustive ranking.
- Returns:
(ranked_items, item_scores) – ranked_items contains item indices being ranked by their scores. item_scores contains scores of items corresponding to index in item_indices input.
- Return type:
- rate(user_idx, item_idx, clipping=True)[source]#
Give a rating score between pair of user and item
- Parameters:
- Returns:
A rating score of the user for the item
- Return type:
A scalar
- recommend(user_id, k=-1, remove_seen=False, train_set=None)[source]#
Generate top-K item recommendations for a given user. Key difference between this function and rank() function is that rank() function works with mapped user/item index while this function works with original user/item ID. This helps hide the abstraction of ID-index mapping, and make model usage and deployment cleaner.
- Parameters:
user_id (str, required) – The original ID of the user.
k (int, optional, default=-1) – Cut-off length for recommendations, k=-1 will return ranked list of all items.
remove_seen (bool, optional, default: False) – Remove seen/known items during training and validation from output recommendations.
train_set (
cornac.data.Dataset
, optional, default: None) – Training dataset needs to be provided in order to remove seen items.
- Returns:
recommendations – Recommended items in the form of their original IDs.
- Return type:
- save(save_dir=None, save_trainset=False, metadata=None)[source]#
Save a recommender model to the filesystem.
- Parameters:
save_dir (str, default: None) – Path to a directory for the model to be stored.
save_trainset (bool, default: False) – Save train_set together with the model. This is useful if we want to deploy model later because train_set is required for certain evaluation steps.
metadata (dict, default: None) – Metadata to be saved with the model. This is useful to store model details.
- Returns:
model_file – Path to the model file stored on the filesystem.
- Return type:
- score(user_idx, item_idx=None)[source]#
Predict the scores/ratings of a user for an item.
- Parameters:
- Returns:
res – Relative scores that the user gives to the item or to all known items
- Return type:
A scalar or a Numpy array
- property total_items#
Total number of items including users in test and validation if exists
- property total_users#
Total number of users including users in test and validation if exists
- transform(test_set)[source]#
Transform test set into cached results accelerating the score function. This function is supposed to be called in the cornac.eval_methods.BaseMethod before evaluation step. It is optional for this function to be implemented.
- Parameters:
test_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.
- property user_ids#
Return the list of raw user IDs
Comparative Aspects and Opinions Ranking for Recommendation Explanations (Companion)#
- class cornac.models.companion.recom_companion.Companion(name='Companion', rating_scale=5.0, n_user_factors=8, n_item_factors=8, n_aspect_factors=8, n_opinion_factors=8, n_bpr_samples=1000, n_aspect_ranking_samples=1000, n_opinion_ranking_samples=1000, n_element_samples=50, n_top_aspects=100, alpha=0.5, min_user_freq=2, min_pair_freq=1, min_common_freq=1, use_item_aspect_popularity=True, enum_window=None, lambda_reg=0.1, lambda_p=10, lambda_a=10, lambda_y=10, lambda_z=10, lambda_bpr=10, max_iter=200000, lr=0.1, n_threads=0, trainable=True, verbose=False, init_params=None, seed=None)#
Comparative Aspects and Opinions Ranking for Recommendation Explanations
- Parameters:
name (string, optional, default: 'Companion') – The name of the recommender model.
rating_scale (float, optional, default: 5.0) – The maximum rating score of the dataset.
n_user_factors (int, optional, default: 15) – The dimension of the user latent factors.
n_item_factors (int, optional, default: 15) – The dimension of the item latent factors.
n_aspect_factors (int, optional, default: 12) – The dimension of the aspect latent factors.
n_opinion_factors (int, optional, default: 12) – The dimension of the opinion latent factors.
n_bpr_samples (int, optional, default: 1000) – The number of samples from all BPR pairs.
n_element_samples (int, optional, default: 50) – The number of samples from all ratings in each iteration.
n_top_aspects (int, optional, default: 100) – The number of top scored aspects for each (user, item) pair to construct ranking score.
alpha (float, optional, default: 0.5) – Trace off factor for constructing ranking score.
lambda_reg (float, optional, default: 0.1) – The regularization parameter.
lambda_bpr (float, optional, default: 10.0) – The regularization parameter for BPR.
lambda_p (float, optional, default: 10.0) – The regularization parameter aspect ranking on item.
lambda_a (float, optional, default: 10.0) – The regularization parameter for item ranking by aspect.
lambda_y (float, optional, default: 10.0) – The regularization parameter for positive opinion ranking.
lambda_z (float, optional, default: 10.0) – The regularization parameter for negative opinion ranking.
max_iter (int, optional, default: 200000) – Maximum number of iterations for training.
lr (float, optional, default: 0.1) – The learning rate for optimization
n_threads (int, optional, default: 0) – Number of parallel threads for training. If n_threads=0, all CPU cores will be utilized. If seed is not None, n_threads=1 to remove randomness from parallelization.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U, I, A, O, G1, G2, and G3 are not None).
verbose (boolean, optional, default: False) – When True, running logs are displayed.
init_params (dictionary, optional, default: None) – List of initial parameters, e.g., init_params = {‘U’:U, ‘I’:I, ‘A’:A, ‘O’:O, ‘O’:O, ‘G1’:G1, ‘G2’:G2, ‘G3’:G3}
seed (int, optional, default: None) – Random seed for parameters initialization.
References
Trung-Hoang Le and Hady W. Lauw. 2024. Learning to Rank Aspects and Opinions for Comparative Explanations. Machine Learning (Special Issue for ACML 2024). https://lthoang.com/assets/publications/mlj24.pdf
- fit(train_set, val_set=None)#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- rank(user_idx, item_indices=None, k=-1)#
Rank all test items for a given user.
- Parameters:
user_idx (int, required) – The index of the user for whom to perform item raking.
item_indices (1d array, optional, default: None) – A list of candidate item indices to be ranked by the user. If None, list of ranked known item indices and their scores will be returned.
k (int, required) – Cut-off length for recommendations, k=-1 will return ranked list of all items. This is more important for ANN to know the limit to avoid exhaustive ranking.
- Returns:
(ranked_items, item_scores) – ranked_items contains item indices being ranked by their scores. item_scores contains scores of items corresponding to index in item_indices input.
- Return type:
- score(u_idx, i_idx=None)#
Predict the scores/ratings of a user for an item.
- Parameters:
- Returns:
res – Relative scores that the user gives to the item or to all known items
- Return type:
A scalar or a Numpy array
Disentangled Multimodal Representation Learning for Recommendation (DMRL)#
- class cornac.models.dmrl.recom_dmrl.DMRL(name: str = 'DMRL', batch_size: int = 32, learning_rate: float = 0.0001, decay_c: float = 1, decay_r: float = 0.01, epochs: int = 10, embedding_dim: int = 100, bert_text_dim: int = 384, image_dim: int = None, dropout: float = 0, num_neg: int = 4, num_factors: int = 4, trainable: bool = True, verbose: bool = False, log_metrics: bool = False)[source]#
Disentangled multimodal representation learning
- Parameters:
name (string, default: 'DMRL') – The name of the recommender model.
batch_size (int, optional, default: 32) – The number of samples per batch to load.
learning_rate (float, optional, default: 1e-4) – The learning rate for the optimizer.
decay_c (float, optional, default: 1) – The decay for the disentangled loss term in the loss function.
decay_r (float, optional, default: 0.01) – The decay for the regularization term in the loss function.
epochs (int, optional, default: 10) – The number of epochs to train the model.
embedding_dim (int, optional, default: 100) – The dimension of the embeddings.
bert_text_dim (int, optional, default: 384) – The dimension of the bert text embeddings coming from the huggingface transformer model
image_dim (int, optional, default: None) – The dimension of the image embeddings.
num_neg (int, optional, default: 4) – The number of negative samples to use in the training per user per batch (1 positive and num_neg negatives are used)
num_factors (int, optional, default: 4) – The number of factors to use in the model.
trainable (bool, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already trained.
verbose (bool, optional, default: False) – When True, the model prints out more information during training.
modalities_pre_built (bool, optional, default: True) – When True, the model assumes that the modalities are already built and does not build them.
log_metrics (bool, optional, default: False) – When True, the model logs metrics to tensorboard.
References
- Fan Liu, Huilin Chen, Zhiyong Cheng, Anan Liu, Liqiang Nie, Mohan Kankanhalli. DMRL: Disentangled Multimodal Representation Learning for
Recommendation. https://arxiv.org/pdf/2203.05406.pdf.
- eval_train_set_performance() Tuple[float, float] [source]#
Evaluate the models training set performance using Recall 300 metric.
- fit(train_set: Dataset, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- get_item_image_embedding(batch)[source]#
Get the item image embeddings from the image modality. Expect the image modaility to be preencded and available as a numpy array.
- Parameters:
batch (param) – and all other columns are negative item indices
- get_item_text_embeddings(batch)[source]#
Get the item text embeddings from the BERT model. Either by encoding the text on the fly or by using the preencoded text.
- Parameters:
batch (param) – and all other columns are negative item indices
- get_modality_embeddings(batch)[source]#
Get the modality embeddings for both text and image from the respectiv modality instances.
- Parameters:
batch (param)
second (indices in) – and all other columns are negative item indices
- initialize_and_build_modalities(trainset: Dataset)[source]#
Initializes text and image modalities for the model. Either takes in raw text or image and performs pre-encoding given the transformer models in TransformerTextModality and TransformerVisionModality. If preencoded features are given, it uses those instead and simply wrapes them into a general FeatureModality instance, as no further encoding model is required.
- score(user_index: int, item_indices=None)[source]#
Scores a user-item pair. If item_index is None, scores for all known items.
- Parameters:
name (user_idx) – The index of the user for whom to perform score prediction.
item_indices (torch.Tensor, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.
Bilateral VAE for Collaborative Filtering (BiVAECF)#
- class cornac.models.bivaecf.recom_bivaecf.BiVAECF(name='BiVAECF', k=10, encoder_structure=[20], act_fn='tanh', likelihood='pois', n_epochs=100, batch_size=100, learning_rate=0.001, beta_kl=1.0, cap_priors={'item': False, 'user': False}, trainable=True, verbose=False, seed=None, use_gpu=True)[source]#
Bilateral Variational AutoEncoder for Collaborative Filtering.
- Parameters:
k (int, optional, default: 10) – The dimension of the stochastic user ``theta’’ and item ``beta’’ factors.
encoder_structure (list, default: [20]) – The number of neurons per layer of the user and item encoders for BiVAE. For example, encoder_structure = [20], the user (item) encoder structure will be [num_items, 20, k] ([num_users, 20, k]).
act_fn (str, default: 'tanh') – Name of the activation function used between hidden layers of the auto-encoder. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘relu6’]
likelihood (str, default: 'pois') –
The likelihood function used for modeling the observations. Supported choices:
bern: Bernoulli likelihood gaus: Gaussian likelihood pois: Poisson likelihood
n_epochs (int, optional, default: 100) – The number of epochs for SGD.
batch_size (int, optional, default: 100) – The batch size.
learning_rate (float, optional, default: 0.001) – The learning rate for Adam.
beta_kl (float, optional, default: 1.0) – The weight of the KL terms as in beta-VAE.
cap_priors (dict, optional, default: {"user":False, "item":False}) – When {“user”:True, “item”:True}, CAP priors are used (see BiVAE paper for details), otherwise the standard Normal is used as a Prior over the user and item latent variables.
name (string, optional, default: 'BiVAECF') – The name of the recommender model.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained.
verbose (boolean, optional, default: False) – When True, some running logs are displayed.
seed (int, optional, default: None) – Random seed for parameters initialization.
use_gpu (boolean, optional, default: True) – If True and your system supports CUDA then training is performed on GPUs.
References
Quoc-Tuan Truong, Aghiles Salah, Hady W. Lauw. “ Bilateral Variational Autoencoder for Collaborative Filtering.”
ACM International Conference on Web Search and Data Mining (WSDM). 2021.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
- static load(model_path, trainable=False)[source]#
Load model from the filesystem.
- Parameters:
model_path (str, required) – Path to a file or directory where the model is stored. If a directory is provided, the latest model will be loaded.
trainable (boolean, optional, default: False) – Set it to True if you would like to finetune the model. By default, the model parameters are assumed to be fixed after being loaded.
- Returns:
self
- Return type:
- save(save_dir=None, save_trainset=True)[source]#
Save model to the filesystem.
- Parameters:
- Returns:
model_file – Path to the model file stored on the filesystem.
- Return type:
Causal Inference for Visual Debiasing in Visually-Aware Recommendation (CausalRec)#
- class cornac.models.causalrec.recom_causalrec.CausalRec(name='CausalRec', k=10, k2=10, n_epochs=50, batch_size=100, learning_rate=0.005, lambda_w=0.01, lambda_b=0.01, lambda_e=0.0, mean_feat=None, tanh=0, lambda_2=0.8, use_gpu=False, trainable=True, verbose=True, init_params=None, seed=None)[source]#
CausalRec: Causal Inference for Visual Debiasing in Visually-Aware Recommendation
- Parameters:
k (int, optional, default: 10) – The dimension of the gamma latent factors.
k2 (int, optional, default: 10) – The dimension of the theta latent factors.
n_epochs (int, optional, default: 20) – Maximum number of epochs for SGD.
batch_size (int, optional, default: 100) – The batch size for SGD.
learning_rate (float, optional, default: 0.001) – The learning rate for SGD.
lambda_w (float, optional, default: 0.01) – The regularization hyper-parameter for latent factor weights.
lambda_b (float, optional, default: 0.01) – The regularization hyper-parameter for biases.
lambda_e (float, optional, default: 0.0) – The regularization hyper-parameter for embedding matrix E and beta prime vector.
mean_feat (torch.tensor, required, default: None) – The mean feature of all item embeddings serving as the no-treatment during causal inference.
tanh (int, optional, default: 0) – The number of tanh layers on the visual feature transformation.
lambda_2 (float, optional, default: 0.8) – The coefficient controlling the elimination of the visual bias in Eq. (28).
use_gpu (boolean, optional, default: True) – Whether or not to use GPU to speed up training.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
verbose (boolean, optional, default: True) – When True, running logs are displayed.
init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘Bi’: beta_item, ‘Gu’: gamma_user, ‘Gi’: gamma_item, ‘Tu’: theta_user, ‘E’: emb_matrix, ‘Bp’: beta_prime}
seed (int, optional, default: None) – Random seed for weight initialization.
References
Qiu R., Wang S., Chen Z., Yin H., Huang Z. (2021). CausalRec: Causal Inference for Visual Debiasing in Visually-Aware Recommendation.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
Explainable Recommendation with Comparative Constraints on Product Aspects (ComparER)#
- class cornac.models.comparer.recom_comparer_sub.ComparERSub(name='ComparERSub', rating_scale=5.0, n_user_factors=8, n_item_factors=8, n_aspect_factors=8, n_opinion_factors=8, n_pair_samples=1000, n_bpr_samples=1000, n_element_samples=50, n_top_aspects=100, alpha=0.5, min_user_freq=2, min_pair_freq=1, min_common_freq=1, use_item_aspect_popularity=True, enum_window=None, lambda_reg=0.1, lambda_bpr=10, lambda_d=0.01, max_iter=200000, lr=0.5, n_threads=0, trainable=True, verbose=False, init_params=None, seed=None)#
Explainable Recommendation with Comparative Constraints on Subjective Aspect-Level Quality
- Parameters:
name (string, optional, default: 'ComparERSub') – The name of the recommender model.
rating_scale (float, optional, default: 5.0) – The maximum rating score of the dataset.
n_user_factors (int, optional, default: 15) – The dimension of the user latent factors.
n_item_factors (int, optional, default: 15) – The dimension of the item latent factors.
n_aspect_factors (int, optional, default: 12) – The dimension of the aspect latent factors.
n_opinion_factors (int, optional, default: 12) – The dimension of the opinion latent factors.
n_bpr_samples (int, optional, default: 1000) – The number of samples from all BPR pairs.
n_element_samples (int, optional, default: 50) – The number of samples from all ratings in each iteration.
n_top_aspects (int, optional, default: 100) – The number of top scored aspects for each (user, item) pair to construct ranking score.
alpha (float, optional, default: 0.5) – Trade-off factor for constructing ranking score.
lambda_reg (float, optional, default: 0.1) – The regularization parameter.
lambda_bpr (float, optional, default: 10.0) – The regularization parameter for BPR.
max_iter (int, optional, default: 200000) – Maximum number of iterations for training.
lr (float, optional, default: 0.1) – The learning rate for optimization
n_threads (int, optional, default: 0) – Number of parallel threads for training. If n_threads=0, all CPU cores will be utilized. If seed is not None, n_threads=1 to remove randomness from parallelization.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U, I, A, O, G1, G2, and G3 are not None).
verbose (boolean, optional, default: False) – When True, running logs are displayed.
init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘I’:I, ‘A’:A, ‘O’:O, ‘G1’:G1, ‘G2’:G2, ‘G3’:G3}
- U: ndarray, shape (n_users, n_user_factors)
The user latent factors, optional initialization via init_params
- I: ndarray, shape (n_items, n_item_factors)
The item latent factors, optional initialization via init_params
- A: ndarray, shape (num_aspects+1, n_aspect_factors)
The aspect latent factors, optional initialization via init_params
- O: ndarray, shape (num_opinions, n_opinion_factors)
The opinion latent factors, optional initialization via init_params
- G1: ndarray, shape (n_user_factors, n_item_factors, n_aspect_factors)
The core tensor for user, item, and aspect factors, optional initialization via init_params
- G2: ndarray, shape (n_user_factors, n_aspect_factors, n_opinion_factors)
The core tensor for user, aspect, and opinion factors, optional initialization via init_params
- G3: ndarray, shape (n_item_factors, n_aspect_factors, n_opinion_factors)
The core tensor for item, aspect, and opinion factors, optional initialization via init_params
seed (int, optional, default: None) – Random seed for parameters initialization.
References
Trung-Hoang Le and Hady W. Lauw. “Explainable Recommendation with Comparative Constraints on Product Aspects.”
ACM International Conference on Web Search and Data Mining (WSDM). 2021.
- fit(train_set, val_set=None)#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- rank(user_idx, item_indices=None, k=-1)#
Rank all test items for a given user.
- Parameters:
user_idx (int, required) – The index of the user for whom to perform item raking.
item_indices (1d array, optional, default: None) – A list of candidate item indices to be ranked by the user. If None, list of ranked known item indices and their scores will be returned.
k (int, required) – Cut-off length for recommendations, k=-1 will return ranked list of all items. This is more important for ANN to know the limit to avoid exhaustive ranking.
- Returns:
(ranked_items, item_scores) – ranked_items contains item indices being ranked by their scores. item_scores contains scores of items corresponding to index in item_indices input.
- Return type:
- class cornac.models.comparer.recom_comparer_obj.ComparERObj(name='ComparERObj', model_type='Finer', num_explicit_factors=128, num_latent_factors=128, num_most_cared_aspects=100, rating_scale=5.0, alpha=0.9, lambda_x=1, lambda_y=1, lambda_u=0.01, lambda_h=0.01, lambda_v=0.01, lambda_d=0.01, use_item_aspect_popularity=True, min_user_freq=2, min_pair_freq=1, max_pair_freq=1000000000.0, min_common_freq=1, enum_window=None, use_item_pair_popularity=True, max_iter=1000, num_threads=0, early_stopping=None, trainable=True, verbose=False, init_params=None, seed=None)#
Explainable Recommendation with Comparative Constraints on Objective Aspect-Level Quality
- Parameters:
num_explicit_factors (int, optional, default: 128) – The dimension of the explicit factors.
num_latent_factors (int, optional, default: 128) – The dimension of the latent factors.
num_most_cared_aspects (int, optional, default: 100) – The number of most cared aspects for each user.
rating_scale (float, optional, default: 5.0) – The maximum rating score of the dataset.
alpha (float, optional, default: 0.9) – Trace off factor for constructing ranking score.
lambda_x (float, optional, default: 1) – The regularization parameter for user aspect attentions.
lambda_y (float, optional, default: 1) – The regularization parameter for item aspect qualities.
lambda_u (float, optional, default: 0.01) – The regularization parameter for user and item explicit factors.
lambda_h (float, optional, default: 0.01) – The regularization parameter for user and item latent factors.
lambda_v (float, optional, default: 0.01) – The regularization parameter for V.
use_item_aspect_popularity (boolean, optional, default: True) – When False, item aspect frequency is omitted from item aspect quality computation formular. Specifically, \(Y_{ij} = 1 + \frac{N - 1}{1 + e^{-s_{ij}}}\) if \(p_i\) is reviewed on feature \(F_j\)
min_user_freq (int, optional, default: 2) – Apply constraint for user with minimum number of ratings, where min_user_freq = 2 means only apply constraints on users with at least 2 ratings.
min_pair_freq (int, optional, default: 1) – Apply constraint for the purchased pairs (earlier-later bought) with minimum number of pairs, where min_pair_freq = 2 means only apply constraints on pairs appear at least twice.
max_pair_freq (int, optional, default: 1e9) – Apply constraint for the purchased pairs with frequency at most max_pair_freq, where max_pair_freq = 2 means only apply constraints on pairs appear at most twice.
max_iter (int, optional, default: 1000) – Maximum number of iterations or the number of epochs.
name (string, optional, default: 'ComparERObj') – The name of the recommender model.
num_threads (int, optional, default: 0) – Number of parallel threads for training. If 0, all CPU cores will be utilized.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U1, U2, V, H1, and H2 are not None).
verbose (boolean, optional, default: False) – When True, running logs are displayed.
init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U1’:U1, ‘U2’:U2, ‘V’:V’, H1’:H1, ‘H2’:H2} U1: ndarray, shape (n_users, n_explicit_factors)
The user explicit factors, optional initialization via init_params.
- U2: ndarray, shape (n_ratings, n_explicit_factors)
The item explicit factors, optional initialization via init_params.
- V: ndarray, shape (n_aspects, n_explict_factors)
The aspect factors, optional initialization via init_params.
- H1: ndarray, shape (n_users, n_latent_factors)
The user latent factors, optional initialization via init_params.
- H2: ndarray, shape (n_ratings, n_latent_factors)
The item latent factors, optional initialization via init_params.
seed (int, optional, default: None) – Random seed for weight initialization.
References
Trung-Hoang Le and Hady W. Lauw. “Explainable Recommendation with Comparative Constraints on Product Aspects.”
ACM International Conference on Web Search and Data Mining (WSDM). 2021.
- fit(train_set, val_set=None)#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_params()#
Get model parameters in the form of dictionary including matrices: U1, U2, V, H1, H2
- monitor_value()#
Calculating monitored value used for early stopping on validation set (val_set). This function will be called by early_stop() function.
- Returns:
res – Monitored value on validation set. Return None if val_set is None.
- Return type:
- rank(user_idx, item_indices=None, k=-1)#
Rank all test items for a given user.
- Parameters:
user_idx (int, required) – The index of the user for whom to perform item raking.
item_indices (1d array, optional, default: None) – A list of candidate item indices to be ranked by the user. If None, list of ranked known item indices and their scores will be returned
k (int, required) – Cut-off length for recommendations, k=-1 will return ranked list of all items. This is more important for ANN to know the limit to avoid exhaustive ranking.
- Returns:
(ranked_items, item_scores) – ranked_items contains item indices being ranked by their scores. item_scores contains scores of items corresponding to index in item_indices input.
- Return type:
- score(user_id, item_id=None)#
Predict the scores/ratings of a user for an item.
- Parameters:
- Returns:
res – Relative scores that the user gives to the item or to all known items
- Return type:
A scalar or a Numpy array
Adversarial Training Towards Robust Multimedia Recommender System (AMR)#
- class cornac.models.amr.recom_amr.AMR(name='AMR', k=10, k2=10, n_epochs=50, batch_size=100, learning_rate=0.005, lambda_w=0.01, lambda_b=0.01, lambda_e=0.0, lambda_adv=1.0, use_gpu=False, trainable=True, verbose=True, init_params=None, seed=None)[source]#
Adversarial Training Towards Robust Multimedia Recommender System.
- Parameters:
k (int, optional, default: 10) – The dimension of the gamma latent factors.
k2 (int, optional, default: 10) – The dimension of the theta latent factors.
n_epochs (int, optional, default: 20) – Maximum number of epochs for SGD.
batch_size (int, optional, default: 100) – The batch size for SGD.
learning_rate (float, optional, default: 0.001) – The learning rate for SGD.
lambda_w (float, optional, default: 0.01) – The regularization hyper-parameter for latent factor weights.
lambda_b (float, optional, default: 0.01) – The regularization hyper-parameter for biases.
lambda_e (float, optional, default: 0.0) – The regularization hyper-parameter for embedding matrix E and beta prime vector.
lambda_adv (float, optional, default: 1.0) – The regularization hyper-parameter in Eq. (8) and (10) for the adversarial sample loss.
use_gpu (boolean, optional, default: True) – Whether or not to use GPU to speed up training.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
verbose (boolean, optional, default: True) – When True, running logs are displayed.
init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘Bi’: beta_item, ‘Gu’: gamma_user, ‘Gi’: gamma_item, ‘Tu’: theta_user, ‘E’: emb_matrix, ‘Bp’: beta_prime}
seed (int, optional, default: None) – Random seed for weight initialization.
References
Tang, J., Du, X., He, X., Yuan, F., Tian, Q., and Chua, T. (2020). Adversarial Training Towards Robust Multimedia Recommender System.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
Hybrid neural recommendation with joint deep representation learning of ratings and reviews (HRDR)#
- class cornac.models.hrdr.recom_hrdr.HRDR(name='HRDR', embedding_size=100, id_embedding_size=32, n_factors=32, attention_size=16, kernel_sizes=[3], n_filters=64, n_user_mlp_factors=128, n_item_mlp_factors=128, dropout_rate=0.5, max_text_length=50, max_num_review=32, batch_size=64, max_iter=20, optimizer='adam', learning_rate=0.001, model_selection='last', user_based=True, trainable=True, verbose=True, init_params=None, seed=None)[source]#
- Parameters:
name (string, default: 'HRDR') – The name of the recommender model.
embedding_size (int, default: 100) – Word embedding size
n_factors (int, default: 32) – The dimension of the user/item’s latent factors.
attention_size (int, default: 16) – Attention size
kernel_sizes (list, default: [3]) – List of kernel sizes of conv2d
n_filters (int, default: 64) – Number of filters
n_user_mlp_factors (int, default: 128) – Number of latent dimension of the first layer of a 3-layer MLP following by batch normalization on user net to represent user rating.
n_item_mlp_factors (int, default: 128) – Number of latent dimension of the first layer of a 3-layer MLP following by batch normalization on item net to represent item rating.
dropout_rate (float, default: 0.5) – Dropout rate of neural network dense layers
max_text_length (int, default: 50) – Maximum number of tokens in a review instance
max_num_review (int, default: 32) – Maximum number of reviews that you want to feed into training. By default, the model will be trained with all reviews.
batch_size (int, default: 64) – Batch size
max_iter (int, default: 20) – Max number of training epochs
optimizer (string, optional, default: 'adam') – Optimizer for training is either ‘adam’ or ‘rmsprop’.
learning_rate (float, optional, default: 0.001) – Initial value of learning rate for the optimizer.
trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.
verbose (boolean, optional, default: True) – When True, running logs are displayed.
init_params (dictionary, optional, default: None) – Initial parameters, pretrained_word_embeddings could be initialized here, e.g., init_params={‘pretrained_word_embeddings’: pretrained_word_embeddings}
seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).
References
Liu, H., Wang, Y., Peng, Q., Wu, F., Gan, L., Pan, L., & Jiao, P. (2020). Hybrid neural recommendation with joint deep representation learning of ratings and reviews. Neurocomputing, 374, 77-85.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
- static load(model_path, trainable=False)[source]#
Load a recommender model from the filesystem.
- Parameters:
model_path (str, required) – Path to a file or directory where the model is stored. If a directory is provided, the latest model will be loaded.
trainable (boolean, optional, default: False) – Set it to True if you would like to finetune the model. By default, the model parameters are assumed to be fixed after being loaded.
- Returns:
self
- Return type:
- save(save_dir=None)[source]#
Save a recommender model to the filesystem.
- Parameters:
save_dir (str, default: None) – Path to a directory for the model to be stored.
Hypergraphs with Attention on Reviews for Explainable Recommendation#
- class cornac.models.hypar.recom_hypar.HypAR(name='HypAR', use_cuda=False, stemming=True, batch_size=128, num_workers=0, num_epochs=10, early_stopping=10, eval_interval=1, learning_rate=0.1, weight_decay=0, node_dim=64, num_heads=3, fanout=5, non_linear=True, model_selection='best', objective='ranking', review_aggregator='narre', predictor='narre', preference_module='lightgcn', combiner='add', graph_type='aos', num_neg_samples=50, layer_dropout=None, attention_dropout=0.2, user_based=True, verbose=True, index=0, out_path=None, learn_explainability=False, learn_method='transr', learn_weight=1.0, embedding_type='ao_embeddings', debug=False)[source]#
HypAR: Hypergraph with Attention on Review. This model is from the paper “Hypergraph with Attention on Reviews for explainable recommendation”, by Theis E. Jendal, Trung-Hoang Le, Hady W. Lauw, Matteo Lissandrini, Peter Dolog, and Katja Hose. ECIR 2024: https://doi.org/10.1007/978-3-031-56027-9_14
- Parameters:
name (str, default: 'HypAR') – Name of the model.
use_cuda (bool, default: False) – Whether to use cuda.
stemming (bool, default: True) – Whether to use stemming.
batch_size (int, default: 128) – Batch size.
num_workers (int, default: 0) – Number of workers for dataloader.
num_epochs (int, default: 10) – Number of epochs.
early_stopping (int, default: 10) – Early stopping.
eval_interval (int, default: 1) – Evaluation interval, i.e., how often to evaluate on the validation set.
learning_rate (float, default: 0.1) – Learning rate.
weight_decay (float, default: 0) – Weight decay.
node_dim (int, default: 64) – Dimension of learned and hidden layers.
num_heads (int, default: 3) – Number of attention heads.
fanout (int, default: 5) – Fanout for sampling.
non_linear (bool, default: True) – Whether to use non-linear activation function.
model_selection (str, default: 'best') – Model selection method, i.e., whether to use the best model or the last model.
objective (str, default: 'ranking') – Objective, i.e., whether to use ranking or rating.
review_aggregator (str, default: 'narre') – Review aggregator, i.e., how to aggregate reviews.
predictor (str, default: 'narre') – Predictor, i.e., how to predict ratings.
preference_module (str, default: 'lightgcn') – Preference module, i.e., how to model preferences.
combiner (str, default: 'add') – Combiner, i.e., how to combine embeddings.
graph_type (str, default: 'aos') – Graph type, i.e., which nodes to include in hypergraph. Aspects, opinions and sentiment.
num_neg_samples (int, default: 50) – Number of negative samples to use for ranking.
layer_dropout (float, default: None) – Dropout for node and review embeddings.
attention_dropout (float, default: .2) – Dropout for attention.
user_based (bool, default: True) – Whether to use user-based or item-based.
verbose (bool, default: True) – Whether to print information.
index (int, default: 0) – Index for saving results, i.e., if hyparparameter tuning.
out_path (str, default: None) – Path to save graphs, embeddings and similar.
learn_explainability (bool, default: False) – Whether to learn explainability.
learn_method (str, default: 'transr') – Learning method, i.e., which method to use explainability learning.
learn_weight (float, default: 1.) – Weight for explainability learning loss.
embedding_type (str, default: 'ao_embeddings') – Type of embeddings to use, i.e., whether to use prelearned embeddings or not.
debug (bool, default: False) – Whether to use debug mode as errors might be thrown by dataloaders when debugging.
- fit(train_set: Dataset, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- load(model_path, trainable=False)[source]#
Load a recommender model from the filesystem.
- Parameters:
model_path (str, required) – Path to a file or directory where the model is stored. If a directory is provided, the latest model will be loaded.
trainable (boolean, optional, default: False) – Set it to True if you would like to finetune the model. By default, the model parameters are assumed to be fixed after being loaded.
- Returns:
self
- Return type:
- monitor_value(train_set, val_set=None)[source]#
Calculating monitored value used for early stopping on validation set (val_set). This function will be called by early_stop() function. Note: val_set could be None thus it needs to be checked before usage.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Return type:
raise NotImplementedError
- save(save_dir=None, save_trainset=False)[source]#
Save a recommender model to the filesystem.
- Parameters:
save_dir (str, default: None) – Path to a directory for the model to be stored.
save_trainset (bool, default: False) – Save train_set together with the model. This is useful if we want to deploy model later because train_set is required for certain evaluation steps.
metadata (dict, default: None) – Metadata to be saved with the model. This is useful to store model details.
- Returns:
model_file – Path to the model file stored on the filesystem.
- Return type:
Simplifying and Powering Graph Convolution Network for Recommendation (LightGCN)#
- class cornac.models.lightgcn.recom_lightgcn.LightGCN(name='LightGCN', emb_size=64, num_epochs=1000, learning_rate=0.001, batch_size=1024, num_layers=3, early_stopping=None, lambda_reg=0.0001, trainable=True, verbose=False, seed=2020)[source]#
- Parameters:
name (string, default: 'LightGCN') – The name of the recommender model.
emb_size (int, default: 64) – Size of the node embeddings.
num_epochs (int, default: 1000) – Maximum number of iterations or the number of epochs.
learning_rate (float, default: 0.001) – The learning rate that determines the step size at each iteration
batch_size (int, default: 1024) – Mini-batch size used for train set
num_layers (int, default: 3) – Number of LightGCN Layers
early_stopping ({min_delta: float, patience: int}, optional, default: None) –
If None, no early stopping. Meaning of the arguments:
- min_delta: the minimum increase in monitored value on validation
set to be considered as improvement, i.e. an increment of less than min_delta will count as no improvement.
- patience: number of epochs with no improvement after which
training should be stopped.
lambda_reg (float, default: 1e-4) – Weight decay for the L2 normalization
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained.
verbose (boolean, optional, default: False) – When True, some running logs are displayed.
seed (int, optional, default: 2020) – Random seed for parameters initialization.
References
He, X., Deng, K., Wang, X., Li, Y., Zhang, Y., & Wang, M. (2020). LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
- monitor_value(train_set, val_set)[source]#
Calculating monitored value used for early stopping on validation set (val_set). This function will be called by early_stop() function.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
res – Monitored value on validation set. Return None if val_set is None.
- Return type:
New Variational Autoencoder for Top-N Recommendations with Implicit Feedback (RecVAE)#
- class cornac.models.recvae.recom_recvae.RecVAE(name='RecVae', hidden_dim=600, latent_dim=200, batch_size=500, beta=None, gamma=0.005, lr=0.0005, n_epochs=50, n_enc_epochs=3, n_dec_epochs=1, not_alternating=False, trainable=True, verbose=False, seed=None, use_gpu=True)[source]#
RecVAE, a recommender system based on a Variational Autoencoder.
- Parameters:
name (str, optional, default: 'RecVae') – Name of the recommender model.
hidden_dim (int, optional, default: 600) – Dimension of the hidden layer in the VAE architecture.
latent_dim (int, optional, default: 200) – Dimension of the latent layer in the VAE architecture.
batch_size (int, optional, default: 500) – Size of the batches used during training.
beta (float, optional) – Weighting factor for the KL divergence term in the VAE loss function.
gamma (float, optional, default: 0.005) – Weighting factor for the regularization term in the loss function.
lr (float, optional, default: 5e-4) – Learning rate for the optimizer.
n_epochs (int, optional, default: 50) – Number of epochs to train the model.
n_enc_epochs (int, optional, default: 3) – Number of epochs to train the encoder part of VAE.
n_dec_epochs (int, optional, default: 1) – Number of epochs to train the decoder part of VAE.
not_alternating (boolean, optional, default: False) – If True, the model training will not alternate between encoder and decoder.
trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.
verbose (boolean, optional, default: False) – When True, running logs are displayed.
seed (int, optional) – Random seed for weight initialization and training reproducibility.
use_gpu (boolean, optional, default: True) – When True, training utilizes GPU if available.
References
RecVAE GitHub Repository: ilya-shenbin/RecVAE
Paper Link: https://arxiv.org/abs/1912.11160
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
Predicting Temporal Sets with Deep Neural Networks (DNNTSP)#
- class cornac.models.dnntsp.recom_dnntsp.DNNTSP(name='DNNTSP', emb_dim=32, loss_type='bpr', optimizer='adam', lr=0.001, weight_decay=0, n_epochs=100, batch_size=64, device='cpu', trainable=True, verbose=False, seed=None)[source]#
Deep Neural Network for Temporal Sets Prediction (DNNTSP).
- Parameters:
name (string, default: 'DNNTSP') – The name of the recommender model.
emb_dim (int, optional, default: 32) – Number of hidden factors
loss_type (string, optional, default: "bpr") – Loss type. Including “bpr”: BPRLoss “mse”: MSELoss “weight_mse”: WeightMSELoss “multi_label_soft_margin”: MultiLabelSoftMarginLoss
optimizer (string, optional, default: "adam") – Optimizer
lr (string, optional, default: 0.001) – Learning rate
weight_decay (float, optional, default: 0) – Weight decay for adaptive optimizer
n_epochs (int, optional, default: 100) – Number of epochs
batch_size (int, optional, default: 64) – Batch size
device (string, optional, default: "cpu") – Device for learning and evaluation. Using cpu as default. Use “cuda:0” for using gpu.
trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.
verbose (boolean, optional, default: True) – When True, running logs are displayed.
seed (int, optional, default: None) – Random seed
References
Le Yu, Leilei Sun, Bowen Du, Chuanren Liu, Hui Xiong, and Weifeng Lv. 2020. Predicting Temporal Sets with Deep Neural Networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD ‘20). Association for Computing Machinery, New York, NY, USA, 1083–1091. https://doi.org/10.1145/3394486.3403152
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- score(user_idx, history_baskets, **kwargs)[source]#
Predict the scores for all items based on input history baskets
- Parameters:
history_baskets (list of lists) – The list of history baskets in sequential manner for next-basket prediction.
- Returns:
res – Relative scores of all known items
- Return type:
a Numpy array
Recency Aware Collaborative Filtering for Next Basket Recommendation (UPCF)#
- class cornac.models.upcf.recom_upcf.UPCF(name='UPCF', recency=1, locality=1, asymmetry=0.25, verbose=False)[source]#
User Popularity-based CF (UPCF)
- Parameters:
name (string, default: 'UPCF') – The name of the recommender model.
recency (int, optional, default: 1) – The size of recency window. If 0, all baskets will be used.
locality (int, optional, default: 1) – The strength we enforce the similarity between two items within a basket
asymmetry (float, optional, default: 0.25) – Trade-off parameter which balances the importance of the probability of having item i given j and probability having item j given i. This value will be computed via similaripy.asymetric_cosine.
verbose (boolean, optional, default: False) – When True, running logs are displayed.
References
Guglielmo Faggioli, Mirko Polato, and Fabio Aiolli. 2020. Recency Aware Collaborative Filtering for Next Basket Recommendation. In Proceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization (UMAP ‘20). Association for Computing Machinery, New York, NY, USA, 80–87. https://doi.org/10.1145/3340631.3394850
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- score(user_idx, history_baskets, **kwargs)[source]#
Predict the scores for all items based on input history baskets
- Parameters:
history_baskets (list of lists) – The list of history baskets in sequential manner for next-basket prediction.
- Returns:
res – Relative scores of all known items
- Return type:
a Numpy array
Temporal-Item-Frequency-based User-KNN (TIFUKNN)#
- class cornac.models.tifuknn.recom_tifuknn.TIFUKNN(name='TIFUKNN', n_neighbors=300, within_decay_rate=0.9, group_decay_rate=0.7, alpha=0.7, n_groups=7, verbose=False)[source]#
Temporal-Item-Frequency-based User-KNN (TIFUKNN)
- Parameters:
name (string, default: 'TIFUKNN') – The name of the recommender model.
n_neighbors (int, optional, default: 300) – The number of neighbors for KNN
within_decay_rate (float, optional, default: 0.9) – Within-basket time-decayed ratio in range [0, 1]
group_decay_rate (float, optional, default: 0.7) – Group time-decayed ratio in range [0, 1]
alpha (float, optional, default: 0.7) – The trade-off between current user vector and neighbors vectors to compute final item scores
n_groups (int, optional, default: 7) – The historal baskets will be partition into n_groups equally.
verbose (boolean, optional, default: False) – When True, running logs are displayed.
References
Haoji Hu, Xiangnan He, Jinyang Gao, and Zhi-Li Zhang. 2020. Modeling Personalized Item Frequency Information for Next-basket Recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ‘20). Association for Computing Machinery, New York, NY, USA, 1071–1080. https://doi.org/10.1145/3397271.3401066
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- score(user_idx, history_baskets, **kwargs)[source]#
Predict the scores for all items based on input history baskets
- Parameters:
history_baskets (list of lists) – The list of history baskets in sequential manner for next-basket prediction.
- Returns:
res – Relative scores of all known items
- Return type:
a Numpy array
Correlation-Sensitive Next-Basket Recommendation (Beacon)#
- class cornac.models.beacon.recom_beacon.Beacon(name='Beacon', emb_dim=2, rnn_unit=4, alpha=0.5, rnn_cell_type='LSTM', dropout_rate=0.5, nb_hop=1, max_seq_length=None, n_epochs=15, batch_size=32, lr=0.001, trainable=True, verbose=False, seed=None)[source]#
Correlation-Sensitive Next-Basket Recommendation
- Parameters:
name (string, default: 'Beacon') – The name of the recommender model.
emb_dim (int, optional, default: 2) – Embedding dimension
rnn_unit (int, optional, default: 4) – Number of dimension in a rnn unit.
alpha (float, optional, default: 0.5) – Hyperparameter to control the balance between correlative and sequential associations.
rnn_cell_type (str, optional, default: 'LSTM') – RNN cell type, options including [‘LSTM’, ‘GRU’, None] If None, BasicRNNCell will be used.
dropout_rate (float, optional, default: 0.5) – Dropout rate of neural network dense layers
nb_hop (int, optional, default: 1) – Number of hops for constructing correlation matrix. If 0, zeros matrix will be used.
max_seq_length (int, optional, default: None) – Maximum basket sequence length. If None, it is the maximum number of basket in training sequences.
n_epochs (int, optional, default: 15) – Number of training epochs
batch_size (int, optional, default: 32) – Batch size
lr (float, optional, default: 0.001) – Initial value of learning rate for the optimizer.
verbose (boolean, optional, default: False) – When True, running logs are displayed.
seed (int, optional, default: None) – Random seed
References
LE, Duc Trong, Hady Wirawan LAUW, and Yuan Fang. Correlation-sensitive next-basket recommendation. International Joint Conferences on Artificial Intelligence, 2019.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- score(user_idx, history_baskets, **kwargs)[source]#
Predict the scores for all items based on input history baskets
- Parameters:
history_baskets (list of lists) – The list of history baskets in sequential manner for next-basket prediction.
- Returns:
res – Relative scores of all known items
- Return type:
a Numpy array
Embarrassingly Shallow Autoencoders for Sparse Data (EASEᴿ)#
- class cornac.models.ease.recom_ease.EASE(name='EASEᴿ', lamb=500, posB=True, trainable=True, verbose=True, seed=None, B=None, U=None)[source]#
Embarrassingly Shallow Autoencoders for Sparse Data.
- Parameters:
name (string, optional, default: 'EASEᴿ') – The name of the recommender model.
lamb (float, optional, default: 500) – L2-norm regularization-parameter λ ∈ R+.
posB (boolean, optional, default: False) – Remove Negative Weights
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already trained.
verbose (boolean, optional, default: False) – When True, some running logs are displayed.
seed (int, optional, default: None) – Random seed for parameters initialization.
References
Steck, H. (2019, May). “Embarrassingly shallow autoencoders for sparse data.” In The World Wide Web Conference (pp. 3251-3257).
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
Neural Graph Collaborative Filtering (NGCF)#
- class cornac.models.ngcf.recom_ngcf.NGCF(name='NGCF', emb_size=64, layer_sizes=[64, 64, 64], dropout_rates=[0.1, 0.1, 0.1], num_epochs=1000, learning_rate=0.001, batch_size=1024, early_stopping=None, lambda_reg=0.0001, trainable=True, verbose=False, seed=2020)[source]#
Neural Graph Collaborative Filtering
- Parameters:
name (string, default: 'NGCF') – The name of the recommender model.
emb_size (int, default: 64) – Size of the node embeddings.
layer_sizes (list, default: [64, 64, 64]) – Size of the output of convolution layers.
dropout_rates (list, default: [0.1, 0.1, 0.1]) – Dropout rate for each of the convolution layers. - Number of values should be the same as ‘layer_sizes’
num_epochs (int, default: 1000) – Maximum number of iterations or the number of epochs.
learning_rate (float, default: 0.001) – The learning rate that determines the step size at each iteration.
batch_size (int, default: 1024) – Mini-batch size used for training.
early_stopping ({min_delta: float, patience: int}, optional, default: None) –
If None, no early stopping. Meaning of the arguments:
- min_delta: the minimum increase in monitored value on validation
set to be considered as improvement, i.e. an increment of less than min_delta will count as no improvement.
- patience: number of epochs with no improvement after which
training should be stopped.
lambda_reg (float, default: 1e-4) – Weight decay for the L2 normalization.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained.
verbose (boolean, optional, default: False) – When True, some running logs are displayed.
seed (int, optional, default: 2020) – Random seed for parameters initialization.
References
Wang, Xiang, et al. “Neural graph collaborative filtering.” Proceedings of the 42nd international ACM SIGIR conference on Research and development in Information Retrieval. 2019.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
- monitor_value(train_set, val_set)[source]#
Calculating monitored value used for early stopping on validation set (val_set). This function will be called by early_stop() function.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
res – Monitored value on validation set. Return None if val_set is None.
- Return type:
Collaborative Context Poisson Factorization (C2PF)#
- class cornac.models.c2pf.recom_c2pf.C2PF(k=100, max_iter=100, variant='c2pf', name=None, trainable=True, verbose=False, init_params=None)[source]#
Collaborative Context Poisson Factorization.
- Parameters:
k (int, optional, default: 100) – The dimension of the latent factors.
max_iter (int, optional, default: 100) – Maximum number of iterations for variational C2PF.
variant (string, optional, default: 'c2pf') – C2pf’s variant: c2pf: ‘c2pf’, ‘tc2pf’ (tied-c2pf) or ‘rc2pf’ (reduced-c2pf). Please refer to the original paper for details.
name (string, optional, default: None) – The name of the recommender model. If None, then “variant” is used as the default name of the model.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (Theta, Beta and Xi are not None).
Item_context (See "cornac/examples/c2pf_example.py" in the GitHub repo for an example of how to use cornac's graph modality to load and provide "item context" for C2PF.)
init_params (dict, optional, default: None) –
List of initial parameters, e.g., init_params = {‘G_s’:G_s, ‘G_r’:G_r, ‘L_s’:L_s, ‘L_r’:L_r, ‘L2_s’:L2_s, ‘L2_r’:L2_r, ‘L3_s’:L3_s, ‘L3_r’: L3_r}
- Theta: ndarray, shape (n_users, k)
The expected user latent factors.
- Beta: ndarray, shape (n_items, k)
The expected item latent factors.
- Xi: ndarray, shape (n_items, k)
The expected context item latent factors multiplied by context effects Kappa.
- G_s: ndarray, shape (n_users, k)
Represent the “shape” parameters of Gamma distribution over Theta.
- G_r: ndarray, shape (n_users, k)
Represent the “rate” parameters of Gamma distribution over Theta.
- L_s: ndarray, shape (n_items, k)
Represent the “shape” parameters of Gamma distribution over Beta.
- L_r: ndarray, shape (n_items, k)
Represent the “rate” parameters of Gamma distribution over Beta.
- L2_s: ndarray, shape (n_items, k)
Represent the “shape” parameters of Gamma distribution over Xi.
- L2_r: ndarray, shape (n_items, k)
Represent the “rate” parameters of Gamma distribution over Xi.
- L3_s: ndarray
Represent the “shape” parameters of Gamma distribution over Kappa.
- L3_r: ndarray
Represent the “rate” parameters of Gamma distribution over Kappa.
References
Salah, Aghiles, and Hady W. Lauw. A Bayesian Latent Variable Model of User Preferences with Item Context. In IJCAI, pp. 2667-2674. 2018.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
Graph Convolutional Matrix Completion (GCMC)#
Main class for GCMC recommender model
- class cornac.models.gcmc.recom_gcmc.GCMC(name='GCMC', max_iter=2000, learning_rate=0.01, optimizer='adam', activation_func='leaky_relu', gcn_agg_units=500, gcn_out_units=75, gcn_dropout=0.7, gcn_agg_accum='stack', share_param=False, gen_r_num_basis_func=2, train_grad_clip=1.0, train_valid_interval=1, train_early_stopping_patience=100, train_min_learning_rate=0.001, train_decay_patience=50, train_lr_decay_factor=0.5, trainable=True, verbose=False, seed=None)[source]#
Graph Convolutional Matrix Completion (GCMC)
- Parameters:
name (string, default: 'GCMC') – The name of the recommender model.
max_iter (int, default: 2000) – Maximum number of iterations or the number of epochs for SGD
learning_rate (float, default: 0.01) – The learning rate for SGD
optimizer (string, default: 'adam'. Supported values: 'adam','sgd'.) – The optimization method used for SGD
activation_func (string, default: 'leaky') – The activation function used in the GCMC model. Supported values: [‘leaky’, ‘linear’,’sigmoid’,’relu’, ‘tanh’]
gcn_agg_units (int, default: 500) – The number of units in the graph convolutional layers
gcn_out_units (int, default: 75) – The number of units in the output layer
gcn_dropout (float, default: 0.7) – The dropout rate for the graph convolutional layers
gcn_agg_accum (string, default:'stack') – The graph convolutional layer aggregation type. Supported values: [‘stack’, ‘sum’]
share_param (bool, default: False) – Whether to share the parameters in the graph convolutional layers
gen_r_num_basis_func (int, default: 2) – The number of basis functions used in the generating rating function
train_grad_clip (float, default: 1.0) – The gradient clipping value for training
train_valid_interval (int, default: 1) – The validation interval for training
train_early_stopping_patience (int, default: 100) – The patience for early stopping
train_min_learning_rate (float, default: 0.001) – The minimum learning rate for SGD
train_decay_patience (int, default: 50) – The patience for learning rate decay
train_lr_decay_factor (float, default: 0.5) – The learning rate decay factor
trainable (boolean, default: True) – When False, the model is not trained and Cornac
verbose (boolean, default: True) – When True, some running logs are displayed
seed (int, default: None) – Random seed for parameters initialization
References
van den Berg, R., Kipf, T. N., & Welling, M. (2018). Graph Convolutional Matrix Completion.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
Multi-Task Explainable Recommendation (MTER)#
- class cornac.models.mter.recom_mter.MTER(name='MTER', rating_scale=5.0, n_user_factors=15, n_item_factors=15, n_aspect_factors=12, n_opinion_factors=12, n_bpr_samples=1000, n_element_samples=50, lambda_reg=0.1, lambda_bpr=10, max_iter=200000, lr=0.1, n_threads=0, trainable=True, verbose=False, init_params=None, seed=None)#
Multi-Task Explainable Recommendation
- Parameters:
name (string, optional, default: 'MTER') – The name of the recommender model.
rating_scale (float, optional, default: 5.0) – The maximum rating score of the dataset.
n_user_factors (int, optional, default: 15) – The dimension of the user latent factors.
n_item_factors (int, optional, default: 15) – The dimension of the item latent factors.
n_aspect_factors (int, optional, default: 12) – The dimension of the aspect latent factors.
n_opinion_factors (int, optional, default: 12) – The dimension of the opinion latent factors.
n_bpr_samples (int, optional, default: 1000) – The number of samples from all BPR pairs.
n_element_samples (int, optional, default: 50) – The number of samples from all ratings in each iteration.
lambda_reg (float, optional, default: 0.1) – The regularization parameter.
lambda_bpr (float, optional, default: 10.0) – The regularization parameter for BPR.
max_iter (int, optional, default: 200000) – Maximum number of iterations for training.
lr (float, optional, default: 0.1) – The learning rate for optimization
n_threads (int, optional, default: 0) – Number of parallel threads for training. If n_threads=0, all CPU cores will be utilized. If seed is not None, n_threads=1 to remove randomness from parallelization.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U, I, A, O, G1, G2, and G3 are not None).
verbose (boolean, optional, default: False) – When True, running logs are displayed.
init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘I’:I, ‘A’:A, ‘O’:O, ‘G1’:G1, ‘G2’:G2, ‘G3’:G3}
- U: ndarray, shape (n_users, n_user_factors)
The user latent factors, optional initialization via init_params
- I: ndarray, shape (n_items, n_item_factors)
The item latent factors, optional initialization via init_params
- A: ndarray, shape (num_aspects+1, n_aspect_factors)
The aspect latent factors, optional initialization via init_params
- O: ndarray, shape (num_opinions, n_opinion_factors)
The opinion latent factors, optional initialization via init_params
- G1: ndarray, shape (n_user_factors, n_item_factors, n_aspect_factors)
The core tensor for user, item, and aspect factors, optional initialization via init_params
- G2: ndarray, shape (n_user_factors, n_aspect_factors, n_opinion_factors)
The core tensor for user, aspect, and opinion factors, optional initialization via init_params
- G3: ndarray, shape (n_item_factors, n_aspect_factors, n_opinion_factors)
The core tensor for item, aspect, and opinion factors, optional initialization via init_params
seed (int, optional, default: None) – Random seed for parameters initialization.
References
Nan Wang, Hongning Wang, Yiling Jia, and Yue Yin. 2018. Explainable Recommendation via Multi-Task Learning in Opinionated Text Data. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR ‘18). ACM, New York, NY, USA, 165-174. DOI: https://doi.org/10.1145/3209978.3210010
- fit(train_set, val_set=None)#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- score(u_idx, i_idx=None)#
Predict the scores/ratings of a user for an item.
- Parameters:
- Returns:
res – Relative scores that the user gives to the item or to all known items
- Return type:
A scalar or a Numpy array
Neural Attention Rating Regression with Review-level Explanations (NARRE)#
- class cornac.models.narre.recom_narre.NARRE(name='NARRE', embedding_size=100, id_embedding_size=32, n_factors=32, attention_size=16, kernel_sizes=[3], n_filters=64, dropout_rate=0.5, max_text_length=50, max_num_review=32, batch_size=64, max_iter=10, optimizer='adam', learning_rate=0.001, model_selection='last', user_based=True, trainable=True, verbose=True, init_params=None, seed=None)[source]#
Neural Attentional Rating Regression with Review-level Explanations
- Parameters:
name (string, default: 'NARRE') – The name of the recommender model.
embedding_size (int, default: 100) – Word embedding size
id_embedding_size (int, default: 32) – User/item review id embedding size
n_factors (int, default: 32) – The dimension of the user/item’s latent factors.
attention_size (int, default: 16) – Attention size
kernel_sizes (list, default: [3]) – List of kernel sizes of conv2d
n_filters (int, default: 64) – Number of filters
dropout_rate (float, default: 0.5) – Dropout rate of neural network dense layers
max_text_length (int, default: 50) – Maximum number of tokens in a review instance
max_num_review (int, default: 32) – Maximum number of reviews that you want to feed into training. By default, the model will be trained with 32 reviews.
batch_size (int, default: 64) – Batch size
max_iter (int, default: 10) – Max number of training epochs
optimizer (string, optional, default: 'adam') – Optimizer for training is either ‘adam’ or ‘rmsprop’.
learning_rate (float, optional, default: 0.001) – Initial value of learning rate for the optimizer.
model_selection (str, optional, default: 'last') – Model selection strategy is either ‘best’ or ‘last’.
user_based (boolean, optional, default: True) – Evaluation strategy for model selection, by default, it measures for every users and taking the average user_based=True. Set user_based=False if you want to measure per rating instead.
trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.
verbose (boolean, optional, default: True) – When True, running logs are displayed.
init_params (dictionary, optional, default: None) – Initial parameters, pretrained_word_embeddings could be initialized here, e.g., init_params={‘pretrained_word_embeddings’: pretrained_word_embeddings}
seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).
References
Chen, C., Zhang, M., Liu, Y., & Ma, S. (2018, April). Neural attentional rating regression with review-level explanations. In Proceedings of the 2018 World Wide Web Conference (pp. 1583-1592).
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
- static load(model_path, trainable=False)[source]#
Load a recommender model from the filesystem.
- Parameters:
model_path (str, required) – Path to a file or directory where the model is stored. If a directory is provided, the latest model will be loaded.
trainable (boolean, optional, default: False) – Set it to True if you would like to finetune the model. By default, the model parameters are assumed to be fixed after being loaded.
- Returns:
self
- Return type:
- save(save_dir=None)[source]#
Save a recommender model to the filesystem.
- Parameters:
save_dir (str, default: None) – Path to a directory for the model to be stored.
Probabilistic Collaborative Representation Learning (PCRL)#
- class cornac.models.pcrl.recom_pcrl.PCRL(k=100, z_dims=[300], max_iter=300, batch_size=300, learning_rate=0.001, name='PCRL', trainable=True, verbose=False, w_determinist=True, init_params=None)[source]#
Probabilistic Collaborative Representation Learning.
- Parameters:
k (int, optional, default: 100) – The dimension of the latent factors.
z_dims (Numpy 1d array, optional, default: [300]) – The dimensions of the hidden intermdiate layers ‘z’ in the order [dim(z_L), …,dim(z_1)], please refer to Figure 1 in the orginal paper for more details.
max_iter (int, optional, default: 300) – Maximum number of iterations (number of epochs) for variational PCRL.
batch_size (int, optional, default: 300) – The batch size for SGD.
learning_rate (float, optional, default: 0.001) – The learning rate for SGD.
aux_info (see "cornac/examples/pcrl_example.py" in the GitHub repo for an example of how to use cornac's graph modality provide item auxiliary data (e.g., context, text, etc.) for PCRL.)
name (string, optional, default: 'PCRL') – The name of the recommender model.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (Theta, Beta and Xi are not None).
w_determinist (boolean, optional, default: True) – When True, determinist wheights “W” are used for the generator network, otherwise “W” is stochastic as in the original paper.
init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘G_s’:G_s, ‘G_r’:G_r, ‘L_s’:L_s, ‘L_r’:L_r}.
- Theta: ndarray, shape (n_users, k)
The expected user latent factors.
- Beta: ndarray, shape (n_items, k)
The expected item latent factors.
- G_s: ndarray, shape (n_users, k)
Represent the “shape” parameters of Gamma distribution over Theta.
- G_r: ndarray, shape (n_users, k)
Represent the “rate” parameters of Gamma distribution over Theta.
- L_s: ndarray, shape (n_items, k)
Represent the “shape” parameters of Gamma distribution over Beta.
- L_r: ndarray, shape (n_items, k)
Represent the “rate” parameters of Gamma distribution over Beta.
References
Salah, Aghiles, and Hady W. Lauw. Probabilistic Collaborative Representation Learning for Personalized Item Recommendation. In UAI 2018.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
VAE for Collaborative Filtering (VAECF)#
- class cornac.models.vaecf.recom_vaecf.VAECF(name='VAECF', k=10, autoencoder_structure=[20], act_fn='tanh', likelihood='mult', n_epochs=100, batch_size=100, learning_rate=0.001, beta=1.0, trainable=True, verbose=False, seed=None, use_gpu=False)[source]#
Variational Autoencoder for Collaborative Filtering.
- Parameters:
k (int, optional, default: 10) – The dimension of the stochastic user factors ``z’’.
autoencoder_structure (list, default: [20]) – The number of neurons of encoder/decoder layer for VAE. For example, autoencoder_structure = [200], the VAE structure will be [num_items, 200, k, 200, num_items].
act_fn (str, default: 'tanh') – Name of the activation function used between hidden layers of the auto-encoder. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘relu6’]
likelihood (str, default: 'mult') –
Name of the likelihood function used for modeling the observations. Supported choices:
mult: Multinomial likelihood bern: Bernoulli likelihood gaus: Gaussian likelihood pois: Poisson likelihood
n_epochs (int, optional, default: 100) – The number of epochs for SGD.
batch_size (int, optional, default: 100) – The batch size.
learning_rate (float, optional, default: 0.001) – The learning rate for Adam.
beta (float, optional, default: 1.0) – The weight of the KL term as in beta-VAE.
name (string, optional, default: 'VAECF') – The name of the recommender model.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained.
verbose (boolean, optional, default: False) – When True, some running logs are displayed.
seed (int, optional, default: None) – Random seed for parameters initialization.
use_gpu (boolean, optional, default: False) – If True and your system supports CUDA then training is performed on GPUs.
References
Liang, Dawen, Rahul G. Krishnan, Matthew D. Hoffman, and Tony Jebara. “Variational autoencoders for collaborative filtering.” In Proceedings of the 2018 World Wide Web Conference on World Wide Web, pp. 689-698.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
Collaborative Variational Autoencoder (CVAE)#
- class cornac.models.cvae.recom_cvae.CVAE(name='CVAE', z_dim=50, n_epochs=100, lambda_u=0.0001, lambda_v=0.001, lambda_r=10, lambda_w=0.0001, lr=0.001, a=1, b=0.01, input_dim=8000, vae_layers=[200, 100], act_fn='sigmoid', loss_type='cross-entropy', batch_size=128, init_params=None, trainable=True, seed=None, verbose=True)[source]#
Collaborative Variational Autoencoder
- Parameters:
z_dim (int, optional, default: 50) – The dimension of the user and item latent factors.
n_epochs (int, optional, default: 100) – Maximum number of epochs for training.
lambda_u (float, optional, default: 1e-4) – The regularization hyper-parameter for user latent factor.
lambda_v (float, optional, default: 0.001) – The regularization hyper-parameter for item latent factor.
lambda_r (float, optional, default: 10.0) – Parameter that balance the focus on content or ratings
lambda_w (float, optional, default: 1e-4) – The regularization for VAE weights
lr (float, optional, default: 0.001) – Learning rate in the auto-encoder training
a (float, optional, default: 1) – The confidence of observed ratings.
b (float, optional, default: 0.01) – The confidence of unseen ratings.
input_dim (int, optional, default: 8000) – The size of input vector
vae_layers (list, optional, default: [200, 100]) – The list containing size of each layers in neural network structure
act_fn (str, default: 'sigmoid') – Name of the activation function used for the variational auto-encoder. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘relu6’, ‘leaky_relu’, ‘identity’]
loss_type (String, optional, default: "cross-entropy") – Either “cross-entropy” or “rmse” The type of loss function in the last layer
batch_size (int, optional, default: 128) – The batch size for SGD.
init_params (dict, optional, default: {'U':None, 'V':None}) – Initial U and V latent matrix
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
References
Collaborative Variational Autoencoder for Recommender Systems X. Li and J. She ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2017
http://eelxpeng.github.io/assets/paper/Collaborative_Variational_Autoencoder.pdf
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
Conditional VAE for Collaborative Filtering (CVAECF)#
- class cornac.models.cvaecf.recom_cvaecf.CVAECF(name='CVAECF', z_dim=20, h_dim=20, autoencoder_structure=[20], act_fn='tanh', likelihood='mult', n_epochs=100, batch_size=128, learning_rate=0.001, beta=1.0, alpha_1=1.0, alpha_2=1.0, trainable=True, verbose=False, seed=None, use_gpu=False)[source]#
Conditional Variational Autoencoder for Collaborative Filtering.
- Parameters:
z_dim (int, optional, default: 20) – The dimension of the stochastic user factors ``z’’ representing the preference information.
h_dim (int, optional, default: 20) – The dimension of the stochastic user factors ``h’’ representing the auxiliary data.
autoencoder_structure (list, default: [20]) – The number of neurons of encoder/decoder hidden layer for CVAE. For example, when autoencoder_structure = [20], the CVAE encoder structures will be [y_dim, 20, z_dim] and [x_dim, 20, h_dim], the decoder structure will be [z_dim + h_dim, 20, y_dim], where y and x are respectively the preference and auxiliary data.
act_fn (str, default: 'tanh') – Name of the activation function used between hidden layers of the auto-encoder. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘relu6’]
likelihood (str, default: 'mult') –
Name of the likelihood function used for modeling user preferences. Supported choices:
mult: Multinomial likelihood bern: Bernoulli likelihood gaus: Gaussian likelihood pois: Poisson likelihood
n_epochs (int, optional, default: 100) – The number of epochs for SGD.
batch_size (int, optional, default: 128) – The batch size.
learning_rate (float, optional, default: 0.001) – The learning rate for Adam.
beta (float, optional, default: 1.0) – The weight of the KL term KL(q(z|y)||p(z)) as in beta-VAE.
alpha_1 (float, optional, default: 1.0) – The weight of the KL term KL(q(h|x)||p(h|x)).
alpha_2 (float, optional, default: 1.0) – The weight of the KL term KL(q(h|x)||q(h|y)).
name (string, optional, default: 'CVAECF') – The name of the recommender model.
trainable (boolean, optional, default: True) – When False, the model is not trained, and Cornac assumes that the model is already pre-trained.
verbose (boolean, optional, default: False) – When True, some running logs are displayed.
seed (int, optional, default: None) – Random seed for parameters initialization.
use_gpu (boolean, optional, default: False) – If True and your system supports CUDA then training is performed on GPUs.
data (user auxiliary)
References
Lee, Wonsung, Kyungwoo Song, and Il-Chul Moon. “Augmented variational autoencoders for collaborative filtering with auxiliary information.” Proceedings of ACM CIKM. 2017.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
Generalized Matrix Factorization (GMF)#
- class cornac.models.ncf.recom_gmf.GMF(name='GMF', num_factors=8, reg=0.0, num_epochs=20, batch_size=256, num_neg=4, lr=0.001, learner='adam', backend='tensorflow', early_stopping=None, trainable=True, verbose=True, seed=None)[source]#
Generalized Matrix Factorization.
- Parameters:
num_factors (int, optional, default: 8) – Embedding size of MF model.
reg (float, optional, default: 0.) – Regularization (weight_decay).
num_epochs (int, optional, default: 20) – Number of epochs.
batch_size (int, optional, default: 256) – Batch size.
num_neg (int, optional, default: 4) – Number of negative instances to pair with a positive instance.
lr (float, optional, default: 0.001) – Learning rate.
learner (str, optional, default: 'adam') – Specify an optimizer: adagrad, adam, rmsprop, sgd
backend (str, optional, default: 'tensorflow') – Backend used for model training: tensorflow, pytorch
early_stopping ({min_delta: float, patience: int}, optional, default: None) –
If None, no early stopping. Meaning of the arguments:
min_delta: the minimum increase in monitored value on validation set to be considered as improvement, i.e. an increment of less than min_delta will count as no improvement.
patience: number of epochs with no improvement after which training should be stopped.
name (string, optional, default: 'GMF') – Name of the recommender model.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained.
verbose (boolean, optional, default: False) – When True, some running logs are displayed.
seed (int, optional, default: None) – Random seed for parameters initialization.
References
He, X., Liao, L., Zhang, H., Nie, L., Hu, X., & Chua, T. S. (2017, April). Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web (pp. 173-182).
Indexable Bayesian Personalized Ranking (IBPR)#
- class cornac.models.ibpr.recom_ibpr.IBPR(k=20, max_iter=100, learning_rate=0.05, lamda=0.001, batch_size=100, name='IBPR', trainable=True, verbose=False, init_params=None)[source]#
Indexable Bayesian Personalized Ranking.
- Parameters:
k (int, optional, default: 20) – The dimension of the latent factors.
max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
learning_rate (float, optional, default: 0.05) – The learning rate for SGD.
lamda (float, optional, default: 0.001) – The regularization parameter.
batch_size (int, optional, default: 100) – The batch size for SGD.
name (string, optional, default: 'IBRP') – The name of the recommender model.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
verbose (boolean, optional, default: False) – When True, some running logs are displayed.
init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V} please see below the definition of U and V.
- U: csc_matrix, shape (n_users,k)
The user latent factors, optional initialization via init_params.
- V: csc_matrix, shape (n_items,k)
The item latent factors, optional initialization via init_params.
References
Le, D. D., & Lauw, H. W. (2017, November). Indexable Bayesian personalized ranking for efficient top-k recommendation. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (pp. 1389-1398). ACM.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
Matrix Co-Factorization (MCF)#
- class cornac.models.mcf.recom_mcf.MCF(k=5, max_iter=100, learning_rate=0.001, gamma=0.9, lamda=0.001, name='MCF', trainable=True, verbose=False, init_params=None, seed=None)[source]#
Matrix Co-Factorization.
- Parameters:
k (int, optional, default: 5) – The dimension of the latent factors.
max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
learning_rate (float, optional, default: 0.001) – The learning rate for SGD_RMSProp.
gamma (float, optional, default: 0.9) – The weight for previous/current gradient in RMSProp.
lamda (float, optional, default: 0.001) – The regularization parameter.
name (string, optional, default: 'MCF') – The name of the recommender model.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained (U and V are not None).
network (item-affinity)
verbose (boolean, optional, default: False) – When True, some running logs are displayed.
init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’: U, ‘V’: V, ‘Z’, Z}.
- U: ndarray, shape (n_users, k)
User latent factors.
- V: ndarray, shape (n_items, k)
Item latent factors.
- Z: ndarray, shape (n_items, k)
The “Also-Viewed” item latent factors.
seed (int, optional, default: None) – Random seed for parameters initialization.
References
Park, Chanyoung, Donghyun Kim, Jinoh Oh, and Hwanjo Yu. “Do Also-Viewed Products Help User Rating Prediction?.” In Proceedings of WWW, pp. 1113-1122. 2017.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
Multi-Layer Perceptron (MLP)#
- class cornac.models.ncf.recom_mlp.MLP(name='MLP', layers=(64, 32, 16, 8), act_fn='relu', reg=0.0, num_epochs=20, batch_size=256, num_neg=4, lr=0.001, learner='adam', backend='tensorflow', early_stopping=None, trainable=True, verbose=True, seed=None)[source]#
Multi-Layer Perceptron.
- Parameters:
layers (list, optional, default: [64, 32, 16, 8]) – MLP layers. Note that the first layer is the concatenation of user and item embeddings. So layers[0]/2 is the embedding size.
act_fn (str, default: 'relu') – Name of the activation function used for the MLP layers. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘selu, ‘relu6’, ‘leaky_relu’]
reg (float, optional, default: 0.) – Regularization (weight_decay).
num_epochs (int, optional, default: 20) – Number of epochs.
batch_size (int, optional, default: 256) – Batch size.
num_neg (int, optional, default: 4) – Number of negative instances to pair with a positive instance.
lr (float, optional, default: 0.001) – Learning rate.
learner (str, optional, default: 'adam') – Specify an optimizer: adagrad, adam, rmsprop, sgd
backend (str, optional, default: 'tensorflow') – Backend used for model training: tensorflow, pytorch
early_stopping ({min_delta: float, patience: int}, optional, default: None) –
If None, no early stopping. Meaning of the arguments:
min_delta: the minimum increase in monitored value on validation set to be considered as improvement, i.e. an increment of less than min_delta will count as no improvement.
patience: number of epochs with no improvement after which training should be stopped.
name (string, optional, default: 'MLP') – Name of the recommender model.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained.
verbose (boolean, optional, default: False) – When True, some running logs are displayed.
seed (int, optional, default: None) – Random seed for parameters initialization.
References
He, X., Liao, L., Zhang, H., Nie, L., Hu, X., & Chua, T. S. (2017, April). Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web (pp. 173-182).
Neural Matrix Factorization (NeuMF/NCF)#
- class cornac.models.ncf.recom_neumf.NeuMF(name='NeuMF', num_factors=8, layers=(64, 32, 16, 8), act_fn='relu', reg=0.0, num_epochs=20, batch_size=256, num_neg=4, lr=0.001, learner='adam', backend='tensorflow', early_stopping=None, trainable=True, verbose=True, seed=None)[source]#
Neural Matrix Factorization.
- Parameters:
num_factors (int, optional, default: 8) – Embedding size of MF model.
layers (list, optional, default: [64, 32, 16, 8]) – MLP layers. Note that the first layer is the concatenation of user and item embeddings. So layers[0]/2 is the embedding size.
act_fn (str, default: 'relu') – Name of the activation function used for the MLP layers. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘selu, ‘relu6’, ‘leaky_relu’]
reg (float, optional, default: 0.) – Regularization (weight_decay).
reg_layers (list, optional, default: [0., 0., 0., 0.]) – Regularization for each MLP layer, reg_layers[0] is the regularization for embeddings.
num_epochs (int, optional, default: 20) – Number of epochs.
batch_size (int, optional, default: 256) – Batch size.
num_neg (int, optional, default: 4) – Number of negative instances to pair with a positive instance.
lr (float, optional, default: 0.001) – Learning rate.
learner (str, optional, default: 'adam') – Specify an optimizer: adagrad, adam, rmsprop, sgd
backend (str, optional, default: 'tensorflow') – Backend used for model training: tensorflow, pytorch
early_stopping ({min_delta: float, patience: int}, optional, default: None) –
If None, no early stopping. Meaning of the arguments:
min_delta: the minimum increase in monitored value on validation set to be considered as improvement, i.e. an increment of less than min_delta will count as no improvement.
patience: number of epochs with no improvement after which training should be stopped.
name (string, optional, default: 'NeuMF') – Name of the recommender model.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained.
verbose (boolean, optional, default: False) – When True, some running logs are displayed.
seed (int, optional, default: None) – Random seed for parameters initialization.
References
He, X., Liao, L., Zhang, H., Nie, L., Hu, X., & Chua, T. S. (2017, April). Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web (pp. 173-182).
- from_pretrained(pretrained_gmf, pretrained_mlp, alpha=0.5)[source]#
Provide pre-trained GMF and MLP models. Section 3.4.1 of the paper.
- Parameters:
pretrained_gmf (object of type GMF, required) – Reference to trained/fitted GMF model.
pretrained_mlp (object of type MLP, required) – Reference to trained/fitted MLP model.
alpha (float, optional, default: 0.5) – Hyper-parameter determining the trade-off between the two pre-trained models. Details are described in the section 3.4.1 of the paper.
Online Indexable Bayesian Personalized Ranking (OIBPR)#
- class cornac.models.online_ibpr.recom_online_ibpr.OnlineIBPR(k=20, max_iter=100, learning_rate=0.05, lamda=0.001, batch_size=100, name='online_ibpr', trainable=True, verbose=False, init_params=None)[source]#
Online Indexable Bayesian Personalized Ranking.
- Parameters:
k (int, optional, default: 20) – The dimension of the latent factors.
max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
learning_rate (float, optional, default: 0.05) – The learning rate for SGD.
lamda (float, optional, default: 0.001) – The regularization parameter.
batch_size (int, optional, default: 100) – The batch size for SGD.
name (string, optional, default: 'IBRP') – The name of the recommender model.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
verbose (boolean, optional, default: False) – When True, some running logs are displayed.
init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V} please see below the definition of U and V.
- U: csc_matrix, shape (n_users,k)
The user latent factors, optional initialization via init_params.
- V: csc_matrix, shape (n_items,k)
The item latent factors, optional initialization via init_params.
References
Le, D. D., & Lauw, H. W. (2017, November). Indexable Bayesian personalized ranking for efficient top-k recommendation. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (pp. 1389-1398). ACM.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
Visual Matrix Factorization (VMF)#
- class cornac.models.vmf.recom_vmf.VMF(name='VMF', k=10, d=10, n_epochs=100, batch_size=100, learning_rate=0.001, gamma=0.9, lambda_u=0.001, lambda_v=0.001, lambda_p=1.0, lambda_e=10.0, trainable=True, verbose=False, use_gpu=False, init_params=None, seed=None)[source]#
Visual Matrix Factorization.
- Parameters:
k (int, optional, default: 10) – The dimension of the user and item factors.
d (int, optional, default: 10) – The dimension of the user visual factors.
n_epochs (int, optional, default: 100) – The number of epochs for SGD.
learning_rate (float, optional, default: 0.001) – The learning rate for SGD_RMSProp.
gamma (float, optional, default: 0.9) – The weight for previous/current gradient in RMSProp.
lambda_u (float, optional, default: 0.001) – The regularization parameter for user factors.
lambda_v (float, optional, default: 0.001) – The regularization parameter for item factors.
lambda_p (float, optional, default: 1.0) – The regularization parameter for user visual factors.
lambda_e (float, optional, default: 10.) – The regularization parameter for the kernel embedding matrix
lambda_u – The regularization parameter for user factors.
name (string, optional, default: 'VMF') – The name of the recommender model.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained (The parameters of the model U, V, P, E are not None).
visual_features (See "cornac/examples/vmf_example.py" for an example of how to use cornac's visual modality to load and provide the "item visual features" for VMF.)
verbose (boolean, optional, default: False) – When True, some running logs are displayed.
init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V, ‘P’: P, ‘E’: E}.
U: numpy array of shape (n_users,k), user latent factors. V: numpy array of shape (n_items,k), item latent factors. P: numpy array of shape (n_users,d), user visual latent factors. E: numpy array of shape (d,c), embedding kernel matrix.
seed (int, optional, default: None) – Random seed for parameters initialization.
References
Park, Chanyoung, Donghyun Kim, Jinoh Oh, and Hwanjo Yu. “Do Also-Viewed Products Help User Rating Prediction?.” In Proceedings of WWW, pp. 1113-1122. 2017.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
Collaborative Deep Ranking (CDR)#
- class cornac.models.cdr.recom_cdr.CDR(name='CDR', k=50, autoencoder_structure=None, act_fn='relu', lambda_u=0.1, lambda_v=100, lambda_w=0.1, lambda_n=1000, corruption_rate=0.3, learning_rate=0.001, dropout_rate=0.1, batch_size=128, max_iter=100, trainable=True, verbose=True, vocab_size=8000, init_params=None, seed=None)[source]#
Collaborative Deep Ranking.
- Parameters:
k (int, optional, default: 50) – The dimension of the latent factors.
max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
autoencoder_structure (list, default: None) – The number of neurons of encoder/decoder layer for SDAE. For example, autoencoder_structure = [200], the SDAE structure will be [vocab_size, 200, k, 200, vocab_size]
act_fn (str, default: 'relu') – Name of the activation function used for the auto-encoder. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘relu6’, ‘leaky_relu’, ‘identity’]
learning_rate (float, optional, default: 0.001) – The learning rate for AdamOptimizer.
lambda_u (float, optional, default: 0.1) – The regularization parameter for users.
lambda_v (float, optional, default: 10) – The regularization parameter for items.
lambda_w (float, optional, default: 0.1) – The regularization parameter for SDAE weights.
lambda_n (float, optional, default: 1000) – The regularization parameter for SDAE output.
corruption_rate (float, optional, default: 0.3) – The corruption ratio for SDAE.
dropout_rate (float, optional, default: 0.1) – The probability that each element is removed in dropout of SDAE.
batch_size (int, optional, default: 128) – The batch size for SGD.
name (string, optional, default: 'CDR') – The name of the recommender model.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V}
- U: ndarray, shape (n_users,k)
The user latent factors, optional initialization via init_params.
- V: ndarray, shape (n_items,k)
The item latent factors, optional initialization via init_params.
seed (int, optional, default: None) – Random seed for weight initialization.
References
Collaborative Deep Ranking: A Hybrid Pair-Wise Recommendation Algorithm with Implicit Feedback Ying H., Chen L., Xiong Y., Wu J. (2016)
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
Collaborative Ordinal Embedding (COE)#
- class cornac.models.coe.recom_coe.COE(k=20, max_iter=100, learning_rate=0.05, lamda=0.001, batch_size=1000, name='coe', trainable=True, verbose=False, init_params=None)[source]#
Collaborative Ordinal Embedding.
- Parameters:
k (int, optional, default: 20) – The dimension of the latent factors.
max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
learning_rate (float, optional, default: 0.05) – The learning rate for SGD.
lamda (float, optional, default: 0.001) – The regularization parameter.
batch_size (int, optional, default: 100) – The batch size for SGD.
name (string, optional, default: 'IBRP') – The name of the recommender model.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
verbose (boolean, optional, default: False) – When True, some running logs are displayed.
init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V}.
- U: ndarray, shape (n_users, k)
The user latent factors.
- V: ndarray, shape (n_items, k)
The item latent factors.
References
Le, D. D., & Lauw, H. W. (2016, June). Euclidean co-embedding of ordinal data for multi-type visualization. In Proceedings of the 2016 SIAM International Conference on Data Mining (pp. 396-404). Society for Industrial and Applied Mathematics.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
Convolutional Matrix Factorization (ConvMF)#
- class cornac.models.conv_mf.recom_convmf.ConvMF(name='ConvMF', k=50, n_epochs=50, cnn_epochs=5, cnn_bs=128, cnn_lr=0.001, lambda_u=1, lambda_v=100, emb_dim=200, max_len=300, filter_sizes=[3, 4, 5], num_filters=100, hidden_dim=200, dropout_rate=0.2, give_item_weight=True, trainable=True, verbose=False, init_params=None, seed=None)[source]#
- Parameters:
k (int, optional, default: 50) – The dimension of the user and item latent factors.
n_epochs (int, optional, default: 50) – Maximum number of epochs for training.
cnn_epochs (int, optional, default: 5) – Number of epochs for optimizing the CNN for each overall training epoch.
cnn_bs (int, optional, default: 128) – Batch size for optimizing CNN.
cnn_lr (float, optional, default: 0.001) – Learning rate for optimizing CNN.
lambda_u (float, optional, default: 1.0) – The regularization hyper-parameter for user latent factor.
lambda_v (float, optional, default: 100.0) – The regularization hyper-parameter for item latent factor.
emb_dim (int, optional, default: 200) – The embedding size of each word. One word corresponds with [1 x emb_dim] vector in the embedding space
max_len (int, optional, default 300) – The maximum length of item’s document
filter_sizes (list, optional, default: [3, 4, 5]) – The length of filters in convolutional layer
num_filters (int, optional, default: 100) – The number of filters in convolutional layer
hidden_dim (int, optional, default: 200) – The dimension of hidden layer after the pooling of all convolutional layers
dropout_rate (float, optional, default: 0.2) – Dropout rate while training CNN
give_item_weight (boolean, optional, default: True) – When True, each item will be weighted base on the number of user who have rated this item
init_params (dict, optional, default: {'U':None, 'V':None, 'W': None}) – Initial U and V matrix and initial weight for embedding layer W
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
References
Donghyun Kim1, Chanyoung Park1. ConvMF: Convolutional Matrix Factorization for Document Context-Aware Recommendation. In :10th ACM Conference on Recommender Systems Pages 233-240
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
Spherical k-means (Skmeans)#
- class cornac.models.skm.recom_skmeans.SKMeans(k=5, max_iter=100, name='Skmeans', trainable=True, tol=1e-06, verbose=True, seed=None, init_par=None)[source]#
Spherical k-means based recommender.
- Parameters:
k (int, optional, default: 5) – The number of clusters.
max_iter (int, optional, default: 100) – Maximum number of iterations.
name (string, optional, default: 'Skmeans') – The name of the recommender model.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already trained.
tol (float, optional, default: 1e-6) – Relative tolerance with regards to skmeans’ criterion to declare convergence.
verbose (boolean, optional, default: False) – When True, some running logs are displayed.
seed (int, optional, default: None) – Random seed for parameters initialization.
init_par (numpy 1d array, optional, default: None) – The initial object parition, 1d array contaning the cluster label (int type starting from 0) of each object (user). If par = None, then skmeans is initialized randomly.
centroids (csc_matrix, shape (k,n_users)) – The maxtrix of cluster centroids.
References
Salah, Aghiles, Nicoleta Rogovschi, and Mohamed Nadif. “A dynamic collaborative filtering system via a weighted clustering approach.” Neurocomputing 175 (2016): 206-215.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
Visual Bayesian Personalized Ranking (VBPR)#
- class cornac.models.vbpr.recom_vbpr.VBPR(name='VBPR', k=10, k2=10, n_epochs=50, batch_size=100, learning_rate=0.005, lambda_w=0.01, lambda_b=0.01, lambda_e=0.0, use_gpu=False, trainable=True, verbose=True, init_params=None, seed=None)[source]#
Visual Bayesian Personalized Ranking.
- Parameters:
k (int, optional, default: 10) – The dimension of the gamma latent factors.
k2 (int, optional, default: 10) – The dimension of the theta latent factors.
n_epochs (int, optional, default: 20) – Maximum number of epochs for SGD.
batch_size (int, optional, default: 100) – The batch size for SGD.
learning_rate (float, optional, default: 0.001) – The learning rate for SGD.
lambda_w (float, optional, default: 0.01) – The regularization hyper-parameter for latent factor weights.
lambda_b (float, optional, default: 0.01) – The regularization hyper-parameter for biases.
lambda_e (float, optional, default: 0.0) – The regularization hyper-parameter for embedding matrix E and beta prime vector.
use_gpu (boolean, optional, default: True) – Whether or not to use GPU to speed up training.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
verbose (boolean, optional, default: True) – When True, running logs are displayed.
init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘Bi’: beta_item, ‘Gu’: gamma_user, ‘Gi’: gamma_item, ‘Tu’: theta_user, ‘E’: emb_matrix, ‘Bp’: beta_prime}
seed (int, optional, default: None) – Random seed for weight initialization.
References
He, R., & McAuley, J. (2016). VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
Collaborative Deep Learning (CDL)#
- class cornac.models.cdl.recom_cdl.CDL(name='CDL', k=50, autoencoder_structure=None, act_fn='relu', lambda_u=0.1, lambda_v=10, lambda_w=0.1, lambda_n=1000, a=1, b=0.01, corruption_rate=0.3, learning_rate=0.001, vocab_size=8000, dropout_rate=0.1, batch_size=128, max_iter=100, trainable=True, verbose=True, init_params=None, seed=None)[source]#
Collaborative Deep Learning.
- Parameters:
name (string, default: 'CDL') – The name of the recommender model.
k (int, optional, default: 50) – The dimension of the latent factors.
max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
autoencoder_structure (list, default: None) – The number of neurons of encoder/decoder layer for SDAE. For example, autoencoder_structure = [200], the SDAE structure will be [vocab_size, 200, k, 200, vocab_size]
act_fn (str, default: 'relu') – Name of the activation function used for the auto-encoder. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘relu6’, ‘leaky_relu’, ‘identity’]
learning_rate (float, optional, default: 0.001) – The learning rate for AdamOptimizer.
vocab_size (int, default: 8000) – The size of text input for the SDAE.
lambda_u (float, optional, default: 0.1) – The regularization parameter for users.
lambda_v (float, optional, default: 10) – The regularization parameter for items.
lambda_w (float, optional, default: 0.1) – The regularization parameter for SDAE weights.
lambda_n (float, optional, default: 1000) – The regularization parameter for SDAE output.
a (float, optional, default: 1) – The confidence of observed ratings.
b (float, optional, default: 0.01) – The confidence of unseen ratings.
corruption_rate (float, optional, default: 0.3) – The corruption ratio for input text of the SDAE.
dropout_rate (float, optional, default: 0.1) – The probability that each element is removed in dropout of SDAE.
batch_size (int, optional, default: 128) – The batch size for SGD.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V}
- U: ndarray, shape (n_users,k)
The user latent factors, optional initialization via init_params.
- V: ndarray, shape (n_items,k)
The item latent factors, optional initialization via init_params.
seed (int, optional, default: None) – Random seed for weight initialization.
References
Hao Wang, Naiyan Wang, Dit-Yan Yeung. CDL: Collaborative Deep Learning for Recommender Systems. In : SIGKDD. 2015. p. 1235-1244.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
Hierarchical Poisson Factorization (HPF)#
- class cornac.models.hpf.recom_hpf.HPF(k=5, max_iter=100, name='HPF', trainable=True, verbose=False, hierarchical=True, seed=None, init_params=None)[source]#
Hierarchical Poisson Factorization.
- Parameters:
k (int, optional, default: 5) – The dimension of the latent factors.
max_iter (int, optional, default: 100) – Maximum number of iterations.
name (string, optional, default: 'HPF') – The name of the recommender model.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained (Theta and Beta are not None).
verbose (boolean, optional, default: False) – When True, some running logs are displayed.
hierarchical (boolean, optional, default: True) – When False, PF is used instead of HPF.
seed (int, optional, default: None) – Random seed for parameters initialization.
init_params (dict, optional, default: None) –
Initial parameters of the model.
- Theta: ndarray, shape (n_users, k)
The expected user latent factors.
- Beta: ndarray, shape (n_items, k)
The expected item latent factors.
- G_s: ndarray, shape (n_users, k)
This represents “shape” parameters of Gamma distribution over Theta.
- G_r: ndarray, shape (n_users, k)
This represents “rate” parameters of Gamma distribution over Theta.
- L_s: ndarray, shape (n_items, k)
This represents “shape” parameters of Gamma distribution over Beta.
- L_r: ndarray, shape (n_items, k)
This represents “rate” parameters of Gamma distribution over Beta.
References
Gopalan, Prem, Jake M. Hofman, and David M. Blei. Scalable Recommendation with Hierarchical Poisson Factorization. In UAI, pp. 326-335. 2015.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
TriRank: Review-aware Explainable Recommendation by Modeling Aspects (TriRank)#
- class cornac.models.trirank.recom_trirank.TriRank(name='TriRank', alpha=1, beta=1, gamma=1, eta_U=1, eta_P=1, eta_A=1, max_iter=100, verbose=True, init_params=None, seed=None)[source]#
TriRank: Review-aware Explainable Recommendation by Modeling Aspects.
- Parameters:
name (string, optional, default: 'TriRank') – The name of the recommender model.
alpha (float, optional, default: 1) – The weight of smoothness on user-item relation
beta (float, optional, default: 1) – The weight of smoothness on item-aspect relation
gamma (float, optional, default: 1) – The weight of smoothness on user-aspect relation
eta_U (float, optional, default: 1) – The weight of fitting constraint on users
eta_P (float, optional, default: 1) – The weight of fitting constraint on items
eta_A (float, optional, default: 1) – The weight of fitting constraint on aspects
max_iter (int, optional, default: 100) – Maximum number of iterations to stop online training. If set to max_iter=-1, the online training will stop when model parameters are converged.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (R, X, Y, p, a, u are not None).
verbose (boolean, optional, default: False) – When True, running logs are displayed.
init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘R’:R, ‘X’:X, ‘Y’:Y, ‘p’:p, ‘a’:a, ‘u’:u}
- R: csr_matrix, shape (n_users, n_items)
The symmetric normalized of edge weight matrix of user-item relation, optional initialization via init_params
- X: csr_matrix, shape (n_items, n_aspects)
The symmetric normalized of edge weight matrix of item-aspect relation, optional initialization via init_params
- Y: csr_matrix, shape (n_users, n_aspects)
The symmetric normalized of edge weight matrix of user-aspect relation, optional initialization via init_params
- p: ndarray, shape (n_items,)
Initialized item weights, optional initialization via init_params
- a: ndarray, shape (n_aspects,)
Initialized aspect weights, optional initialization via init_params
- u: ndarray, shape (n_aspects,)
Initialized user weights, optional initialization via init_params
seed (int, optional, default: None) – Random seed for parameters initialization.
References
He, Xiangnan, Tao Chen, Min-Yen Kan, and Xiao Chen. 2014. TriRank: Review-aware Explainable Recommendation by Modeling Aspects. In the 24th ACM international on conference on information and knowledge management (CIKM’15). ACM, New York, NY, USA, 1661-1670. DOI: https://doi.org/10.1145/2806416.2806504
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
Explicit Factor Model (EFM)#
- class cornac.models.efm.recom_efm.EFM(name='EFM', num_explicit_factors=40, num_latent_factors=60, num_most_cared_aspects=15, rating_scale=5.0, alpha=0.85, lambda_x=1, lambda_y=1, lambda_u=0.01, lambda_h=0.01, lambda_v=0.01, use_item_aspect_popularity=True, max_iter=100, num_threads=0, trainable=True, verbose=False, init_params=None, seed=None)#
Explict Factor Models
- Parameters:
num_explicit_factors (int, optional, default: 40) – The dimension of the explicit factors.
num_latent_factors (int, optional, default: 60) – The dimension of the latent factors.
num_most_cared_aspects (int, optional, default: 15) – The number of most cared aspects for each user.
rating_scale (float, optional, default: 5.0) – The maximum rating score of the dataset.
alpha (float, optional, default: 0.85) – Trade-off factor for constructing ranking score.
lambda_x (float, optional, default: 1) – The regularization parameter for user aspect attentions.
lambda_y (float, optional, default: 1) – The regularization parameter for item aspect qualities.
lambda_u (float, optional, default: 0.01) – The regularization parameter for user and item explicit factors.
lambda_h (float, optional, default: 0.01) – The regularization parameter for user and item latent factors.
lambda_v (float, optional, default: 0.01) – The regularization parameter for V.
use_item_aspect_popularity (boolean, optional, default: True) – When False, item aspect frequency is omitted from item aspect quality computation formular. Specifically, \(Y_{ij} = 1 + \frac{N - 1}{1 + e^{-s_{ij}}}\) if \(p_i\) is reviewed on feature \(F_j\)
max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs.
name (string, optional, default: 'EFM') – The name of the recommender model.
num_threads (int, optional, default: 0) – Number of parallel threads for training. If 0, all CPU cores will be utilized.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U1, U2, V, H1, and H2 are not None).
verbose (boolean, optional, default: False) – When True, running logs are displayed.
init_params (dictionary, optional, default: {}) –
List of initial parameters, e.g., init_params = {‘U1’:U1, ‘U2’:U2, ‘V’:V, ‘H1’:H1, ‘H2’:H2}
- U1: ndarray, shape (n_users, n_explicit_factors)
The user explicit factors, optional initialization via init_params.
- U2: ndarray, shape (n_ratings, n_explicit_factors)
The item explicit factors, optional initialization via init_params.
- V: ndarray, shape (n_aspects, n_explict_factors)
The aspect factors, optional initialization via init_params.
- H1: ndarray, shape (n_users, n_latent_factors)
The user latent factors, optional initialization via init_params.
- H2: ndarray, shape (n_ratings, n_latent_factors)
The item latent factors, optional initialization via init_params.
seed (int, optional, default: None) – Random seed for weight initialization.
References
Yongfeng Zhang, Guokun Lai, Min Zhang, Yi Zhang, Yiqun Liu, and Shaoping Ma. 2014. Explicit factor models for explainable recommendation based on phrase-level sentiment analysis. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval (SIGIR ‘14). ACM, New York, NY, USA, 83-92. DOI: https://doi.org/10.1145/2600428.2609579
- fit(train_set, val_set=None)#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- rank(user_idx, item_indices=None, k=-1)#
Rank all test items for a given user.
- Parameters:
user_idx (int, required) – The index of the user for whom to perform item raking.
item_indices (1d array, optional, default: None) – A list of candidate item indices to be ranked by the user. If None, list of ranked known item indices and their scores will be returned
k (int, required) – Cut-off length for recommendations, k=-1 will return ranked list of all items. This is more important for ANN to know the limit to avoid exhaustive ranking.
- Returns:
(ranked_items, item_scores) – ranked_items contains item indices being ranked by their scores. item_scores contains scores of items corresponding to index in item_indices input.
- Return type:
- score(user_idx, item_idx=None)#
Predict the scores/ratings of a user for an item.
- Parameters:
- Returns:
res – Relative scores that the user gives to the item or to all known items
- Return type:
A scalar or a Numpy array
Hidden Factors and Hidden Topics (HFT)#
- class cornac.models.hft.recom_hft.HFT(name='HFT', k=10, max_iter=50, grad_iter=50, lambda_text=0.1, l2_reg=0.001, vocab_size=8000, init_params=None, trainable=True, verbose=True, seed=None)[source]#
Hidden Factors and Hidden Topics
- Parameters:
name (string, default: 'HFT') – The name of the recommender model.
k (int, optional, default: 10) – The dimension of the latent factors.
max_iter (int, optional, default: 50) – Maximum number of iterations for EM.
grad_iter (int, optional, default: 50) – Maximum number of iterations for L-BFGS.
lambda_text (float, default: 0.1) – Weight of corpus likelihood in objective function.
l2_reg (float, default: 0.001) – Regularization for user item latent factors.
vocab_size (int, optional, default: 8000) – Size of vocabulary for review text.
init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘alpha’: alpha, ‘beta_u’: beta_u, ‘beta_i’: beta_i, ‘gamma_u’: gamma_u, ‘gamma_v’: gamma_v}
- alpha: float
Model offset, optional initialization via init_params.
- beta_u: ndarray. shape (n_user, 1)
User biases, optional initialization via init_params.
- beta_u: ndarray. shape (n_item, 1)
Item biases, optional initialization via init_params.
- gamma_u: ndarray, shape (n_users,k)
The user latent factors, optional initialization via init_params.
- gamma_v: ndarray, shape (n_items,k)
The item latent factors, optional initialization via init_params.
trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.
verbose (boolean, optional, default: True) – When True, some running logs are displayed.
seed (int, optional, default: None) – Random seed for weight initialization.
References
Julian McAuley, Jure Leskovec. “Hidden Factors and Hidden Topics: Understanding Rating Dimensions with Review Text” RecSys ‘13 Proceedings of the 7th ACM conference on Recommender systems Pages 165-172
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
Weighted Bayesian Personalized Ranking (WBPR)#
- class cornac.models.bpr.recom_wbpr.WBPR(name='WBPR', k=10, max_iter=100, learning_rate=0.001, lambda_reg=0.01, use_bias=True, num_threads=0, trainable=True, verbose=False, init_params=None, seed=None)#
Weighted Bayesian Personalized Ranking.
- Parameters:
k (int, optional, default: 10) – The dimension of the latent factors.
max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
learning_rate (float, optional, default: 0.001) – The learning rate for SGD.
lambda_reg (float, optional, default: 0.001) – The regularization hyper-parameter.
use_bias (boolean, optional, default: True) – When True, item bias is used.
num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.
trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.
verbose (boolean, optional, default: True) – When True, some running logs are displayed.
init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘U’: user_factors, ‘V’: item_factors, ‘Bi’: item_biases}
seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).
References
Gantner, Zeno, Lucas Drumond, Christoph Freudenthaler, and Lars Schmidt-Thieme. “Personalized ranking for non-uniformly sampled items.” In Proceedings of KDD Cup 2011, pp. 231-247. 2012.
- fit(train_set, val_set=None)#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
Collaborative Topic Regression (CTR)#
- class cornac.models.ctr.recom_ctr.CTR(name='CTR', k=200, lambda_u=0.01, lambda_v=0.01, eta=0.01, a=1, b=0.01, max_iter=100, trainable=True, verbose=True, init_params=None, seed=None)[source]#
Collaborative Topic Regression.
- Parameters:
name (string, default: 'CTR') – The name of the recommender model.
k (int, optional, default: 200) – The dimension of the latent factors.
max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
lambda_u (float, optional, default: 0.01) – The regularization parameter for users.
lambda_v (float, optional, default: 0.01) – The regularization parameter for items.
a (float, optional, default: 1) – The confidence of observed ratings.
b (float, optional, default: 0.01) – The confidence of unseen ratings.
eta (float, optional, default: 0.01) – Added value for smoothing phi.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V}
- U: ndarray, shape (n_users,k)
The user latent factors, optional initialization via init_params.
- V: ndarray, shape (n_items,k)
The item latent factors, optional initialization via init_params.
seed (int, optional, default: None) – Random seed for weight initialization.
References
Wang, Chong, and David M. Blei. “Collaborative topic modeling for recommending scientific articles.” Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2011.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
Baseline Only#
- cornac.models.baseline_only.recom_bo#
alias of <module ‘cornac.models.baseline_only.recom_bo’ from ‘/home/docs/checkouts/readthedocs.org/user_builds/cornac/envs/latest/lib/python3.11/site-packages/cornac/models/baseline_only/recom_bo.cpython-311-x86_64-linux-gnu.so’>
Bayesian Personalized Ranking (BPR)#
- class cornac.models.bpr.recom_bpr.BPR(name='BPR', k=10, max_iter=100, learning_rate=0.001, lambda_reg=0.01, use_bias=True, num_threads=0, trainable=True, verbose=False, init_params=None, seed=None)#
Bayesian Personalized Ranking.
- Parameters:
k (int, optional, default: 10) – The dimension of the latent factors.
max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
learning_rate (float, optional, default: 0.001) – The learning rate for SGD.
lambda_reg (float, optional, default: 0.001) – The regularization hyper-parameter.
use_bias (boolean, optional, default: True) – When True, item bias is used.
num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.
trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.
verbose (boolean, optional, default: True) – When True, some running logs are displayed.
init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘U’: user_factors, ‘V’: item_factors, ‘Bi’: item_biases}
seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).
References
Rendle, Steffen, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. BPR: Bayesian personalized ranking from implicit feedback. In UAI, pp. 452-461. 2009.
- fit(train_set, val_set=None)#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
- score(user_idx, item_idx=None)#
Predict the scores/ratings of a user for an item.
- Parameters:
- Returns:
res – Relative scores that the user gives to the item or to all known items
- Return type:
A scalar or a Numpy array
Factorization Machines (FM)#
Global Average (GlobalAvg)#
- class cornac.models.global_avg.recom_global_avg.GlobalAvg(name='GlobalAvg')[source]#
Global Average baseline for rating prediction. Rating predictions equal to average rating of training data (not personalized).
- Parameters:
name (string, default: 'GlobalAvg') – The name of the recommender model.
Item K-Nearest-Neighbors (ItemKNN)#
- class cornac.models.knn.recom_knn.ItemKNN(name='ItemKNN', k=20, similarity='cosine', mean_centered=False, weighting=None, amplify=1.0, num_threads=0, trainable=True, verbose=True, seed=None)[source]#
Item-Based Nearest Neighbor.
- Parameters:
name (string, default: 'ItemKNN') – The name of the recommender model.
k (int, optional, default: 20) – The number of nearest neighbors.
similarity (str, optional, default: 'cosine') – The similarity measurement. Supported types: [‘cosine’, ‘pearson’]
mean_centered (bool, optional, default: False) – Whether values of the user-item rating matrix will be centered by the mean of their corresponding rows (mean rating of each user).
weighting (str, optional, default: None) – The option for re-weighting the rating matrix. Supported types: [‘idf’, ‘bm25’]. If None, no weighting is applied.
amplify (float, optional, default: 1.0) – Amplifying the influence on similarity weights.
num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.
seed (int, optional, default: None) – Random seed for weight initialization.
References
Sarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2001, April). Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th international conference on World Wide Web (pp. 285-295).
Aggarwal, C. C. (2016). Recommender systems (Vol. 1). Cham: Springer International Publishing.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
Learn to Rank user Preferences based on Phrase-level sentiment analysis across Multiple categories (LRPPM)#
- class cornac.models.lrppm.recom_lrppm.LRPPM(name='LRPPM', rating_scale=5, n_factors=8, ld=1, reg=0.01, alpha=1, num_top_aspects=99999, n_ranking_samples=1000, n_samples=200, max_iter=200000, lr=0.1, n_threads=0, trainable=True, verbose=False, init_params=None, seed=None)#
Learn to Rank user Preferences based on Phrase-level sentiment analysis across Multiple categories (LRPPM)
- Parameters:
name (string, optional, default: 'LRPPM') – The name of the recommender model.
rating_scale (float, optional, default: 5.0) – The maximum rating score of the dataset.
n_factors (int, optional, default: 8) – The dimension of the latent factors.
ld (float, optional, default: 1.0) – The control factor for aspect ranking objective.
lambda_reg (float, optional, default: 0.01) – The regularization parameter.
n_top_aspects (int, optional, default: 100) – The number of top scored aspects for each (user, item) pair to construct ranking score.
alpha (float, optional, default: 0.5) – Trade-off factor for constructing ranking score.
n_ranking_samples (int, optional, default: 1000) – The number of samples from ranking pairs.
n_samples (int, optional, default: 200) – The number of samples from all ratings in each iteration.
max_iter (int, optional, default: 200000) – Maximum number of iterations for training.
lr (float, optional, default: 0.1) – The learning rate for optimization
n_threads (int, optional, default: 0) – Number of parallel threads for training. If n_threads=0, all CPU cores will be utilized. If seed is not None, n_threads=1 to remove randomness from parallelization.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U, I, UA, and IA are not None).
n_threads – Number of parallel threads for training. If n_threads=0, all CPU cores will be utilized. If seed is not None, n_threads=1 to remove randomness from parallelization.
trainable – When False, the model is not trained and Cornac assumes that the model already pre-trained (U, I, A, O, G1, G2, and G3 are not None).
verbose (boolean, optional, default: False) – When True, running logs are displayed.
init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘I’:I, ‘UA’:UA, ‘IA’:IA}
- U: ndarray, shape (n_users, n_factors)
The user latent factors, optional initialization via init_params
- I: ndarray, shape (n_users, n_factors)
The item latent factors, optional initialization via init_params
- UA: ndarray, shape (num_aspects, n_factors)
The user-aspect latent factors, optional initialization via init_params
- IA: ndarray, shape (num_aspects, n_factors)
The item-aspect latent factors, optional initialization via init_params
seed (int, optional, default: None) – Random seed for parameters initialization.
References
Xu Chen, Zheng Qin, Yongfeng Zhang, Tao Xu. 2016. Learning to Rank Features for Recommendation over Multiple Categories. Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval (SIGIR ‘16). ACM, New York, NY, USA, 305-314. DOI: https://doi.org/10.1145/2911451.2911549
- fit(train_set, val_set=None)#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- rank(user_idx, item_indices=None, k=-1)#
Rank all test items for a given user.
- Parameters:
user_idx (int, required) – The index of the user for whom to perform item raking.
item_indices (1d array, optional, default: None) – A list of candidate item indices to be ranked by the user. If None, list of ranked known item indices and their scores will be returned.
k (int, required) – Cut-off length for recommendations, k=-1 will return ranked list of all items. This is more important for ANN to know the limit to avoid exhaustive ranking.
- Returns:
(ranked_items, item_scores) – ranked_items contains item indices being ranked by their scores. item_scores contains scores of items corresponding to index in item_indices input.
- Return type:
- score(u_idx, i_idx=None)#
Predict the scores/ratings of a user for an item.
- Parameters:
- Returns:
res – Relative scores that the user gives to the item or to all known items
- Return type:
A scalar or a Numpy array
Matrix Factorization (MF)#
- class cornac.models.mf.recom_mf.MF(name='MF', k=10, backend='cpu', optimizer='sgd', max_iter=20, learning_rate=0.01, batch_size=256, lambda_reg=0.02, dropout=0.0, use_bias=True, early_stop=False, num_threads=0, trainable=True, verbose=False, init_params=None, seed=None)[source]#
Matrix Factorization.
- Parameters:
k (int, optional, default: 10) – The dimension of the latent factors.
backend (str, optional, default: 'cpu') – Backend used for model training: cpu, pytorch
optimizer (str, optional, default: 'sgd') – Specify an optimizer: adagrad, adam, rmsprop, sgd. (ineffective if using CPU backend)
max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for training.
learning_rate (float, optional, default: 0.01) – The learning rate.
batch_size (int, optional, default: 256) – Batch size (ineffective if using CPU backend).
lambda_reg (float, optional, default: 0.001) – The lambda value used for regularization.
dropout (float, optional, default: 0.0) – The dropout rate of embedding. (ineffective if using CPU backend)
use_bias (boolean, optional, default: True) – When True, user, item, and global biases are used.
early_stop (boolean, optional, default: False) – When True, delta loss will be checked after each iteration to stop learning earlier.
num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization. (Only effective if using CPU backend).
trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.
verbose (boolean, optional, default: True) – When True, running logs are displayed.
init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘U’: user_factors, ‘V’: item_factors, ‘Bu’: user_biases, ‘Bi’: item_biases}
seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).
References
Koren, Y., Bell, R., & Volinsky, C. Matrix factorization techniques for recommender systems. In Computer, (8), 30-37. 2009.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
Maximum Margin Matrix Factorization (MMMF)#
- class cornac.models.mmmf.recom_mmmf.MMMF(name='MMMF', k=10, max_iter=100, learning_rate=0.001, lambda_reg=0.01, num_threads=0, trainable=True, verbose=False, init_params=None, seed=None)#
Maximum Margin Matrix Factorization. This implements MF model optimized for the Soft Margin (Hinge) Ranking Loss, using SGD as similar to BPR model.
- Parameters:
k (int, optional, default: 10) – The dimension of the latent factors.
max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
learning_rate (float, optional, default: 0.001) – The learning rate for SGD.
lambda_reg (float, optional, default: 0.001) – The regularization hyper-parameter.
num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.
trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.
verbose (boolean, optional, default: True) – When True, some running logs are displayed.
init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘U’: user_factors, ‘V’: item_factors, ‘Bi’: item_biases}
seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).
References
Weimer, M., Karatzoglou, A., & Smola, A. (2008). Improving maximum margin matrix factorization. Machine Learning, 72(3), 263-276.
Most Popular (MostPop)#
- class cornac.models.most_pop.recom_most_pop.MostPop(name='MostPop')[source]#
Most Popular. Item are recommended based on their popularity (not personalized).
- Parameters:
name (string, default: 'MostPop') – The name of the recommender model.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
Non-negative Matrix Factorization (NMF)#
- class cornac.models.nmf.recom_nmf.NMF(name='NMF', k=15, max_iter=50, learning_rate=0.005, lambda_reg=0.0, lambda_u=0.06, lambda_v=0.06, lambda_bu=0.02, lambda_bi=0.02, use_bias=False, num_threads=0, trainable=True, verbose=False, init_params=None, seed=None)#
Non-negative Matrix Factorization
- Parameters:
k (int, optional, default: 15) – The dimension of the latent factors.
max_iter (int, optional, default: 50) – Maximum number of iterations or the number of epochs for SGD.
learning_rate (float, optional, default: 0.005) – The learning rate.
lambda_reg (float, optional, default: 0.0) – The lambda value used for regularization of all parameters.
lambda_u (float, optional, default: 0.06) – The regularization parameter for user factors U.
lambda_v (float, optional, default: 0.06) – The regularization parameter for item factors V.
lambda_bu (float, optional, default: 0.02) – The regularization parameter for user biases Bu.
lambda_bi (float, optional, default: 0.02) – The regularization parameter for item biases Bi.
use_bias (boolean, optional, default: False) – When True, user, item, and global biases are used.
num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.
trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.
verbose (boolean, optional, default: True) – When True, running logs are displayed.
init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘U’: user_factors, ‘V’: item_factors, ‘Bu’: user_biases, ‘Bi’: item_biases, ‘mu’: global_mean}
seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).
References
Lee, D. D., & Seung, H. S. (2001). Algorithms for non-negative matrix factorization. In Advances in neural information processing systems (pp. 556-562).
Takahashi, N., Katayama, J., & Takeuchi, J. I. (2014). A generalized sufficient condition for global convergence of modified multiplicative updates for NMF. In Proceedings of 2014 International Symposium on Nonlinear Theory and its Applications (pp. 44-47).
- fit(train_set, val_set=None)#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
- score(user_idx, item_idx=None)#
Predict the scores/ratings of a user for an item.
- Parameters:
- Returns:
res – Relative scores that the user gives to the item or to all known items
- Return type:
A scalar or a Numpy array
Probabilitic Matrix Factorization (PMF)#
- class cornac.models.pmf.recom_pmf.PMF(k=5, max_iter=100, learning_rate=0.001, gamma=0.9, lambda_reg=0.001, name='PMF', variant='non_linear', trainable=True, verbose=False, init_params=None, seed=None)[source]#
Probabilistic Matrix Factorization.
- Parameters:
k (int, optional, default: 5) – The dimension of the latent factors.
max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
learning_rate (float, optional, default: 0.001) – The learning rate for SGD_RMSProp.
gamma (float, optional, default: 0.9) – The weight for previous/current gradient in RMSProp.
lambda_reg (float, optional, default: 0.001) – The regularization coefficient.
name (string, optional, default: 'PMF') – The name of the recommender model.
variant ({"linear","non_linear"}, optional, default: 'non_linear') – Pmf variant. If ‘non_linear’, the Gaussian mean is the output of a Sigmoid function. If ‘linear’ the Gaussian mean is the output of the identity function.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
verbose (boolean, optional, default: False) – When True, some running logs are displayed.
init_params (dict, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V}.
- U: ndarray, shape (n_users, k)
User latent factors.
- V: ndarray, shape (n_items, k)
Item latent factors.
seed (int, optional, default: None) – Random seed for parameters initialization.
References
Mnih, Andriy, and Ruslan R. Salakhutdinov. Probabilistic matrix factorization. In NIPS, pp. 1257-1264. 2008.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
Session Popular (SPop)#
- class cornac.models.spop.recom_spop.SPop(name='SPop', use_session_popularity=True)[source]#
Recommend most popular items of the current session.
- Parameters:
name (string, default: 'SPop') – The name of the recommender model.
use_session_popularity (boolean, optional, default: True) – When False, no item frequency from history items in current session are being used.
References
Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, Domonkos Tikk: Session-based Recommendations with Recurrent Neural Networks, ICLR 2016
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- score(user_idx, history_items, **kwargs)[source]#
Predict the scores for all items based on input history items
- Parameters:
history_items (list of lists) – The list of history items in sequential manner for next-item prediction.
- Returns:
res – Relative scores of all known items
- Return type:
a Numpy array
Session-based Recommendations with Recurrent Neural Networks (GRU4Rec)#
- class cornac.models.gru4rec.recom_gru4rec.GRU4Rec(name='GRU4Rec', layers=[100], loss='cross-entropy', batch_size=512, dropout_p_embed=0.0, dropout_p_hidden=0.0, learning_rate=0.05, momentum=0.0, sample_alpha=0.5, n_sample=2048, embedding=0, constrained_embedding=True, n_epochs=10, bpreg=1.0, elu_param=0.5, logq=0.0, device='cpu', trainable=True, verbose=False, seed=None)[source]#
Session-based Recommendations with Recurrent Neural Networks
- Parameters:
name (string, default: 'GRU4Rec') – The name of the recommender model.
layers (list of int, optional, default: [100]) – The number of hidden units in each layer
loss (str, optional, default: 'cross-entropy') – Select the loss function.
batch_size (int, optional, default: 512) – Batch size
dropout_p_embed (float, optional, default: 0.0) – Dropout ratio for embedding layers
dropout_p_hidden (float, optional, default: 0.0) – Dropout ratio for hidden layers
learning_rate (float, optional, default: 0.05) – Learning rate for the optimizer
momentum (float, optional, default: 0.0) – Momentum for adaptive learning rate
sample_alpha (float, optional, default: 0.5) – Tradeoff factor controls the contribution of negative sample towards final loss
n_sample (int, optional, default: 2048) – Number of negative samples
embedding (int, optional, default: 0)
constrained_embedding (bool, optional, default: True)
n_epochs (int, optional, default: 10)
bpreg (float, optional, default: 1.0) – Regularization coefficient for ‘bpr-max’ loss.
elu_param (float, optional, default: 0.5) – Elu param for ‘bpr-max’ loss
logq (float, optional, default: 0,) – LogQ correction to offset the sampling bias affecting ‘cross-entropy’ loss.
device (str, optional, default: 'cpu') – Set to ‘cuda’ for GPU support.
trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.
verbose (boolean, optional, default: True) – When True, running logs are displayed.
seed (int, optional, default: None) – Random seed for weight initialization.
References
Hidasi, B., Karatzoglou, A., Baltrunas, L., & Tikk, D. (2015). Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- score(user_idx, history_items, **kwargs)[source]#
Predict the scores for all items based on input history items
- Parameters:
history_items (list of lists) – The list of history items in sequential manner for next-item prediction.
- Returns:
res – Relative scores of all known items
- Return type:
a Numpy array
Singular Value Decomposition (SVD)#
- class cornac.models.svd.recom_svd.SVD(name='SVD', k=10, max_iter=20, learning_rate=0.01, lambda_reg=0.02, early_stop=False, num_threads=0, trainable=True, verbose=False, init_params=None, seed=None)[source]#
Singular Value Decomposition (SVD). The implementation is based on Matrix Factorization with biases.
- Parameters:
k (int, optional, default: 10) – The dimension of the latent factors.
max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
learning_rate (float, optional, default: 0.01) – The learning rate.
lambda_reg (float, optional, default: 0.001) – The lambda value used for regularization.
early_stop (boolean, optional, default: False) – When True, delta loss will be checked after each iteration to stop learning earlier.
num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.
trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.
verbose (boolean, optional, default: True) – When True, running logs are displayed.
init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘U’: user_factors, ‘V’: item_factors, ‘Bu’: user_biases, ‘Bi’: item_biases}
seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).
References
Koren, Y. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In SIGKDD, pp. 426-434. 2008.
Koren, Y. Factor in the neighbors: Scalable and accurate collaborative filtering. In TKDD, 2010.
Social Recommendation using PMF (SoRec)#
- class cornac.models.sorec.recom_sorec.SoRec(name='SoRec', k=5, max_iter=100, learning_rate=0.001, lambda_c=10, lambda_reg=0.001, gamma=0.9, weight_link=True, trainable=True, verbose=False, init_params=None, seed=None)[source]#
Social recommendation using Probabilistic Matrix Factorization.
- Parameters:
k (int, optional, default: 5) – The dimension of the latent factors.
max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
learning_rate (float, optional, default: 0.001) – The learning rate for SGD_RMSProp.
gamma (float, optional, default: 0.9) – The weight for previous/current gradient in RMSProp.
lambda_c (float, optional, default: 10) – The parameter balancing the information from the user-item rating matrix and the user social network.
lambda_reg (float, optional, default: 0.001) – The regularization parameter.
weight_link (boolean, optional, default: True) – When true the social network links are weighted according to eq. (4) in the original paper.
name (string, optional, default: 'SoRec') – The name of the recommender model.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U, V and Z are not None).
verbose (boolean, optional, default: False) – When True, some running logs are displayed.
init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V, ‘Z’:Z}.
- U: a ndarray of shape (n_users, k)
Containing the user latent factors.
- V: a ndarray of shape (n_items, k)
Containing the item latent factors.
- Z: a ndarray of shape (n_users, k)
Containing the social network latent factors.
seed (int, optional, default: None) – Random seed for parameters initialization.
References
Ma, H. Yang, M. R. Lyu, and I. King. SoRec:Social recommendation using probabilistic matrix factorization. CIKM ’08, pages 931–940, Napa Valley, USA, 2008.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
- score(user_idx, item_idx=None)[source]#
Predict the scores/ratings of a user for an item. :param user_idx: The index of the user for whom to perform score prediction. :type user_idx: int, required :param item_idx: The index of the item for which to perform score prediction.
If None, scores for all known items will be returned.
- Returns:
res – Relative scores that the user gives to the item or to all known items
- Return type:
A scalar or a Numpy array
User K-Nearest-Neighbors (UserKNN)#
- class cornac.models.knn.recom_knn.UserKNN(name='UserKNN', k=20, similarity='cosine', mean_centered=False, weighting=None, amplify=1.0, num_threads=0, trainable=True, verbose=True, seed=None)[source]#
User-Based Nearest Neighbor.
- Parameters:
name (string, default: 'UserKNN') – The name of the recommender model.
k (int, optional, default: 20) – The number of nearest neighbors.
similarity (str, optional, default: 'cosine') – The similarity measurement. Supported types: [‘cosine’, ‘pearson’]
mean_centered (bool, optional, default: False) – Whether values of the user-item rating matrix will be centered by the mean of their corresponding rows (mean rating of each user).
weighting (str, optional, default: None) – The option for re-weighting the rating matrix. Supported types: [‘idf’, ‘bm25’]. If None, no weighting is applied.
amplify (float, optional, default: 1.0) – Amplifying the influence on similarity weights.
num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.
seed (int, optional, default: None) – Random seed for weight initialization.
References
CarlKadie, J. B. D. (1998). Empirical analysis of predictive algorithms for collaborative filtering. Microsoft Research Microsoft Corporation One Microsoft Way Redmond, WA, 98052.
Aggarwal, C. C. (2016). Recommender systems (Vol. 1). Cham: Springer International Publishing.
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
Weighted Matrix Factorization (WMF)#
- class cornac.models.wmf.recom_wmf.WMF(name='WMF', k=200, lambda_u=0.01, lambda_v=0.01, a=1, b=0.01, learning_rate=0.001, batch_size=128, max_iter=100, trainable=True, verbose=True, init_params=None, seed=None)[source]#
Weighted Matrix Factorization.
- Parameters:
name (string, default: 'WMF') – The name of the recommender model.
k (int, optional, default: 200) – The dimension of the latent factors.
max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
learning_rate (float, optional, default: 0.001) – The learning rate for AdamOptimizer.
lambda_u (float, optional, default: 0.01) – The regularization parameter for users.
lambda_v (float, optional, default: 0.01) – The regularization parameter for items.
a (float, optional, default: 1) – The confidence of observed ratings.
b (float, optional, default: 0.01) – The confidence of unseen ratings.
batch_size (int, optional, default: 128) – The batch size for SGD.
trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V}
- U: ndarray, shape (n_users,k)
The user latent factors, optional initialization via init_params.
- V: ndarray, shape (n_items,k)
The item latent factors, optional initialization via init_params.
seed (int, optional, default: None) – Random seed for weight initialization.
References
Hu, Y., Koren, Y., & Volinsky, C. (2008, December). Collaborative filtering for implicit feedback datasets. In 2008 Eighth IEEE International Conference on Data Mining (pp. 263-272).
Pan, R., Zhou, Y., Cao, B., Liu, N. N., Lukose, R., Scholz, M., & Yang, Q. (2008, December). One-class collaborative filtering. In 2008 Eighth IEEE International Conference on Data Mining (pp. 502-511).
- fit(train_set, val_set=None)[source]#
Fit the model to observations.
- Parameters:
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
- Returns:
self
- Return type:
- get_item_vectors()[source]#
Getting a matrix of item vectors used for building the index for ANN search.
- Returns:
out – Matrix of item vectors for all items available in the model.
- Return type:
numpy.array
- get_user_vectors()[source]#
Getting a matrix of user vectors serving as query for ANN search.
- Returns:
out – Matrix of user vectors for all users available in the model.
- Return type:
numpy.array
- get_vector_measure()[source]#
Getting a valid choice of vector measurement in ANNMixin._measures.
- Returns:
measure – Dot product aka. inner product
- Return type:
MEASURE_DOT
Social Bayesian Personalized Ranking (SBPR)#
Social Bayesian Personalized Ranking.
k (int, optional, default: 10) – The dimension of the latent factors.
max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
learning_rate (float, optional, default: 0.001) – The learning rate for SGD.
lambda_u (float, optional, default: 0.001) – The regularization hyper-parameter of user factors.
lambda_v (float, optional, default: 0.001) – The regularization hyper-parameter item factors.
lambda_b (float, optional, default: 0.001) – The regularization hyper-parameter item biases.
use_bias (boolean, optional, default: True) – When True, item bias is used.
num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.
trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.
verbose (boolean, optional, default: True) – When True, some running logs are displayed.
init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘U’: user_factors, ‘V’: item_factors, ‘Bi’: item_biases}
seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).
References
Zhao, T., McAuley, J., & King, I. (2014, November). Leveraging social connections to improve personalized ranking for collaborative filtering. CIKM 2014 (pp. 261-270).
Fit the model to observations.
train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).self
object