Models¶
Recommender (Generic Class)¶
-
class
cornac.models.recommender.
Recommender
(name, trainable=True, verbose=False)[source]¶ Generic class for a recommender model. All recommendation models should inherit from this class
Parameters: - name (str, required) – The name of the recommender model
- trainable (boolean, optional, default: True) – When False, the model is not trainable
-
clone
(new_params=None)[source]¶ Clone an instance of the model object.
Parameters: new_params (dict, optional, default: None) – New parameters for the cloned instance. Returns: object Return type: cornac.models.Recommender
-
default_score
()[source]¶ Overwrite this function if your algorithm has special treatment for cold-start problem
-
early_stop
(min_delta=0.0, patience=0)[source]¶ Check if training should be stopped when validation loss has stopped improving.
Parameters: - min_delta (float, optional, default: 0.) – The minimum increase in monitored value on validation set to be considered as improvement, i.e. an increment of less than min_delta will count as no improvement.
- patience (int, optional, default: 0) – Number of epochs with no improvement after which training should be stopped.
Returns: res – Return True if model training should be stopped (no improvement on validation set), otherwise return False.
Return type:
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
-
static
load
(model_path, trainable=False)[source]¶ Load a recommender model from the filesystem.
Parameters: - model_path (str, required) – Path to a file or directory where the model is stored. If a directory is provided, the latest model will be loaded.
- trainable (boolean, optional, default: False) – Set it to True if you would like to finetune the model. By default, the model parameters are assumed to be fixed after being loaded.
Returns: self
Return type:
-
monitor_value
()[source]¶ Calculating monitored value used for early stopping on validation set (val_set). This function will be called by early_stop() function. Note: val_set could be None thus it needs to be checked before usage.
Returns: Return type: raise NotImplementedError
-
rank
(user_idx, item_indices=None)[source]¶ Rank all test items for a given user.
Parameters: - user_idx (int, required) – The index of the user for whom to perform item raking.
- item_indices (1d array, optional, default: None) – A list of candidate item indices to be ranked by the user. If None, list of ranked known item indices and their scores will be returned.
Returns: (item_rank, item_scores) – item_rank contains item indices being ranked by their scores. item_scores contains scores of items corresponding to their indices in the item_indices input.
Return type:
-
rate
(user_idx, item_idx, clipping=True)[source]¶ Give a rating score between pair of user and item
Parameters: Returns: A rating score of the user for the item
Return type: A scalar
-
save
(save_dir=None)[source]¶ Save a recommender model to the filesystem.
Parameters: save_dir (str, default: None) – Path to a directory for the model to be stored. Returns: model_file – Path to the model file stored on the filesystem. Return type: str
-
score
(user_idx, item_idx=None)[source]¶ Predict the scores/ratings of a user for an item.
Parameters: Returns: res – Relative scores that the user gives to the item or to all known items
Return type: A scalar or a Numpy array
-
transform
(test_set)[source]¶ Transform test set into cached results accelerating the score function. This function is supposed to be called in the cornac.eval_methods.BaseMethod before evaluation step. It is optional for this function to be implemented.
Parameters: test_set ( cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.
Bilateral VAE for Collaborative Filtering (BiVAECF)¶
-
class
cornac.models.bivaecf.recom_bivaecf.
BiVAECF
(name='BiVAECF', k=10, encoder_structure=[20], act_fn='tanh', likelihood='pois', n_epochs=100, batch_size=100, learning_rate=0.001, beta_kl=1.0, cap_priors={'item': False, 'user': False}, trainable=True, verbose=False, seed=None, use_gpu=True)[source]¶ Bilateral Variational AutoEncoder for Collaborative Filtering.
Parameters: - k (int, optional, default: 10) – The dimension of the stochastic user ``theta’’ and item ``beta’’ factors.
- encoder_structure (list, default: [20]) – The number of neurons per layer of the user and item encoders for BiVAE. For example, encoder_structure = [20], the user (item) encoder structure will be [num_items, 20, k] ([num_users, 20, k]).
- act_fn (str, default: 'tanh') – Name of the activation function used between hidden layers of the auto-encoder. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘relu6’]
- likelihood (str, default: 'pois') –
The likelihood function used for modeling the observations. Supported choices:
bern: Bernoulli likelihood gaus: Gaussian likelihood pois: Poisson likelihood
- n_epochs (int, optional, default: 100) – The number of epochs for SGD.
- batch_size (int, optional, default: 100) – The batch size.
- learning_rate (float, optional, default: 0.001) – The learning rate for Adam.
- beta_kl (float, optional, default: 1.0) – The weight of the KL terms as in beta-VAE.
- cap_priors (dict, optional, default: {"user":False, "item":False}) – When {“user”:True, “item”:True}, CAP priors are used (see BiVAE paper for details), otherwise the standard Normal is used as a Prior over the user and item latent variables.
- name (string, optional, default: 'BiVAECF') – The name of the recommender model.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained.
- verbose (boolean, optional, default: False) – When True, some running logs are displayed.
- seed (int, optional, default: None) – Random seed for parameters initialization.
- use_gpu (boolean, optional, default: True) – If True and your system supports CUDA then training is performed on GPUs.
References
- Quoc-Tuan Truong, Aghiles Salah, Hady W. Lauw. ” Bilateral Variational Autoencoder for Collaborative Filtering.”
ACM International Conference on Web Search and Data Mining (WSDM). 2021.
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Causal Inference for Visual Debiasing in Visually-Aware Recommendation (CausalRec)¶
-
class
cornac.models.causalrec.recom_causalrec.
CausalRec
(name='CausalRec', k=10, k2=10, n_epochs=50, batch_size=100, learning_rate=0.005, lambda_w=0.01, lambda_b=0.01, lambda_e=0.0, mean_feat=None, tanh=0, lambda_2=0.8, use_gpu=False, trainable=True, verbose=True, init_params=None, seed=None)[source]¶ CausalRec: Causal Inference for Visual Debiasing in Visually-Aware Recommendation
Parameters: - k (int, optional, default: 10) – The dimension of the gamma latent factors.
- k2 (int, optional, default: 10) – The dimension of the theta latent factors.
- n_epochs (int, optional, default: 20) – Maximum number of epochs for SGD.
- batch_size (int, optional, default: 100) – The batch size for SGD.
- learning_rate (float, optional, default: 0.001) – The learning rate for SGD.
- lambda_w (float, optional, default: 0.01) – The regularization hyper-parameter for latent factor weights.
- lambda_b (float, optional, default: 0.01) – The regularization hyper-parameter for biases.
- lambda_e (float, optional, default: 0.0) – The regularization hyper-parameter for embedding matrix E and beta prime vector.
- mean_feat (torch.tensor, required, default: None) – The mean feature of all item embeddings serving as the no-treatment during causal inference.
- tanh (int, optional, default: 0) – The number of tanh layers on the visual feature transformation.
- lambda_2 (float, optional, default: 0.8) – The coefficient controlling the elimination of the visual bias in Eq. (28).
- use_gpu (boolean, optional, default: True) – Whether or not to use GPU to speed up training.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
- verbose (boolean, optional, default: True) – When True, running logs are displayed.
- init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘Bi’: beta_item, ‘Gu’: gamma_user, ‘Gi’: gamma_item, ‘Tu’: theta_user, ‘E’: emb_matrix, ‘Bp’: beta_prime}
- seed (int, optional, default: None) – Random seed for weight initialization.
References
- Qiu R., Wang S., Chen Z., Yin H., Huang Z. (2021). CausalRec: Causal Inference for Visual Debiasing in Visually-Aware Recommendation.
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Explainable Recommendation with Comparative Constraints on Product Aspects (ComparER)¶
-
class
cornac.models.comparer.recom_comparer_sub.
ComparERSub
¶ Explainable Recommendation with Comparative Constraints on Subjective Aspect-Level Quality
Parameters: - name (string, optional, default: 'ComparERSub') – The name of the recommender model.
- rating_scale (float, optional, default: 5.0) – The maximum rating score of the dataset.
- n_user_factors (int, optional, default: 15) – The dimension of the user latent factors.
- n_item_factors (int, optional, default: 15) – The dimension of the item latent factors.
- n_aspect_factors (int, optional, default: 12) – The dimension of the aspect latent factors.
- n_opinion_factors (int, optional, default: 12) – The dimension of the opinion latent factors.
- n_bpr_samples (int, optional, default: 1000) – The number of samples from all BPR pairs.
- n_element_samples (int, optional, default: 50) – The number of samples from all ratings in each iteration.
- n_top_aspects (int, optional, default: 100) – The number of top scored aspects for each (user, item) pair to construct ranking score.
- alpha (float, optional, default: 0.5) – Trade-off factor for constructing ranking score.
- lambda_reg (float, optional, default: 0.1) – The regularization parameter.
- lambda_bpr (float, optional, default: 10.0) – The regularization parameter for BPR.
- max_iter (int, optional, default: 200000) – Maximum number of iterations for training.
- lr (float, optional, default: 0.1) – The learning rate for optimization
- n_threads (int, optional, default: 0) – Number of parallel threads for training. If n_threads=0, all CPU cores will be utilized. If seed is not None, n_threads=1 to remove randomness from parallelization.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U, I, A, O, G1, G2, and G3 are not None).
- verbose (boolean, optional, default: False) – When True, running logs are displayed.
- init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘I’:I, ‘A’:A, ‘O’:O, ‘G1’:G1, ‘G2’:G2, ‘G3’:G3}
- U: ndarray, shape (n_users, n_user_factors)
- The user latent factors, optional initialization via init_params
- I: ndarray, shape (n_items, n_item_factors)
- The item latent factors, optional initialization via init_params
- A: ndarray, shape (num_aspects+1, n_aspect_factors)
- The aspect latent factors, optional initialization via init_params
- O: ndarray, shape (num_opinions, n_opinion_factors)
- The opinion latent factors, optional initialization via init_params
- G1: ndarray, shape (n_user_factors, n_item_factors, n_aspect_factors)
- The core tensor for user, item, and aspect factors, optional initialization via init_params
- G2: ndarray, shape (n_user_factors, n_aspect_factors, n_opinion_factors)
- The core tensor for user, aspect, and opinion factors, optional initialization via init_params
- G3: ndarray, shape (n_item_factors, n_aspect_factors, n_opinion_factors)
- The core tensor for item, aspect, and opinion factors, optional initialization via init_params
- seed (int, optional, default: None) – Random seed for parameters initialization.
References
- Trung-Hoang Le and Hady W. Lauw. “Explainable Recommendation with Comparative Constraints on Product Aspects.”
ACM International Conference on Web Search and Data Mining (WSDM). 2021.
-
fit
¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
-
class
cornac.models.comparer.recom_comparer_obj.
ComparERObj
¶ Explainable Recommendation with Comparative Constraints on Objective Aspect-Level Quality
Parameters: - num_explicit_factors (int, optional, default: 128) – The dimension of the explicit factors.
- num_latent_factors (int, optional, default: 128) – The dimension of the latent factors.
- num_most_cared_aspects (int, optional, default: 100) – The number of most cared aspects for each user.
- rating_scale (float, optional, default: 5.0) – The maximum rating score of the dataset.
- alpha (float, optional, default: 0.9) – Trace off factor for constructing ranking score.
- lambda_x (float, optional, default: 1) – The regularization parameter for user aspect attentions.
- lambda_y (float, optional, default: 1) – The regularization parameter for item aspect qualities.
- lambda_u (float, optional, default: 0.01) – The regularization parameter for user and item explicit factors.
- lambda_h (float, optional, default: 0.01) – The regularization parameter for user and item latent factors.
- lambda_v (float, optional, default: 0.01) – The regularization parameter for V.
- use_item_aspect_popularity (boolean, optional, default: True) – When False, item aspect frequency is omitted from item aspect quality computation formular. Specifically, \(Y_{ij} = 1 + \frac{N - 1}{1 + e^{-s_{ij}}}\) if \(p_i\) is reviewed on feature \(F_j\)
- min_user_freq (int, optional, default: 2) – Apply constraint for user with minimum number of ratings, where min_user_freq = 2 means only apply constraints on users with at least 2 ratings.
- min_pair_freq (int, optional, default: 1) – Apply constraint for the purchased pairs (earlier-later bought) with minimum number of pairs, where min_pair_freq = 2 means only apply constraints on pairs appear at least twice.
- max_pair_freq (int, optional, default: 1e9) – Apply constraint for the purchased pairs with frequency at most max_pair_freq, where max_pair_freq = 2 means only apply constraints on pairs appear at most twice.
- max_iter (int, optional, default: 1000) – Maximum number of iterations or the number of epochs.
- name (string, optional, default: 'ComparERObj') – The name of the recommender model.
- num_threads (int, optional, default: 0) – Number of parallel threads for training. If 0, all CPU cores will be utilized.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U1, U2, V, H1, and H2 are not None).
- verbose (boolean, optional, default: False) – When True, running logs are displayed.
- init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U1’:U1, ‘U2’:U2, ‘V’:V’, H1’:H1, ‘H2’:H2} U1: ndarray, shape (n_users, n_explicit_factors)
The user explicit factors, optional initialization via init_params.- U2: ndarray, shape (n_ratings, n_explicit_factors)
- The item explicit factors, optional initialization via init_params.
- V: ndarray, shape (n_aspects, n_explict_factors)
- The aspect factors, optional initialization via init_params.
- H1: ndarray, shape (n_users, n_latent_factors)
- The user latent factors, optional initialization via init_params.
- H2: ndarray, shape (n_ratings, n_latent_factors)
- The item latent factors, optional initialization via init_params.
- seed (int, optional, default: None) – Random seed for weight initialization.
References
- Trung-Hoang Le and Hady W. Lauw. “Explainable Recommendation with Comparative Constraints on Product Aspects.”
ACM International Conference on Web Search and Data Mining (WSDM). 2021.
-
fit
¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
-
get_params
¶ U1, U2, V, H1, H2
Type: Get model parameters in the form of dictionary including matrices
-
monitor_value
¶ Calculating monitored value used for early stopping on validation set (val_set). This function will be called by early_stop() function.
Returns: res – Monitored value on validation set. Return None if val_set is None. Return type: float
-
rank
¶ Rank all test items for a given user.
Parameters: - user_id (int, required) – The index of the user for whom to perform item raking.
- item_ids (1d array, optional, default: None) – A list of candidate item indices to be ranked by the user. If None, list of ranked known item indices and their scores will be returned
Returns: - Tuple of item_rank, and item_scores. The order of values
- in item_scores are corresponding to the order of their ids in item_ids
-
score
¶ Predict the scores/ratings of a user for an item.
Parameters: Returns: res – Relative scores that the user gives to the item or to all known items
Return type: A scalar or a Numpy array
Adversarial Training Towards Robust Multimedia Recommender System (AMR)¶
-
class
cornac.models.amr.recom_amr.
AMR
(name='AMR', k=10, k2=10, n_epochs=50, batch_size=100, learning_rate=0.005, lambda_w=0.01, lambda_b=0.01, lambda_e=0.0, lambda_adv=1.0, use_gpu=False, trainable=True, verbose=True, init_params=None, seed=None)[source]¶ Adversarial Training Towards Robust Multimedia Recommender System.
Parameters: - k (int, optional, default: 10) – The dimension of the gamma latent factors.
- k2 (int, optional, default: 10) – The dimension of the theta latent factors.
- n_epochs (int, optional, default: 20) – Maximum number of epochs for SGD.
- batch_size (int, optional, default: 100) – The batch size for SGD.
- learning_rate (float, optional, default: 0.001) – The learning rate for SGD.
- lambda_w (float, optional, default: 0.01) – The regularization hyper-parameter for latent factor weights.
- lambda_b (float, optional, default: 0.01) – The regularization hyper-parameter for biases.
- lambda_e (float, optional, default: 0.0) – The regularization hyper-parameter for embedding matrix E and beta prime vector.
- lambda_adv (float, optional, default: 1.0) – The regularization hyper-parameter in Eq. (8) and (10) for the adversarial sample loss.
- use_gpu (boolean, optional, default: True) – Whether or not to use GPU to speed up training.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
- verbose (boolean, optional, default: True) – When True, running logs are displayed.
- init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘Bi’: beta_item, ‘Gu’: gamma_user, ‘Gi’: gamma_item, ‘Tu’: theta_user, ‘E’: emb_matrix, ‘Bp’: beta_prime}
- seed (int, optional, default: None) – Random seed for weight initialization.
References
- Tang, J., Du, X., He, X., Yuan, F., Tian, Q., and Chua, T. (2020). Adversarial Training Towards Robust Multimedia Recommender System.
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Embarrassingly Shallow Autoencoders for Sparse Data (EASEᴿ)¶
-
class
cornac.models.ease.recom_ease.
EASE
(name='EASEᴿ', lamb=500, posB=True, trainable=True, verbose=True, seed=None, B=None, U=None)[source]¶ Embarrassingly Shallow Autoencoders for Sparse Data.
Parameters: - name (string, optional, default: 'EASEᴿ') – The name of the recommender model.
- lamb (float, optional, default: 500) – L2-norm regularization-parameter λ ∈ R+.
- posB (boolean, optional, default: False) – Remove Negative Weights
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already trained.
- verbose (boolean, optional, default: False) – When True, some running logs are displayed.
- seed (int, optional, default: None) – Random seed for parameters initialization.
References
- Steck, H. (2019, May). “Embarrassingly shallow autoencoders for sparse data.” In The World Wide Web Conference (pp. 3251-3257).
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Collaborative Context Poisson Factorization (C2PF)¶
-
class
cornac.models.c2pf.recom_c2pf.
C2PF
(k=100, max_iter=100, variant='c2pf', name=None, trainable=True, verbose=False, init_params=None)[source]¶ Collaborative Context Poisson Factorization.
Parameters: - k (int, optional, default: 100) – The dimension of the latent factors.
- max_iter (int, optional, default: 100) – Maximum number of iterations for variational C2PF.
- variant (string, optional, default: 'c2pf') – C2pf’s variant: c2pf: ‘c2pf’, ‘tc2pf’ (tied-c2pf) or ‘rc2pf’ (reduced-c2pf). Please refer to the original paper for details.
- name (string, optional, default: None) – The name of the recommender model. If None, then “variant” is used as the default name of the model.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (Theta, Beta and Xi are not None).
- Item_context (See "cornac/examples/c2pf_example.py" in the GitHub repo for an example of how to use cornac's graph modality to load and provide "item context" for C2PF.) –
- init_params (dict, optional, default: None) –
List of initial parameters, e.g., init_params = {‘G_s’:G_s, ‘G_r’:G_r, ‘L_s’:L_s, ‘L_r’:L_r, ‘L2_s’:L2_s, ‘L2_r’:L2_r, ‘L3_s’:L3_s, ‘L3_r’: L3_r}
- Theta: ndarray, shape (n_users, k)
- The expected user latent factors.
- Beta: ndarray, shape (n_items, k)
- The expected item latent factors.
- Xi: ndarray, shape (n_items, k)
- The expected context item latent factors multiplied by context effects Kappa.
- G_s: ndarray, shape (n_users, k)
- Represent the “shape” parameters of Gamma distribution over Theta.
- G_r: ndarray, shape (n_users, k)
- Represent the “rate” parameters of Gamma distribution over Theta.
- L_s: ndarray, shape (n_items, k)
- Represent the “shape” parameters of Gamma distribution over Beta.
- L_r: ndarray, shape (n_items, k)
- Represent the “rate” parameters of Gamma distribution over Beta.
- L2_s: ndarray, shape (n_items, k)
- Represent the “shape” parameters of Gamma distribution over Xi.
- L2_r: ndarray, shape (n_items, k)
- Represent the “rate” parameters of Gamma distribution over Xi.
- L3_s: ndarray
- Represent the “shape” parameters of Gamma distribution over Kappa.
- L3_r: ndarray
- Represent the “rate” parameters of Gamma distribution over Kappa.
References
- Salah, Aghiles, and Hady W. Lauw. A Bayesian Latent Variable Model of User Preferences with Item Context. In IJCAI, pp. 2667-2674. 2018.
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Graph Convolutional Matrix Completion (GCMC)¶
Main class for GCMC recommender model
-
class
cornac.models.gcmc.recom_gcmc.
GCMC
(name='GCMC', max_iter=2000, learning_rate=0.01, optimizer='adam', activation_model='leaky', gcn_agg_units=500, gcn_out_units=75, gcn_dropout=0.7, gcn_agg_accum='stack', share_param=False, gen_r_num_basis_func=2, train_grad_clip=1.0, train_valid_interval=1, train_early_stopping_patience=100, train_min_learning_rate=0.001, train_decay_patience=50, train_lr_decay_factor=0.5, trainable=True, verbose=False, seed=None)[source]¶ Graph Convolutional Matrix Completion (GCMC)
Parameters: - name (string, default: 'GCMC') – The name of the recommender model.
- max_iter (int, default: 2000) – Maximum number of iterations or the number of epochs for SGD
- learning_rate (float, default: 0.01) – The learning rate for SGD
- optimizer (string, default: 'adam'. Supported values: 'adam','sgd'.) – The optimization method used for SGD
- activation_model (string, default: 'leaky') – The activation function used in the GCMC model. Supported values: [‘leaky’, ‘linear’,’sigmoid’,’relu’, ‘tanh’]
- gcn_agg_units (int, default: 500) – The number of units in the graph convolutional layers
- gcn_out_units (int, default: 75) – The number of units in the output layer
- gcn_dropout (float, default: 0.7) – The dropout rate for the graph convolutional layers
- gcn_agg_accum (string, default:'stack') – The graph convolutional layer aggregation type. Supported values: [‘stack’, ‘sum’]
- share_param (bool, default: False) – Whether to share the parameters in the graph convolutional layers
- gen_r_num_basis_func (int, default: 2) – The number of basis functions used in the generating rating function
- train_grad_clip (float, default: 1.0) – The gradient clipping value for training
- train_valid_interval (int, default: 1) – The validation interval for training
- train_early_stopping_patience (int, default: 100) – The patience for early stopping
- train_min_learning_rate (float, default: 0.001) – The minimum learning rate for SGD
- train_decay_patience (int, default: 50) – The patience for learning rate decay
- train_lr_decay_factor (float, default: 0.5) – The learning rate decay factor
- trainable (boolean, default: True) – When False, the model is not trained and Cornac
- verbose (boolean, default: True) – When True, some running logs are displayed
- seed (int, default: None) – Random seed for parameters initialization
References
- van den Berg, R., Kipf, T. N., & Welling, M. (2018). Graph Convolutional Matrix Completion.
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
-
score
(user_idx, item_idx=None)[source]¶ Predict the scores/ratings of a user for an item.
Parameters: Returns: res – Relative scores that the user gives to the item or to all known items
Return type: A scalar or a Numpy array
-
transform
(test_set)[source]¶ Transform the model to indexed dictionary for scoring purposes.
Parameters: test_set ( cornac.data.Dataset
, required) – User-Item preference data.
Multi-Task Explainable Recommendation (MTER)¶
-
class
cornac.models.mter.recom_mter.
MTER
¶ Multi-Task Explainable Recommendation
Parameters: - name (string, optional, default: 'MTER') – The name of the recommender model.
- rating_scale (float, optional, default: 5.0) – The maximum rating score of the dataset.
- n_user_factors (int, optional, default: 15) – The dimension of the user latent factors.
- n_item_factors (int, optional, default: 15) – The dimension of the item latent factors.
- n_aspect_factors (int, optional, default: 12) – The dimension of the aspect latent factors.
- n_opinion_factors (int, optional, default: 12) – The dimension of the opinion latent factors.
- n_bpr_samples (int, optional, default: 1000) – The number of samples from all BPR pairs.
- n_element_samples (int, optional, default: 50) – The number of samples from all ratings in each iteration.
- lambda_reg (float, optional, default: 0.1) – The regularization parameter.
- lambda_bpr (float, optional, default: 10.0) – The regularization parameter for BPR.
- max_iter (int, optional, default: 200000) – Maximum number of iterations for training.
- lr (float, optional, default: 0.1) – The learning rate for optimization
- n_threads (int, optional, default: 0) – Number of parallel threads for training. If n_threads=0, all CPU cores will be utilized. If seed is not None, n_threads=1 to remove randomness from parallelization.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U, I, A, O, G1, G2, and G3 are not None).
- verbose (boolean, optional, default: False) – When True, running logs are displayed.
- init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘I’:I, ‘A’:A, ‘O’:O, ‘G1’:G1, ‘G2’:G2, ‘G3’:G3}
- U: ndarray, shape (n_users, n_user_factors)
- The user latent factors, optional initialization via init_params
- I: ndarray, shape (n_items, n_item_factors)
- The item latent factors, optional initialization via init_params
- A: ndarray, shape (num_aspects+1, n_aspect_factors)
- The aspect latent factors, optional initialization via init_params
- O: ndarray, shape (num_opinions, n_opinion_factors)
- The opinion latent factors, optional initialization via init_params
- G1: ndarray, shape (n_user_factors, n_item_factors, n_aspect_factors)
- The core tensor for user, item, and aspect factors, optional initialization via init_params
- G2: ndarray, shape (n_user_factors, n_aspect_factors, n_opinion_factors)
- The core tensor for user, aspect, and opinion factors, optional initialization via init_params
- G3: ndarray, shape (n_item_factors, n_aspect_factors, n_opinion_factors)
- The core tensor for item, aspect, and opinion factors, optional initialization via init_params
- seed (int, optional, default: None) – Random seed for parameters initialization.
References
Nan Wang, Hongning Wang, Yiling Jia, and Yue Yin. 2018. Explainable Recommendation via Multi-Task Learning in Opinionated Text Data. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR ‘18). ACM, New York, NY, USA, 165-174. DOI: https://doi.org/10.1145/3209978.3210010
-
fit
¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
-
score
¶ Predict the scores/ratings of a user for an item.
Parameters: Returns: res – Relative scores that the user gives to the item or to all known items
Return type: A scalar or a Numpy array
Hybrid neural recommendation with joint deep representation learning of ratings and reviews (HRDR)¶
-
class
cornac.models.hrdr.recom_hrdr.
HRDR
(name='HRDR', embedding_size=100, id_embedding_size=32, n_factors=32, attention_size=16, kernel_sizes=[3], n_filters=64, n_user_mlp_factors=128, n_item_mlp_factors=128, dropout_rate=0.5, max_text_length=50, max_num_review=32, batch_size=64, max_iter=20, optimizer='adam', learning_rate=0.001, model_selection='last', user_based=True, trainable=True, verbose=True, init_params=None, seed=None)[source]¶ Parameters: - name (string, default: 'HRDR') – The name of the recommender model.
- embedding_size (int, default: 100) – Word embedding size
- n_factors (int, default: 32) – The dimension of the user/item’s latent factors.
- attention_size (int, default: 16) – Attention size
- kernel_sizes (list, default: [3]) – List of kernel sizes of conv2d
- n_filters (int, default: 64) – Number of filters
- n_user_mlp_factors (int, default: 128) – Number of latent dimension of the first layer of a 3-layer MLP following by batch normalization on user net to represent user rating.
- n_item_mlp_factors (int, default: 128) – Number of latent dimension of the first layer of a 3-layer MLP following by batch normalization on item net to represent item rating.
- dropout_rate (float, default: 0.5) – Dropout rate of neural network dense layers
- max_text_length (int, default: 50) – Maximum number of tokens in a review instance
- max_num_review (int, default: 32) – Maximum number of reviews that you want to feed into training. By default, the model will be trained with all reviews.
- batch_size (int, default: 64) – Batch size
- max_iter (int, default: 20) – Max number of training epochs
- optimizer (string, optional, default: 'adam') – Optimizer for training is either ‘adam’ or ‘rmsprop’.
- learning_rate (float, optional, default: 0.001) – Initial value of learning rate for the optimizer.
- trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.
- verbose (boolean, optional, default: True) – When True, running logs are displayed.
- init_params (dictionary, optional, default: None) – Initial parameters, pretrained_word_embeddings could be initialized here, e.g., init_params={‘pretrained_word_embeddings’: pretrained_word_embeddings}
- seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).
References
Liu, H., Wang, Y., Peng, Q., Wu, F., Gan, L., Pan, L., & Jiao, P. (2020). Hybrid neural recommendation with joint deep representation learning of ratings and reviews. Neurocomputing, 374, 77-85.
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
-
static
load
(model_path, trainable=False)[source]¶ Load a recommender model from the filesystem.
Parameters: - model_path (str, required) – Path to a file or directory where the model is stored. If a directory is provided, the latest model will be loaded.
- trainable (boolean, optional, default: False) – Set it to True if you would like to finetune the model. By default, the model parameters are assumed to be fixed after being loaded.
Returns: self
Return type:
-
save
(save_dir=None)[source]¶ Save a recommender model to the filesystem.
Parameters: save_dir (str, default: None) – Path to a directory for the model to be stored.
Neural Attention Rating Regression with Review-level Explanations (NARRE)¶
-
class
cornac.models.narre.recom_narre.
NARRE
(name='NARRE', embedding_size=100, id_embedding_size=32, n_factors=32, attention_size=16, kernel_sizes=[3], n_filters=64, dropout_rate=0.5, max_text_length=50, max_num_review=32, batch_size=64, max_iter=10, optimizer='adam', learning_rate=0.001, model_selection='last', user_based=True, trainable=True, verbose=True, init_params=None, seed=None)[source]¶ Neural Attentional Rating Regression with Review-level Explanations
Parameters: - name (string, default: 'NARRE') – The name of the recommender model.
- embedding_size (int, default: 100) – Word embedding size
- id_embedding_size (int, default: 32) – User/item review id embedding size
- n_factors (int, default: 32) – The dimension of the user/item’s latent factors.
- attention_size (int, default: 16) – Attention size
- kernel_sizes (list, default: [3]) – List of kernel sizes of conv2d
- n_filters (int, default: 64) – Number of filters
- dropout_rate (float, default: 0.5) – Dropout rate of neural network dense layers
- max_text_length (int, default: 50) – Maximum number of tokens in a review instance
- max_num_review (int, default: 32) – Maximum number of reviews that you want to feed into training. By default, the model will be trained with 32 reviews.
- batch_size (int, default: 64) – Batch size
- max_iter (int, default: 10) – Max number of training epochs
- optimizer (string, optional, default: 'adam') – Optimizer for training is either ‘adam’ or ‘rmsprop’.
- learning_rate (float, optional, default: 0.001) – Initial value of learning rate for the optimizer.
- model_selection (str, optional, default: 'last') – Model selection strategy is either ‘best’ or ‘last’.
- user_based (boolean, optional, default: True) – Evaluation strategy for model selection, by default, it measures for every users and taking the average user_based=True. Set user_based=False if you want to measure per rating instead.
- trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.
- verbose (boolean, optional, default: True) – When True, running logs are displayed.
- init_params (dictionary, optional, default: None) – Initial parameters, pretrained_word_embeddings could be initialized here, e.g., init_params={‘pretrained_word_embeddings’: pretrained_word_embeddings}
- seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).
References
- Chen, C., Zhang, M., Liu, Y., & Ma, S. (2018, April). Neural attentional rating regression with review-level explanations. In Proceedings of the 2018 World Wide Web Conference (pp. 1583-1592).
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
-
static
load
(model_path, trainable=False)[source]¶ Load a recommender model from the filesystem.
Parameters: - model_path (str, required) – Path to a file or directory where the model is stored. If a directory is provided, the latest model will be loaded.
- trainable (boolean, optional, default: False) – Set it to True if you would like to finetune the model. By default, the model parameters are assumed to be fixed after being loaded.
Returns: self
Return type:
-
save
(save_dir=None)[source]¶ Save a recommender model to the filesystem.
Parameters: save_dir (str, default: None) – Path to a directory for the model to be stored.
Probabilistic Collaborative Representation Learning (PCRL)¶
-
class
cornac.models.pcrl.recom_pcrl.
PCRL
(k=100, z_dims=[300], max_iter=300, batch_size=300, learning_rate=0.001, name='PCRL', trainable=True, verbose=False, w_determinist=True, init_params=None)[source]¶ Probabilistic Collaborative Representation Learning.
Parameters: - k (int, optional, default: 100) – The dimension of the latent factors.
- z_dims (Numpy 1d array, optional, default: [300]) – The dimensions of the hidden intermdiate layers ‘z’ in the order [dim(z_L), …,dim(z_1)], please refer to Figure 1 in the orginal paper for more details.
- max_iter (int, optional, default: 300) – Maximum number of iterations (number of epochs) for variational PCRL.
- batch_size (int, optional, default: 300) – The batch size for SGD.
- learning_rate (float, optional, default: 0.001) – The learning rate for SGD.
- aux_info (see "cornac/examples/pcrl_example.py" in the GitHub repo for an example of how to use cornac's graph modality provide item auxiliary data (e.g., context, text, etc.) for PCRL.) –
- name (string, optional, default: 'PCRL') – The name of the recommender model.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (Theta, Beta and Xi are not None).
- w_determinist (boolean, optional, default: True) – When True, determinist wheights “W” are used for the generator network, otherwise “W” is stochastic as in the original paper.
- init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘G_s’:G_s, ‘G_r’:G_r, ‘L_s’:L_s, ‘L_r’:L_r}.
- Theta: ndarray, shape (n_users, k)
- The expected user latent factors.
- Beta: ndarray, shape (n_items, k)
- The expected item latent factors.
- G_s: ndarray, shape (n_users, k)
- Represent the “shape” parameters of Gamma distribution over Theta.
- G_r: ndarray, shape (n_users, k)
- Represent the “rate” parameters of Gamma distribution over Theta.
- L_s: ndarray, shape (n_items, k)
- Represent the “shape” parameters of Gamma distribution over Beta.
- L_r: ndarray, shape (n_items, k)
- Represent the “rate” parameters of Gamma distribution over Beta.
References
- Salah, Aghiles, and Hady W. Lauw. Probabilistic Collaborative Representation Learning for Personalized Item Recommendation. In UAI 2018.
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
VAE for Collaborative Filtering (VAECF)¶
-
class
cornac.models.vaecf.recom_vaecf.
VAECF
(name='VAECF', k=10, autoencoder_structure=[20], act_fn='tanh', likelihood='mult', n_epochs=100, batch_size=100, learning_rate=0.001, beta=1.0, trainable=True, verbose=False, seed=None, use_gpu=False)[source]¶ Variational Autoencoder for Collaborative Filtering.
Parameters: - k (int, optional, default: 10) – The dimension of the stochastic user factors ``z’’.
- autoencoder_structure (list, default: [20]) – The number of neurons of encoder/decoder layer for VAE. For example, autoencoder_structure = [200], the VAE structure will be [num_items, 200, k, 200, num_items].
- act_fn (str, default: 'tanh') – Name of the activation function used between hidden layers of the auto-encoder. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘relu6’]
- likelihood (str, default: 'mult') –
Name of the likelihood function used for modeling the observations. Supported choices:
mult: Multinomial likelihood bern: Bernoulli likelihood gaus: Gaussian likelihood pois: Poisson likelihood
- n_epochs (int, optional, default: 100) – The number of epochs for SGD.
- batch_size (int, optional, default: 100) – The batch size.
- learning_rate (float, optional, default: 0.001) – The learning rate for Adam.
- beta (float, optional, default: 1.0) – The weight of the KL term as in beta-VAE.
- name (string, optional, default: 'VAECF') – The name of the recommender model.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained.
- verbose (boolean, optional, default: False) – When True, some running logs are displayed.
- seed (int, optional, default: None) – Random seed for parameters initialization.
- use_gpu (boolean, optional, default: False) – If True and your system supports CUDA then training is performed on GPUs.
References
- Liang, Dawen, Rahul G. Krishnan, Matthew D. Hoffman, and Tony Jebara. “Variational autoencoders for collaborative filtering.” In Proceedings of the 2018 World Wide Web Conference on World Wide Web, pp. 689-698.
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Conditional VAE for Collaborative Filtering (CVAECF)¶
-
class
cornac.models.cvaecf.recom_cvaecf.
CVAECF
(name='CVAECF', z_dim=20, h_dim=20, autoencoder_structure=[20], act_fn='tanh', likelihood='mult', n_epochs=100, batch_size=128, learning_rate=0.001, beta=1.0, alpha_1=1.0, alpha_2=1.0, trainable=True, verbose=False, seed=None, use_gpu=False)[source]¶ Conditional Variational Autoencoder for Collaborative Filtering.
Parameters: - z_dim (int, optional, default: 20) – The dimension of the stochastic user factors ``z’’ representing the preference information.
- h_dim (int, optional, default: 20) – The dimension of the stochastic user factors ``h’’ representing the auxiliary data.
- autoencoder_structure (list, default: [20]) – The number of neurons of encoder/decoder hidden layer for CVAE. For example, when autoencoder_structure = [20], the CVAE encoder structures will be [y_dim, 20, z_dim] and [x_dim, 20, h_dim], the decoder structure will be [z_dim + h_dim, 20, y_dim], where y and x are respectively the preference and auxiliary data.
- act_fn (str, default: 'tanh') – Name of the activation function used between hidden layers of the auto-encoder. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘relu6’]
- likelihood (str, default: 'mult') –
Name of the likelihood function used for modeling user preferences. Supported choices:
mult: Multinomial likelihood bern: Bernoulli likelihood gaus: Gaussian likelihood pois: Poisson likelihood
- n_epochs (int, optional, default: 100) – The number of epochs for SGD.
- batch_size (int, optional, default: 128) – The batch size.
- learning_rate (float, optional, default: 0.001) – The learning rate for Adam.
- beta (float, optional, default: 1.0) – The weight of the KL term KL(q(z|y)||p(z)) as in beta-VAE.
- alpha_1 (float, optional, default: 1.0) – The weight of the KL term KL(q(h|x)||p(h|x)).
- alpha_2 (float, optional, default: 1.0) – The weight of the KL term KL(q(h|x)||q(h|y)).
- name (string, optional, default: 'CVAECF') – The name of the recommender model.
- trainable (boolean, optional, default: True) – When False, the model is not trained, and Cornac assumes that the model is already pre-trained.
- verbose (boolean, optional, default: False) – When True, some running logs are displayed.
- seed (int, optional, default: None) – Random seed for parameters initialization.
- use_gpu (boolean, optional, default: False) – If True and your system supports CUDA then training is performed on GPUs.
- auxiliary data (user) –
References
- Lee, Wonsung, Kyungwoo Song, and Il-Chul Moon. “Augmented variational autoencoders for collaborative filtering with auxiliary information.” Proceedings of ACM CIKM. 2017.
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Collaborative Variational Autoencoder (CVAE)¶
-
class
cornac.models.cvae.recom_cvae.
CVAE
(name='CVAE', z_dim=50, n_epochs=100, lambda_u=0.0001, lambda_v=0.001, lambda_r=10, lambda_w=0.0001, lr=0.001, a=1, b=0.01, input_dim=8000, vae_layers=[200, 100], act_fn='sigmoid', loss_type='cross-entropy', batch_size=128, init_params=None, trainable=True, seed=None, verbose=True)[source]¶ Collaborative Variational Autoencoder
Parameters: - z_dim (int, optional, default: 50) – The dimension of the user and item latent factors.
- n_epochs (int, optional, default: 100) – Maximum number of epochs for training.
- lambda_u (float, optional, default: 1e-4) – The regularization hyper-parameter for user latent factor.
- lambda_v (float, optional, default: 0.001) – The regularization hyper-parameter for item latent factor.
- lambda_r (float, optional, default: 10.0) – Parameter that balance the focus on content or ratings
- lambda_w (float, optional, default: 1e-4) – The regularization for VAE weights
- lr (float, optional, default: 0.001) – Learning rate in the auto-encoder training
- a (float, optional, default: 1) – The confidence of observed ratings.
- b (float, optional, default: 0.01) – The confidence of unseen ratings.
- input_dim (int, optional, default: 8000) – The size of input vector
- vae_layers (list, optional, default: [200, 100]) – The list containing size of each layers in neural network structure
- act_fn (str, default: 'sigmoid') – Name of the activation function used for the variational auto-encoder. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘relu6’, ‘leaky_relu’, ‘identity’]
- loss_type (String, optional, default: "cross-entropy") – Either “cross-entropy” or “rmse” The type of loss function in the last layer
- batch_size (int, optional, default: 128) – The batch size for SGD.
- init_params (dict, optional, default: {'U':None, 'V':None}) – Initial U and V latent matrix
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
References
Collaborative Variational Autoencoder for Recommender Systems X. Li and J. She ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2017
http://eelxpeng.github.io/assets/paper/Collaborative_Variational_Autoencoder.pdf
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Generalized Matrix Factorization (GMF)¶
-
class
cornac.models.ncf.recom_gmf.
GMF
(name='GMF', num_factors=8, regs=(0.0, 0.0), num_epochs=20, batch_size=256, num_neg=4, lr=0.001, learner='adam', early_stopping=None, trainable=True, verbose=True, seed=None)[source]¶ Generalized Matrix Factorization.
Parameters: - num_factors (int, optional, default: 8) – Embedding size of MF model.
- regs (float, optional, default: 0.) – Regularization for user and item embeddings.
- num_epochs (int, optional, default: 20) – Number of epochs.
- batch_size (int, optional, default: 256) – Batch size.
- num_neg (int, optional, default: 4) – Number of negative instances to pair with a positive instance.
- lr (float, optional, default: 0.001) – Learning rate.
- learner (str, optional, default: 'adam') – Specify an optimizer: adagrad, adam, rmsprop, sgd
- early_stopping ({min_delta: float, patience: int}, optional, default: None) –
If None, no early stopping. Meaning of the arguments:
- min_delta: the minimum increase in monitored value on validation set to be considered as improvement, i.e. an increment of less than min_delta will count as no improvement.
- patience: number of epochs with no improvement after which training should be stopped.
- name (string, optional, default: 'GMF') – Name of the recommender model.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained.
- verbose (boolean, optional, default: False) – When True, some running logs are displayed.
- seed (int, optional, default: None) – Random seed for parameters initialization.
References
- He, X., Liao, L., Zhang, H., Nie, L., Hu, X., & Chua, T. S. (2017, April). Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web (pp. 173-182).
Indexable Bayesian Personalized Ranking (IBPR)¶
-
class
cornac.models.ibpr.recom_ibpr.
IBPR
(k=20, max_iter=100, learning_rate=0.05, lamda=0.001, batch_size=100, name='IBPR', trainable=True, verbose=False, init_params=None)[source]¶ Indexable Bayesian Personalized Ranking.
Parameters: - k (int, optional, default: 20) – The dimension of the latent factors.
- max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
- learning_rate (float, optional, default: 0.05) – The learning rate for SGD.
- lamda (float, optional, default: 0.001) – The regularization parameter.
- batch_size (int, optional, default: 100) – The batch size for SGD.
- name (string, optional, default: 'IBRP') – The name of the recommender model.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
- verbose (boolean, optional, default: False) – When True, some running logs are displayed.
- init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V} please see below the definition of U and V.
- U: csc_matrix, shape (n_users,k)
- The user latent factors, optional initialization via init_params.
- V: csc_matrix, shape (n_items,k)
- The item latent factors, optional initialization via init_params.
References
- Le, D. D., & Lauw, H. W. (2017, November). Indexable Bayesian personalized ranking for efficient top-k recommendation. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (pp. 1389-1398). ACM.
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Matrix Co-Factorization (MCF)¶
-
class
cornac.models.mcf.recom_mcf.
MCF
(k=5, max_iter=100, learning_rate=0.001, gamma=0.9, lamda=0.001, name='MCF', trainable=True, verbose=False, init_params=None, seed=None)[source]¶ Matrix Co-Factorization.
Parameters: - k (int, optional, default: 5) – The dimension of the latent factors.
- max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
- learning_rate (float, optional, default: 0.001) – The learning rate for SGD_RMSProp.
- gamma (float, optional, default: 0.9) – The weight for previous/current gradient in RMSProp.
- lamda (float, optional, default: 0.001) – The regularization parameter.
- name (string, optional, default: 'MCF') – The name of the recommender model.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained (U and V are not None).
- network (item-affinity) –
- verbose (boolean, optional, default: False) – When True, some running logs are displayed.
- init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’: U, ‘V’: V, ‘Z’, Z}.
- U: ndarray, shape (n_users, k)
- User latent factors.
- V: ndarray, shape (n_items, k)
- Item latent factors.
- Z: ndarray, shape (n_items, k)
- The “Also-Viewed” item latent factors.
- seed (int, optional, default: None) – Random seed for parameters initialization.
References
- Park, Chanyoung, Donghyun Kim, Jinoh Oh, and Hwanjo Yu. “Do Also-Viewed Products Help User Rating Prediction?.” In Proceedings of WWW, pp. 1113-1122. 2017.
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Multi-Layer Perceptron (MLP)¶
-
class
cornac.models.ncf.recom_mlp.
MLP
(name='MLP', layers=(64, 32, 16, 8), act_fn='relu', reg_layers=(0.0, 0.0, 0.0, 0.0), num_epochs=20, batch_size=256, num_neg=4, lr=0.001, learner='adam', early_stopping=None, trainable=True, verbose=True, seed=None)[source]¶ Multi-Layer Perceptron.
Parameters: - layers (list, optional, default: [64, 32, 16, 8]) – MLP layers. Note that the first layer is the concatenation of user and item embeddings. So layers[0]/2 is the embedding size.
- act_fn (str, default: 'relu') – Name of the activation function used for the MLP layers. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘selu, ‘relu6’, ‘leaky_relu’]
- reg_layers (list, optional, default: [0., 0., 0., 0.]) – Regularization for each MLP layer, reg_layers[0] is the regularization for embeddings.
- num_epochs (int, optional, default: 20) – Number of epochs.
- batch_size (int, optional, default: 256) – Batch size.
- num_neg (int, optional, default: 4) – Number of negative instances to pair with a positive instance.
- lr (float, optional, default: 0.001) – Learning rate.
- learner (str, optional, default: 'adam') – Specify an optimizer: adagrad, adam, rmsprop, sgd
- early_stopping ({min_delta: float, patience: int}, optional, default: None) –
If None, no early stopping. Meaning of the arguments:
- min_delta: the minimum increase in monitored value on validation set to be considered as improvement, i.e. an increment of less than min_delta will count as no improvement.
- patience: number of epochs with no improvement after which training should be stopped.
- name (string, optional, default: 'MLP') – Name of the recommender model.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained.
- verbose (boolean, optional, default: False) – When True, some running logs are displayed.
- seed (int, optional, default: None) – Random seed for parameters initialization.
References
- He, X., Liao, L., Zhang, H., Nie, L., Hu, X., & Chua, T. S. (2017, April). Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web (pp. 173-182).
Neural Matrix Factorization (NeuMF/NCF)¶
-
class
cornac.models.ncf.recom_neumf.
NeuMF
(name='NeuMF', num_factors=8, layers=(64, 32, 16, 8), act_fn='relu', reg_mf=0.0, reg_layers=(0.0, 0.0, 0.0, 0.0), num_epochs=20, batch_size=256, num_neg=4, lr=0.001, learner='adam', early_stopping=None, trainable=True, verbose=True, seed=None)[source]¶ Neural Matrix Factorization.
Parameters: - num_factors (int, optional, default: 8) – Embedding size of MF model.
- layers (list, optional, default: [64, 32, 16, 8]) – MLP layers. Note that the first layer is the concatenation of user and item embeddings. So layers[0]/2 is the embedding size.
- act_fn (str, default: 'relu') – Name of the activation function used for the MLP layers. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘selu, ‘relu6’, ‘leaky_relu’]
- reg_mf (float, optional, default: 0.) – Regularization for MF embeddings.
- reg_layers (list, optional, default: [0., 0., 0., 0.]) – Regularization for each MLP layer, reg_layers[0] is the regularization for embeddings.
- num_epochs (int, optional, default: 20) – Number of epochs.
- batch_size (int, optional, default: 256) – Batch size.
- num_neg (int, optional, default: 4) – Number of negative instances to pair with a positive instance.
- lr (float, optional, default: 0.001) – Learning rate.
- learner (str, optional, default: 'adam') – Specify an optimizer: adagrad, adam, rmsprop, sgd
- early_stopping ({min_delta: float, patience: int}, optional, default: None) –
If None, no early stopping. Meaning of the arguments:
- min_delta: the minimum increase in monitored value on validation set to be considered as improvement, i.e. an increment of less than min_delta will count as no improvement.
- patience: number of epochs with no improvement after which training should be stopped.
- name (string, optional, default: 'NeuMF') – Name of the recommender model.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained.
- verbose (boolean, optional, default: False) – When True, some running logs are displayed.
- seed (int, optional, default: None) – Random seed for parameters initialization.
References
- He, X., Liao, L., Zhang, H., Nie, L., Hu, X., & Chua, T. S. (2017, April). Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web (pp. 173-182).
-
pretrain
(gmf_model, mlp_model, alpha=0.5)[source]¶ Provide pre-trained GMF and MLP models. Section 3.4.1 of the paper.
Parameters: - gmf_model (object of type GMF, required) – Reference to trained/fitted GMF model.
- gmf_model – Reference to trained/fitted GMF model.
- alpha (float, optional, default: 0.5) – Hyper-parameter determining the trade-off between the two pre-trained models. Details are described in the section 3.4.1 of the paper.
Online Indexable Bayesian Personalized Ranking (OIBPR)¶
-
class
cornac.models.online_ibpr.recom_online_ibpr.
OnlineIBPR
(k=20, max_iter=100, learning_rate=0.05, lamda=0.001, batch_size=100, name='online_ibpr', trainable=True, verbose=False, init_params=None)[source]¶ Online Indexable Bayesian Personalized Ranking.
Parameters: - k (int, optional, default: 20) – The dimension of the latent factors.
- max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
- learning_rate (float, optional, default: 0.05) – The learning rate for SGD.
- lamda (float, optional, default: 0.001) – The regularization parameter.
- batch_size (int, optional, default: 100) – The batch size for SGD.
- name (string, optional, default: 'IBRP') – The name of the recommender model.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
- verbose (boolean, optional, default: False) – When True, some running logs are displayed.
- init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V} please see below the definition of U and V.
- U: csc_matrix, shape (n_users,k)
- The user latent factors, optional initialization via init_params.
- V: csc_matrix, shape (n_items,k)
- The item latent factors, optional initialization via init_params.
References
- Le, D. D., & Lauw, H. W. (2017, November). Indexable Bayesian personalized ranking for efficient top-k recommendation. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (pp. 1389-1398). ACM.
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Visual Matrix Factorization (VMF)¶
-
class
cornac.models.vmf.recom_vmf.
VMF
(name='VMF', k=10, d=10, n_epochs=100, batch_size=100, learning_rate=0.001, gamma=0.9, lambda_u=0.001, lambda_v=0.001, lambda_p=1.0, lambda_e=10.0, trainable=True, verbose=False, use_gpu=False, init_params=None, seed=None)[source]¶ Visual Matrix Factorization.
Parameters: - k (int, optional, default: 10) – The dimension of the user and item factors.
- d (int, optional, default: 10) – The dimension of the user visual factors.
- n_epochs (int, optional, default: 100) – The number of epochs for SGD.
- learning_rate (float, optional, default: 0.001) – The learning rate for SGD_RMSProp.
- gamma (float, optional, default: 0.9) – The weight for previous/current gradient in RMSProp.
- lambda_u (float, optional, default: 0.001) – The regularization parameter for user factors.
- lambda_v (float, optional, default: 0.001) – The regularization parameter for item factors.
- lambda_p (float, optional, default: 1.0) – The regularization parameter for user visual factors.
- lambda_e (float, optional, default: 10.) – The regularization parameter for the kernel embedding matrix
- lambda_u – The regularization parameter for user factors.
- name (string, optional, default: 'VMF') – The name of the recommender model.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained (The parameters of the model U, V, P, E are not None).
- visual_features (See "cornac/examples/vmf_example.py" for an example of how to use cornac's visual modality to load and provide the "item visual features" for VMF.) –
- verbose (boolean, optional, default: False) – When True, some running logs are displayed.
- init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V, ‘P’: P, ‘E’: E}.
U: numpy array of shape (n_users,k), user latent factors. V: numpy array of shape (n_items,k), item latent factors. P: numpy array of shape (n_users,d), user visual latent factors. E: numpy array of shape (d,c), embedding kernel matrix.
- seed (int, optional, default: None) – Random seed for parameters initialization.
References
- Park, Chanyoung, Donghyun Kim, Jinoh Oh, and Hwanjo Yu. “Do Also-Viewed Products Help User Rating Prediction?.” In Proceedings of WWW, pp. 1113-1122. 2017.
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Collaborative Deep Ranking (CDR)¶
-
class
cornac.models.cdr.recom_cdr.
CDR
(name='CDR', k=50, autoencoder_structure=None, act_fn='relu', lambda_u=0.1, lambda_v=100, lambda_w=0.1, lambda_n=1000, corruption_rate=0.3, learning_rate=0.001, dropout_rate=0.1, batch_size=128, max_iter=100, trainable=True, verbose=True, vocab_size=8000, init_params=None, seed=None)[source]¶ Collaborative Deep Ranking.
Parameters: - k (int, optional, default: 50) – The dimension of the latent factors.
- max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
- autoencoder_structure (list, default: None) – The number of neurons of encoder/decoder layer for SDAE. For example, autoencoder_structure = [200], the SDAE structure will be [vocab_size, 200, k, 200, vocab_size]
- act_fn (str, default: 'relu') – Name of the activation function used for the auto-encoder. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘relu6’, ‘leaky_relu’, ‘identity’]
- learning_rate (float, optional, default: 0.001) – The learning rate for AdamOptimizer.
- lambda_u (float, optional, default: 0.1) – The regularization parameter for users.
- lambda_v (float, optional, default: 10) – The regularization parameter for items.
- lambda_w (float, optional, default: 0.1) – The regularization parameter for SDAE weights.
- lambda_n (float, optional, default: 1000) – The regularization parameter for SDAE output.
- corruption_rate (float, optional, default: 0.3) – The corruption ratio for SDAE.
- dropout_rate (float, optional, default: 0.1) – The probability that each element is removed in dropout of SDAE.
- batch_size (int, optional, default: 128) – The batch size for SGD.
- name (string, optional, default: 'CDR') – The name of the recommender model.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
- init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V}
- U: ndarray, shape (n_users,k)
- The user latent factors, optional initialization via init_params.
- V: ndarray, shape (n_items,k)
- The item latent factors, optional initialization via init_params.
- seed (int, optional, default: None) – Random seed for weight initialization.
References
Collaborative Deep Ranking: A Hybrid Pair-Wise Recommendation Algorithm with Implicit Feedback Ying H., Chen L., Xiong Y., Wu J. (2016)
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Collaborative Ordinal Embedding (COE)¶
-
class
cornac.models.coe.recom_coe.
COE
(k=20, max_iter=100, learning_rate=0.05, lamda=0.001, batch_size=1000, name='coe', trainable=True, verbose=False, init_params=None)[source]¶ Collaborative Ordinal Embedding.
Parameters: - k (int, optional, default: 20) – The dimension of the latent factors.
- max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
- learning_rate (float, optional, default: 0.05) – The learning rate for SGD.
- lamda (float, optional, default: 0.001) – The regularization parameter.
- batch_size (int, optional, default: 100) – The batch size for SGD.
- name (string, optional, default: 'IBRP') – The name of the recommender model.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
- verbose (boolean, optional, default: False) – When True, some running logs are displayed.
- init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V}.
- U: ndarray, shape (n_users, k)
- The user latent factors.
- V: ndarray, shape (n_items, k)
- The item latent factors.
References
- Le, D. D., & Lauw, H. W. (2016, June). Euclidean co-embedding of ordinal data for multi-type visualization. In Proceedings of the 2016 SIAM International Conference on Data Mining (pp. 396-404). Society for Industrial and Applied Mathematics.
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Convolutional Matrix Factorization (ConvMF)¶
-
class
cornac.models.conv_mf.recom_convmf.
ConvMF
(name='ConvMF', k=50, n_epochs=50, cnn_epochs=5, cnn_bs=128, cnn_lr=0.001, lambda_u=1, lambda_v=100, emb_dim=200, max_len=300, filter_sizes=[3, 4, 5], num_filters=100, hidden_dim=200, dropout_rate=0.2, give_item_weight=True, trainable=True, verbose=False, init_params=None, seed=None)[source]¶ Parameters: - k (int, optional, default: 50) – The dimension of the user and item latent factors.
- n_epochs (int, optional, default: 50) – Maximum number of epochs for training.
- cnn_epochs (int, optional, default: 5) – Number of epochs for optimizing the CNN for each overall training epoch.
- cnn_bs (int, optional, default: 128) – Batch size for optimizing CNN.
- cnn_lr (float, optional, default: 0.001) – Learning rate for optimizing CNN.
- lambda_u (float, optional, default: 1.0) – The regularization hyper-parameter for user latent factor.
- lambda_v (float, optional, default: 100.0) – The regularization hyper-parameter for item latent factor.
- emb_dim (int, optional, default: 200) – The embedding size of each word. One word corresponds with [1 x emb_dim] vector in the embedding space
- max_len (int, optional, default 300) – The maximum length of item’s document
- filter_sizes (list, optional, default: [3, 4, 5]) – The length of filters in convolutional layer
- num_filters (int, optional, default: 100) – The number of filters in convolutional layer
- hidden_dim (int, optional, default: 200) – The dimension of hidden layer after the pooling of all convolutional layers
- dropout_rate (float, optional, default: 0.2) – Dropout rate while training CNN
- give_item_weight (boolean, optional, default: True) – When True, each item will be weighted base on the number of user who have rated this item
- init_params (dict, optional, default: {'U':None, 'V':None, 'W': None}) – Initial U and V matrix and initial weight for embedding layer W
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
References
- Donghyun Kim1, Chanyoung Park1. ConvMF: Convolutional Matrix Factorization for Document Context-Aware Recommendation. In :10th ACM Conference on Recommender Systems Pages 233-240
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Spherical k-means (Skmeans)¶
-
class
cornac.models.skm.recom_skmeans.
SKMeans
(k=5, max_iter=100, name='Skmeans', trainable=True, tol=1e-06, verbose=True, seed=None, init_par=None)[source]¶ Spherical k-means based recommender.
Parameters: - k (int, optional, default: 5) – The number of clusters.
- max_iter (int, optional, default: 100) – Maximum number of iterations.
- name (string, optional, default: 'Skmeans') – The name of the recommender model.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already trained.
- tol (float, optional, default: 1e-6) – Relative tolerance with regards to skmeans’ criterion to declare convergence.
- verbose (boolean, optional, default: False) – When True, some running logs are displayed.
- seed (int, optional, default: None) – Random seed for parameters initialization.
- init_par (numpy 1d array, optional, default: None) – The initial object parition, 1d array contaning the cluster label (int type starting from 0) of each object (user). If par = None, then skmeans is initialized randomly.
- centroids (csc_matrix, shape (k,n_users)) – The maxtrix of cluster centroids.
References
- Salah, Aghiles, Nicoleta Rogovschi, and Mohamed Nadif. “A dynamic collaborative filtering system via a weighted clustering approach.” Neurocomputing 175 (2016): 206-215.
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Visual Bayesian Personalized Ranking (VBPR)¶
-
class
cornac.models.vbpr.recom_vbpr.
VBPR
(name='VBPR', k=10, k2=10, n_epochs=50, batch_size=100, learning_rate=0.005, lambda_w=0.01, lambda_b=0.01, lambda_e=0.0, use_gpu=False, trainable=True, verbose=True, init_params=None, seed=None)[source]¶ Visual Bayesian Personalized Ranking.
Parameters: - k (int, optional, default: 10) – The dimension of the gamma latent factors.
- k2 (int, optional, default: 10) – The dimension of the theta latent factors.
- n_epochs (int, optional, default: 20) – Maximum number of epochs for SGD.
- batch_size (int, optional, default: 100) – The batch size for SGD.
- learning_rate (float, optional, default: 0.001) – The learning rate for SGD.
- lambda_w (float, optional, default: 0.01) – The regularization hyper-parameter for latent factor weights.
- lambda_b (float, optional, default: 0.01) – The regularization hyper-parameter for biases.
- lambda_e (float, optional, default: 0.0) – The regularization hyper-parameter for embedding matrix E and beta prime vector.
- use_gpu (boolean, optional, default: True) – Whether or not to use GPU to speed up training.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
- verbose (boolean, optional, default: True) – When True, running logs are displayed.
- init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘Bi’: beta_item, ‘Gu’: gamma_user, ‘Gi’: gamma_item, ‘Tu’: theta_user, ‘E’: emb_matrix, ‘Bp’: beta_prime}
- seed (int, optional, default: None) – Random seed for weight initialization.
References
- He, R., & McAuley, J. (2016). VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback.
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Collaborative Deep Learning (CDL)¶
-
class
cornac.models.cdl.recom_cdl.
CDL
(name='CDL', k=50, autoencoder_structure=None, act_fn='relu', lambda_u=0.1, lambda_v=10, lambda_w=0.1, lambda_n=1000, a=1, b=0.01, corruption_rate=0.3, learning_rate=0.001, vocab_size=8000, dropout_rate=0.1, batch_size=128, max_iter=100, trainable=True, verbose=True, init_params=None, seed=None)[source]¶ Collaborative Deep Learning.
Parameters: - name (string, default: 'CDL') – The name of the recommender model.
- k (int, optional, default: 50) – The dimension of the latent factors.
- max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
- autoencoder_structure (list, default: None) – The number of neurons of encoder/decoder layer for SDAE. For example, autoencoder_structure = [200], the SDAE structure will be [vocab_size, 200, k, 200, vocab_size]
- act_fn (str, default: 'relu') – Name of the activation function used for the auto-encoder. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘relu6’, ‘leaky_relu’, ‘identity’]
- learning_rate (float, optional, default: 0.001) – The learning rate for AdamOptimizer.
- vocab_size (int, default: 8000) – The size of text input for the SDAE.
- lambda_u (float, optional, default: 0.1) – The regularization parameter for users.
- lambda_v (float, optional, default: 10) – The regularization parameter for items.
- lambda_w (float, optional, default: 0.1) – The regularization parameter for SDAE weights.
- lambda_n (float, optional, default: 1000) – The regularization parameter for SDAE output.
- a (float, optional, default: 1) – The confidence of observed ratings.
- b (float, optional, default: 0.01) – The confidence of unseen ratings.
- corruption_rate (float, optional, default: 0.3) – The corruption ratio for input text of the SDAE.
- dropout_rate (float, optional, default: 0.1) – The probability that each element is removed in dropout of SDAE.
- batch_size (int, optional, default: 128) – The batch size for SGD.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
- init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V}
- U: ndarray, shape (n_users,k)
- The user latent factors, optional initialization via init_params.
- V: ndarray, shape (n_items,k)
- The item latent factors, optional initialization via init_params.
- seed (int, optional, default: None) – Random seed for weight initialization.
References
- Hao Wang, Naiyan Wang, Dit-Yan Yeung. CDL: Collaborative Deep Learning for Recommender Systems. In : SIGKDD. 2015. p. 1235-1244.
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Hierarchical Poisson Factorization (HPF)¶
-
class
cornac.models.hpf.recom_hpf.
HPF
(k=5, max_iter=100, name='HPF', trainable=True, verbose=False, hierarchical=True, seed=None, init_params=None)[source]¶ Hierarchical Poisson Factorization.
Parameters: - k (int, optional, default: 5) – The dimension of the latent factors.
- max_iter (int, optional, default: 100) – Maximum number of iterations.
- name (string, optional, default: 'HPF') – The name of the recommender model.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained (Theta and Beta are not None).
- verbose (boolean, optional, default: False) – When True, some running logs are displayed.
- hierarchical (boolean, optional, default: True) – When False, PF is used instead of HPF.
- seed (int, optional, default: None) – Random seed for parameters initialization.
- init_params (dict, optional, default: None) –
Initial parameters of the model.
- Theta: ndarray, shape (n_users, k)
- The expected user latent factors.
- Beta: ndarray, shape (n_items, k)
- The expected item latent factors.
- G_s: ndarray, shape (n_users, k)
- This represents “shape” parameters of Gamma distribution over Theta.
- G_r: ndarray, shape (n_users, k)
- This represents “rate” parameters of Gamma distribution over Theta.
- L_s: ndarray, shape (n_items, k)
- This represents “shape” parameters of Gamma distribution over Beta.
- L_r: ndarray, shape (n_items, k)
- This represents “rate” parameters of Gamma distribution over Beta.
References
- Gopalan, Prem, Jake M. Hofman, and David M. Blei. Scalable Recommendation with Hierarchical Poisson Factorization. In UAI, pp. 326-335. 2015.
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Explicit Factor Model (EFM)¶
-
class
cornac.models.efm.recom_efm.
EFM
¶ Explict Factor Models
Parameters: - num_explicit_factors (int, optional, default: 40) – The dimension of the explicit factors.
- num_latent_factors (int, optional, default: 60) – The dimension of the latent factors.
- num_most_cared_aspects (int, optional, default: 15) – The number of most cared aspects for each user.
- rating_scale (float, optional, default: 5.0) – The maximum rating score of the dataset.
- alpha (float, optional, default: 0.85) – Trade-off factor for constructing ranking score.
- lambda_x (float, optional, default: 1) – The regularization parameter for user aspect attentions.
- lambda_y (float, optional, default: 1) – The regularization parameter for item aspect qualities.
- lambda_u (float, optional, default: 0.01) – The regularization parameter for user and item explicit factors.
- lambda_h (float, optional, default: 0.01) – The regularization parameter for user and item latent factors.
- lambda_v (float, optional, default: 0.01) – The regularization parameter for V.
- use_item_aspect_popularity (boolean, optional, default: True) – When False, item aspect frequency is omitted from item aspect quality computation formular. Specifically, \(Y_{ij} = 1 + \frac{N - 1}{1 + e^{-s_{ij}}}\) if \(p_i\) is reviewed on feature \(F_j\)
- max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs.
- name (string, optional, default: 'EFM') – The name of the recommender model.
- num_threads (int, optional, default: 0) – Number of parallel threads for training. If 0, all CPU cores will be utilized.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U1, U2, V, H1, and H2 are not None).
- verbose (boolean, optional, default: False) – When True, running logs are displayed.
- init_params (dictionary, optional, default: {}) –
List of initial parameters, e.g., init_params = {‘U1’:U1, ‘U2’:U2, ‘V’:V, ‘H1’:H1, ‘H2’:H2}
- U1: ndarray, shape (n_users, n_explicit_factors)
- The user explicit factors, optional initialization via init_params.
- U2: ndarray, shape (n_ratings, n_explicit_factors)
- The item explicit factors, optional initialization via init_params.
- V: ndarray, shape (n_aspects, n_explict_factors)
- The aspect factors, optional initialization via init_params.
- H1: ndarray, shape (n_users, n_latent_factors)
- The user latent factors, optional initialization via init_params.
- H2: ndarray, shape (n_ratings, n_latent_factors)
- The item latent factors, optional initialization via init_params.
- seed (int, optional, default: None) – Random seed for weight initialization.
References
Yongfeng Zhang, Guokun Lai, Min Zhang, Yi Zhang, Yiqun Liu, and Shaoping Ma. 2014. Explicit factor models for explainable recommendation based on phrase-level sentiment analysis. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval (SIGIR ‘14). ACM, New York, NY, USA, 83-92. DOI: https://doi.org/10.1145/2600428.2609579
-
fit
¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
-
rank
¶ Rank all test items for a given user.
Parameters: - user_idx (int, required) – The index of the user for whom to perform item raking.
- item_indices (1d array, optional, default: None) – A list of candidate item indices to be ranked by the user. If None, list of ranked known item indices and their scores will be returned
Returns: - Tuple of item_rank, and item_scores. The order of values
- in item_scores are corresponding to the order of their ids in item_ids
-
score
¶ Predict the scores/ratings of a user for an item.
Parameters: Returns: res – Relative scores that the user gives to the item or to all known items
Return type: A scalar or a Numpy array
Hidden Factors and Hidden Topics (HFT)¶
-
class
cornac.models.hft.recom_hft.
HFT
(name='HFT', k=10, max_iter=50, grad_iter=50, lambda_text=0.1, l2_reg=0.001, vocab_size=8000, init_params=None, trainable=True, verbose=True, seed=None)[source]¶ Hidden Factors and Hidden Topics
Parameters: - name (string, default: 'HFT') – The name of the recommender model.
- k (int, optional, default: 10) – The dimension of the latent factors.
- max_iter (int, optional, default: 50) – Maximum number of iterations for EM.
- grad_iter (int, optional, default: 50) – Maximum number of iterations for L-BFGS.
- lambda_text (float, default: 0.1) – Weight of corpus likelihood in objective function.
- l2_reg (float, default: 0.001) – Regularization for user item latent factors.
- vocab_size (int, optional, default: 8000) – Size of vocabulary for review text.
- init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘alpha’: alpha, ‘beta_u’: beta_u, ‘beta_i’: beta_i, ‘gamma_u’: gamma_u, ‘gamma_v’: gamma_v}
- alpha: float
- Model offset, optional initialization via init_params.
- beta_u: ndarray. shape (n_user, 1)
- User biases, optional initialization via init_params.
- beta_u: ndarray. shape (n_item, 1)
- Item biases, optional initialization via init_params.
- gamma_u: ndarray, shape (n_users,k)
- The user latent factors, optional initialization via init_params.
- gamma_v: ndarray, shape (n_items,k)
- The item latent factors, optional initialization via init_params.
- trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.
- verbose (boolean, optional, default: True) – When True, some running logs are displayed.
- seed (int, optional, default: None) – Random seed for weight initialization.
References
Julian McAuley, Jure Leskovec. “Hidden Factors and Hidden Topics: Understanding Rating Dimensions with Review Text” RecSys ‘13 Proceedings of the 7th ACM conference on Recommender systems Pages 165-172
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Weighted Bayesian Personalized Ranking (WBPR)¶
-
class
cornac.models.bpr.recom_wbpr.
WBPR
¶ Weighted Bayesian Personalized Ranking.
Parameters: - k (int, optional, default: 10) – The dimension of the latent factors.
- max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
- learning_rate (float, optional, default: 0.001) – The learning rate for SGD.
- lambda_reg (float, optional, default: 0.001) – The regularization hyper-parameter.
- use_bias (boolean, optional, default: True) – When True, item bias is used.
- num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.
- trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.
- verbose (boolean, optional, default: True) – When True, some running logs are displayed.
- init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘U’: user_factors, ‘V’: item_factors, ‘Bi’: item_biases}
- seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).
References
- Gantner, Zeno, Lucas Drumond, Christoph Freudenthaler, and Lars Schmidt-Thieme. “Personalized ranking for non-uniformly sampled items.” In Proceedings of KDD Cup 2011, pp. 231-247. 2012.
-
fit
¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Collaborative Topic Regression (CTR)¶
-
class
cornac.models.ctr.recom_ctr.
CTR
(name='CTR', k=200, lambda_u=0.01, lambda_v=0.01, eta=0.01, a=1, b=0.01, max_iter=100, trainable=True, verbose=True, init_params=None, seed=None)[source]¶ Collaborative Topic Regression.
Parameters: - name (string, default: 'CTR') – The name of the recommender model.
- k (int, optional, default: 200) – The dimension of the latent factors.
- max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
- lambda_u (float, optional, default: 0.01) – The regularization parameter for users.
- lambda_v (float, optional, default: 0.01) – The regularization parameter for items.
- a (float, optional, default: 1) – The confidence of observed ratings.
- b (float, optional, default: 0.01) – The confidence of unseen ratings.
- eta (float, optional, default: 0.01) – Added value for smoothing phi.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
- init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V}
- U: ndarray, shape (n_users,k)
- The user latent factors, optional initialization via init_params.
- V: ndarray, shape (n_items,k)
- The item latent factors, optional initialization via init_params.
- seed (int, optional, default: None) – Random seed for weight initialization.
References
Wang, Chong, and David M. Blei. “Collaborative topic modeling for recommending scientific articles.” Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2011.
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Baseline Only¶
-
cornac.models.baseline_only.
recom_bo
¶ alias of
cornac.models.baseline_only.recom_bo
Bayesian Personalized Ranking (BPR)¶
-
class
cornac.models.bpr.recom_bpr.
BPR
¶ Bayesian Personalized Ranking.
Parameters: - k (int, optional, default: 10) – The dimension of the latent factors.
- max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
- learning_rate (float, optional, default: 0.001) – The learning rate for SGD.
- lambda_reg (float, optional, default: 0.001) – The regularization hyper-parameter.
- use_bias (boolean, optional, default: True) – When True, item bias is used.
- num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.
- trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.
- verbose (boolean, optional, default: True) – When True, some running logs are displayed.
- init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘U’: user_factors, ‘V’: item_factors, ‘Bi’: item_biases}
- seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).
References
- Rendle, Steffen, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. BPR: Bayesian personalized ranking from implicit feedback. In UAI, pp. 452-461. 2009.
-
fit
¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
-
score
¶ Predict the scores/ratings of a user for an item.
Parameters: Returns: res – Relative scores that the user gives to the item or to all known items
Return type: A scalar or a Numpy array
Factorization Machines (FM)¶
Global Average (GlobalAvg)¶
-
class
cornac.models.global_avg.recom_global_avg.
GlobalAvg
(name='GlobalAvg')[source]¶ Global Average baseline for rating prediction. Rating predictions equal to average rating of training data (not personalized).
Parameters: name (string, default: 'GlobalAvg') – The name of the recommender model.
Item K-Nearest-Neighbors (ItemKNN)¶
-
class
cornac.models.knn.recom_knn.
ItemKNN
(name='ItemKNN', k=20, similarity='cosine', mean_centered=False, weighting=None, amplify=1.0, num_threads=0, trainable=True, verbose=True, seed=None)[source]¶ Item-Based Nearest Neighbor.
Parameters: - name (string, default: 'ItemKNN') – The name of the recommender model.
- k (int, optional, default: 20) – The number of nearest neighbors.
- similarity (str, optional, default: 'cosine') – The similarity measurement. Supported types: [‘cosine’, ‘pearson’]
- mean_centered (bool, optional, default: False) – Whether values of the user-item rating matrix will be centered by the mean of their corresponding rows (mean rating of each user).
- weighting (str, optional, default: None) – The option for re-weighting the rating matrix. Supported types: [‘idf’, ‘bm25’]. If None, no weighting is applied.
- amplify (float, optional, default: 1.0) – Amplifying the influence on similarity weights.
- num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.
- seed (int, optional, default: None) – Random seed for weight initialization.
References
- Sarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2001, April). Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th international conference on World Wide Web (pp. 285-295).
- Aggarwal, C. C. (2016). Recommender systems (Vol. 1). Cham: Springer International Publishing.
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Matrix Factorization (MF)¶
-
class
cornac.models.mf.recom_mf.
MF
¶ Matrix Factorization.
Parameters: - k (int, optional, default: 10) – The dimension of the latent factors.
- max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
- learning_rate (float, optional, default: 0.01) – The learning rate.
- lambda_reg (float, optional, default: 0.001) – The lambda value used for regularization.
- use_bias (boolean, optional, default: True) – When True, user, item, and global biases are used.
- early_stop (boolean, optional, default: False) – When True, delta loss will be checked after each iteration to stop learning earlier.
- num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.
- trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.
- verbose (boolean, optional, default: True) – When True, running logs are displayed.
- init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘U’: user_factors, ‘V’: item_factors, ‘Bu’: user_biases, ‘Bi’: item_biases}
- seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).
References
- Koren, Y., Bell, R., & Volinsky, C. Matrix factorization techniques for recommender systems. In Computer, (8), 30-37. 2009.
-
fit
¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
-
score
¶ Predict the scores/ratings of a user for an item.
Parameters: Returns: res – Relative scores that the user gives to the item or to all known items
Return type: A scalar or a Numpy array
Maximum Margin Matrix Factorization (MMMF)¶
-
class
cornac.models.mmmf.recom_mmmf.
MMMF
¶ Maximum Margin Matrix Factorization. This implements MF model optimized for the Soft Margin (Hinge) Ranking Loss, using SGD as similar to BPR model.
Parameters: - k (int, optional, default: 10) – The dimension of the latent factors.
- max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
- learning_rate (float, optional, default: 0.001) – The learning rate for SGD.
- lambda_reg (float, optional, default: 0.001) – The regularization hyper-parameter.
- num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.
- trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.
- verbose (boolean, optional, default: True) – When True, some running logs are displayed.
- init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘U’: user_factors, ‘V’: item_factors, ‘Bi’: item_biases}
- seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).
References
- Weimer, M., Karatzoglou, A., & Smola, A. (2008). Improving maximum margin matrix factorization. Machine Learning, 72(3), 263-276.
Most Popular (MostPop)¶
-
class
cornac.models.most_pop.recom_most_pop.
MostPop
(name='MostPop')[source]¶ Most Popular. Item are recommended based on their popularity (not personalized).
Parameters: name (string, default: 'MostPop') – The name of the recommender model. -
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
-
Non-negative Matrix Factorization (NMF)¶
-
class
cornac.models.nmf.recom_nmf.
NMF
¶ Non-negative Matrix Factorization
Parameters: - k (int, optional, default: 15) – The dimension of the latent factors.
- max_iter (int, optional, default: 50) – Maximum number of iterations or the number of epochs for SGD.
- learning_rate (float, optional, default: 0.005) – The learning rate.
- lambda_reg (float, optional, default: 0.0) – The lambda value used for regularization of all parameters.
- lambda_u (float, optional, default: 0.06) – The regularization parameter for user factors U.
- lambda_v (float, optional, default: 0.06) – The regularization parameter for item factors V.
- lambda_bu (float, optional, default: 0.02) – The regularization parameter for user biases Bu.
- lambda_bi (float, optional, default: 0.02) – The regularization parameter for item biases Bi.
- use_bias (boolean, optional, default: False) – When True, user, item, and global biases are used.
- num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.
- trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.
- verbose (boolean, optional, default: True) – When True, running logs are displayed.
- init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘U’: user_factors, ‘V’: item_factors, ‘Bu’: user_biases, ‘Bi’: item_biases, ‘mu’: global_mean}
- seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).
References
- Lee, D. D., & Seung, H. S. (2001). Algorithms for non-negative matrix factorization. In Advances in neural information processing systems (pp. 556-562).
- Takahashi, N., Katayama, J., & Takeuchi, J. I. (2014). A generalized sufficient condition for global convergence of modified multiplicative updates for NMF. In Proceedings of 2014 International Symposium on Nonlinear Theory and its Applications (pp. 44-47).
-
fit
¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
-
score
¶ Predict the scores/ratings of a user for an item.
Parameters: Returns: res – Relative scores that the user gives to the item or to all known items
Return type: A scalar or a Numpy array
Probabilitic Matrix Factorization (PMF)¶
-
class
cornac.models.pmf.recom_pmf.
PMF
(k=5, max_iter=100, learning_rate=0.001, gamma=0.9, lambda_reg=0.001, name='PMF', variant='non_linear', trainable=True, verbose=False, init_params=None, seed=None)[source]¶ Probabilistic Matrix Factorization.
Parameters: - k (int, optional, default: 5) – The dimension of the latent factors.
- max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
- learning_rate (float, optional, default: 0.001) – The learning rate for SGD_RMSProp.
- gamma (float, optional, default: 0.9) – The weight for previous/current gradient in RMSProp.
- lambda_reg (float, optional, default: 0.001) – The regularization coefficient.
- name (string, optional, default: 'PMF') – The name of the recommender model.
- variant ({"linear","non_linear"}, optional, default: 'non_linear') – Pmf variant. If ‘non_linear’, the Gaussian mean is the output of a Sigmoid function. If ‘linear’ the Gaussian mean is the output of the identity function.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
- verbose (boolean, optional, default: False) – When True, some running logs are displayed.
- init_params (dict, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V}.
- U: ndarray, shape (n_users, k)
- User latent factors.
- V: ndarray, shape (n_items, k)
- Item latent factors.
- seed (int, optional, default: None) – Random seed for parameters initialization.
References
- Mnih, Andriy, and Ruslan R. Salakhutdinov. Probabilistic matrix factorization. In NIPS, pp. 1257-1264. 2008.
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Singular Value Decomposition (SVD)¶
-
class
cornac.models.svd.recom_svd.
SVD
(name='SVD', k=10, max_iter=20, learning_rate=0.01, lambda_reg=0.02, early_stop=False, num_threads=0, trainable=True, verbose=False, init_params=None, seed=None)[source]¶ Singular Value Decomposition (SVD). The implementation is based on Matrix Factorization with biases.
Parameters: - k (int, optional, default: 10) – The dimension of the latent factors.
- max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
- learning_rate (float, optional, default: 0.01) – The learning rate.
- lambda_reg (float, optional, default: 0.001) – The lambda value used for regularization.
- early_stop (boolean, optional, default: False) – When True, delta loss will be checked after each iteration to stop learning earlier.
- num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.
- trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.
- verbose (boolean, optional, default: True) – When True, running logs are displayed.
- init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘U’: user_factors, ‘V’: item_factors, ‘Bu’: user_biases, ‘Bi’: item_biases}
- seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).
References
- Koren, Y. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In SIGKDD, pp. 426-434. 2008.
- Koren, Y. Factor in the neighbors: Scalable and accurate collaborative filtering. In TKDD, 2010.
Social Recommendation using PMF (SoRec)¶
-
class
cornac.models.sorec.recom_sorec.
SoRec
(name='SoRec', k=5, max_iter=100, learning_rate=0.001, lambda_c=10, lambda_reg=0.001, gamma=0.9, weight_link=True, trainable=True, verbose=False, init_params=None, seed=None)[source]¶ Social recommendation using Probabilistic Matrix Factorization.
Parameters: - k (int, optional, default: 5) – The dimension of the latent factors.
- max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
- learning_rate (float, optional, default: 0.001) – The learning rate for SGD_RMSProp.
- gamma (float, optional, default: 0.9) – The weight for previous/current gradient in RMSProp.
- lambda_c (float, optional, default: 10) – The parameter balancing the information from the user-item rating matrix and the user social network.
- lambda_reg (float, optional, default: 0.001) – The regularization parameter.
- weight_link (boolean, optional, default: True) – When true the social network links are weighted according to eq. (4) in the original paper.
- name (string, optional, default: 'SoRec') – The name of the recommender model.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U, V and Z are not None).
- verbose (boolean, optional, default: False) – When True, some running logs are displayed.
- init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V, ‘Z’:Z}.
- U: a ndarray of shape (n_users, k)
- Containing the user latent factors.
- V: a ndarray of shape (n_items, k)
- Containing the item latent factors.
- Z: a ndarray of shape (n_users, k)
- Containing the social network latent factors.
- seed (int, optional, default: None) – Random seed for parameters initialization.
References
- Ma, H. Yang, M. R. Lyu, and I. King. SoRec:Social recommendation using probabilistic matrix factorization. CIKM ’08, pages 931–940, Napa Valley, USA, 2008.
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
-
score
(user_idx, item_idx=None)[source]¶ Predict the scores/ratings of a user for an item. :param user_idx: The index of the user for whom to perform score prediction. :type user_idx: int, required :param item_idx: The index of the item for which to perform score prediction.
If None, scores for all known items will be returned.Returns: res – Relative scores that the user gives to the item or to all known items Return type: A scalar or a Numpy array
User K-Nearest-Neighbors (UserKNN)¶
-
class
cornac.models.knn.recom_knn.
UserKNN
(name='UserKNN', k=20, similarity='cosine', mean_centered=False, weighting=None, amplify=1.0, num_threads=0, trainable=True, verbose=True, seed=None)[source]¶ User-Based Nearest Neighbor.
Parameters: - name (string, default: 'UserKNN') – The name of the recommender model.
- k (int, optional, default: 20) – The number of nearest neighbors.
- similarity (str, optional, default: 'cosine') – The similarity measurement. Supported types: [‘cosine’, ‘pearson’]
- mean_centered (bool, optional, default: False) – Whether values of the user-item rating matrix will be centered by the mean of their corresponding rows (mean rating of each user).
- weighting (str, optional, default: None) – The option for re-weighting the rating matrix. Supported types: [‘idf’, ‘bm25’]. If None, no weighting is applied.
- amplify (float, optional, default: 1.0) – Amplifying the influence on similarity weights.
- num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.
- seed (int, optional, default: None) – Random seed for weight initialization.
References
- CarlKadie, J. B. D. (1998). Empirical analysis of predictive algorithms for collaborative filtering. Microsoft Research Microsoft Corporation One Microsoft Way Redmond, WA, 98052.
- Aggarwal, C. C. (2016). Recommender systems (Vol. 1). Cham: Springer International Publishing.
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Weighted Matrix Factorization (WMF)¶
-
class
cornac.models.wmf.recom_wmf.
WMF
(name='WMF', k=200, lambda_u=0.01, lambda_v=0.01, a=1, b=0.01, learning_rate=0.001, batch_size=128, max_iter=100, trainable=True, verbose=True, init_params=None, seed=None)[source]¶ Weighted Matrix Factorization.
Parameters: - name (string, default: 'WMF') – The name of the recommender model.
- k (int, optional, default: 200) – The dimension of the latent factors.
- max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.
- learning_rate (float, optional, default: 0.001) – The learning rate for AdamOptimizer.
- lambda_u (float, optional, default: 0.01) – The regularization parameter for users.
- lambda_v (float, optional, default: 0.01) – The regularization parameter for items.
- a (float, optional, default: 1) – The confidence of observed ratings.
- b (float, optional, default: 0.01) – The confidence of unseen ratings.
- batch_size (int, optional, default: 128) – The batch size for SGD.
- trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).
- init_params (dictionary, optional, default: None) –
List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V}
- U: ndarray, shape (n_users,k)
- The user latent factors, optional initialization via init_params.
- V: ndarray, shape (n_items,k)
- The item latent factors, optional initialization via init_params.
- seed (int, optional, default: None) – Random seed for weight initialization.
References
- Hu, Y., Koren, Y., & Volinsky, C. (2008, December). Collaborative filtering for implicit feedback datasets. In 2008 Eighth IEEE International Conference on Data Mining (pp. 263-272).
- Pan, R., Zhou, Y., Cao, B., Liu, N. N., Lukose, R., Scholz, M., & Yang, Q. (2008, December). One-class collaborative filtering. In 2008 Eighth IEEE International Conference on Data Mining (pp. 502-511).
-
fit
(train_set, val_set=None)[source]¶ Fit the model to observations.
Parameters: - train_set (
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities. - val_set (
cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).
Returns: self
Return type: - train_set (
Social Bayesian Personalized Ranking (SBPR)¶
cornac.models.sbpr.recom_sbpr.
SBPR
¶Social Bayesian Personalized Ranking.
References
fit
¶Fit the model to observations.
cornac.data.Dataset
, required) – User-Item preference data as well as additional modalities.cornac.data.Dataset
, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).self
object