Models#

Below are the models that are currently supported in Cornac.

Recommender (Generic Class)#

class cornac.models.recommender.ANNMixin[source]#

Mixin class for Approximate Nearest Neighbor Search.

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Return type:

raise NotImplementedError

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Return type:

raise NotImplementedError

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Return type:

raise NotImplementedError

class cornac.models.recommender.NextBasketRecommender(name, trainable=True, verbose=False)[source]#

Generic class for a next basket recommender model. All next basket recommendation models should inherit from this class.

Parameters:
  • name (str, required) – Name of the recommender model.

  • trainable (boolean, optional, default: True) – When False, the model is not trainable.

  • verbose (boolean, optional, default: False) – When True, running logs are displayed.

num_users#

Number of users in training data.

Type:

int

num_items#

Number of items in training data.

Type:

int

total_users#

Number of users in training, validation, and test data. In other words, this includes unknown/unseen users.

Type:

int

total_items#

Number of items in training, validation, and test data. In other words, this includes unknown/unseen items.

Type:

int

uid_map#

Global mapping of user ID-index.

Type:

int

iid_map#

Global mapping of item ID-index.

Type:

int

score(user_idx, history_baskets, **kwargs)[source]#

Predict the scores for all items based on input history baskets

Parameters:

history_baskets (list of lists) – The list of history baskets in sequential manner for next-basket prediction.

Returns:

res – Relative scores of all known items

Return type:

a Numpy array

class cornac.models.recommender.NextItemRecommender(name, trainable=True, verbose=False)[source]#

Generic class for a next item recommender model. All next item recommendation models should inherit from this class.

Parameters:
  • name (str, required) – Name of the recommender model.

  • trainable (boolean, optional, default: True) – When False, the model is not trainable.

  • verbose (boolean, optional, default: False) – When True, running logs are displayed.

num_users#

Number of users in training data.

Type:

int

num_items#

Number of items in training data.

Type:

int

total_users#

Number of users in training, validation, and test data. In other words, this includes unknown/unseen users.

Type:

int

total_items#

Number of items in training, validation, and test data. In other words, this includes unknown/unseen items.

Type:

int

uid_map#

Global mapping of user ID-index.

Type:

int

iid_map#

Global mapping of item ID-index.

Type:

int

score(user_idx, history_items, **kwargs)[source]#

Predict the scores for all items based on input history items

Parameters:

history_items (list of lists) – The list of history items in sequential manner for next-item prediction.

Returns:

res – Relative scores of all known items

Return type:

a Numpy array

class cornac.models.recommender.Recommender(name, trainable=True, verbose=False)[source]#

Generic class for a recommender model. All recommendation models should inherit from this class.

Parameters:
  • name (str, required) – Name of the recommender model.

  • trainable (boolean, optional, default: True) – When False, the model is not trainable.

  • verbose (boolean, optional, default: False) – When True, running logs are displayed.

num_users#

Number of users in training data.

Type:

int

num_items#

Number of items in training data.

Type:

int

total_users#

Number of users in training, validation, and test data. In other words, this includes unknown/unseen users.

Type:

int

total_items#

Number of items in training, validation, and test data. In other words, this includes unknown/unseen items.

Type:

int

uid_map#

Global mapping of user ID-index.

Type:

int

iid_map#

Global mapping of item ID-index.

Type:

int

max_rating#

Maximum value among the rating observations.

Type:

float

min_rating#

Minimum value among the rating observations.

Type:

float

global_mean#

Average value over the rating observations.

Type:

float

clone(new_params=None)[source]#

Clone an instance of the model object.

Parameters:

new_params (dict, optional, default: None) – New parameters for the cloned instance.

Returns:

object

Return type:

cornac.models.Recommender

default_score()[source]#

Overwrite this function if your algorithm has special treatment for cold-start problem

early_stop(train_set, val_set, min_delta=0.0, patience=0)[source]#

Check if training should be stopped when validation loss has stopped improving.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

  • min_delta (float, optional, default: 0.) – The minimum increase in monitored value on validation set to be considered as improvement, i.e. an increment of less than min_delta will count as no improvement.

  • patience (int, optional, default: 0) – Number of epochs with no improvement after which training should be stopped.

Returns:

res – Return True if model training should be stopped (no improvement on validation set), otherwise return False.

Return type:

bool

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

is_unknown_item(item_idx)[source]#

Return whether the model knows item by its index. Reverse of knows_item() function, for better readability in some cases.

Parameters:

item_idx (int, required) – The index of the item (not the original item ID).

Returns:

res – True if model knows the item from traning data, False otherwise.

Return type:

bool

is_unknown_user(user_idx)[source]#

Return whether the model knows user by its index. Reverse of knows_user() function, for better readability in some cases.

Parameters:

user_idx (int, required) – The index of the user (not the original user ID).

Returns:

res – True if model knows the user from traning data, False otherwise.

Return type:

bool

property item_ids#

Return the list of raw item IDs

knows_item(item_idx)[source]#

Return whether the model knows item by its index

Parameters:

item_idx (int, required) – The index of the item (not the original item ID).

Returns:

res – True if model knows the item from traning data, False otherwise.

Return type:

bool

knows_user(user_idx)[source]#

Return whether the model knows user by its index

Parameters:

user_idx (int, required) – The index of the user (not the original user ID).

Returns:

res – True if model knows the user from traning data, False otherwise.

Return type:

bool

static load(model_path, trainable=False)[source]#

Load a recommender model from the filesystem.

Parameters:
  • model_path (str, required) – Path to a file or directory where the model is stored. If a directory is provided, the latest model will be loaded.

  • trainable (boolean, optional, default: False) – Set it to True if you would like to finetune the model. By default, the model parameters are assumed to be fixed after being loaded.

Returns:

self

Return type:

object

monitor_value(train_set, val_set)[source]#

Calculating monitored value used for early stopping on validation set (val_set). This function will be called by early_stop() function. Note: val_set could be None thus it needs to be checked before usage.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Return type:

raise NotImplementedError

rank(user_idx, item_indices=None, k=-1, **kwargs)[source]#

Rank all test items for a given user.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform item raking.

  • item_indices (1d array, optional, default: None) – A list of candidate item indices to be ranked by the user. If None, list of ranked known item indices and their scores will be returned.

  • k (int, required) – Cut-off length for recommendations, k=-1 will return ranked list of all items. This is more important for ANN to know the limit to avoid exhaustive ranking.

Returns:

(ranked_items, item_scores)ranked_items contains item indices being ranked by their scores. item_scores contains scores of items corresponding to index in item_indices input.

Return type:

tuple

rate(user_idx, item_idx, clipping=True)[source]#

Give a rating score between pair of user and item

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform item raking.

  • item_idx (int, required) – The index of the item to be rated by the user.

  • clipping (bool, default: True) – Whether to clip the predicted rating value.

Returns:

A rating score of the user for the item

Return type:

A scalar

recommend(user_id, k=-1, remove_seen=False, train_set=None)[source]#

Generate top-K item recommendations for a given user. Key difference between this function and rank() function is that rank() function works with mapped user/item index while this function works with original user/item ID. This helps hide the abstraction of ID-index mapping, and make model usage and deployment cleaner.

Parameters:
  • user_id (str, required) – The original ID of the user.

  • k (int, optional, default=-1) – Cut-off length for recommendations, k=-1 will return ranked list of all items.

  • remove_seen (bool, optional, default: False) – Remove seen/known items during training and validation from output recommendations.

  • train_set (cornac.data.Dataset, optional, default: None) – Training dataset needs to be provided in order to remove seen items.

Returns:

recommendations – Recommended items in the form of their original IDs.

Return type:

list

save(save_dir=None, save_trainset=False, metadata=None)[source]#

Save a recommender model to the filesystem.

Parameters:
  • save_dir (str, default: None) – Path to a directory for the model to be stored.

  • save_trainset (bool, default: False) – Save train_set together with the model. This is useful if we want to deploy model later because train_set is required for certain evaluation steps.

  • metadata (dict, default: None) – Metadata to be saved with the model. This is useful to store model details.

Returns:

model_file – Path to the model file stored on the filesystem.

Return type:

str

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

property total_items#

Total number of items including users in test and validation if exists

property total_users#

Total number of users including users in test and validation if exists

transform(test_set)[source]#

Transform test set into cached results accelerating the score function. This function is supposed to be called in the cornac.eval_methods.BaseMethod before evaluation step. It is optional for this function to be implemented.

Parameters:

test_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

property user_ids#

Return the list of raw user IDs

cornac.models.recommender.is_ann_supported(recom)[source]#

Return True if the given recommender model support ANN search.

Parameters:

recom (recommender model) – Recommender object to test.

Returns:

out – True if recom supports ANN search and False otherwise.

Return type:

bool

Disentangled Multimodal Representation Learning for Recommendation (DMRL)#

class cornac.models.dmrl.recom_dmrl.DMRL(name: str = 'DMRL', batch_size: int = 32, learning_rate: float = 0.0001, decay_c: float = 1, decay_r: float = 0.01, epochs: int = 10, embedding_dim: int = 100, bert_text_dim: int = 384, image_dim: int = None, dropout: float = 0, num_neg: int = 4, num_factors: int = 4, trainable: bool = True, verbose: bool = False, log_metrics: bool = False)[source]#

Disentangled multimodal representation learning

Parameters:
  • name (string, default: 'DMRL') – The name of the recommender model.

  • batch_size (int, optional, default: 32) – The number of samples per batch to load.

  • learning_rate (float, optional, default: 1e-4) – The learning rate for the optimizer.

  • decay_c (float, optional, default: 1) – The decay for the disentangled loss term in the loss function.

  • decay_r (float, optional, default: 0.01) – The decay for the regularization term in the loss function.

  • epochs (int, optional, default: 10) – The number of epochs to train the model.

  • embedding_dim (int, optional, default: 100) – The dimension of the embeddings.

  • bert_text_dim (int, optional, default: 384) – The dimension of the bert text embeddings coming from the huggingface transformer model

  • image_dim (int, optional, default: None) – The dimension of the image embeddings.

  • num_neg (int, optional, default: 4) – The number of negative samples to use in the training per user per batch (1 positive and num_neg negatives are used)

  • num_factors (int, optional, default: 4) – The number of factors to use in the model.

  • trainable (bool, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already trained.

  • verbose (bool, optional, default: False) – When True, the model prints out more information during training.

  • modalities_pre_built (bool, optional, default: True) – When True, the model assumes that the modalities are already built and does not build them.

  • log_metrics (bool, optional, default: False) – When True, the model logs metrics to tensorboard.

References

  • Fan Liu, Huilin Chen, Zhiyong Cheng, Anan Liu, Liqiang Nie, Mohan Kankanhalli. DMRL: Disentangled Multimodal Representation Learning for

    Recommendation. https://arxiv.org/pdf/2203.05406.pdf.

eval_train_set_performance() Tuple[float, float][source]#

Evaluate the models training set performance using Recall 300 metric.

fit(train_set: Dataset, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

get_item_image_embedding(batch)[source]#

Get the item image embeddings from the image modality. Expect the image modaility to be preencded and available as a numpy array.

Parameters:

batch (param) – and all other columns are negative item indices

get_item_text_embeddings(batch)[source]#

Get the item text embeddings from the BERT model. Either by encoding the text on the fly or by using the preencoded text.

Parameters:

batch (param) – and all other columns are negative item indices

get_modality_embeddings(batch)[source]#

Get the modality embeddings for both text and image from the respectiv modality instances.

Parameters:
  • batch (param)

  • second (indices in) – and all other columns are negative item indices

initialize_and_build_modalities(trainset: Dataset)[source]#

Initializes text and image modalities for the model. Either takes in raw text or image and performs pre-encoding given the transformer models in TransformerTextModality and TransformerVisionModality. If preencoded features are given, it uses those instead and simply wrapes them into a general FeatureModality instance, as no further encoding model is required.

score(user_index: int, item_indices=None)[source]#

Scores a user-item pair. If item_index is None, scores for all known items.

Parameters:
  • name (user_idx) – The index of the user for whom to perform score prediction.

  • item_indices (torch.Tensor, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Bilateral VAE for Collaborative Filtering (BiVAECF)#

class cornac.models.bivaecf.recom_bivaecf.BiVAECF(name='BiVAECF', k=10, encoder_structure=[20], act_fn='tanh', likelihood='pois', n_epochs=100, batch_size=100, learning_rate=0.001, beta_kl=1.0, cap_priors={'item': False, 'user': False}, trainable=True, verbose=False, seed=None, use_gpu=True)[source]#

Bilateral Variational AutoEncoder for Collaborative Filtering.

Parameters:
  • k (int, optional, default: 10) – The dimension of the stochastic user ``theta’’ and item ``beta’’ factors.

  • encoder_structure (list, default: [20]) – The number of neurons per layer of the user and item encoders for BiVAE. For example, encoder_structure = [20], the user (item) encoder structure will be [num_items, 20, k] ([num_users, 20, k]).

  • act_fn (str, default: 'tanh') – Name of the activation function used between hidden layers of the auto-encoder. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘relu6’]

  • likelihood (str, default: 'pois') –

    The likelihood function used for modeling the observations. Supported choices:

    bern: Bernoulli likelihood gaus: Gaussian likelihood pois: Poisson likelihood

  • n_epochs (int, optional, default: 100) – The number of epochs for SGD.

  • batch_size (int, optional, default: 100) – The batch size.

  • learning_rate (float, optional, default: 0.001) – The learning rate for Adam.

  • beta_kl (float, optional, default: 1.0) – The weight of the KL terms as in beta-VAE.

  • cap_priors (dict, optional, default: {"user":False, "item":False}) – When {“user”:True, “item”:True}, CAP priors are used (see BiVAE paper for details), otherwise the standard Normal is used as a Prior over the user and item latent variables.

  • name (string, optional, default: 'BiVAECF') – The name of the recommender model.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained.

  • verbose (boolean, optional, default: False) – When True, some running logs are displayed.

  • seed (int, optional, default: None) – Random seed for parameters initialization.

  • use_gpu (boolean, optional, default: True) – If True and your system supports CUDA then training is performed on GPUs.

References

  • Quoc-Tuan Truong, Aghiles Salah, Hady W. Lauw. “ Bilateral Variational Autoencoder for Collaborative Filtering.”

ACM International Conference on Web Search and Data Mining (WSDM). 2021.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

static load(model_path, trainable=False)[source]#

Load model from the filesystem.

Parameters:
  • model_path (str, required) – Path to a file or directory where the model is stored. If a directory is provided, the latest model will be loaded.

  • trainable (boolean, optional, default: False) – Set it to True if you would like to finetune the model. By default, the model parameters are assumed to be fixed after being loaded.

Returns:

self

Return type:

object

save(save_dir=None, save_trainset=True)[source]#

Save model to the filesystem.

Parameters:
  • save_dir (str, default: None) – Path to a directory for the model to be stored.

  • save_trainset (bool, default: True) – Save train_set together with the model. This is useful if we want to deploy model later because train_set is required for certain evaluation steps.

Returns:

model_file – Path to the model file stored on the filesystem.

Return type:

str

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Causal Inference for Visual Debiasing in Visually-Aware Recommendation (CausalRec)#

class cornac.models.causalrec.recom_causalrec.CausalRec(name='CausalRec', k=10, k2=10, n_epochs=50, batch_size=100, learning_rate=0.005, lambda_w=0.01, lambda_b=0.01, lambda_e=0.0, mean_feat=None, tanh=0, lambda_2=0.8, use_gpu=False, trainable=True, verbose=True, init_params=None, seed=None)[source]#

CausalRec: Causal Inference for Visual Debiasing in Visually-Aware Recommendation

Parameters:
  • k (int, optional, default: 10) – The dimension of the gamma latent factors.

  • k2 (int, optional, default: 10) – The dimension of the theta latent factors.

  • n_epochs (int, optional, default: 20) – Maximum number of epochs for SGD.

  • batch_size (int, optional, default: 100) – The batch size for SGD.

  • learning_rate (float, optional, default: 0.001) – The learning rate for SGD.

  • lambda_w (float, optional, default: 0.01) – The regularization hyper-parameter for latent factor weights.

  • lambda_b (float, optional, default: 0.01) – The regularization hyper-parameter for biases.

  • lambda_e (float, optional, default: 0.0) – The regularization hyper-parameter for embedding matrix E and beta prime vector.

  • mean_feat (torch.tensor, required, default: None) – The mean feature of all item embeddings serving as the no-treatment during causal inference.

  • tanh (int, optional, default: 0) – The number of tanh layers on the visual feature transformation.

  • lambda_2 (float, optional, default: 0.8) – The coefficient controlling the elimination of the visual bias in Eq. (28).

  • use_gpu (boolean, optional, default: True) – Whether or not to use GPU to speed up training.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).

  • verbose (boolean, optional, default: True) – When True, running logs are displayed.

  • init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘Bi’: beta_item, ‘Gu’: gamma_user, ‘Gi’: gamma_item, ‘Tu’: theta_user, ‘E’: emb_matrix, ‘Bp’: beta_prime}

  • seed (int, optional, default: None) – Random seed for weight initialization.

References

  • Qiu R., Wang S., Chen Z., Yin H., Huang Z. (2021). CausalRec: Causal Inference for Visual Debiasing in Visually-Aware Recommendation.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

score(user_idx, item_idx=None)[source]#

Predict the debiased scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Explainable Recommendation with Comparative Constraints on Product Aspects (ComparER)#

class cornac.models.comparer.recom_comparer_sub.ComparERSub(name='ComparERSub', rating_scale=5.0, n_user_factors=8, n_item_factors=8, n_aspect_factors=8, n_opinion_factors=8, n_pair_samples=1000, n_bpr_samples=1000, n_element_samples=50, n_top_aspects=100, alpha=0.5, min_user_freq=2, min_pair_freq=1, min_common_freq=1, use_item_aspect_popularity=True, enum_window=None, lambda_reg=0.1, lambda_bpr=10, lambda_d=0.01, max_iter=200000, lr=0.5, n_threads=0, trainable=True, verbose=False, init_params=None, seed=None)#

Explainable Recommendation with Comparative Constraints on Subjective Aspect-Level Quality

Parameters:
  • name (string, optional, default: 'ComparERSub') – The name of the recommender model.

  • rating_scale (float, optional, default: 5.0) – The maximum rating score of the dataset.

  • n_user_factors (int, optional, default: 15) – The dimension of the user latent factors.

  • n_item_factors (int, optional, default: 15) – The dimension of the item latent factors.

  • n_aspect_factors (int, optional, default: 12) – The dimension of the aspect latent factors.

  • n_opinion_factors (int, optional, default: 12) – The dimension of the opinion latent factors.

  • n_bpr_samples (int, optional, default: 1000) – The number of samples from all BPR pairs.

  • n_element_samples (int, optional, default: 50) – The number of samples from all ratings in each iteration.

  • n_top_aspects (int, optional, default: 100) – The number of top scored aspects for each (user, item) pair to construct ranking score.

  • alpha (float, optional, default: 0.5) – Trade-off factor for constructing ranking score.

  • lambda_reg (float, optional, default: 0.1) – The regularization parameter.

  • lambda_bpr (float, optional, default: 10.0) – The regularization parameter for BPR.

  • max_iter (int, optional, default: 200000) – Maximum number of iterations for training.

  • lr (float, optional, default: 0.1) – The learning rate for optimization

  • n_threads (int, optional, default: 0) – Number of parallel threads for training. If n_threads=0, all CPU cores will be utilized. If seed is not None, n_threads=1 to remove randomness from parallelization.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U, I, A, O, G1, G2, and G3 are not None).

  • verbose (boolean, optional, default: False) – When True, running logs are displayed.

  • init_params (dictionary, optional, default: None) –

    List of initial parameters, e.g., init_params = {‘U’:U, ‘I’:I, ‘A’:A, ‘O’:O, ‘G1’:G1, ‘G2’:G2, ‘G3’:G3}

    U: ndarray, shape (n_users, n_user_factors)

    The user latent factors, optional initialization via init_params

    I: ndarray, shape (n_items, n_item_factors)

    The item latent factors, optional initialization via init_params

    A: ndarray, shape (num_aspects+1, n_aspect_factors)

    The aspect latent factors, optional initialization via init_params

    O: ndarray, shape (num_opinions, n_opinion_factors)

    The opinion latent factors, optional initialization via init_params

    G1: ndarray, shape (n_user_factors, n_item_factors, n_aspect_factors)

    The core tensor for user, item, and aspect factors, optional initialization via init_params

    G2: ndarray, shape (n_user_factors, n_aspect_factors, n_opinion_factors)

    The core tensor for user, aspect, and opinion factors, optional initialization via init_params

    G3: ndarray, shape (n_item_factors, n_aspect_factors, n_opinion_factors)

    The core tensor for item, aspect, and opinion factors, optional initialization via init_params

  • seed (int, optional, default: None) – Random seed for parameters initialization.

References

  • Trung-Hoang Le and Hady W. Lauw. “Explainable Recommendation with Comparative Constraints on Product Aspects.”

ACM International Conference on Web Search and Data Mining (WSDM). 2021.

fit(train_set, val_set=None)#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

rank(user_idx, item_indices=None, k=-1)#

Rank all test items for a given user.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform item raking.

  • item_indices (1d array, optional, default: None) – A list of candidate item indices to be ranked by the user. If None, list of ranked known item indices and their scores will be returned.

  • k (int, required) – Cut-off length for recommendations, k=-1 will return ranked list of all items. This is more important for ANN to know the limit to avoid exhaustive ranking.

Returns:

(ranked_items, item_scores)ranked_items contains item indices being ranked by their scores. item_scores contains scores of items corresponding to index in item_indices input.

Return type:

tuple

class cornac.models.comparer.recom_comparer_obj.ComparERObj(name='ComparERObj', model_type='Finer', num_explicit_factors=128, num_latent_factors=128, num_most_cared_aspects=100, rating_scale=5.0, alpha=0.9, lambda_x=1, lambda_y=1, lambda_u=0.01, lambda_h=0.01, lambda_v=0.01, lambda_d=0.01, use_item_aspect_popularity=True, min_user_freq=2, min_pair_freq=1, max_pair_freq=1000000000.0, min_common_freq=1, enum_window=None, use_item_pair_popularity=True, max_iter=1000, num_threads=0, early_stopping=None, trainable=True, verbose=False, init_params=None, seed=None)#

Explainable Recommendation with Comparative Constraints on Objective Aspect-Level Quality

Parameters:
  • num_explicit_factors (int, optional, default: 128) – The dimension of the explicit factors.

  • num_latent_factors (int, optional, default: 128) – The dimension of the latent factors.

  • num_most_cared_aspects (int, optional, default: 100) – The number of most cared aspects for each user.

  • rating_scale (float, optional, default: 5.0) – The maximum rating score of the dataset.

  • alpha (float, optional, default: 0.9) – Trace off factor for constructing ranking score.

  • lambda_x (float, optional, default: 1) – The regularization parameter for user aspect attentions.

  • lambda_y (float, optional, default: 1) – The regularization parameter for item aspect qualities.

  • lambda_u (float, optional, default: 0.01) – The regularization parameter for user and item explicit factors.

  • lambda_h (float, optional, default: 0.01) – The regularization parameter for user and item latent factors.

  • lambda_v (float, optional, default: 0.01) – The regularization parameter for V.

  • use_item_aspect_popularity (boolean, optional, default: True) – When False, item aspect frequency is omitted from item aspect quality computation formular. Specifically, \(Y_{ij} = 1 + \frac{N - 1}{1 + e^{-s_{ij}}}\) if \(p_i\) is reviewed on feature \(F_j\)

  • min_user_freq (int, optional, default: 2) – Apply constraint for user with minimum number of ratings, where min_user_freq = 2 means only apply constraints on users with at least 2 ratings.

  • min_pair_freq (int, optional, default: 1) – Apply constraint for the purchased pairs (earlier-later bought) with minimum number of pairs, where min_pair_freq = 2 means only apply constraints on pairs appear at least twice.

  • max_pair_freq (int, optional, default: 1e9) – Apply constraint for the purchased pairs with frequency at most max_pair_freq, where max_pair_freq = 2 means only apply constraints on pairs appear at most twice.

  • max_iter (int, optional, default: 1000) – Maximum number of iterations or the number of epochs.

  • name (string, optional, default: 'ComparERObj') – The name of the recommender model.

  • num_threads (int, optional, default: 0) – Number of parallel threads for training. If 0, all CPU cores will be utilized.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U1, U2, V, H1, and H2 are not None).

  • verbose (boolean, optional, default: False) – When True, running logs are displayed.

  • init_params (dictionary, optional, default: None) –

    List of initial parameters, e.g., init_params = {‘U1’:U1, ‘U2’:U2, ‘V’:V’, H1’:H1, ‘H2’:H2} U1: ndarray, shape (n_users, n_explicit_factors)

    The user explicit factors, optional initialization via init_params.

    U2: ndarray, shape (n_ratings, n_explicit_factors)

    The item explicit factors, optional initialization via init_params.

    V: ndarray, shape (n_aspects, n_explict_factors)

    The aspect factors, optional initialization via init_params.

    H1: ndarray, shape (n_users, n_latent_factors)

    The user latent factors, optional initialization via init_params.

    H2: ndarray, shape (n_ratings, n_latent_factors)

    The item latent factors, optional initialization via init_params.

  • seed (int, optional, default: None) – Random seed for weight initialization.

References

  • Trung-Hoang Le and Hady W. Lauw. “Explainable Recommendation with Comparative Constraints on Product Aspects.”

ACM International Conference on Web Search and Data Mining (WSDM). 2021.

fit(train_set, val_set=None)#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_params()#

Get model parameters in the form of dictionary including matrices: U1, U2, V, H1, H2

monitor_value()#

Calculating monitored value used for early stopping on validation set (val_set). This function will be called by early_stop() function.

Returns:

res – Monitored value on validation set. Return None if val_set is None.

Return type:

float

rank(user_idx, item_indices=None, k=-1)#

Rank all test items for a given user.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform item raking.

  • item_indices (1d array, optional, default: None) – A list of candidate item indices to be ranked by the user. If None, list of ranked known item indices and their scores will be returned

  • k (int, required) – Cut-off length for recommendations, k=-1 will return ranked list of all items. This is more important for ANN to know the limit to avoid exhaustive ranking.

Returns:

(ranked_items, item_scores)ranked_items contains item indices being ranked by their scores. item_scores contains scores of items corresponding to index in item_indices input.

Return type:

tuple

score(user_id, item_id=None)#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_id (int, required) – The index of the user for whom to perform score prediction.

  • item_id (int, optional, default: None) – The index of the item for that to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Adversarial Training Towards Robust Multimedia Recommender System (AMR)#

class cornac.models.amr.recom_amr.AMR(name='AMR', k=10, k2=10, n_epochs=50, batch_size=100, learning_rate=0.005, lambda_w=0.01, lambda_b=0.01, lambda_e=0.0, lambda_adv=1.0, use_gpu=False, trainable=True, verbose=True, init_params=None, seed=None)[source]#

Adversarial Training Towards Robust Multimedia Recommender System.

Parameters:
  • k (int, optional, default: 10) – The dimension of the gamma latent factors.

  • k2 (int, optional, default: 10) – The dimension of the theta latent factors.

  • n_epochs (int, optional, default: 20) – Maximum number of epochs for SGD.

  • batch_size (int, optional, default: 100) – The batch size for SGD.

  • learning_rate (float, optional, default: 0.001) – The learning rate for SGD.

  • lambda_w (float, optional, default: 0.01) – The regularization hyper-parameter for latent factor weights.

  • lambda_b (float, optional, default: 0.01) – The regularization hyper-parameter for biases.

  • lambda_e (float, optional, default: 0.0) – The regularization hyper-parameter for embedding matrix E and beta prime vector.

  • lambda_adv (float, optional, default: 1.0) – The regularization hyper-parameter in Eq. (8) and (10) for the adversarial sample loss.

  • use_gpu (boolean, optional, default: True) – Whether or not to use GPU to speed up training.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).

  • verbose (boolean, optional, default: True) – When True, running logs are displayed.

  • init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘Bi’: beta_item, ‘Gu’: gamma_user, ‘Gi’: gamma_item, ‘Tu’: theta_user, ‘E’: emb_matrix, ‘Bp’: beta_prime}

  • seed (int, optional, default: None) – Random seed for weight initialization.

References

  • Tang, J., Du, X., He, X., Yuan, F., Tian, Q., and Chua, T. (2020). Adversarial Training Towards Robust Multimedia Recommender System.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Hybrid neural recommendation with joint deep representation learning of ratings and reviews (HRDR)#

class cornac.models.hrdr.recom_hrdr.HRDR(name='HRDR', embedding_size=100, id_embedding_size=32, n_factors=32, attention_size=16, kernel_sizes=[3], n_filters=64, n_user_mlp_factors=128, n_item_mlp_factors=128, dropout_rate=0.5, max_text_length=50, max_num_review=32, batch_size=64, max_iter=20, optimizer='adam', learning_rate=0.001, model_selection='last', user_based=True, trainable=True, verbose=True, init_params=None, seed=None)[source]#
Parameters:
  • name (string, default: 'HRDR') – The name of the recommender model.

  • embedding_size (int, default: 100) – Word embedding size

  • n_factors (int, default: 32) – The dimension of the user/item’s latent factors.

  • attention_size (int, default: 16) – Attention size

  • kernel_sizes (list, default: [3]) – List of kernel sizes of conv2d

  • n_filters (int, default: 64) – Number of filters

  • n_user_mlp_factors (int, default: 128) – Number of latent dimension of the first layer of a 3-layer MLP following by batch normalization on user net to represent user rating.

  • n_item_mlp_factors (int, default: 128) – Number of latent dimension of the first layer of a 3-layer MLP following by batch normalization on item net to represent item rating.

  • dropout_rate (float, default: 0.5) – Dropout rate of neural network dense layers

  • max_text_length (int, default: 50) – Maximum number of tokens in a review instance

  • max_num_review (int, default: 32) – Maximum number of reviews that you want to feed into training. By default, the model will be trained with all reviews.

  • batch_size (int, default: 64) – Batch size

  • max_iter (int, default: 20) – Max number of training epochs

  • optimizer (string, optional, default: 'adam') – Optimizer for training is either ‘adam’ or ‘rmsprop’.

  • learning_rate (float, optional, default: 0.001) – Initial value of learning rate for the optimizer.

  • trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.

  • verbose (boolean, optional, default: True) – When True, running logs are displayed.

  • init_params (dictionary, optional, default: None) – Initial parameters, pretrained_word_embeddings could be initialized here, e.g., init_params={‘pretrained_word_embeddings’: pretrained_word_embeddings}

  • seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).

References

Liu, H., Wang, Y., Peng, Q., Wu, F., Gan, L., Pan, L., & Jiao, P. (2020). Hybrid neural recommendation with joint deep representation learning of ratings and reviews. Neurocomputing, 374, 77-85.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

static load(model_path, trainable=False)[source]#

Load a recommender model from the filesystem.

Parameters:
  • model_path (str, required) – Path to a file or directory where the model is stored. If a directory is provided, the latest model will be loaded.

  • trainable (boolean, optional, default: False) – Set it to True if you would like to finetune the model. By default, the model parameters are assumed to be fixed after being loaded.

Returns:

self

Return type:

object

save(save_dir=None)[source]#

Save a recommender model to the filesystem.

Parameters:

save_dir (str, default: None) – Path to a directory for the model to be stored.

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for that to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Hypergraphs with Attention on Reviews for Explainable Recommendation#

class cornac.models.hypar.recom_hypar.HypAR(name='HypAR', use_cuda=False, stemming=True, batch_size=128, num_workers=0, num_epochs=10, early_stopping=10, eval_interval=1, learning_rate=0.1, weight_decay=0, node_dim=64, num_heads=3, fanout=5, non_linear=True, model_selection='best', objective='ranking', review_aggregator='narre', predictor='narre', preference_module='lightgcn', combiner='add', graph_type='aos', num_neg_samples=50, layer_dropout=None, attention_dropout=0.2, user_based=True, verbose=True, index=0, out_path=None, learn_explainability=False, learn_method='transr', learn_weight=1.0, embedding_type='ao_embeddings', debug=False)[source]#

HypAR: Hypergraph with Attention on Review. This model is from the paper “Hypergraph with Attention on Reviews for explainable recommendation”, by Theis E. Jendal, Trung-Hoang Le, Hady W. Lauw, Matteo Lissandrini, Peter Dolog, and Katja Hose. ECIR 2024: https://doi.org/10.1007/978-3-031-56027-9_14

Parameters:
  • name (str, default: 'HypAR') – Name of the model.

  • use_cuda (bool, default: False) – Whether to use cuda.

  • stemming (bool, default: True) – Whether to use stemming.

  • batch_size (int, default: 128) – Batch size.

  • num_workers (int, default: 0) – Number of workers for dataloader.

  • num_epochs (int, default: 10) – Number of epochs.

  • early_stopping (int, default: 10) – Early stopping.

  • eval_interval (int, default: 1) – Evaluation interval, i.e., how often to evaluate on the validation set.

  • learning_rate (float, default: 0.1) – Learning rate.

  • weight_decay (float, default: 0) – Weight decay.

  • node_dim (int, default: 64) – Dimension of learned and hidden layers.

  • num_heads (int, default: 3) – Number of attention heads.

  • fanout (int, default: 5) – Fanout for sampling.

  • non_linear (bool, default: True) – Whether to use non-linear activation function.

  • model_selection (str, default: 'best') – Model selection method, i.e., whether to use the best model or the last model.

  • objective (str, default: 'ranking') – Objective, i.e., whether to use ranking or rating.

  • review_aggregator (str, default: 'narre') – Review aggregator, i.e., how to aggregate reviews.

  • predictor (str, default: 'narre') – Predictor, i.e., how to predict ratings.

  • preference_module (str, default: 'lightgcn') – Preference module, i.e., how to model preferences.

  • combiner (str, default: 'add') – Combiner, i.e., how to combine embeddings.

  • graph_type (str, default: 'aos') – Graph type, i.e., which nodes to include in hypergraph. Aspects, opinions and sentiment.

  • num_neg_samples (int, default: 50) – Number of negative samples to use for ranking.

  • layer_dropout (float, default: None) – Dropout for node and review embeddings.

  • attention_dropout (float, default: .2) – Dropout for attention.

  • user_based (bool, default: True) – Whether to use user-based or item-based.

  • verbose (bool, default: True) – Whether to print information.

  • index (int, default: 0) – Index for saving results, i.e., if hyparparameter tuning.

  • out_path (str, default: None) – Path to save graphs, embeddings and similar.

  • learn_explainability (bool, default: False) – Whether to learn explainability.

  • learn_method (str, default: 'transr') – Learning method, i.e., which method to use explainability learning.

  • learn_weight (float, default: 1.) – Weight for explainability learning loss.

  • embedding_type (str, default: 'ao_embeddings') – Type of embeddings to use, i.e., whether to use prelearned embeddings or not.

  • debug (bool, default: False) – Whether to use debug mode as errors might be thrown by dataloaders when debugging.

fit(train_set: Dataset, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

load(model_path, trainable=False)[source]#

Load a recommender model from the filesystem.

Parameters:
  • model_path (str, required) – Path to a file or directory where the model is stored. If a directory is provided, the latest model will be loaded.

  • trainable (boolean, optional, default: False) – Set it to True if you would like to finetune the model. By default, the model parameters are assumed to be fixed after being loaded.

Returns:

self

Return type:

object

monitor_value(train_set, val_set=None)[source]#

Calculating monitored value used for early stopping on validation set (val_set). This function will be called by early_stop() function. Note: val_set could be None thus it needs to be checked before usage.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Return type:

raise NotImplementedError

save(save_dir=None, save_trainset=False)[source]#

Save a recommender model to the filesystem.

Parameters:
  • save_dir (str, default: None) – Path to a directory for the model to be stored.

  • save_trainset (bool, default: False) – Save train_set together with the model. This is useful if we want to deploy model later because train_set is required for certain evaluation steps.

  • metadata (dict, default: None) – Metadata to be saved with the model. This is useful to store model details.

Returns:

model_file – Path to the model file stored on the filesystem.

Return type:

str

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Simplifying and Powering Graph Convolution Network for Recommendation (LightGCN)#

class cornac.models.lightgcn.recom_lightgcn.LightGCN(name='LightGCN', emb_size=64, num_epochs=1000, learning_rate=0.001, batch_size=1024, num_layers=3, early_stopping=None, lambda_reg=0.0001, trainable=True, verbose=False, seed=2020)[source]#
Parameters:
  • name (string, default: 'LightGCN') – The name of the recommender model.

  • emb_size (int, default: 64) – Size of the node embeddings.

  • num_epochs (int, default: 1000) – Maximum number of iterations or the number of epochs.

  • learning_rate (float, default: 0.001) – The learning rate that determines the step size at each iteration

  • batch_size (int, default: 1024) – Mini-batch size used for train set

  • num_layers (int, default: 3) – Number of LightGCN Layers

  • early_stopping ({min_delta: float, patience: int}, optional, default: None) –

    If None, no early stopping. Meaning of the arguments:

    • min_delta: the minimum increase in monitored value on validation

      set to be considered as improvement, i.e. an increment of less than min_delta will count as no improvement.

    • patience: number of epochs with no improvement after which

      training should be stopped.

  • lambda_reg (float, default: 1e-4) – Weight decay for the L2 normalization

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained.

  • verbose (boolean, optional, default: False) – When True, some running logs are displayed.

  • seed (int, optional, default: 2020) – Random seed for parameters initialization.

References

  • He, X., Deng, K., Wang, X., Li, Y., Zhang, Y., & Wang, M. (2020). LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

monitor_value(train_set, val_set)[source]#

Calculating monitored value used for early stopping on validation set (val_set). This function will be called by early_stop() function.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

res – Monitored value on validation set. Return None if val_set is None.

Return type:

float

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

New Variational Autoencoder for Top-N Recommendations with Implicit Feedback (RecVAE)#

class cornac.models.recvae.recom_recvae.RecVAE(name='RecVae', hidden_dim=600, latent_dim=200, batch_size=500, beta=None, gamma=0.005, lr=0.0005, n_epochs=50, n_enc_epochs=3, n_dec_epochs=1, not_alternating=False, trainable=True, verbose=False, seed=None, use_gpu=True)[source]#

RecVAE, a recommender system based on a Variational Autoencoder.

Parameters:
  • name (str, optional, default: 'RecVae') – Name of the recommender model.

  • hidden_dim (int, optional, default: 600) – Dimension of the hidden layer in the VAE architecture.

  • latent_dim (int, optional, default: 200) – Dimension of the latent layer in the VAE architecture.

  • batch_size (int, optional, default: 500) – Size of the batches used during training.

  • beta (float, optional) – Weighting factor for the KL divergence term in the VAE loss function.

  • gamma (float, optional, default: 0.005) – Weighting factor for the regularization term in the loss function.

  • lr (float, optional, default: 5e-4) – Learning rate for the optimizer.

  • n_epochs (int, optional, default: 50) – Number of epochs to train the model.

  • n_enc_epochs (int, optional, default: 3) – Number of epochs to train the encoder part of VAE.

  • n_dec_epochs (int, optional, default: 1) – Number of epochs to train the decoder part of VAE.

  • not_alternating (boolean, optional, default: False) – If True, the model training will not alternate between encoder and decoder.

  • trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.

  • verbose (boolean, optional, default: False) – When True, running logs are displayed.

  • seed (int, optional) – Random seed for weight initialization and training reproducibility.

  • use_gpu (boolean, optional, default: True) – When True, training utilizes GPU if available.

References

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_id (int, required) – The index of the user for whom to perform score prediction.

  • item_id (int, optional, default: None) – The index of the item for that to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Predicting Temporal Sets with Deep Neural Networks (DNNTSP)#

class cornac.models.dnntsp.recom_dnntsp.DNNTSP(name='DNNTSP', emb_dim=32, loss_type='bpr', optimizer='adam', lr=0.001, weight_decay=0, n_epochs=100, batch_size=64, device='cpu', trainable=True, verbose=False, seed=None)[source]#

Deep Neural Network for Temporal Sets Prediction (DNNTSP).

Parameters:
  • name (string, default: 'DNNTSP') – The name of the recommender model.

  • emb_dim (int, optional, default: 32) – Number of hidden factors

  • loss_type (string, optional, default: "bpr") – Loss type. Including “bpr”: BPRLoss “mse”: MSELoss “weight_mse”: WeightMSELoss “multi_label_soft_margin”: MultiLabelSoftMarginLoss

  • optimizer (string, optional, default: "adam") – Optimizer

  • lr (string, optional, default: 0.001) – Learning rate

  • weight_decay (float, optional, default: 0) – Weight decay for adaptive optimizer

  • n_epochs (int, optional, default: 100) – Number of epochs

  • batch_size (int, optional, default: 64) – Batch size

  • device (string, optional, default: "cpu") – Device for learning and evaluation. Using cpu as default. Use “cuda:0” for using gpu.

  • trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.

  • verbose (boolean, optional, default: True) – When True, running logs are displayed.

  • seed (int, optional, default: None) – Random seed

References

Le Yu, Leilei Sun, Bowen Du, Chuanren Liu, Hui Xiong, and Weifeng Lv. 2020. Predicting Temporal Sets with Deep Neural Networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD ‘20). Association for Computing Machinery, New York, NY, USA, 1083–1091. https://doi.org/10.1145/3394486.3403152

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

score(user_idx, history_baskets, **kwargs)[source]#

Predict the scores for all items based on input history baskets

Parameters:

history_baskets (list of lists) – The list of history baskets in sequential manner for next-basket prediction.

Returns:

res – Relative scores of all known items

Return type:

a Numpy array

Recency Aware Collaborative Filtering for Next Basket Recommendation (UPCF)#

class cornac.models.upcf.recom_upcf.UPCF(name='UPCF', recency=1, locality=1, asymmetry=0.25, verbose=False)[source]#

User Popularity-based CF (UPCF)

Parameters:
  • name (string, default: 'UPCF') – The name of the recommender model.

  • recency (int, optional, default: 1) – The size of recency window. If 0, all baskets will be used.

  • locality (int, optional, default: 1) – The strength we enforce the similarity between two items within a basket

  • asymmetry (float, optional, default: 0.25) – Trade-off parameter which balances the importance of the probability of having item i given j and probability having item j given i. This value will be computed via similaripy.asymetric_cosine.

  • verbose (boolean, optional, default: False) – When True, running logs are displayed.

References

Guglielmo Faggioli, Mirko Polato, and Fabio Aiolli. 2020. Recency Aware Collaborative Filtering for Next Basket Recommendation. In Proceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization (UMAP ‘20). Association for Computing Machinery, New York, NY, USA, 80–87. https://doi.org/10.1145/3340631.3394850

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

score(user_idx, history_baskets, **kwargs)[source]#

Predict the scores for all items based on input history baskets

Parameters:

history_baskets (list of lists) – The list of history baskets in sequential manner for next-basket prediction.

Returns:

res – Relative scores of all known items

Return type:

a Numpy array

Temporal-Item-Frequency-based User-KNN (TIFUKNN)#

class cornac.models.tifuknn.recom_tifuknn.TIFUKNN(name='TIFUKNN', n_neighbors=300, within_decay_rate=0.9, group_decay_rate=0.7, alpha=0.7, n_groups=7, verbose=False)[source]#

Temporal-Item-Frequency-based User-KNN (TIFUKNN)

Parameters:
  • name (string, default: 'TIFUKNN') – The name of the recommender model.

  • n_neighbors (int, optional, default: 300) – The number of neighbors for KNN

  • within_decay_rate (float, optional, default: 0.9) – Within-basket time-decayed ratio in range [0, 1]

  • group_decay_rate (float, optional, default: 0.7) – Group time-decayed ratio in range [0, 1]

  • alpha (float, optional, default: 0.7) – The trade-off between current user vector and neighbors vectors to compute final item scores

  • n_groups (int, optional, default: 7) – The historal baskets will be partition into n_groups equally.

  • verbose (boolean, optional, default: False) – When True, running logs are displayed.

References

Haoji Hu, Xiangnan He, Jinyang Gao, and Zhi-Li Zhang. 2020. Modeling Personalized Item Frequency Information for Next-basket Recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ‘20). Association for Computing Machinery, New York, NY, USA, 1071–1080. https://doi.org/10.1145/3397271.3401066

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

score(user_idx, history_baskets, **kwargs)[source]#

Predict the scores for all items based on input history baskets

Parameters:

history_baskets (list of lists) – The list of history baskets in sequential manner for next-basket prediction.

Returns:

res – Relative scores of all known items

Return type:

a Numpy array

Correlation-Sensitive Next-Basket Recommendation (Beacon)#

class cornac.models.beacon.recom_beacon.Beacon(name='Beacon', emb_dim=2, rnn_unit=4, alpha=0.5, rnn_cell_type='LSTM', dropout_rate=0.5, nb_hop=1, max_seq_length=None, n_epochs=15, batch_size=32, lr=0.001, trainable=True, verbose=False, seed=None)[source]#

Correlation-Sensitive Next-Basket Recommendation

Parameters:
  • name (string, default: 'Beacon') – The name of the recommender model.

  • emb_dim (int, optional, default: 2) – Embedding dimension

  • rnn_unit (int, optional, default: 4) – Number of dimension in a rnn unit.

  • alpha (float, optional, default: 0.5) – Hyperparameter to control the balance between correlative and sequential associations.

  • rnn_cell_type (str, optional, default: 'LSTM') – RNN cell type, options including [‘LSTM’, ‘GRU’, None] If None, BasicRNNCell will be used.

  • dropout_rate (float, optional, default: 0.5) – Dropout rate of neural network dense layers

  • nb_hop (int, optional, default: 1) – Number of hops for constructing correlation matrix. If 0, zeros matrix will be used.

  • max_seq_length (int, optional, default: None) – Maximum basket sequence length. If None, it is the maximum number of basket in training sequences.

  • n_epochs (int, optional, default: 15) – Number of training epochs

  • batch_size (int, optional, default: 32) – Batch size

  • lr (float, optional, default: 0.001) – Initial value of learning rate for the optimizer.

  • verbose (boolean, optional, default: False) – When True, running logs are displayed.

  • seed (int, optional, default: None) – Random seed

References

LE, Duc Trong, Hady Wirawan LAUW, and Yuan Fang. Correlation-sensitive next-basket recommendation. International Joint Conferences on Artificial Intelligence, 2019.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

score(user_idx, history_baskets, **kwargs)[source]#

Predict the scores for all items based on input history baskets

Parameters:

history_baskets (list of lists) – The list of history baskets in sequential manner for next-basket prediction.

Returns:

res – Relative scores of all known items

Return type:

a Numpy array

Embarrassingly Shallow Autoencoders for Sparse Data (EASEᴿ)#

class cornac.models.ease.recom_ease.EASE(name='EASEᴿ', lamb=500, posB=True, trainable=True, verbose=True, seed=None, B=None, U=None)[source]#

Embarrassingly Shallow Autoencoders for Sparse Data.

Parameters:
  • name (string, optional, default: 'EASEᴿ') – The name of the recommender model.

  • lamb (float, optional, default: 500) – L2-norm regularization-parameter λ ∈ R+.

  • posB (boolean, optional, default: False) – Remove Negative Weights

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already trained.

  • verbose (boolean, optional, default: False) – When True, some running logs are displayed.

  • seed (int, optional, default: None) – Random seed for parameters initialization.

References

  • Steck, H. (2019, May). “Embarrassingly shallow autoencoders for sparse data.” In The World Wide Web Conference (pp. 3251-3257).

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Neural Graph Collaborative Filtering (NGCF)#

class cornac.models.ngcf.recom_ngcf.NGCF(name='NGCF', emb_size=64, layer_sizes=[64, 64, 64], dropout_rates=[0.1, 0.1, 0.1], num_epochs=1000, learning_rate=0.001, batch_size=1024, early_stopping=None, lambda_reg=0.0001, trainable=True, verbose=False, seed=2020)[source]#

Neural Graph Collaborative Filtering

Parameters:
  • name (string, default: 'NGCF') – The name of the recommender model.

  • emb_size (int, default: 64) – Size of the node embeddings.

  • layer_sizes (list, default: [64, 64, 64]) – Size of the output of convolution layers.

  • dropout_rates (list, default: [0.1, 0.1, 0.1]) – Dropout rate for each of the convolution layers. - Number of values should be the same as ‘layer_sizes’

  • num_epochs (int, default: 1000) – Maximum number of iterations or the number of epochs.

  • learning_rate (float, default: 0.001) – The learning rate that determines the step size at each iteration.

  • batch_size (int, default: 1024) – Mini-batch size used for training.

  • early_stopping ({min_delta: float, patience: int}, optional, default: None) –

    If None, no early stopping. Meaning of the arguments:

    • min_delta: the minimum increase in monitored value on validation

      set to be considered as improvement, i.e. an increment of less than min_delta will count as no improvement.

    • patience: number of epochs with no improvement after which

      training should be stopped.

  • lambda_reg (float, default: 1e-4) – Weight decay for the L2 normalization.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained.

  • verbose (boolean, optional, default: False) – When True, some running logs are displayed.

  • seed (int, optional, default: 2020) – Random seed for parameters initialization.

References

  • Wang, Xiang, et al. “Neural graph collaborative filtering.” Proceedings of the 42nd international ACM SIGIR conference on Research and development in Information Retrieval. 2019.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

monitor_value(train_set, val_set)[source]#

Calculating monitored value used for early stopping on validation set (val_set). This function will be called by early_stop() function.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

res – Monitored value on validation set. Return None if val_set is None.

Return type:

float

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Collaborative Context Poisson Factorization (C2PF)#

class cornac.models.c2pf.recom_c2pf.C2PF(k=100, max_iter=100, variant='c2pf', name=None, trainable=True, verbose=False, init_params=None)[source]#

Collaborative Context Poisson Factorization.

Parameters:
  • k (int, optional, default: 100) – The dimension of the latent factors.

  • max_iter (int, optional, default: 100) – Maximum number of iterations for variational C2PF.

  • variant (string, optional, default: 'c2pf') – C2pf’s variant: c2pf: ‘c2pf’, ‘tc2pf’ (tied-c2pf) or ‘rc2pf’ (reduced-c2pf). Please refer to the original paper for details.

  • name (string, optional, default: None) – The name of the recommender model. If None, then “variant” is used as the default name of the model.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (Theta, Beta and Xi are not None).

  • Item_context (See "cornac/examples/c2pf_example.py" in the GitHub repo for an example of how to use cornac's graph modality to load and provide "item context" for C2PF.)

  • init_params (dict, optional, default: None) –

    List of initial parameters, e.g., init_params = {‘G_s’:G_s, ‘G_r’:G_r, ‘L_s’:L_s, ‘L_r’:L_r, ‘L2_s’:L2_s, ‘L2_r’:L2_r, ‘L3_s’:L3_s, ‘L3_r’: L3_r}

    Theta: ndarray, shape (n_users, k)

    The expected user latent factors.

    Beta: ndarray, shape (n_items, k)

    The expected item latent factors.

    Xi: ndarray, shape (n_items, k)

    The expected context item latent factors multiplied by context effects Kappa.

    G_s: ndarray, shape (n_users, k)

    Represent the “shape” parameters of Gamma distribution over Theta.

    G_r: ndarray, shape (n_users, k)

    Represent the “rate” parameters of Gamma distribution over Theta.

    L_s: ndarray, shape (n_items, k)

    Represent the “shape” parameters of Gamma distribution over Beta.

    L_r: ndarray, shape (n_items, k)

    Represent the “rate” parameters of Gamma distribution over Beta.

    L2_s: ndarray, shape (n_items, k)

    Represent the “shape” parameters of Gamma distribution over Xi.

    L2_r: ndarray, shape (n_items, k)

    Represent the “rate” parameters of Gamma distribution over Xi.

    L3_s: ndarray

    Represent the “shape” parameters of Gamma distribution over Kappa.

    L3_r: ndarray

    Represent the “rate” parameters of Gamma distribution over Kappa.

References

  • Salah, Aghiles, and Hady W. Lauw. A Bayesian Latent Variable Model of User Preferences with Item Context. In IJCAI, pp. 2667-2674. 2018.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Graph Convolutional Matrix Completion (GCMC)#

Main class for GCMC recommender model

class cornac.models.gcmc.recom_gcmc.GCMC(name='GCMC', max_iter=2000, learning_rate=0.01, optimizer='adam', activation_func='leaky_relu', gcn_agg_units=500, gcn_out_units=75, gcn_dropout=0.7, gcn_agg_accum='stack', share_param=False, gen_r_num_basis_func=2, train_grad_clip=1.0, train_valid_interval=1, train_early_stopping_patience=100, train_min_learning_rate=0.001, train_decay_patience=50, train_lr_decay_factor=0.5, trainable=True, verbose=False, seed=None)[source]#

Graph Convolutional Matrix Completion (GCMC)

Parameters:
  • name (string, default: 'GCMC') – The name of the recommender model.

  • max_iter (int, default: 2000) – Maximum number of iterations or the number of epochs for SGD

  • learning_rate (float, default: 0.01) – The learning rate for SGD

  • optimizer (string, default: 'adam'. Supported values: 'adam','sgd'.) – The optimization method used for SGD

  • activation_func (string, default: 'leaky') – The activation function used in the GCMC model. Supported values: [‘leaky’, ‘linear’,’sigmoid’,’relu’, ‘tanh’]

  • gcn_agg_units (int, default: 500) – The number of units in the graph convolutional layers

  • gcn_out_units (int, default: 75) – The number of units in the output layer

  • gcn_dropout (float, default: 0.7) – The dropout rate for the graph convolutional layers

  • gcn_agg_accum (string, default:'stack') – The graph convolutional layer aggregation type. Supported values: [‘stack’, ‘sum’]

  • share_param (bool, default: False) – Whether to share the parameters in the graph convolutional layers

  • gen_r_num_basis_func (int, default: 2) – The number of basis functions used in the generating rating function

  • train_grad_clip (float, default: 1.0) – The gradient clipping value for training

  • train_valid_interval (int, default: 1) – The validation interval for training

  • train_early_stopping_patience (int, default: 100) – The patience for early stopping

  • train_min_learning_rate (float, default: 0.001) – The minimum learning rate for SGD

  • train_decay_patience (int, default: 50) – The patience for learning rate decay

  • train_lr_decay_factor (float, default: 0.5) – The learning rate decay factor

  • trainable (boolean, default: True) – When False, the model is not trained and Cornac

  • verbose (boolean, default: True) – When True, some running logs are displayed

  • seed (int, default: None) – Random seed for parameters initialization

References

  • van den Berg, R., Kipf, T. N., & Welling, M. (2018). Graph Convolutional Matrix Completion.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

transform(test_set)[source]#

Transform the model to indexed dictionary for scoring purposes.

Parameters:

test_set (cornac.data.Dataset, required) – User-Item preference data.

Multi-Task Explainable Recommendation (MTER)#

class cornac.models.mter.recom_mter.MTER(name='MTER', rating_scale=5.0, n_user_factors=15, n_item_factors=15, n_aspect_factors=12, n_opinion_factors=12, n_bpr_samples=1000, n_element_samples=50, lambda_reg=0.1, lambda_bpr=10, max_iter=200000, lr=0.1, n_threads=0, trainable=True, verbose=False, init_params=None, seed=None)#

Multi-Task Explainable Recommendation

Parameters:
  • name (string, optional, default: 'MTER') – The name of the recommender model.

  • rating_scale (float, optional, default: 5.0) – The maximum rating score of the dataset.

  • n_user_factors (int, optional, default: 15) – The dimension of the user latent factors.

  • n_item_factors (int, optional, default: 15) – The dimension of the item latent factors.

  • n_aspect_factors (int, optional, default: 12) – The dimension of the aspect latent factors.

  • n_opinion_factors (int, optional, default: 12) – The dimension of the opinion latent factors.

  • n_bpr_samples (int, optional, default: 1000) – The number of samples from all BPR pairs.

  • n_element_samples (int, optional, default: 50) – The number of samples from all ratings in each iteration.

  • lambda_reg (float, optional, default: 0.1) – The regularization parameter.

  • lambda_bpr (float, optional, default: 10.0) – The regularization parameter for BPR.

  • max_iter (int, optional, default: 200000) – Maximum number of iterations for training.

  • lr (float, optional, default: 0.1) – The learning rate for optimization

  • n_threads (int, optional, default: 0) – Number of parallel threads for training. If n_threads=0, all CPU cores will be utilized. If seed is not None, n_threads=1 to remove randomness from parallelization.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U, I, A, O, G1, G2, and G3 are not None).

  • verbose (boolean, optional, default: False) – When True, running logs are displayed.

  • init_params (dictionary, optional, default: None) –

    List of initial parameters, e.g., init_params = {‘U’:U, ‘I’:I, ‘A’:A, ‘O’:O, ‘G1’:G1, ‘G2’:G2, ‘G3’:G3}

    U: ndarray, shape (n_users, n_user_factors)

    The user latent factors, optional initialization via init_params

    I: ndarray, shape (n_items, n_item_factors)

    The item latent factors, optional initialization via init_params

    A: ndarray, shape (num_aspects+1, n_aspect_factors)

    The aspect latent factors, optional initialization via init_params

    O: ndarray, shape (num_opinions, n_opinion_factors)

    The opinion latent factors, optional initialization via init_params

    G1: ndarray, shape (n_user_factors, n_item_factors, n_aspect_factors)

    The core tensor for user, item, and aspect factors, optional initialization via init_params

    G2: ndarray, shape (n_user_factors, n_aspect_factors, n_opinion_factors)

    The core tensor for user, aspect, and opinion factors, optional initialization via init_params

    G3: ndarray, shape (n_item_factors, n_aspect_factors, n_opinion_factors)

    The core tensor for item, aspect, and opinion factors, optional initialization via init_params

  • seed (int, optional, default: None) – Random seed for parameters initialization.

References

Nan Wang, Hongning Wang, Yiling Jia, and Yue Yin. 2018. Explainable Recommendation via Multi-Task Learning in Opinionated Text Data. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR ‘18). ACM, New York, NY, USA, 165-174. DOI: https://doi.org/10.1145/3209978.3210010

fit(train_set, val_set=None)#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

score(u_idx, i_idx=None)#

Predict the scores/ratings of a user for an item.

Parameters:
  • u_idx (int, required) – The index of the user for whom to perform score prediction.

  • i_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Neural Attention Rating Regression with Review-level Explanations (NARRE)#

class cornac.models.narre.recom_narre.NARRE(name='NARRE', embedding_size=100, id_embedding_size=32, n_factors=32, attention_size=16, kernel_sizes=[3], n_filters=64, dropout_rate=0.5, max_text_length=50, max_num_review=32, batch_size=64, max_iter=10, optimizer='adam', learning_rate=0.001, model_selection='last', user_based=True, trainable=True, verbose=True, init_params=None, seed=None)[source]#

Neural Attentional Rating Regression with Review-level Explanations

Parameters:
  • name (string, default: 'NARRE') – The name of the recommender model.

  • embedding_size (int, default: 100) – Word embedding size

  • id_embedding_size (int, default: 32) – User/item review id embedding size

  • n_factors (int, default: 32) – The dimension of the user/item’s latent factors.

  • attention_size (int, default: 16) – Attention size

  • kernel_sizes (list, default: [3]) – List of kernel sizes of conv2d

  • n_filters (int, default: 64) – Number of filters

  • dropout_rate (float, default: 0.5) – Dropout rate of neural network dense layers

  • max_text_length (int, default: 50) – Maximum number of tokens in a review instance

  • max_num_review (int, default: 32) – Maximum number of reviews that you want to feed into training. By default, the model will be trained with 32 reviews.

  • batch_size (int, default: 64) – Batch size

  • max_iter (int, default: 10) – Max number of training epochs

  • optimizer (string, optional, default: 'adam') – Optimizer for training is either ‘adam’ or ‘rmsprop’.

  • learning_rate (float, optional, default: 0.001) – Initial value of learning rate for the optimizer.

  • model_selection (str, optional, default: 'last') – Model selection strategy is either ‘best’ or ‘last’.

  • user_based (boolean, optional, default: True) – Evaluation strategy for model selection, by default, it measures for every users and taking the average user_based=True. Set user_based=False if you want to measure per rating instead.

  • trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.

  • verbose (boolean, optional, default: True) – When True, running logs are displayed.

  • init_params (dictionary, optional, default: None) – Initial parameters, pretrained_word_embeddings could be initialized here, e.g., init_params={‘pretrained_word_embeddings’: pretrained_word_embeddings}

  • seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).

References

  • Chen, C., Zhang, M., Liu, Y., & Ma, S. (2018, April). Neural attentional rating regression with review-level explanations. In Proceedings of the 2018 World Wide Web Conference (pp. 1583-1592).

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

static load(model_path, trainable=False)[source]#

Load a recommender model from the filesystem.

Parameters:
  • model_path (str, required) – Path to a file or directory where the model is stored. If a directory is provided, the latest model will be loaded.

  • trainable (boolean, optional, default: False) – Set it to True if you would like to finetune the model. By default, the model parameters are assumed to be fixed after being loaded.

Returns:

self

Return type:

object

save(save_dir=None)[source]#

Save a recommender model to the filesystem.

Parameters:

save_dir (str, default: None) – Path to a directory for the model to be stored.

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for that to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Probabilistic Collaborative Representation Learning (PCRL)#

class cornac.models.pcrl.recom_pcrl.PCRL(k=100, z_dims=[300], max_iter=300, batch_size=300, learning_rate=0.001, name='PCRL', trainable=True, verbose=False, w_determinist=True, init_params=None)[source]#

Probabilistic Collaborative Representation Learning.

Parameters:
  • k (int, optional, default: 100) – The dimension of the latent factors.

  • z_dims (Numpy 1d array, optional, default: [300]) – The dimensions of the hidden intermdiate layers ‘z’ in the order [dim(z_L), …,dim(z_1)], please refer to Figure 1 in the orginal paper for more details.

  • max_iter (int, optional, default: 300) – Maximum number of iterations (number of epochs) for variational PCRL.

  • batch_size (int, optional, default: 300) – The batch size for SGD.

  • learning_rate (float, optional, default: 0.001) – The learning rate for SGD.

  • aux_info (see "cornac/examples/pcrl_example.py" in the GitHub repo for an example of how to use cornac's graph modality provide item auxiliary data (e.g., context, text, etc.) for PCRL.)

  • name (string, optional, default: 'PCRL') – The name of the recommender model.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (Theta, Beta and Xi are not None).

  • w_determinist (boolean, optional, default: True) – When True, determinist wheights “W” are used for the generator network, otherwise “W” is stochastic as in the original paper.

  • init_params (dictionary, optional, default: None) –

    List of initial parameters, e.g., init_params = {‘G_s’:G_s, ‘G_r’:G_r, ‘L_s’:L_s, ‘L_r’:L_r}.

    Theta: ndarray, shape (n_users, k)

    The expected user latent factors.

    Beta: ndarray, shape (n_items, k)

    The expected item latent factors.

    G_s: ndarray, shape (n_users, k)

    Represent the “shape” parameters of Gamma distribution over Theta.

    G_r: ndarray, shape (n_users, k)

    Represent the “rate” parameters of Gamma distribution over Theta.

    L_s: ndarray, shape (n_items, k)

    Represent the “shape” parameters of Gamma distribution over Beta.

    L_r: ndarray, shape (n_items, k)

    Represent the “rate” parameters of Gamma distribution over Beta.

References

  • Salah, Aghiles, and Hady W. Lauw. Probabilistic Collaborative Representation Learning for Personalized Item Recommendation. In UAI 2018.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for a list of items.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

VAE for Collaborative Filtering (VAECF)#

class cornac.models.vaecf.recom_vaecf.VAECF(name='VAECF', k=10, autoencoder_structure=[20], act_fn='tanh', likelihood='mult', n_epochs=100, batch_size=100, learning_rate=0.001, beta=1.0, trainable=True, verbose=False, seed=None, use_gpu=False)[source]#

Variational Autoencoder for Collaborative Filtering.

Parameters:
  • k (int, optional, default: 10) – The dimension of the stochastic user factors ``z’’.

  • autoencoder_structure (list, default: [20]) – The number of neurons of encoder/decoder layer for VAE. For example, autoencoder_structure = [200], the VAE structure will be [num_items, 200, k, 200, num_items].

  • act_fn (str, default: 'tanh') – Name of the activation function used between hidden layers of the auto-encoder. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘relu6’]

  • likelihood (str, default: 'mult') –

    Name of the likelihood function used for modeling the observations. Supported choices:

    mult: Multinomial likelihood bern: Bernoulli likelihood gaus: Gaussian likelihood pois: Poisson likelihood

  • n_epochs (int, optional, default: 100) – The number of epochs for SGD.

  • batch_size (int, optional, default: 100) – The batch size.

  • learning_rate (float, optional, default: 0.001) – The learning rate for Adam.

  • beta (float, optional, default: 1.0) – The weight of the KL term as in beta-VAE.

  • name (string, optional, default: 'VAECF') – The name of the recommender model.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained.

  • verbose (boolean, optional, default: False) – When True, some running logs are displayed.

  • seed (int, optional, default: None) – Random seed for parameters initialization.

  • use_gpu (boolean, optional, default: False) – If True and your system supports CUDA then training is performed on GPUs.

References

  • Liang, Dawen, Rahul G. Krishnan, Matthew D. Hoffman, and Tony Jebara. “Variational autoencoders for collaborative filtering.” In Proceedings of the 2018 World Wide Web Conference on World Wide Web, pp. 689-698.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Collaborative Variational Autoencoder (CVAE)#

class cornac.models.cvae.recom_cvae.CVAE(name='CVAE', z_dim=50, n_epochs=100, lambda_u=0.0001, lambda_v=0.001, lambda_r=10, lambda_w=0.0001, lr=0.001, a=1, b=0.01, input_dim=8000, vae_layers=[200, 100], act_fn='sigmoid', loss_type='cross-entropy', batch_size=128, init_params=None, trainable=True, seed=None, verbose=True)[source]#

Collaborative Variational Autoencoder

Parameters:
  • z_dim (int, optional, default: 50) – The dimension of the user and item latent factors.

  • n_epochs (int, optional, default: 100) – Maximum number of epochs for training.

  • lambda_u (float, optional, default: 1e-4) – The regularization hyper-parameter for user latent factor.

  • lambda_v (float, optional, default: 0.001) – The regularization hyper-parameter for item latent factor.

  • lambda_r (float, optional, default: 10.0) – Parameter that balance the focus on content or ratings

  • lambda_w (float, optional, default: 1e-4) – The regularization for VAE weights

  • lr (float, optional, default: 0.001) – Learning rate in the auto-encoder training

  • a (float, optional, default: 1) – The confidence of observed ratings.

  • b (float, optional, default: 0.01) – The confidence of unseen ratings.

  • input_dim (int, optional, default: 8000) – The size of input vector

  • vae_layers (list, optional, default: [200, 100]) – The list containing size of each layers in neural network structure

  • act_fn (str, default: 'sigmoid') – Name of the activation function used for the variational auto-encoder. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘relu6’, ‘leaky_relu’, ‘identity’]

  • loss_type (String, optional, default: "cross-entropy") – Either “cross-entropy” or “rmse” The type of loss function in the last layer

  • batch_size (int, optional, default: 128) – The batch size for SGD.

  • init_params (dict, optional, default: {'U':None, 'V':None}) – Initial U and V latent matrix

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).

References

Collaborative Variational Autoencoder for Recommender Systems X. Li and J. She ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2017

http://eelxpeng.github.io/assets/paper/Collaborative_Variational_Autoencoder.pdf

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Conditional VAE for Collaborative Filtering (CVAECF)#

class cornac.models.cvaecf.recom_cvaecf.CVAECF(name='CVAECF', z_dim=20, h_dim=20, autoencoder_structure=[20], act_fn='tanh', likelihood='mult', n_epochs=100, batch_size=128, learning_rate=0.001, beta=1.0, alpha_1=1.0, alpha_2=1.0, trainable=True, verbose=False, seed=None, use_gpu=False)[source]#

Conditional Variational Autoencoder for Collaborative Filtering.

Parameters:
  • z_dim (int, optional, default: 20) – The dimension of the stochastic user factors ``z’’ representing the preference information.

  • h_dim (int, optional, default: 20) – The dimension of the stochastic user factors ``h’’ representing the auxiliary data.

  • autoencoder_structure (list, default: [20]) – The number of neurons of encoder/decoder hidden layer for CVAE. For example, when autoencoder_structure = [20], the CVAE encoder structures will be [y_dim, 20, z_dim] and [x_dim, 20, h_dim], the decoder structure will be [z_dim + h_dim, 20, y_dim], where y and x are respectively the preference and auxiliary data.

  • act_fn (str, default: 'tanh') – Name of the activation function used between hidden layers of the auto-encoder. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘relu6’]

  • likelihood (str, default: 'mult') –

    Name of the likelihood function used for modeling user preferences. Supported choices:

    mult: Multinomial likelihood bern: Bernoulli likelihood gaus: Gaussian likelihood pois: Poisson likelihood

  • n_epochs (int, optional, default: 100) – The number of epochs for SGD.

  • batch_size (int, optional, default: 128) – The batch size.

  • learning_rate (float, optional, default: 0.001) – The learning rate for Adam.

  • beta (float, optional, default: 1.0) – The weight of the KL term KL(q(z|y)||p(z)) as in beta-VAE.

  • alpha_1 (float, optional, default: 1.0) – The weight of the KL term KL(q(h|x)||p(h|x)).

  • alpha_2 (float, optional, default: 1.0) – The weight of the KL term KL(q(h|x)||q(h|y)).

  • name (string, optional, default: 'CVAECF') – The name of the recommender model.

  • trainable (boolean, optional, default: True) – When False, the model is not trained, and Cornac assumes that the model is already pre-trained.

  • verbose (boolean, optional, default: False) – When True, some running logs are displayed.

  • seed (int, optional, default: None) – Random seed for parameters initialization.

  • use_gpu (boolean, optional, default: False) – If True and your system supports CUDA then training is performed on GPUs.

  • data (user auxiliary)

References

  • Lee, Wonsung, Kyungwoo Song, and Il-Chul Moon. “Augmented variational autoencoders for collaborative filtering with auxiliary information.” Proceedings of ACM CIKM. 2017.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Generalized Matrix Factorization (GMF)#

class cornac.models.ncf.recom_gmf.GMF(name='GMF', num_factors=8, reg=0.0, num_epochs=20, batch_size=256, num_neg=4, lr=0.001, learner='adam', backend='tensorflow', early_stopping=None, trainable=True, verbose=True, seed=None)[source]#

Generalized Matrix Factorization.

Parameters:
  • num_factors (int, optional, default: 8) – Embedding size of MF model.

  • reg (float, optional, default: 0.) – Regularization (weight_decay).

  • num_epochs (int, optional, default: 20) – Number of epochs.

  • batch_size (int, optional, default: 256) – Batch size.

  • num_neg (int, optional, default: 4) – Number of negative instances to pair with a positive instance.

  • lr (float, optional, default: 0.001) – Learning rate.

  • learner (str, optional, default: 'adam') – Specify an optimizer: adagrad, adam, rmsprop, sgd

  • backend (str, optional, default: 'tensorflow') – Backend used for model training: tensorflow, pytorch

  • early_stopping ({min_delta: float, patience: int}, optional, default: None) –

    If None, no early stopping. Meaning of the arguments:

    • min_delta: the minimum increase in monitored value on validation set to be considered as improvement, i.e. an increment of less than min_delta will count as no improvement.

    • patience: number of epochs with no improvement after which training should be stopped.

  • name (string, optional, default: 'GMF') – Name of the recommender model.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained.

  • verbose (boolean, optional, default: False) – When True, some running logs are displayed.

  • seed (int, optional, default: None) – Random seed for parameters initialization.

References

  • He, X., Liao, L., Zhang, H., Nie, L., Hu, X., & Chua, T. S. (2017, April). Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web (pp. 173-182).

Indexable Bayesian Personalized Ranking (IBPR)#

class cornac.models.ibpr.recom_ibpr.IBPR(k=20, max_iter=100, learning_rate=0.05, lamda=0.001, batch_size=100, name='IBPR', trainable=True, verbose=False, init_params=None)[source]#

Indexable Bayesian Personalized Ranking.

Parameters:
  • k (int, optional, default: 20) – The dimension of the latent factors.

  • max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.

  • learning_rate (float, optional, default: 0.05) – The learning rate for SGD.

  • lamda (float, optional, default: 0.001) – The regularization parameter.

  • batch_size (int, optional, default: 100) – The batch size for SGD.

  • name (string, optional, default: 'IBRP') – The name of the recommender model.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).

  • verbose (boolean, optional, default: False) – When True, some running logs are displayed.

  • init_params (dictionary, optional, default: None) –

    List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V} please see below the definition of U and V.

    U: csc_matrix, shape (n_users,k)

    The user latent factors, optional initialization via init_params.

    V: csc_matrix, shape (n_items,k)

    The item latent factors, optional initialization via init_params.

References

  • Le, D. D., & Lauw, H. W. (2017, November). Indexable Bayesian personalized ranking for efficient top-k recommendation. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (pp. 1389-1398). ACM.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Matrix Co-Factorization (MCF)#

class cornac.models.mcf.recom_mcf.MCF(k=5, max_iter=100, learning_rate=0.001, gamma=0.9, lamda=0.001, name='MCF', trainable=True, verbose=False, init_params=None, seed=None)[source]#

Matrix Co-Factorization.

Parameters:
  • k (int, optional, default: 5) – The dimension of the latent factors.

  • max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.

  • learning_rate (float, optional, default: 0.001) – The learning rate for SGD_RMSProp.

  • gamma (float, optional, default: 0.9) – The weight for previous/current gradient in RMSProp.

  • lamda (float, optional, default: 0.001) – The regularization parameter.

  • name (string, optional, default: 'MCF') – The name of the recommender model.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained (U and V are not None).

  • network (item-affinity)

  • verbose (boolean, optional, default: False) – When True, some running logs are displayed.

  • init_params (dictionary, optional, default: None) –

    List of initial parameters, e.g., init_params = {‘U’: U, ‘V’: V, ‘Z’, Z}.

    U: ndarray, shape (n_users, k)

    User latent factors.

    V: ndarray, shape (n_items, k)

    Item latent factors.

    Z: ndarray, shape (n_items, k)

    The “Also-Viewed” item latent factors.

  • seed (int, optional, default: None) – Random seed for parameters initialization.

References

  • Park, Chanyoung, Donghyun Kim, Jinoh Oh, and Hwanjo Yu. “Do Also-Viewed Products Help User Rating Prediction?.” In Proceedings of WWW, pp. 1113-1122. 2017.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Multi-Layer Perceptron (MLP)#

class cornac.models.ncf.recom_mlp.MLP(name='MLP', layers=(64, 32, 16, 8), act_fn='relu', reg=0.0, num_epochs=20, batch_size=256, num_neg=4, lr=0.001, learner='adam', backend='tensorflow', early_stopping=None, trainable=True, verbose=True, seed=None)[source]#

Multi-Layer Perceptron.

Parameters:
  • layers (list, optional, default: [64, 32, 16, 8]) – MLP layers. Note that the first layer is the concatenation of user and item embeddings. So layers[0]/2 is the embedding size.

  • act_fn (str, default: 'relu') – Name of the activation function used for the MLP layers. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘selu, ‘relu6’, ‘leaky_relu’]

  • reg (float, optional, default: 0.) – Regularization (weight_decay).

  • num_epochs (int, optional, default: 20) – Number of epochs.

  • batch_size (int, optional, default: 256) – Batch size.

  • num_neg (int, optional, default: 4) – Number of negative instances to pair with a positive instance.

  • lr (float, optional, default: 0.001) – Learning rate.

  • learner (str, optional, default: 'adam') – Specify an optimizer: adagrad, adam, rmsprop, sgd

  • backend (str, optional, default: 'tensorflow') – Backend used for model training: tensorflow, pytorch

  • early_stopping ({min_delta: float, patience: int}, optional, default: None) –

    If None, no early stopping. Meaning of the arguments:

    • min_delta: the minimum increase in monitored value on validation set to be considered as improvement, i.e. an increment of less than min_delta will count as no improvement.

    • patience: number of epochs with no improvement after which training should be stopped.

  • name (string, optional, default: 'MLP') – Name of the recommender model.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained.

  • verbose (boolean, optional, default: False) – When True, some running logs are displayed.

  • seed (int, optional, default: None) – Random seed for parameters initialization.

References

  • He, X., Liao, L., Zhang, H., Nie, L., Hu, X., & Chua, T. S. (2017, April). Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web (pp. 173-182).

Neural Matrix Factorization (NeuMF/NCF)#

class cornac.models.ncf.recom_neumf.NeuMF(name='NeuMF', num_factors=8, layers=(64, 32, 16, 8), act_fn='relu', reg=0.0, num_epochs=20, batch_size=256, num_neg=4, lr=0.001, learner='adam', backend='tensorflow', early_stopping=None, trainable=True, verbose=True, seed=None)[source]#

Neural Matrix Factorization.

Parameters:
  • num_factors (int, optional, default: 8) – Embedding size of MF model.

  • layers (list, optional, default: [64, 32, 16, 8]) – MLP layers. Note that the first layer is the concatenation of user and item embeddings. So layers[0]/2 is the embedding size.

  • act_fn (str, default: 'relu') – Name of the activation function used for the MLP layers. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘selu, ‘relu6’, ‘leaky_relu’]

  • reg (float, optional, default: 0.) – Regularization (weight_decay).

  • reg_layers (list, optional, default: [0., 0., 0., 0.]) – Regularization for each MLP layer, reg_layers[0] is the regularization for embeddings.

  • num_epochs (int, optional, default: 20) – Number of epochs.

  • batch_size (int, optional, default: 256) – Batch size.

  • num_neg (int, optional, default: 4) – Number of negative instances to pair with a positive instance.

  • lr (float, optional, default: 0.001) – Learning rate.

  • learner (str, optional, default: 'adam') – Specify an optimizer: adagrad, adam, rmsprop, sgd

  • backend (str, optional, default: 'tensorflow') – Backend used for model training: tensorflow, pytorch

  • early_stopping ({min_delta: float, patience: int}, optional, default: None) –

    If None, no early stopping. Meaning of the arguments:

    • min_delta: the minimum increase in monitored value on validation set to be considered as improvement, i.e. an increment of less than min_delta will count as no improvement.

    • patience: number of epochs with no improvement after which training should be stopped.

  • name (string, optional, default: 'NeuMF') – Name of the recommender model.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained.

  • verbose (boolean, optional, default: False) – When True, some running logs are displayed.

  • seed (int, optional, default: None) – Random seed for parameters initialization.

References

  • He, X., Liao, L., Zhang, H., Nie, L., Hu, X., & Chua, T. S. (2017, April). Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web (pp. 173-182).

from_pretrained(pretrained_gmf, pretrained_mlp, alpha=0.5)[source]#

Provide pre-trained GMF and MLP models. Section 3.4.1 of the paper.

Parameters:
  • pretrained_gmf (object of type GMF, required) – Reference to trained/fitted GMF model.

  • pretrained_mlp (object of type MLP, required) – Reference to trained/fitted MLP model.

  • alpha (float, optional, default: 0.5) – Hyper-parameter determining the trade-off between the two pre-trained models. Details are described in the section 3.4.1 of the paper.

Online Indexable Bayesian Personalized Ranking (OIBPR)#

class cornac.models.online_ibpr.recom_online_ibpr.OnlineIBPR(k=20, max_iter=100, learning_rate=0.05, lamda=0.001, batch_size=100, name='online_ibpr', trainable=True, verbose=False, init_params=None)[source]#

Online Indexable Bayesian Personalized Ranking.

Parameters:
  • k (int, optional, default: 20) – The dimension of the latent factors.

  • max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.

  • learning_rate (float, optional, default: 0.05) – The learning rate for SGD.

  • lamda (float, optional, default: 0.001) – The regularization parameter.

  • batch_size (int, optional, default: 100) – The batch size for SGD.

  • name (string, optional, default: 'IBRP') – The name of the recommender model.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).

  • verbose (boolean, optional, default: False) – When True, some running logs are displayed.

  • init_params (dictionary, optional, default: None) –

    List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V} please see below the definition of U and V.

    U: csc_matrix, shape (n_users,k)

    The user latent factors, optional initialization via init_params.

    V: csc_matrix, shape (n_items,k)

    The item latent factors, optional initialization via init_params.

References

  • Le, D. D., & Lauw, H. W. (2017, November). Indexable Bayesian personalized ranking for efficient top-k recommendation. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (pp. 1389-1398). ACM.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Visual Matrix Factorization (VMF)#

class cornac.models.vmf.recom_vmf.VMF(name='VMF', k=10, d=10, n_epochs=100, batch_size=100, learning_rate=0.001, gamma=0.9, lambda_u=0.001, lambda_v=0.001, lambda_p=1.0, lambda_e=10.0, trainable=True, verbose=False, use_gpu=False, init_params=None, seed=None)[source]#

Visual Matrix Factorization.

Parameters:
  • k (int, optional, default: 10) – The dimension of the user and item factors.

  • d (int, optional, default: 10) – The dimension of the user visual factors.

  • n_epochs (int, optional, default: 100) – The number of epochs for SGD.

  • learning_rate (float, optional, default: 0.001) – The learning rate for SGD_RMSProp.

  • gamma (float, optional, default: 0.9) – The weight for previous/current gradient in RMSProp.

  • lambda_u (float, optional, default: 0.001) – The regularization parameter for user factors.

  • lambda_v (float, optional, default: 0.001) – The regularization parameter for item factors.

  • lambda_p (float, optional, default: 1.0) – The regularization parameter for user visual factors.

  • lambda_e (float, optional, default: 10.) – The regularization parameter for the kernel embedding matrix

  • lambda_u – The regularization parameter for user factors.

  • name (string, optional, default: 'VMF') – The name of the recommender model.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained (The parameters of the model U, V, P, E are not None).

  • visual_features (See "cornac/examples/vmf_example.py" for an example of how to use cornac's visual modality to load and provide the "item visual features" for VMF.)

  • verbose (boolean, optional, default: False) – When True, some running logs are displayed.

  • init_params (dictionary, optional, default: None) –

    List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V, ‘P’: P, ‘E’: E}.

    U: numpy array of shape (n_users,k), user latent factors. V: numpy array of shape (n_items,k), item latent factors. P: numpy array of shape (n_users,d), user visual latent factors. E: numpy array of shape (d,c), embedding kernel matrix.

  • seed (int, optional, default: None) – Random seed for parameters initialization.

References

  • Park, Chanyoung, Donghyun Kim, Jinoh Oh, and Hwanjo Yu. “Do Also-Viewed Products Help User Rating Prediction?.” In Proceedings of WWW, pp. 1113-1122. 2017.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Collaborative Deep Ranking (CDR)#

class cornac.models.cdr.recom_cdr.CDR(name='CDR', k=50, autoencoder_structure=None, act_fn='relu', lambda_u=0.1, lambda_v=100, lambda_w=0.1, lambda_n=1000, corruption_rate=0.3, learning_rate=0.001, dropout_rate=0.1, batch_size=128, max_iter=100, trainable=True, verbose=True, vocab_size=8000, init_params=None, seed=None)[source]#

Collaborative Deep Ranking.

Parameters:
  • k (int, optional, default: 50) – The dimension of the latent factors.

  • max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.

  • autoencoder_structure (list, default: None) – The number of neurons of encoder/decoder layer for SDAE. For example, autoencoder_structure = [200], the SDAE structure will be [vocab_size, 200, k, 200, vocab_size]

  • act_fn (str, default: 'relu') – Name of the activation function used for the auto-encoder. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘relu6’, ‘leaky_relu’, ‘identity’]

  • learning_rate (float, optional, default: 0.001) – The learning rate for AdamOptimizer.

  • lambda_u (float, optional, default: 0.1) – The regularization parameter for users.

  • lambda_v (float, optional, default: 10) – The regularization parameter for items.

  • lambda_w (float, optional, default: 0.1) – The regularization parameter for SDAE weights.

  • lambda_n (float, optional, default: 1000) – The regularization parameter for SDAE output.

  • corruption_rate (float, optional, default: 0.3) – The corruption ratio for SDAE.

  • dropout_rate (float, optional, default: 0.1) – The probability that each element is removed in dropout of SDAE.

  • batch_size (int, optional, default: 128) – The batch size for SGD.

  • name (string, optional, default: 'CDR') – The name of the recommender model.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).

  • init_params (dictionary, optional, default: None) –

    List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V}

    U: ndarray, shape (n_users,k)

    The user latent factors, optional initialization via init_params.

    V: ndarray, shape (n_items,k)

    The item latent factors, optional initialization via init_params.

  • seed (int, optional, default: None) – Random seed for weight initialization.

References

Collaborative Deep Ranking: A Hybrid Pair-Wise Recommendation Algorithm with Implicit Feedback Ying H., Chen L., Xiong Y., Wu J. (2016)

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Collaborative Ordinal Embedding (COE)#

class cornac.models.coe.recom_coe.COE(k=20, max_iter=100, learning_rate=0.05, lamda=0.001, batch_size=1000, name='coe', trainable=True, verbose=False, init_params=None)[source]#

Collaborative Ordinal Embedding.

Parameters:
  • k (int, optional, default: 20) – The dimension of the latent factors.

  • max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.

  • learning_rate (float, optional, default: 0.05) – The learning rate for SGD.

  • lamda (float, optional, default: 0.001) – The regularization parameter.

  • batch_size (int, optional, default: 100) – The batch size for SGD.

  • name (string, optional, default: 'IBRP') – The name of the recommender model.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).

  • verbose (boolean, optional, default: False) – When True, some running logs are displayed.

  • init_params (dictionary, optional, default: None) –

    List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V}.

    U: ndarray, shape (n_users, k)

    The user latent factors.

    V: ndarray, shape (n_items, k)

    The item latent factors.

References

  • Le, D. D., & Lauw, H. W. (2016, June). Euclidean co-embedding of ordinal data for multi-type visualization. In Proceedings of the 2016 SIAM International Conference on Data Mining (pp. 396-404). Society for Industrial and Applied Mathematics.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Convolutional Matrix Factorization (ConvMF)#

class cornac.models.conv_mf.recom_convmf.ConvMF(name='ConvMF', k=50, n_epochs=50, cnn_epochs=5, cnn_bs=128, cnn_lr=0.001, lambda_u=1, lambda_v=100, emb_dim=200, max_len=300, filter_sizes=[3, 4, 5], num_filters=100, hidden_dim=200, dropout_rate=0.2, give_item_weight=True, trainable=True, verbose=False, init_params=None, seed=None)[source]#
Parameters:
  • k (int, optional, default: 50) – The dimension of the user and item latent factors.

  • n_epochs (int, optional, default: 50) – Maximum number of epochs for training.

  • cnn_epochs (int, optional, default: 5) – Number of epochs for optimizing the CNN for each overall training epoch.

  • cnn_bs (int, optional, default: 128) – Batch size for optimizing CNN.

  • cnn_lr (float, optional, default: 0.001) – Learning rate for optimizing CNN.

  • lambda_u (float, optional, default: 1.0) – The regularization hyper-parameter for user latent factor.

  • lambda_v (float, optional, default: 100.0) – The regularization hyper-parameter for item latent factor.

  • emb_dim (int, optional, default: 200) – The embedding size of each word. One word corresponds with [1 x emb_dim] vector in the embedding space

  • max_len (int, optional, default 300) – The maximum length of item’s document

  • filter_sizes (list, optional, default: [3, 4, 5]) – The length of filters in convolutional layer

  • num_filters (int, optional, default: 100) – The number of filters in convolutional layer

  • hidden_dim (int, optional, default: 200) – The dimension of hidden layer after the pooling of all convolutional layers

  • dropout_rate (float, optional, default: 0.2) – Dropout rate while training CNN

  • give_item_weight (boolean, optional, default: True) – When True, each item will be weighted base on the number of user who have rated this item

  • init_params (dict, optional, default: {'U':None, 'V':None, 'W': None}) – Initial U and V matrix and initial weight for embedding layer W

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).

References

  • Donghyun Kim1, Chanyoung Park1. ConvMF: Convolutional Matrix Factorization for Document Context-Aware Recommendation. In :10th ACM Conference on Recommender Systems Pages 233-240

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Spherical k-means (Skmeans)#

class cornac.models.skm.recom_skmeans.SKMeans(k=5, max_iter=100, name='Skmeans', trainable=True, tol=1e-06, verbose=True, seed=None, init_par=None)[source]#

Spherical k-means based recommender.

Parameters:
  • k (int, optional, default: 5) – The number of clusters.

  • max_iter (int, optional, default: 100) – Maximum number of iterations.

  • name (string, optional, default: 'Skmeans') – The name of the recommender model.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already trained.

  • tol (float, optional, default: 1e-6) – Relative tolerance with regards to skmeans’ criterion to declare convergence.

  • verbose (boolean, optional, default: False) – When True, some running logs are displayed.

  • seed (int, optional, default: None) – Random seed for parameters initialization.

  • init_par (numpy 1d array, optional, default: None) – The initial object parition, 1d array contaning the cluster label (int type starting from 0) of each object (user). If par = None, then skmeans is initialized randomly.

  • centroids (csc_matrix, shape (k,n_users)) – The maxtrix of cluster centroids.

References

  • Salah, Aghiles, Nicoleta Rogovschi, and Mohamed Nadif. “A dynamic collaborative filtering system via a weighted clustering approach.” Neurocomputing 175 (2016): 206-215.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Visual Bayesian Personalized Ranking (VBPR)#

class cornac.models.vbpr.recom_vbpr.VBPR(name='VBPR', k=10, k2=10, n_epochs=50, batch_size=100, learning_rate=0.005, lambda_w=0.01, lambda_b=0.01, lambda_e=0.0, use_gpu=False, trainable=True, verbose=True, init_params=None, seed=None)[source]#

Visual Bayesian Personalized Ranking.

Parameters:
  • k (int, optional, default: 10) – The dimension of the gamma latent factors.

  • k2 (int, optional, default: 10) – The dimension of the theta latent factors.

  • n_epochs (int, optional, default: 20) – Maximum number of epochs for SGD.

  • batch_size (int, optional, default: 100) – The batch size for SGD.

  • learning_rate (float, optional, default: 0.001) – The learning rate for SGD.

  • lambda_w (float, optional, default: 0.01) – The regularization hyper-parameter for latent factor weights.

  • lambda_b (float, optional, default: 0.01) – The regularization hyper-parameter for biases.

  • lambda_e (float, optional, default: 0.0) – The regularization hyper-parameter for embedding matrix E and beta prime vector.

  • use_gpu (boolean, optional, default: True) – Whether or not to use GPU to speed up training.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).

  • verbose (boolean, optional, default: True) – When True, running logs are displayed.

  • init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘Bi’: beta_item, ‘Gu’: gamma_user, ‘Gi’: gamma_item, ‘Tu’: theta_user, ‘E’: emb_matrix, ‘Bp’: beta_prime}

  • seed (int, optional, default: None) – Random seed for weight initialization.

References

  • He, R., & McAuley, J. (2016). VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Collaborative Deep Learning (CDL)#

class cornac.models.cdl.recom_cdl.CDL(name='CDL', k=50, autoencoder_structure=None, act_fn='relu', lambda_u=0.1, lambda_v=10, lambda_w=0.1, lambda_n=1000, a=1, b=0.01, corruption_rate=0.3, learning_rate=0.001, vocab_size=8000, dropout_rate=0.1, batch_size=128, max_iter=100, trainable=True, verbose=True, init_params=None, seed=None)[source]#

Collaborative Deep Learning.

Parameters:
  • name (string, default: 'CDL') – The name of the recommender model.

  • k (int, optional, default: 50) – The dimension of the latent factors.

  • max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.

  • autoencoder_structure (list, default: None) – The number of neurons of encoder/decoder layer for SDAE. For example, autoencoder_structure = [200], the SDAE structure will be [vocab_size, 200, k, 200, vocab_size]

  • act_fn (str, default: 'relu') – Name of the activation function used for the auto-encoder. Supported functions: [‘sigmoid’, ‘tanh’, ‘elu’, ‘relu’, ‘relu6’, ‘leaky_relu’, ‘identity’]

  • learning_rate (float, optional, default: 0.001) – The learning rate for AdamOptimizer.

  • vocab_size (int, default: 8000) – The size of text input for the SDAE.

  • lambda_u (float, optional, default: 0.1) – The regularization parameter for users.

  • lambda_v (float, optional, default: 10) – The regularization parameter for items.

  • lambda_w (float, optional, default: 0.1) – The regularization parameter for SDAE weights.

  • lambda_n (float, optional, default: 1000) – The regularization parameter for SDAE output.

  • a (float, optional, default: 1) – The confidence of observed ratings.

  • b (float, optional, default: 0.01) – The confidence of unseen ratings.

  • corruption_rate (float, optional, default: 0.3) – The corruption ratio for input text of the SDAE.

  • dropout_rate (float, optional, default: 0.1) – The probability that each element is removed in dropout of SDAE.

  • batch_size (int, optional, default: 128) – The batch size for SGD.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).

  • init_params (dictionary, optional, default: None) –

    List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V}

    U: ndarray, shape (n_users,k)

    The user latent factors, optional initialization via init_params.

    V: ndarray, shape (n_items,k)

    The item latent factors, optional initialization via init_params.

  • seed (int, optional, default: None) – Random seed for weight initialization.

References

  • Hao Wang, Naiyan Wang, Dit-Yan Yeung. CDL: Collaborative Deep Learning for Recommender Systems. In : SIGKDD. 2015. p. 1235-1244.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Hierarchical Poisson Factorization (HPF)#

class cornac.models.hpf.recom_hpf.HPF(k=5, max_iter=100, name='HPF', trainable=True, verbose=False, hierarchical=True, seed=None, init_params=None)[source]#

Hierarchical Poisson Factorization.

Parameters:
  • k (int, optional, default: 5) – The dimension of the latent factors.

  • max_iter (int, optional, default: 100) – Maximum number of iterations.

  • name (string, optional, default: 'HPF') – The name of the recommender model.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model is already pre-trained (Theta and Beta are not None).

  • verbose (boolean, optional, default: False) – When True, some running logs are displayed.

  • hierarchical (boolean, optional, default: True) – When False, PF is used instead of HPF.

  • seed (int, optional, default: None) – Random seed for parameters initialization.

  • init_params (dict, optional, default: None) –

    Initial parameters of the model.

    Theta: ndarray, shape (n_users, k)

    The expected user latent factors.

    Beta: ndarray, shape (n_items, k)

    The expected item latent factors.

    G_s: ndarray, shape (n_users, k)

    This represents “shape” parameters of Gamma distribution over Theta.

    G_r: ndarray, shape (n_users, k)

    This represents “rate” parameters of Gamma distribution over Theta.

    L_s: ndarray, shape (n_items, k)

    This represents “shape” parameters of Gamma distribution over Beta.

    L_r: ndarray, shape (n_items, k)

    This represents “rate” parameters of Gamma distribution over Beta.

References

  • Gopalan, Prem, Jake M. Hofman, and David M. Blei. Scalable Recommendation with Hierarchical Poisson Factorization. In UAI, pp. 326-335. 2015.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

TriRank: Review-aware Explainable Recommendation by Modeling Aspects (TriRank)#

class cornac.models.trirank.recom_trirank.TriRank(name='TriRank', alpha=1, beta=1, gamma=1, eta_U=1, eta_P=1, eta_A=1, max_iter=100, verbose=True, init_params=None, seed=None)[source]#

TriRank: Review-aware Explainable Recommendation by Modeling Aspects.

Parameters:
  • name (string, optional, default: 'TriRank') – The name of the recommender model.

  • alpha (float, optional, default: 1) – The weight of smoothness on user-item relation

  • beta (float, optional, default: 1) – The weight of smoothness on item-aspect relation

  • gamma (float, optional, default: 1) – The weight of smoothness on user-aspect relation

  • eta_U (float, optional, default: 1) – The weight of fitting constraint on users

  • eta_P (float, optional, default: 1) – The weight of fitting constraint on items

  • eta_A (float, optional, default: 1) – The weight of fitting constraint on aspects

  • max_iter (int, optional, default: 100) – Maximum number of iterations to stop online training. If set to max_iter=-1, the online training will stop when model parameters are converged.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (R, X, Y, p, a, u are not None).

  • verbose (boolean, optional, default: False) – When True, running logs are displayed.

  • init_params (dictionary, optional, default: None) –

    List of initial parameters, e.g., init_params = {‘R’:R, ‘X’:X, ‘Y’:Y, ‘p’:p, ‘a’:a, ‘u’:u}

    R: csr_matrix, shape (n_users, n_items)

    The symmetric normalized of edge weight matrix of user-item relation, optional initialization via init_params

    X: csr_matrix, shape (n_items, n_aspects)

    The symmetric normalized of edge weight matrix of item-aspect relation, optional initialization via init_params

    Y: csr_matrix, shape (n_users, n_aspects)

    The symmetric normalized of edge weight matrix of user-aspect relation, optional initialization via init_params

    p: ndarray, shape (n_items,)

    Initialized item weights, optional initialization via init_params

    a: ndarray, shape (n_aspects,)

    Initialized aspect weights, optional initialization via init_params

    u: ndarray, shape (n_aspects,)

    Initialized user weights, optional initialization via init_params

  • seed (int, optional, default: None) – Random seed for parameters initialization.

References

He, Xiangnan, Tao Chen, Min-Yen Kan, and Xiao Chen. 2014. TriRank: Review-aware Explainable Recommendation by Modeling Aspects. In the 24th ACM international on conference on information and knowledge management (CIKM’15). ACM, New York, NY, USA, 1661-1670. DOI: https://doi.org/10.1145/2806416.2806504

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Explicit Factor Model (EFM)#

class cornac.models.efm.recom_efm.EFM(name='EFM', num_explicit_factors=40, num_latent_factors=60, num_most_cared_aspects=15, rating_scale=5.0, alpha=0.85, lambda_x=1, lambda_y=1, lambda_u=0.01, lambda_h=0.01, lambda_v=0.01, use_item_aspect_popularity=True, max_iter=100, num_threads=0, trainable=True, verbose=False, init_params=None, seed=None)#

Explict Factor Models

Parameters:
  • num_explicit_factors (int, optional, default: 40) – The dimension of the explicit factors.

  • num_latent_factors (int, optional, default: 60) – The dimension of the latent factors.

  • num_most_cared_aspects (int, optional, default: 15) – The number of most cared aspects for each user.

  • rating_scale (float, optional, default: 5.0) – The maximum rating score of the dataset.

  • alpha (float, optional, default: 0.85) – Trade-off factor for constructing ranking score.

  • lambda_x (float, optional, default: 1) – The regularization parameter for user aspect attentions.

  • lambda_y (float, optional, default: 1) – The regularization parameter for item aspect qualities.

  • lambda_u (float, optional, default: 0.01) – The regularization parameter for user and item explicit factors.

  • lambda_h (float, optional, default: 0.01) – The regularization parameter for user and item latent factors.

  • lambda_v (float, optional, default: 0.01) – The regularization parameter for V.

  • use_item_aspect_popularity (boolean, optional, default: True) – When False, item aspect frequency is omitted from item aspect quality computation formular. Specifically, \(Y_{ij} = 1 + \frac{N - 1}{1 + e^{-s_{ij}}}\) if \(p_i\) is reviewed on feature \(F_j\)

  • max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs.

  • name (string, optional, default: 'EFM') – The name of the recommender model.

  • num_threads (int, optional, default: 0) – Number of parallel threads for training. If 0, all CPU cores will be utilized.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U1, U2, V, H1, and H2 are not None).

  • verbose (boolean, optional, default: False) – When True, running logs are displayed.

  • init_params (dictionary, optional, default: {}) –

    List of initial parameters, e.g., init_params = {‘U1’:U1, ‘U2’:U2, ‘V’:V, ‘H1’:H1, ‘H2’:H2}

    U1: ndarray, shape (n_users, n_explicit_factors)

    The user explicit factors, optional initialization via init_params.

    U2: ndarray, shape (n_ratings, n_explicit_factors)

    The item explicit factors, optional initialization via init_params.

    V: ndarray, shape (n_aspects, n_explict_factors)

    The aspect factors, optional initialization via init_params.

    H1: ndarray, shape (n_users, n_latent_factors)

    The user latent factors, optional initialization via init_params.

    H2: ndarray, shape (n_ratings, n_latent_factors)

    The item latent factors, optional initialization via init_params.

  • seed (int, optional, default: None) – Random seed for weight initialization.

References

Yongfeng Zhang, Guokun Lai, Min Zhang, Yi Zhang, Yiqun Liu, and Shaoping Ma. 2014. Explicit factor models for explainable recommendation based on phrase-level sentiment analysis. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval (SIGIR ‘14). ACM, New York, NY, USA, 83-92. DOI: https://doi.org/10.1145/2600428.2609579

fit(train_set, val_set=None)#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

rank(user_idx, item_indices=None, k=-1)#

Rank all test items for a given user.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform item raking.

  • item_indices (1d array, optional, default: None) – A list of candidate item indices to be ranked by the user. If None, list of ranked known item indices and their scores will be returned

  • k (int, required) – Cut-off length for recommendations, k=-1 will return ranked list of all items. This is more important for ANN to know the limit to avoid exhaustive ranking.

Returns:

(ranked_items, item_scores)ranked_items contains item indices being ranked by their scores. item_scores contains scores of items corresponding to index in item_indices input.

Return type:

tuple

score(user_idx, item_idx=None)#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Social Bayesian Personalized Ranking (SBPR)#

class cornac.models.sbpr.recom_sbpr.SBPR(name='SBPR', k=10, max_iter=100, learning_rate=0.001, lambda_u=0.01, lambda_v=0.01, lambda_b=0.01, use_bias=True, num_threads=0, trainable=True, verbose=False, init_params=None, seed=None)#

Social Bayesian Personalized Ranking.

Parameters:
  • k (int, optional, default: 10) – The dimension of the latent factors.

  • max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.

  • learning_rate (float, optional, default: 0.001) – The learning rate for SGD.

  • lambda_u (float, optional, default: 0.001) – The regularization hyper-parameter of user factors.

  • lambda_v (float, optional, default: 0.001) – The regularization hyper-parameter item factors.

  • lambda_b (float, optional, default: 0.001) – The regularization hyper-parameter item biases.

  • use_bias (boolean, optional, default: True) – When True, item bias is used.

  • num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.

  • trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.

  • verbose (boolean, optional, default: True) – When True, some running logs are displayed.

  • init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘U’: user_factors, ‘V’: item_factors, ‘Bi’: item_biases}

  • seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).

References

  • Zhao, T., McAuley, J., & King, I. (2014, November). Leveraging social connections to improve personalized ranking for collaborative filtering. CIKM 2014 (pp. 261-270).

fit(train_set, val_set=None)#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

Hidden Factors and Hidden Topics (HFT)#

class cornac.models.hft.recom_hft.HFT(name='HFT', k=10, max_iter=50, grad_iter=50, lambda_text=0.1, l2_reg=0.001, vocab_size=8000, init_params=None, trainable=True, verbose=True, seed=None)[source]#

Hidden Factors and Hidden Topics

Parameters:
  • name (string, default: 'HFT') – The name of the recommender model.

  • k (int, optional, default: 10) – The dimension of the latent factors.

  • max_iter (int, optional, default: 50) – Maximum number of iterations for EM.

  • grad_iter (int, optional, default: 50) – Maximum number of iterations for L-BFGS.

  • lambda_text (float, default: 0.1) – Weight of corpus likelihood in objective function.

  • l2_reg (float, default: 0.001) – Regularization for user item latent factors.

  • vocab_size (int, optional, default: 8000) – Size of vocabulary for review text.

  • init_params (dictionary, optional, default: None) –

    List of initial parameters, e.g., init_params = {‘alpha’: alpha, ‘beta_u’: beta_u, ‘beta_i’: beta_i, ‘gamma_u’: gamma_u, ‘gamma_v’: gamma_v}

    alpha: float

    Model offset, optional initialization via init_params.

    beta_u: ndarray. shape (n_user, 1)

    User biases, optional initialization via init_params.

    beta_u: ndarray. shape (n_item, 1)

    Item biases, optional initialization via init_params.

    gamma_u: ndarray, shape (n_users,k)

    The user latent factors, optional initialization via init_params.

    gamma_v: ndarray, shape (n_items,k)

    The item latent factors, optional initialization via init_params.

  • trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.

  • verbose (boolean, optional, default: True) – When True, some running logs are displayed.

  • seed (int, optional, default: None) – Random seed for weight initialization.

References

Julian McAuley, Jure Leskovec. “Hidden Factors and Hidden Topics: Understanding Rating Dimensions with Review Text” RecSys ‘13 Proceedings of the 7th ACM conference on Recommender systems Pages 165-172

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Weighted Bayesian Personalized Ranking (WBPR)#

class cornac.models.bpr.recom_wbpr.WBPR(name='WBPR', k=10, max_iter=100, learning_rate=0.001, lambda_reg=0.01, use_bias=True, num_threads=0, trainable=True, verbose=False, init_params=None, seed=None)#

Weighted Bayesian Personalized Ranking.

Parameters:
  • k (int, optional, default: 10) – The dimension of the latent factors.

  • max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.

  • learning_rate (float, optional, default: 0.001) – The learning rate for SGD.

  • lambda_reg (float, optional, default: 0.001) – The regularization hyper-parameter.

  • use_bias (boolean, optional, default: True) – When True, item bias is used.

  • num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.

  • trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.

  • verbose (boolean, optional, default: True) – When True, some running logs are displayed.

  • init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘U’: user_factors, ‘V’: item_factors, ‘Bi’: item_biases}

  • seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).

References

  • Gantner, Zeno, Lucas Drumond, Christoph Freudenthaler, and Lars Schmidt-Thieme. “Personalized ranking for non-uniformly sampled items.” In Proceedings of KDD Cup 2011, pp. 231-247. 2012.

fit(train_set, val_set=None)#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

Collaborative Topic Regression (CTR)#

class cornac.models.ctr.recom_ctr.CTR(name='CTR', k=200, lambda_u=0.01, lambda_v=0.01, eta=0.01, a=1, b=0.01, max_iter=100, trainable=True, verbose=True, init_params=None, seed=None)[source]#

Collaborative Topic Regression.

Parameters:
  • name (string, default: 'CTR') – The name of the recommender model.

  • k (int, optional, default: 200) – The dimension of the latent factors.

  • max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.

  • lambda_u (float, optional, default: 0.01) – The regularization parameter for users.

  • lambda_v (float, optional, default: 0.01) – The regularization parameter for items.

  • a (float, optional, default: 1) – The confidence of observed ratings.

  • b (float, optional, default: 0.01) – The confidence of unseen ratings.

  • eta (float, optional, default: 0.01) – Added value for smoothing phi.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).

  • init_params (dictionary, optional, default: None) –

    List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V}

    U: ndarray, shape (n_users,k)

    The user latent factors, optional initialization via init_params.

    V: ndarray, shape (n_items,k)

    The item latent factors, optional initialization via init_params.

  • seed (int, optional, default: None) – Random seed for weight initialization.

References

Wang, Chong, and David M. Blei. “Collaborative topic modeling for recommending scientific articles.” Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2011.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Baseline Only#

cornac.models.baseline_only.recom_bo#

alias of <module ‘cornac.models.baseline_only.recom_bo’ from ‘/home/docs/checkouts/readthedocs.org/user_builds/cornac/envs/latest/lib/python3.11/site-packages/cornac/models/baseline_only/recom_bo.cpython-311-x86_64-linux-gnu.so’>

Bayesian Personalized Ranking (BPR)#

class cornac.models.bpr.recom_bpr.BPR(name='BPR', k=10, max_iter=100, learning_rate=0.001, lambda_reg=0.01, use_bias=True, num_threads=0, trainable=True, verbose=False, init_params=None, seed=None)#

Bayesian Personalized Ranking.

Parameters:
  • k (int, optional, default: 10) – The dimension of the latent factors.

  • max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.

  • learning_rate (float, optional, default: 0.001) – The learning rate for SGD.

  • lambda_reg (float, optional, default: 0.001) – The regularization hyper-parameter.

  • use_bias (boolean, optional, default: True) – When True, item bias is used.

  • num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.

  • trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.

  • verbose (boolean, optional, default: True) – When True, some running logs are displayed.

  • init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘U’: user_factors, ‘V’: item_factors, ‘Bi’: item_biases}

  • seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).

References

  • Rendle, Steffen, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. BPR: Bayesian personalized ranking from implicit feedback. In UAI, pp. 452-461. 2009.

fit(train_set, val_set=None)#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

score(user_idx, item_idx=None)#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Factorization Machines (FM)#

Global Average (GlobalAvg)#

class cornac.models.global_avg.recom_global_avg.GlobalAvg(name='GlobalAvg')[source]#

Global Average baseline for rating prediction. Rating predictions equal to average rating of training data (not personalized).

Parameters:

name (string, default: 'GlobalAvg') – The name of the recommender model.

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Item K-Nearest-Neighbors (ItemKNN)#

class cornac.models.knn.recom_knn.ItemKNN(name='ItemKNN', k=20, similarity='cosine', mean_centered=False, weighting=None, amplify=1.0, num_threads=0, trainable=True, verbose=True, seed=None)[source]#

Item-Based Nearest Neighbor.

Parameters:
  • name (string, default: 'ItemKNN') – The name of the recommender model.

  • k (int, optional, default: 20) – The number of nearest neighbors.

  • similarity (str, optional, default: 'cosine') – The similarity measurement. Supported types: [‘cosine’, ‘pearson’]

  • mean_centered (bool, optional, default: False) – Whether values of the user-item rating matrix will be centered by the mean of their corresponding rows (mean rating of each user).

  • weighting (str, optional, default: None) – The option for re-weighting the rating matrix. Supported types: [‘idf’, ‘bm25’]. If None, no weighting is applied.

  • amplify (float, optional, default: 1.0) – Amplifying the influence on similarity weights.

  • num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.

  • seed (int, optional, default: None) – Random seed for weight initialization.

References

  • Sarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2001, April). Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th international conference on World Wide Web (pp. 285-295).

  • Aggarwal, C. C. (2016). Recommender systems (Vol. 1). Cham: Springer International Publishing.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Learn to Rank user Preferences based on Phrase-level sentiment analysis across Multiple categories (LRPPM)#

class cornac.models.lrppm.recom_lrppm.LRPPM(name='LRPPM', rating_scale=5, n_factors=8, ld=1, reg=0.01, alpha=1, num_top_aspects=99999, n_ranking_samples=1000, n_samples=200, max_iter=200000, lr=0.1, n_threads=0, trainable=True, verbose=False, init_params=None, seed=None)#

Learn to Rank user Preferences based on Phrase-level sentiment analysis across Multiple categories (LRPPM)

Parameters:
  • name (string, optional, default: 'LRPPM') – The name of the recommender model.

  • rating_scale (float, optional, default: 5.0) – The maximum rating score of the dataset.

  • n_factors (int, optional, default: 8) – The dimension of the latent factors.

  • ld (float, optional, default: 1.0) – The control factor for aspect ranking objective.

  • lambda_reg (float, optional, default: 0.01) – The regularization parameter.

  • n_top_aspects (int, optional, default: 100) – The number of top scored aspects for each (user, item) pair to construct ranking score.

  • alpha (float, optional, default: 0.5) – Trade-off factor for constructing ranking score.

  • n_ranking_samples (int, optional, default: 1000) – The number of samples from ranking pairs.

  • n_samples (int, optional, default: 200) – The number of samples from all ratings in each iteration.

  • max_iter (int, optional, default: 200000) – Maximum number of iterations for training.

  • lr (float, optional, default: 0.1) – The learning rate for optimization

  • n_threads (int, optional, default: 0) – Number of parallel threads for training. If n_threads=0, all CPU cores will be utilized. If seed is not None, n_threads=1 to remove randomness from parallelization.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U, I, UA, and IA are not None).

  • n_threads – Number of parallel threads for training. If n_threads=0, all CPU cores will be utilized. If seed is not None, n_threads=1 to remove randomness from parallelization.

  • trainable – When False, the model is not trained and Cornac assumes that the model already pre-trained (U, I, A, O, G1, G2, and G3 are not None).

  • verbose (boolean, optional, default: False) – When True, running logs are displayed.

  • init_params (dictionary, optional, default: None) –

    List of initial parameters, e.g., init_params = {‘U’:U, ‘I’:I, ‘UA’:UA, ‘IA’:IA}

    U: ndarray, shape (n_users, n_factors)

    The user latent factors, optional initialization via init_params

    I: ndarray, shape (n_users, n_factors)

    The item latent factors, optional initialization via init_params

    UA: ndarray, shape (num_aspects, n_factors)

    The user-aspect latent factors, optional initialization via init_params

    IA: ndarray, shape (num_aspects, n_factors)

    The item-aspect latent factors, optional initialization via init_params

  • seed (int, optional, default: None) – Random seed for parameters initialization.

References

Xu Chen, Zheng Qin, Yongfeng Zhang, Tao Xu. 2016. Learning to Rank Features for Recommendation over Multiple Categories. Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval (SIGIR ‘16). ACM, New York, NY, USA, 305-314. DOI: https://doi.org/10.1145/2911451.2911549

fit(train_set, val_set=None)#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

rank(user_idx, item_indices=None, k=-1)#

Rank all test items for a given user.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform item raking.

  • item_indices (1d array, optional, default: None) – A list of candidate item indices to be ranked by the user. If None, list of ranked known item indices and their scores will be returned.

  • k (int, required) – Cut-off length for recommendations, k=-1 will return ranked list of all items. This is more important for ANN to know the limit to avoid exhaustive ranking.

Returns:

(ranked_items, item_scores)ranked_items contains item indices being ranked by their scores. item_scores contains scores of items corresponding to index in item_indices input.

Return type:

tuple

score(u_idx, i_idx=None)#

Predict the scores/ratings of a user for an item.

Parameters:
  • u_idx (int, required) – The index of the user for whom to perform score prediction.

  • i_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Matrix Factorization (MF)#

class cornac.models.mf.recom_mf.MF(name='MF', k=10, backend='cpu', optimizer='sgd', max_iter=20, learning_rate=0.01, batch_size=256, lambda_reg=0.02, dropout=0.0, use_bias=True, early_stop=False, num_threads=0, trainable=True, verbose=False, init_params=None, seed=None)[source]#

Matrix Factorization.

Parameters:
  • k (int, optional, default: 10) – The dimension of the latent factors.

  • backend (str, optional, default: 'cpu') – Backend used for model training: cpu, pytorch

  • optimizer (str, optional, default: 'sgd') – Specify an optimizer: adagrad, adam, rmsprop, sgd. (ineffective if using CPU backend)

  • max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for training.

  • learning_rate (float, optional, default: 0.01) – The learning rate.

  • batch_size (int, optional, default: 256) – Batch size (ineffective if using CPU backend).

  • lambda_reg (float, optional, default: 0.001) – The lambda value used for regularization.

  • dropout (float, optional, default: 0.0) – The dropout rate of embedding. (ineffective if using CPU backend)

  • use_bias (boolean, optional, default: True) – When True, user, item, and global biases are used.

  • early_stop (boolean, optional, default: False) – When True, delta loss will be checked after each iteration to stop learning earlier.

  • num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization. (Only effective if using CPU backend).

  • trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.

  • verbose (boolean, optional, default: True) – When True, running logs are displayed.

  • init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘U’: user_factors, ‘V’: item_factors, ‘Bu’: user_biases, ‘Bi’: item_biases}

  • seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).

References

  • Koren, Y., Bell, R., & Volinsky, C. Matrix factorization techniques for recommender systems. In Computer, (8), 30-37. 2009.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Maximum Margin Matrix Factorization (MMMF)#

class cornac.models.mmmf.recom_mmmf.MMMF(name='MMMF', k=10, max_iter=100, learning_rate=0.001, lambda_reg=0.01, num_threads=0, trainable=True, verbose=False, init_params=None, seed=None)#

Maximum Margin Matrix Factorization. This implements MF model optimized for the Soft Margin (Hinge) Ranking Loss, using SGD as similar to BPR model.

Parameters:
  • k (int, optional, default: 10) – The dimension of the latent factors.

  • max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.

  • learning_rate (float, optional, default: 0.001) – The learning rate for SGD.

  • lambda_reg (float, optional, default: 0.001) – The regularization hyper-parameter.

  • num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.

  • trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.

  • verbose (boolean, optional, default: True) – When True, some running logs are displayed.

  • init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘U’: user_factors, ‘V’: item_factors, ‘Bi’: item_biases}

  • seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).

References

  • Weimer, M., Karatzoglou, A., & Smola, A. (2008). Improving maximum margin matrix factorization. Machine Learning, 72(3), 263-276.

Most Popular (MostPop)#

class cornac.models.most_pop.recom_most_pop.MostPop(name='MostPop')[source]#

Most Popular. Item are recommended based on their popularity (not personalized).

Parameters:

name (string, default: 'MostPop') – The name of the recommender model.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Non-negative Matrix Factorization (NMF)#

class cornac.models.nmf.recom_nmf.NMF(name='NMF', k=15, max_iter=50, learning_rate=0.005, lambda_reg=0.0, lambda_u=0.06, lambda_v=0.06, lambda_bu=0.02, lambda_bi=0.02, use_bias=False, num_threads=0, trainable=True, verbose=False, init_params=None, seed=None)#

Non-negative Matrix Factorization

Parameters:
  • k (int, optional, default: 15) – The dimension of the latent factors.

  • max_iter (int, optional, default: 50) – Maximum number of iterations or the number of epochs for SGD.

  • learning_rate (float, optional, default: 0.005) – The learning rate.

  • lambda_reg (float, optional, default: 0.0) – The lambda value used for regularization of all parameters.

  • lambda_u (float, optional, default: 0.06) – The regularization parameter for user factors U.

  • lambda_v (float, optional, default: 0.06) – The regularization parameter for item factors V.

  • lambda_bu (float, optional, default: 0.02) – The regularization parameter for user biases Bu.

  • lambda_bi (float, optional, default: 0.02) – The regularization parameter for item biases Bi.

  • use_bias (boolean, optional, default: False) – When True, user, item, and global biases are used.

  • num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.

  • trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.

  • verbose (boolean, optional, default: True) – When True, running logs are displayed.

  • init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘U’: user_factors, ‘V’: item_factors, ‘Bu’: user_biases, ‘Bi’: item_biases, ‘mu’: global_mean}

  • seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).

References

  • Lee, D. D., & Seung, H. S. (2001). Algorithms for non-negative matrix factorization. In Advances in neural information processing systems (pp. 556-562).

  • Takahashi, N., Katayama, J., & Takeuchi, J. I. (2014). A generalized sufficient condition for global convergence of modified multiplicative updates for NMF. In Proceedings of 2014 International Symposium on Nonlinear Theory and its Applications (pp. 44-47).

fit(train_set, val_set=None)#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

score(user_idx, item_idx=None)#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Probabilitic Matrix Factorization (PMF)#

class cornac.models.pmf.recom_pmf.PMF(k=5, max_iter=100, learning_rate=0.001, gamma=0.9, lambda_reg=0.001, name='PMF', variant='non_linear', trainable=True, verbose=False, init_params=None, seed=None)[source]#

Probabilistic Matrix Factorization.

Parameters:
  • k (int, optional, default: 5) – The dimension of the latent factors.

  • max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.

  • learning_rate (float, optional, default: 0.001) – The learning rate for SGD_RMSProp.

  • gamma (float, optional, default: 0.9) – The weight for previous/current gradient in RMSProp.

  • lambda_reg (float, optional, default: 0.001) – The regularization coefficient.

  • name (string, optional, default: 'PMF') – The name of the recommender model.

  • variant ({"linear","non_linear"}, optional, default: 'non_linear') – Pmf variant. If ‘non_linear’, the Gaussian mean is the output of a Sigmoid function. If ‘linear’ the Gaussian mean is the output of the identity function.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).

  • verbose (boolean, optional, default: False) – When True, some running logs are displayed.

  • init_params (dict, optional, default: None) –

    List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V}.

    U: ndarray, shape (n_users, k)

    User latent factors.

    V: ndarray, shape (n_items, k)

    Item latent factors.

  • seed (int, optional, default: None) – Random seed for parameters initialization.

References

  • Mnih, Andriy, and Ruslan R. Salakhutdinov. Probabilistic matrix factorization. In NIPS, pp. 1257-1264. 2008.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Session Popular (SPop)#

class cornac.models.spop.recom_spop.SPop(name='SPop', use_session_popularity=True)[source]#

Recommend most popular items of the current session.

Parameters:
  • name (string, default: 'SPop') – The name of the recommender model.

  • use_session_popularity (boolean, optional, default: True) – When False, no item frequency from history items in current session are being used.

References

Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, Domonkos Tikk: Session-based Recommendations with Recurrent Neural Networks, ICLR 2016

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

score(user_idx, history_items, **kwargs)[source]#

Predict the scores for all items based on input history items

Parameters:

history_items (list of lists) – The list of history items in sequential manner for next-item prediction.

Returns:

res – Relative scores of all known items

Return type:

a Numpy array

Session-based Recommendations with Recurrent Neural Networks (GRU4Rec)#

class cornac.models.gru4rec.recom_gru4rec.GRU4Rec(name='GRU4Rec', layers=[100], loss='cross-entropy', batch_size=512, dropout_p_embed=0.0, dropout_p_hidden=0.0, learning_rate=0.05, momentum=0.0, sample_alpha=0.5, n_sample=2048, embedding=0, constrained_embedding=True, n_epochs=10, bpreg=1.0, elu_param=0.5, logq=0.0, device='cpu', trainable=True, verbose=False, seed=None)[source]#

Session-based Recommendations with Recurrent Neural Networks

Parameters:
  • name (string, default: 'GRU4Rec') – The name of the recommender model.

  • layers (list of int, optional, default: [100]) – The number of hidden units in each layer

  • loss (str, optional, default: 'cross-entropy') – Select the loss function.

  • batch_size (int, optional, default: 512) – Batch size

  • dropout_p_embed (float, optional, default: 0.0) – Dropout ratio for embedding layers

  • dropout_p_hidden (float, optional, default: 0.0) – Dropout ratio for hidden layers

  • learning_rate (float, optional, default: 0.05) – Learning rate for the optimizer

  • momentum (float, optional, default: 0.0) – Momentum for adaptive learning rate

  • sample_alpha (float, optional, default: 0.5) – Tradeoff factor controls the contribution of negative sample towards final loss

  • n_sample (int, optional, default: 2048) – Number of negative samples

  • embedding (int, optional, default: 0)

  • constrained_embedding (bool, optional, default: True)

  • n_epochs (int, optional, default: 10)

  • bpreg (float, optional, default: 1.0) – Regularization coefficient for ‘bpr-max’ loss.

  • elu_param (float, optional, default: 0.5) – Elu param for ‘bpr-max’ loss

  • logq (float, optional, default: 0,) – LogQ correction to offset the sampling bias affecting ‘cross-entropy’ loss.

  • device (str, optional, default: 'cpu') – Set to ‘cuda’ for GPU support.

  • trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.

  • verbose (boolean, optional, default: True) – When True, running logs are displayed.

  • seed (int, optional, default: None) – Random seed for weight initialization.

References

Hidasi, B., Karatzoglou, A., Baltrunas, L., & Tikk, D. (2015). Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

score(user_idx, history_items, **kwargs)[source]#

Predict the scores for all items based on input history items

Parameters:

history_items (list of lists) – The list of history items in sequential manner for next-item prediction.

Returns:

res – Relative scores of all known items

Return type:

a Numpy array

Singular Value Decomposition (SVD)#

class cornac.models.svd.recom_svd.SVD(name='SVD', k=10, max_iter=20, learning_rate=0.01, lambda_reg=0.02, early_stop=False, num_threads=0, trainable=True, verbose=False, init_params=None, seed=None)[source]#

Singular Value Decomposition (SVD). The implementation is based on Matrix Factorization with biases.

Parameters:
  • k (int, optional, default: 10) – The dimension of the latent factors.

  • max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.

  • learning_rate (float, optional, default: 0.01) – The learning rate.

  • lambda_reg (float, optional, default: 0.001) – The lambda value used for regularization.

  • early_stop (boolean, optional, default: False) – When True, delta loss will be checked after each iteration to stop learning earlier.

  • num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.

  • trainable (boolean, optional, default: True) – When False, the model will not be re-trained, and input of pre-trained parameters are required.

  • verbose (boolean, optional, default: True) – When True, running logs are displayed.

  • init_params (dictionary, optional, default: None) – Initial parameters, e.g., init_params = {‘U’: user_factors, ‘V’: item_factors, ‘Bu’: user_biases, ‘Bi’: item_biases}

  • seed (int, optional, default: None) – Random seed for weight initialization. If specified, training will take longer because of single-thread (no parallelization).

References

  • Koren, Y. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In SIGKDD, pp. 426-434. 2008.

  • Koren, Y. Factor in the neighbors: Scalable and accurate collaborative filtering. In TKDD, 2010.

Social Recommendation using PMF (SoRec)#

class cornac.models.sorec.recom_sorec.SoRec(name='SoRec', k=5, max_iter=100, learning_rate=0.001, lambda_c=10, lambda_reg=0.001, gamma=0.9, weight_link=True, trainable=True, verbose=False, init_params=None, seed=None)[source]#

Social recommendation using Probabilistic Matrix Factorization.

Parameters:
  • k (int, optional, default: 5) – The dimension of the latent factors.

  • max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.

  • learning_rate (float, optional, default: 0.001) – The learning rate for SGD_RMSProp.

  • gamma (float, optional, default: 0.9) – The weight for previous/current gradient in RMSProp.

  • lambda_c (float, optional, default: 10) – The parameter balancing the information from the user-item rating matrix and the user social network.

  • lambda_reg (float, optional, default: 0.001) – The regularization parameter.

  • weight_link (boolean, optional, default: True) – When true the social network links are weighted according to eq. (4) in the original paper.

  • name (string, optional, default: 'SoRec') – The name of the recommender model.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U, V and Z are not None).

  • verbose (boolean, optional, default: False) – When True, some running logs are displayed.

  • init_params (dictionary, optional, default: None) –

    List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V, ‘Z’:Z}.

    U: a ndarray of shape (n_users, k)

    Containing the user latent factors.

    V: a ndarray of shape (n_items, k)

    Containing the item latent factors.

    Z: a ndarray of shape (n_users, k)

    Containing the social network latent factors.

  • seed (int, optional, default: None) – Random seed for parameters initialization.

References

    1. Ma, H. Yang, M. R. Lyu, and I. King. SoRec:Social recommendation using probabilistic matrix factorization. CIKM ’08, pages 931–940, Napa Valley, USA, 2008.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item. :param user_idx: The index of the user for whom to perform score prediction. :type user_idx: int, required :param item_idx: The index of the item for which to perform score prediction.

If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

User K-Nearest-Neighbors (UserKNN)#

class cornac.models.knn.recom_knn.UserKNN(name='UserKNN', k=20, similarity='cosine', mean_centered=False, weighting=None, amplify=1.0, num_threads=0, trainable=True, verbose=True, seed=None)[source]#

User-Based Nearest Neighbor.

Parameters:
  • name (string, default: 'UserKNN') – The name of the recommender model.

  • k (int, optional, default: 20) – The number of nearest neighbors.

  • similarity (str, optional, default: 'cosine') – The similarity measurement. Supported types: [‘cosine’, ‘pearson’]

  • mean_centered (bool, optional, default: False) – Whether values of the user-item rating matrix will be centered by the mean of their corresponding rows (mean rating of each user).

  • weighting (str, optional, default: None) – The option for re-weighting the rating matrix. Supported types: [‘idf’, ‘bm25’]. If None, no weighting is applied.

  • amplify (float, optional, default: 1.0) – Amplifying the influence on similarity weights.

  • num_threads (int, optional, default: 0) – Number of parallel threads for training. If num_threads=0, all CPU cores will be utilized. If seed is not None, num_threads=1 to remove randomness from parallelization.

  • seed (int, optional, default: None) – Random seed for weight initialization.

References

  • CarlKadie, J. B. D. (1998). Empirical analysis of predictive algorithms for collaborative filtering. Microsoft Research Microsoft Corporation One Microsoft Way Redmond, WA, 98052.

  • Aggarwal, C. C. (2016). Recommender systems (Vol. 1). Cham: Springer International Publishing.

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array

Weighted Matrix Factorization (WMF)#

class cornac.models.wmf.recom_wmf.WMF(name='WMF', k=200, lambda_u=0.01, lambda_v=0.01, a=1, b=0.01, learning_rate=0.001, batch_size=128, max_iter=100, trainable=True, verbose=True, init_params=None, seed=None)[source]#

Weighted Matrix Factorization.

Parameters:
  • name (string, default: 'WMF') – The name of the recommender model.

  • k (int, optional, default: 200) – The dimension of the latent factors.

  • max_iter (int, optional, default: 100) – Maximum number of iterations or the number of epochs for SGD.

  • learning_rate (float, optional, default: 0.001) – The learning rate for AdamOptimizer.

  • lambda_u (float, optional, default: 0.01) – The regularization parameter for users.

  • lambda_v (float, optional, default: 0.01) – The regularization parameter for items.

  • a (float, optional, default: 1) – The confidence of observed ratings.

  • b (float, optional, default: 0.01) – The confidence of unseen ratings.

  • batch_size (int, optional, default: 128) – The batch size for SGD.

  • trainable (boolean, optional, default: True) – When False, the model is not trained and Cornac assumes that the model already pre-trained (U and V are not None).

  • init_params (dictionary, optional, default: None) –

    List of initial parameters, e.g., init_params = {‘U’:U, ‘V’:V}

    U: ndarray, shape (n_users,k)

    The user latent factors, optional initialization via init_params.

    V: ndarray, shape (n_items,k)

    The item latent factors, optional initialization via init_params.

  • seed (int, optional, default: None) – Random seed for weight initialization.

References

  • Hu, Y., Koren, Y., & Volinsky, C. (2008, December). Collaborative filtering for implicit feedback datasets. In 2008 Eighth IEEE International Conference on Data Mining (pp. 263-272).

  • Pan, R., Zhou, Y., Cao, B., Liu, N. N., Lukose, R., Scholz, M., & Yang, Q. (2008, December). One-class collaborative filtering. In 2008 Eighth IEEE International Conference on Data Mining (pp. 502-511).

fit(train_set, val_set=None)[source]#

Fit the model to observations.

Parameters:
  • train_set (cornac.data.Dataset, required) – User-Item preference data as well as additional modalities.

  • val_set (cornac.data.Dataset, optional, default: None) – User-Item preference data for model selection purposes (e.g., early stopping).

Returns:

self

Return type:

object

get_item_vectors()[source]#

Getting a matrix of item vectors used for building the index for ANN search.

Returns:

out – Matrix of item vectors for all items available in the model.

Return type:

numpy.array

get_user_vectors()[source]#

Getting a matrix of user vectors serving as query for ANN search.

Returns:

out – Matrix of user vectors for all users available in the model.

Return type:

numpy.array

get_vector_measure()[source]#

Getting a valid choice of vector measurement in ANNMixin._measures.

Returns:

measure – Dot product aka. inner product

Return type:

MEASURE_DOT

score(user_idx, item_idx=None)[source]#

Predict the scores/ratings of a user for an item.

Parameters:
  • user_idx (int, required) – The index of the user for whom to perform score prediction.

  • item_idx (int, optional, default: None) – The index of the item for which to perform score prediction. If None, scores for all known items will be returned.

Returns:

res – Relative scores that the user gives to the item or to all known items

Return type:

A scalar or a Numpy array