Built-in Datasets#
Amazon Clothing#
This data is built based on the Amazon datasets provided by Julian McAuley @ http://jmcauley.ucsd.edu/data/amazon/. We make sure all items having three types of auxiliary data: text, image, and context (items appearing together).
- cornac.datasets.amazon_clothing.load_feedback(reader: Reader = None) List [source]#
Load the user-item ratings, scale: [1,5]
- Parameters:
reader (obj:cornac.data.Reader, default: None) – Reader object used to read the data.
- Returns:
data – Data in the form of a list of tuples (user, item, rating).
- Return type:
array-like
- cornac.datasets.amazon_clothing.load_graph(reader: Reader = None) List [source]#
Load the item-item interactions (symmetric network), built from the Amazon Also-Viewed information
- Parameters:
reader (obj:cornac.data.Reader, default: None) – Reader object used to read the data.
- Returns:
data – Data in the form of a list of tuples (item, item, 1).
- Return type:
array-like
Amazon Digital Music#
This data is built based on the Amazon datasets provided by Julian McAuley at: http://jmcauley.ucsd.edu/data/amazon/
- cornac.datasets.amazon_digital_music.load_feedback(reader: Reader = None) List [source]#
Load the user-item ratings, scale: [1,5]
- Parameters:
reader (obj:cornac.data.Reader, default: None) – Reader object used to read the data.
- Returns:
data – Data in the form of a list of tuples (user, item, rating).
- Return type:
array-like
- cornac.datasets.amazon_digital_music.load_review(reader: Reader = None) List [source]#
Load the user-item-review list
- Parameters:
reader (obj:cornac.data.Reader, default: None) – Reader object used to read the data.
- Returns:
data – Data in the form of a list of tuples (user, item, review).
- Return type:
array-like
Amazon Office#
This data is built based on the Amazon datasets provided by Julian McAuley at: http://jmcauley.ucsd.edu/data/amazon/
- cornac.datasets.amazon_office.load_feedback(reader: Reader = None) List [source]#
Load the user-item ratings, scale: [1,5]
- Parameters:
reader (obj:cornac.data.Reader, default: None) – Reader object used to read the data.
- Returns:
data – Data in the form of a list of tuples (user, item, rating).
- Return type:
array-like
- cornac.datasets.amazon_office.load_graph(reader: Reader = None) List [source]#
Load the item-item interactions (symmetric network), built from the Amazon Also-Viewed information
- Parameters:
reader (obj:cornac.data.Reader, default: None) – Reader object used to read the data.
- Returns:
data – Data in the form of a list of tuples (item, item, 1).
- Return type:
array-like
Amazon Toys and Games#
This data is built based on the Amazon datasets provided by Julian McAuley at: http://jmcauley.ucsd.edu/data/amazon/
- cornac.datasets.amazon_toy.load_feedback(fmt='UIR', reader: Reader = None) List [source]#
Load the user-item ratings, scale: [1,5]
- Parameters:
reader (obj:cornac.data.Reader, default: None) – Reader object used to read the data.
- Returns:
data – Data in the form of a list of tuples (user, item, rating).
- Return type:
array-like
- cornac.datasets.amazon_toy.load_sentiment(reader: Reader = None) List [source]#
Load the user-item-sentiments The dataset was constructed by the method described in the reference paper.
- Parameters:
reader (obj:cornac.data.Reader, default: None) – Reader object used to read the data.
- Returns:
data – Data in the form of a list of tuples (user, item, [(aspect, opinion, sentiment), (aspect, opinion, sentiment), …]).
- Return type:
array-like
References
Gao, J., Wang, X., Wang, Y., & Xie, X. (2019). Explainable Recommendation Through Attentive Multi-View Learning. AAAI.
CiteULike#
This dataset is mostly from the paper ‘Collaborative topic modeling for recommending scientific articles’ [Wang and Blei - KDD 2011]. It was further collected, named citeulike-a, and used in the paper ‘Collaborative Topic Regression with Social Regularization’ [Wang, Chen and Li - IJCAI 2013].
Link to the data: http://www.wanghao.in/CDL.htm
- cornac.datasets.citeulike.load_feedback(reader: Reader = None) List [source]#
Load the implicit feedback between users and items
- Parameters:
reader (obj:cornac.data.Reader, default: None) – Reader object used to read the data.
- Returns:
data – Data in the form of a list of tuples (user, item, 1).
- Return type:
array-like
Epinions#
Link to the dataset: http://www.trustlet.org/downloaded_epinions.html
- cornac.datasets.epinions.load_feedback(reader: Reader = None) List [source]#
Load user-item ratings, rating value is in [1,5]
- Parameters:
reader (obj:cornac.data.Reader, default: None) – Reader object used to read the data.
- Returns:
data – Data in the form of a list of tuples (user, item, rating).
- Return type:
array-like
- cornac.datasets.epinions.load_trust(reader: Reader = None) List [source]#
Load the user trust information (undirected network)
- Parameters:
reader (obj:cornac.data.Reader, default: None) – Reader object used to read the data.
- Returns:
data – Data in the form of a list of tuples (source_user, target_item, trust_value).
- Return type:
array-like
FilmTrust#
Source: https://www.librec.net/datasets.html
- cornac.datasets.filmtrust.load_feedback(reader: Reader = None) List [source]#
Load the user-item ratings, scale: [0.5,4]
- Parameters:
reader (obj:cornac.data.Reader, default: None) – Reader object used to read the data.
- Returns:
data – Data in the form of a list of tuples (user, item, rating).
- Return type:
array-like
- cornac.datasets.filmtrust.load_trust(reader: Reader = None) List [source]#
Load the user-user trust information (undirected network)
- Parameters:
reader (obj:cornac.data.Reader, default: None) – Reader object used to read the data.
- Returns:
data – Data in the form of a list of tuples (user, user, 1).
- Return type:
array-like
MovieLens#
Link to the data: https://grouplens.org/datasets/movielens/
- class cornac.datasets.movielens.MovieLens(url, unzip, path, sep, skip)#
- path#
Alias for field number 2
- sep#
Alias for field number 3
- skip#
Alias for field number 4
- unzip#
Alias for field number 1
- url#
Alias for field number 0
- cornac.datasets.movielens.load_feedback(fmt='UIR', variant='100K', reader=None)[source]#
Load the user-item ratings of one of the MovieLens datasets
- Parameters:
- Returns:
data – Data in the form of a list of tuples depending on the given data format.
- Return type:
array-like
- cornac.datasets.movielens.load_plot()[source]#
Load the plots of movies provided @ http://dm.postech.ac.kr/~cartopy/ConvMF/
- Returns:
texts (List) – List of text documents, one per item.
ids (List) – List of item ids aligned with indices in texts.
Netflix#
Link to the data: https://www.kaggle.com/netflix-inc/netflix-prize-data/
Tradesy#
Link to the data: http://jmcauley.ucsd.edu/data/tradesy/ This data is used in the VBPR paper. After cleaning the data, we have: - Number of feedback: 394,421 (410,186 is reported but there are duplicates) - Number of users: 19,243 (19,823 is reported due to duplicates) - Number of items: 165,906 (166,521 is reported due to duplicates)