Mab2Rec Public API

BanditRecommender

class mab2rec.BanditRecommender(learning_policy: Union[mabwiser.mab.LearningPolicy.EpsilonGreedy, mabwiser.mab.LearningPolicy.Popularity, mabwiser.mab.LearningPolicy.Random, mabwiser.mab.LearningPolicy.Softmax, mabwiser.mab.LearningPolicy.ThompsonSampling, mabwiser.mab.LearningPolicy.UCB1, mabwiser.mab.LearningPolicy.LinGreedy, mabwiser.mab.LearningPolicy.LinTS, mabwiser.mab.LearningPolicy.LinUCB], neighborhood_policy: Union[None, mabwiser.mab.NeighborhoodPolicy.LSHNearest, mabwiser.mab.NeighborhoodPolicy.Clusters, mabwiser.mab.NeighborhoodPolicy.KNearest, mabwiser.mab.NeighborhoodPolicy.Radius, mabwiser.mab.NeighborhoodPolicy.TreeBandit] = None, top_k: int = 10, seed: int = 12345, n_jobs: int = 1, backend: Optional[str] = None)

Bases: object

Mab2Rec: Multi-Armed Bandit Recommender

Mab2Rec is a library to support prototyping and building of bandit-based recommendation algorithms. It is powered by MABWiser which supports context-free, parametric and non-parametric contextual bandit models.

learning_policy

The learning policy.

Type

MABWiser LearningPolicy

neighborhood_policy

The neighborhood policy.

Type

MABWiser NeighborhoodPolicy

top_k

The number of items to recommend.

Type

int, default=10

seed

The random seed to initialize the internal random number generator.

Type

int, Constants.default_seed

n_jobs

This is used to specify how many concurrent processes/threads should be used for parallelized routines. Default value is set to 1. If set to -1, all CPUs are used. If set to -2, all CPUs but one are used, and so on.

Type

int

backend

Specify a parallelization backend implementation supported in the joblib library. Supported options are: - “loky” used by default, can induce some communication and memory overhead when exchanging input and

output data with the worker Python processes.

  • “multiprocessing” previous process-based backend based on multiprocessing.Pool. Less robust than loky.

  • “threading” is a very low-overhead backend but it suffers from the Python Global Interpreter Lock if the called function relies a lot on Python objects.

Default value is None. In this case the default backend selected by joblib will be used.

Type

str, optional

mab

The multi-armed bandit.

Type

MAB

Examples

>>> from mab2rec import BanditRecommender, LearningPolicy
>>> decisions = ['Arm1', 'Arm1', 'Arm3', 'Arm1', 'Arm2', 'Arm3']
>>> rewards = [0, 1, 1, 0, 1, 0]
>>> rec = BanditRecommender(LearningPolicy.EpsilonGreedy(epsilon=0.25), top_k=2)
>>> rec.fit(decisions, rewards)
>>> rec.recommend()
['Arm2', 'Arm1']
>>> rec.add_arm('Arm4')
>>> rec.partial_fit(['Arm4'], [1])
>>> rec.recommend()[0]
['Arm2', 'Arm4']
>>> from mab2rec import BanditRecommender, LearningPolicy, NeighborhoodPolicy
>>> decisions = ['Arm1', 'Arm1', 'Arm3', 'Arm1', 'Arm2', 'Arm3']
>>> rewards = [0, 1, 1, 0, 1, 0]
>>> contexts = [[0, 0, 0], [1, 0, 1], [0, 1, 1], [0, 0, 0], [1, 1, 1], [0, 1, 0]]
>>> rec = BanditRecommender(LearningPolicy.EpsilonGreedy(), NeighborhoodPolicy.KNearest(k=3), top_k=2)
>>> rec.fit(decisions, rewards, contexts)
>>> rec.recommend([[1, 1, 0], [1, 1, 1], [0, 1, 0]])
[['Arm2', 'Arm3'], ['Arm3', 'Arm2'], ['Arm3', 'Arm2']]
>>> from mab2rec import BanditRecommender, LearningPolicy
>>> decisions = ['Arm1', 'Arm1', 'Arm3', 'Arm1', 'Arm2', 'Arm3']
>>> rewards = [0, 1, 1, 0, 1, 0]
>>> contexts = [[0, 0, 0], [1, 0, 1], [0, 1, 1], [0, 0, 0], [1, 1, 1], [0, 1, 0]]
>>> rec = BanditRecommender(LearningPolicy.LinGreedy(epsilon=0.1), top_k=2)
>>> rec.fit(decisions, rewards, contexts)
>>> rec.recommend([[1, 1, 0], [1, 1, 1], [0, 1, 0]])
[['Arm2', 'Arm1'], ['Arm2', 'Arm1'], ['Arm2', 'Arm3']]
>>> arm_to_features = {'Arm1': [0, 1], 'Arm2': [0, 0], 'Arm3': [0, 0], 'Arm4': [0, 1]}
>>> rec.add_arm('Arm4')
>>> rec.warm_start(arm_to_features, distance_quantile=0.75)
>>> rec.recommend([[1, 1, 0], [1, 1, 1], [0, 1, 0]])
[['Arm2', 'Arm4'], ['Arm2', 'Arm4'], ['Arm2', 'Arm3']]
add_arm(arm: Arm, binarizer=None) None

Adds an _arm_ to the list of arms.

Incorporates the arm into the learning and neighborhood policies with no training data.

Parameters
  • arm (Arm) – The new arm to be added.

  • binarizer (Callable, default=None) – The new binarizer function for Thompson Sampling.

Return type

Returns nothing.

fit(decisions: Union[List[Arm], numpy.ndarray, pandas.core.series.Series], rewards: Union[List[Union[int, float]], numpy.ndarray, pandas.core.series.Series], contexts: Union[None, List[List[Union[int, float]]], numpy.ndarray, pandas.core.series.Series, pandas.core.frame.DataFrame] = None) None

Fits the recommender the given decisions, their corresponding rewards and contexts, if any. If the recommender arms has not been initialized using the set_arms, the recommender arms will be set to the list of arms in decisions.

Validates arguments and raises exceptions in case there are violations.

This function makes the following assumptions:
  • each decision corresponds to an arm of the bandit.

  • there are no None, Nan, or Infinity values in the contexts.

Parameters
  • decisions (Union[List[Arm], np.ndarray, pd.Series]) – The decisions that are made.

  • rewards (Union[List[Num], np.ndarray, pd.Series]) – The rewards that are received corresponding to the decisions.

  • contexts (Union[None, List[List[Num]], np.ndarray, pd.Series, pd.DataFrame], default=None) – The context under which each decision is made.

Return type

Returns nothing.

partial_fit(decisions: Union[List[Arm], numpy.ndarray, pandas.core.series.Series], rewards: Union[List[Union[int, float]], numpy.ndarray, pandas.core.series.Series], contexts: Union[None, List[List[Union[int, float]]], numpy.ndarray, pandas.core.series.Series, pandas.core.frame.DataFrame] = None) None

Updates the recommender with the given decisions, their corresponding rewards and contexts, if any.

Validates arguments and raises exceptions in case there are violations.

This function makes the following assumptions:
  • each decision corresponds to an arm of the bandit.

  • there are no None, Nan, or Infinity values in the contexts.

Parameters
  • decisions (Union[List[Arm], np.ndarray, pd.Series]) – The decisions that are made.

  • rewards (Union[List[Num], np.ndarray, pd.Series]) – The rewards that are received corresponding to the decisions.

  • contexts (Union[None, List[List[Num]], np.ndarray, pd.Series, pd.DataFrame], default=None) – The context under which each decision is made.

Return type

Returns nothing.

predict(contexts: Union[None, List[List[Union[int, float]]], numpy.ndarray, pandas.core.series.Series, pandas.core.frame.DataFrame] = None) Union[Arm, List[Arm]]

Returns the “best” arm (or arms list if multiple contexts are given) based on the expected reward.

The definition of the best depends on the specified learning policy. Contextual learning policies and neighborhood policies require contexts data in training. In testing, they return the best arm given new context(s).

Parameters

contexts (Union[None, List[List[Num]], np.ndarray, pd.Series, pd.DataFrame], default=None) – The context under which each decision is made. If contexts is not None for context-free bandits, the predictions returned will be a list of the same length as contexts.

Return type

The recommended arm or recommended arms list.

predict_expectations(contexts: Union[None, List[List[Union[int, float]]], numpy.ndarray, pandas.core.series.Series, pandas.core.frame.DataFrame] = None) Union[Dict[Arm, Union[int, float]], List[Dict[Arm, Union[int, float]]]]

Returns a dictionary of arms (key) to their expected rewards (value).

Contextual learning policies and neighborhood policies require contexts data for expected rewards.

Parameters

contexts (Union[None, List[Num], List[List[Num]], np.ndarray, pd.Series, pd.DataFrame], default=None) – The context for the expected rewards. If contexts is not None for context-free bandits, the predicted expectations returned will be a list of the same length as contexts.

Return type

The dictionary of arms (key) to their expected rewards (value), or a list of such dictionaries.

recommend(contexts: Union[None, List[List[Union[int, float]]], numpy.ndarray, pandas.core.series.Series, pandas.core.frame.DataFrame] = None, excluded_arms: Optional[List[List[Arm]]] = None, return_scores: bool = False) Union[List[Arm], Tuple[List[Arm], List[Union[int, float]]], List[List[Arm]], Tuple[List[List[Arm]], List[List[Union[int, float]]]]]

Generate _top-k_ recommendations based on the expected reward.

Recommend up to k arms with the highest predicted expectations. For contextual bandits, only items not included in the excluded arms can be recommended.

Parameters
  • contexts (np.ndarray, default=None) – The context under which each decision is made. If contexts is not None for context-free bandits, the recommendations returned will be a list of the same length as contexts.

  • excluded_arms (List[List[Arm]], default=None) – List of list of arms to exclude from recommended arms.

  • return_scores (bool, default=False) – Return score for each recommended item.

Return type

List of tuples of the form ([arm_1, arm_2, …, arm_k], [score_1, score_2, …, score_k])

remove_arm(arm: Arm) None

Removes an _arm_ from the list of arms.

Parameters

arm (Arm) – The existing arm to be removed.

Return type

Returns nothing.

set_arms(arms: List[Arm], binarizer=None) None

Initializes the recommender and sets the recommender with given list of arms. Existing arms not in the given list of arms are removed and new arms are incorporated into the learning and neighborhood policies with no training data. If the recommender has already been initialized it will not be re-initialized.

Parameters
  • arms (List[Arm]) – The new arm to be added.

  • binarizer (Callable, default=None) – The new binarizer function for Thompson Sampling.

Return type

Returns nothing.

warm_start(arm_to_features: Dict[Arm, List[Union[int, float]]], distance_quantile: Optional[float] = None) None

Warm-start untrained (cold) arms of the multi-armed bandit.

Validates arguments and raises exceptions in case there are violations.

Parameters
  • arm_to_features (Dict[Arm, List[Num]]) – Numeric representation for each arm.

  • distance_quantile (float, default=None) – Value between 0 and 1 used to determine if an item can be warm started or not using closest item. All cold items will be warm started if 1 and none will be warm started if 0.

Return type

Returns nothing.

LearningPolicy

class mab2rec.LearningPolicy
class EpsilonGreedy(epsilon: Union[int, float] = 0.1)

Epsilon Greedy Learning Policy.

This policy selects the arm with the highest expected reward with probability 1 - \(\epsilon\), and with probability \(\epsilon\) it selects an arm at random for exploration.

epsilon

The probability of selecting a random arm for exploration. Integer or float. Must be between 0 and 1. Default value is 0.1.

Type

Num

Example

>>> from mabwiser.mab import MAB, LearningPolicy
>>> arms = ['Arm1', 'Arm2']
>>> decisions = ['Arm1', 'Arm1', 'Arm2', 'Arm1']
>>> rewards = [20, 17, 25, 9]
>>> mab = MAB(arms, LearningPolicy.EpsilonGreedy(epsilon=0.25), seed=123456)
>>> mab.fit(decisions, rewards)
>>> mab.predict()
'Arm1'
epsilon: Union[int, float]

Alias for field number 0

class LinGreedy(epsilon: Union[int, float] = 0.1, l2_lambda: Union[int, float] = 1.0, scale: bool = False)

LinGreedy Learning Policy.

This policy trains a ridge regression for each arm. Then, given a given context, it predicts a regression value. This policy selects the arm with the highest regression value with probability 1 - \(\epsilon\), and with probability \(\epsilon\) it selects an arm at random for exploration.

epsilon

The probability of selecting a random arm for exploration. Integer or float. Must be between 0 and 1. Default value is 0.1.

Type

Num

l2_lambda

The regularization strength. Integer or float. Cannot be negative. Default value is 1.0.

Type

Num

scale

Whether to scale features to have zero mean and unit variance. Uses StandardScaler in sklearn.preprocessing. Default value is False.

Type

bool

Example

>>> from mabwiser.mab import MAB, LearningPolicy
>>> list_of_arms = ['Arm1', 'Arm2']
>>> decisions = ['Arm1', 'Arm1', 'Arm2', 'Arm1']
>>> rewards = [20, 17, 25, 9]
>>> contexts = [[0, 1, 2, 3], [1, 2, 3, 0], [2, 3, 1, 0], [3, 2, 1, 0]]
>>> mab = MAB(list_of_arms, LearningPolicy.LinGreedy(epsilon=0.5))
>>> mab.fit(decisions, rewards, contexts)
>>> mab.predict([[3, 2, 0, 1]])
'Arm2'
epsilon: Union[int, float]

Alias for field number 0

l2_lambda: Union[int, float]

Alias for field number 1

scale: bool

Alias for field number 2

class LinTS(alpha: Union[int, float] = 1.0, l2_lambda: Union[int, float] = 1.0, scale: bool = False)

LinTS Learning Policy

For each arm LinTS trains a ridge regression and creates a multivariate normal distribution for the coefficients using the calculated coefficients as the mean and the covariance as:

\[\alpha^{2} (x_i^{T}x_i + \lambda * I_d)^{-1}\]

The normal distribution is randomly sampled to obtain expected coefficients for the ridge regression for each prediction.

\(\alpha\) is a factor used to adjust how conservative the estimate is. Higher \(\alpha\) values promote more exploration.

The multivariate normal distribution uses Cholesky decomposition to guarantee deterministic behavior. This method requires that the covariance is a positive definite matrix. To ensure this is the case, alpha and l2_lambda are required to be greater than zero.

alpha

The multiplier to determine the degree of exploration. Integer or float. Must be greater than zero. Default value is 1.0.

Type

Num

l2_lambda

The regularization strength. Integer or float. Must be greater than zero. Default value is 1.0.

Type

Num

scale

Whether to scale features to have zero mean and unit variance. Uses StandardScaler in sklearn.preprocessing. Default value is False.

Type

bool

Example

>>> from mabwiser.mab import MAB, LearningPolicy
>>> list_of_arms = ['Arm1', 'Arm2']
>>> decisions = ['Arm1', 'Arm1', 'Arm2', 'Arm1']
>>> rewards = [20, 17, 25, 9]
>>> contexts = [[0, 1, 2, 3], [1, 2, 3, 0], [2, 3, 1, 0], [3, 2, 1, 0]]
>>> mab = MAB(list_of_arms, LearningPolicy.LinTS(alpha=0.25))
>>> mab.fit(decisions, rewards, contexts)
>>> mab.predict([[3, 2, 0, 1]])
'Arm2'
alpha: Union[int, float]

Alias for field number 0

l2_lambda: Union[int, float]

Alias for field number 1

scale: bool

Alias for field number 2

class LinUCB(alpha: Union[int, float] = 1.0, l2_lambda: Union[int, float] = 1.0, scale: bool = False)

LinUCB Learning Policy.

This policy trains a ridge regression for each arm. Then, given a given context, it predicts a regression value and calculates the upper confidence bound of that prediction. The arm with the highest highest upper bound is selected.

The UCB for each arm is calculated as:

\[UCB = x_i \beta + \alpha \sqrt{(x_i^{T}x_i + \lambda * I_d)^{-1}x_i}\]

Where \(\beta\) is the matrix of the ridge regression coefficients, \(\lambda\) is the regularization strength, and I_d is a dxd identity matrix where d is the number of features in the context data.

\(\alpha\) is a factor used to adjust how conservative the estimate is. Higher \(\alpha\) values promote more exploration.

alpha

The parameter to control the exploration. Integer or float. Cannot be negative. Default value is 1.0.

Type

Num

l2_lambda

The regularization strength. Integer or float. Cannot be negative. Default value is 1.0.

Type

Num

scale

Whether to scale features to have zero mean and unit variance. Uses StandardScaler in sklearn.preprocessing. Default value is False.

Type

bool

Example

>>> from mabwiser.mab import MAB, LearningPolicy
>>> list_of_arms = ['Arm1', 'Arm2']
>>> decisions = ['Arm1', 'Arm1', 'Arm2', 'Arm1']
>>> rewards = [20, 17, 25, 9]
>>> contexts = [[0, 1, 2, 3], [1, 2, 3, 0], [2, 3, 1, 0], [3, 2, 1, 0]]
>>> mab = MAB(list_of_arms, LearningPolicy.LinUCB(alpha=1.25))
>>> mab.fit(decisions, rewards, contexts)
>>> mab.predict([[3, 2, 0, 1]])
'Arm2'
alpha: Union[int, float]

Alias for field number 0

l2_lambda: Union[int, float]

Alias for field number 1

scale: bool

Alias for field number 2

class Popularity

Randomized Popularity Learning Policy.

Returns a randomized popular arm for each prediction. The probability of selection for each arm is weighted by their mean reward. It assumes that the rewards are non-negative.

The probability of selection is calculated as:

\[P(arm) = \frac{ \mu_i } { \Sigma{ \mu } }\]

where \(\mu_i\) is the mean reward for that arm.

Example

>>> from mabwiser.mab import MAB, LearningPolicy
>>> list_of_arms = ['Arm1', 'Arm2']
>>> decisions = ['Arm1', 'Arm1', 'Arm2', 'Arm1']
>>> rewards = [20, 17, 25, 9]
>>> mab = MAB(list_of_arms, LearningPolicy.Popularity())
>>> mab.fit(decisions, rewards)
>>> mab.predict()
'Arm1'
class Random

Random Learning Policy.

Returns a random arm for each prediction. The probability of selection for each arm is uniformly at random.

Example

>>> from mabwiser.mab import MAB, LearningPolicy
>>> list_of_arms = ['Arm1', 'Arm2']
>>> decisions = ['Arm1', 'Arm1', 'Arm2', 'Arm1']
>>> rewards = [20, 17, 25, 9]
>>> mab = MAB(list_of_arms, LearningPolicy.Random())
>>> mab.fit(decisions, rewards)
>>> mab.predict()
'Arm2'
class Softmax(tau: Union[int, float] = 1)

Softmax Learning Policy.

This policy selects each arm with a probability proportionate to its average reward. The average reward is calculated as a logistic function with each probability as:

\[P(arm) = \frac{ e ^ \frac{\mu_i - \max{\mu}}{ \tau } } { \Sigma{e ^ \frac{\mu - \max{\mu}}{ \tau }} }\]

where \(\mu_i\) is the mean reward for that arm and \(\tau\) is the “temperature” to determine the degree of exploration.

tau

The temperature to control the exploration. Integer or float. Must be greater than zero. Default value is 1.

Type

Num

Example

>>> from mabwiser.mab import MAB, LearningPolicy
>>> list_of_arms = ['Arm1', 'Arm2']
>>> decisions = ['Arm1', 'Arm1', 'Arm2', 'Arm1']
>>> rewards = [20, 17, 25, 9]
>>> mab = MAB(list_of_arms, LearningPolicy.Softmax(tau=1))
>>> mab.fit(decisions, rewards)
>>> mab.predict()
'Arm2'
tau: Union[int, float]

Alias for field number 0

class ThompsonSampling(binarizer: Optional[Callable] = None)

Thompson Sampling Learning Policy.

This policy creates a beta distribution for each arm and then randomly samples from these distributions. The arm with the highest sample value is selected.

Notice that rewards must be binary to create beta distributions. If rewards are not binary, see the binarizer function.

binarizer

If rewards are not binary, a binarizer function is required. Given an arm decision and its corresponding reward, the binarizer function returns True/False or 0/1 to denote whether the decision counts as a success, i.e., True/1 based on the reward or False/0 otherwise.

The function signature of the binarizer is:

binarize(arm: Arm, reward: Num) -> True/False or 0/1

Type

Callable

Example

>>> from mabwiser.mab import MAB, LearningPolicy
>>> list_of_arms = ['Arm1', 'Arm2']
>>> decisions = ['Arm1', 'Arm1', 'Arm2', 'Arm1']
>>> rewards = [1, 1, 1, 0]
>>> mab = MAB(list_of_arms, LearningPolicy.ThompsonSampling())
>>> mab.fit(decisions, rewards)
>>> mab.predict()
'Arm2'
>>> from mabwiser.mab import MAB, LearningPolicy
>>> list_of_arms = ['Arm1', 'Arm2']
>>> arm_to_threshold = {'Arm1':10, 'Arm2':10}
>>> decisions = ['Arm1', 'Arm1', 'Arm2', 'Arm1']
>>> rewards = [10, 20, 15, 7]
>>> def binarize(arm, reward): return reward > arm_to_threshold[arm]
>>> mab = MAB(list_of_arms, LearningPolicy.ThompsonSampling(binarizer=binarize))
>>> mab.fit(decisions, rewards)
>>> mab.predict()
'Arm2'
binarizer: Callable

Alias for field number 0

class UCB1(alpha: Union[int, float] = 1)

Upper Confidence Bound1 Learning Policy.

This policy calculates an upper confidence bound for the mean reward of each arm. It greedily selects the arm with the highest upper confidence bound.

The UCB for each arm is calculated as:

\[UCB = \mu_i + \alpha \times \sqrt[]{\frac{2 \times log(N)}{n_i}}\]

Where \(\mu_i\) is the mean for that arm, \(N\) is the total number of trials, and \(n_i\) is the number of times the arm has been selected.

\(\alpha\) is a factor used to adjust how conservative the estimate is. Higher \(\alpha\) values promote more exploration.

alpha

The parameter to control the exploration. Integer of float. Cannot be negative. Default value is 1.

Type

Num

Example

>>> from mabwiser.mab import MAB, LearningPolicy
>>> list_of_arms = ['Arm1', 'Arm2']
>>> decisions = ['Arm1', 'Arm1', 'Arm2', 'Arm1']
>>> rewards = [20, 17, 25, 9]
>>> mab = MAB(list_of_arms, LearningPolicy.UCB1(alpha=1.25))
>>> mab.fit(decisions, rewards)
>>> mab.predict()
'Arm2'
alpha: Union[int, float]

Alias for field number 0

NeighborhoodPolicy

class mab2rec.NeighborhoodPolicy
class Clusters(n_clusters: Union[int, float] = 2, is_minibatch: bool = False)

Clusters Neighborhood Policy.

Clusters is a k-means clustering approach that uses the observations from the closest cluster with a learning policy. Supports KMeans and MiniBatchKMeans.

n_clusters

The number of clusters. Integer. Must be at least 2. Default value is 2.

Type

Num

is_minibatch

Boolean flag to use MiniBatchKMeans or not. Default value is False.

Type

bool

Example

>>> from mabwiser.mab import MAB, LearningPolicy, NeighborhoodPolicy
>>> list_of_arms = [1, 2, 3, 4]
>>> decisions = [1, 1, 1, 2, 2, 3, 3, 3, 3, 3]
>>> rewards = [0, 1, 1, 0, 0, 0, 0, 1, 1, 1]
>>> contexts = [[0, 1, 2, 3, 5], [1, 1, 1, 1, 1], [0, 0, 1, 0, 0],[0, 2, 2, 3, 5], [1, 3, 1, 1, 1],                             [0, 0, 0, 0, 0], [0, 1, 4, 3, 5], [0, 1, 2, 4, 5], [1, 2, 1, 1, 3], [0, 2, 1, 0, 0]]
>>> mab = MAB(list_of_arms, LearningPolicy.EpsilonGreedy(epsilon=0), NeighborhoodPolicy.Clusters(3))
>>> mab.fit(decisions, rewards, contexts)
>>> mab.predict([[0, 1, 2, 3, 5], [1, 1, 1, 1, 1]])
[3, 1]
is_minibatch: bool

Alias for field number 1

n_clusters: Union[int, float]

Alias for field number 0

class KNearest(k: int = 1, metric: str = 'euclidean')

KNearest Neighborhood Policy.

KNearest is a nearest neighbors approach that selects the k-nearest observations to be used with a learning policy.

k

The number of neighbors to select. Integer value. Must be greater than zero. Default value is 1.

Type

int

metric

The metric used to calculate distance. Accepts any of the metrics supported by scipy.spatial.distance.cdist. Default value is Euclidean distance.

Type

str

Example

>>> from mabwiser.mab import MAB, LearningPolicy, NeighborhoodPolicy
>>> list_of_arms = [1, 2, 3, 4]
>>> decisions = [1, 1, 1, 2, 2, 3, 3, 3, 3, 3]
>>> rewards = [0, 1, 1, 0, 0, 0, 0, 1, 1, 1]
>>> contexts = [[0, 1, 2, 3, 5], [1, 1, 1, 1, 1], [0, 0, 1, 0, 0],[0, 2, 2, 3, 5], [1, 3, 1, 1, 1],                             [0, 0, 0, 0, 0], [0, 1, 4, 3, 5], [0, 1, 2, 4, 5], [1, 2, 1, 1, 3], [0, 2, 1, 0, 0]]
>>> mab = MAB(list_of_arms, LearningPolicy.EpsilonGreedy(epsilon=0),                           NeighborhoodPolicy.KNearest(2, "euclidean"))
>>> mab.fit(decisions, rewards, contexts)
>>> mab.predict([[0, 1, 2, 3, 5], [1, 1, 1, 1, 1]])
[1, 1]
k: int

Alias for field number 0

metric: str

Alias for field number 1

class LSHNearest(n_dimensions: int = 5, n_tables: int = 3, no_nhood_prob_of_arm: Optional[List] = None)

Locality-Sensitive Hashing Approximate Nearest Neighbors Policy.

LSHNearest is a nearest neighbors approach that uses locality sensitive hashing with a simhash to select observations to be used with a learning policy.

For the simhash, contexts are projected onto a hyperplane of n_context_cols x n_dimensions and each column of the hyperplane is evaluated for its sign, giving an ordered array of binary values. This is converted to a base 10 integer used as the hash code to assign the context to a hash table. This process is repeated for a specified number of hash tables, where each has a unique, randomly-generated hyperplane. To select the neighbors for a context, the hash code is calculated for each hash table and any contexts with the same hashes are selected as the neighbors.

As with the radius or k value for other nearest neighbors algorithms, selecting the best number of dimensions and tables requires tuning. For the dimensions, a good starting point is to use the log of the square root of the number of rows in the training data. This will give you sqrt(n_rows) number of hashes.

The number of dimensions and number of tables have inverse effects from each other on the number of empty neighborhoods and average neighborhood size. Increasing the dimensionality decreases the number of collisions, which increases the precision of the approximate neighborhood but also potentially increases the number of empty neighborhoods. Increasing the number of hash tables increases the likelihood of capturing neighbors the other random hyperplanes miss and increases the average neighborhood size. It should be noted that the fit operation is O(2**n_dimensions).

n_dimensions

The number of dimensions to use for the hyperplane. Integer value. Must be greater than zero. Default value is 5.

Type

int

n_tables

The number of hash tables. Integer value. Must be greater than zero. Default value is 3.

Type

int

no_nhood_prob_of_arm

The probabilities associated with each arm. Used to select random arm if context has no neighbors. If not given, a uniform random distribution over all arms is assumed. The probabilities should sum up to 1.

Type

None or List

Example

>>> from mabwiser.mab import MAB, LearningPolicy, NeighborhoodPolicy
>>> list_of_arms = [1, 2, 3, 4]
>>> decisions = [1, 1, 1, 2, 2, 3, 3, 3, 3, 3]
>>> rewards = [0, 1, 1, 0, 0, 0, 0, 1, 1, 1]
>>> contexts = [[0, 1, 2, 3, 5], [1, 1, 1, 1, 1], [0, 0, 1, 0, 0],[0, 2, 2, 3, 5], [1, 3, 1, 1, 1],                             [0, 0, 0, 0, 0], [0, 1, 4, 3, 5], [0, 1, 2, 4, 5], [1, 2, 1, 1, 3], [0, 2, 1, 0, 0]]
>>> mab = MAB(list_of_arms, LearningPolicy.EpsilonGreedy(epsilon=0),                           NeighborhoodPolicy.LSHNearest(5, 3))
>>> mab.fit(decisions, rewards, contexts)
>>> mab.predict([[0, 1, 2, 3, 5], [1, 1, 1, 1, 1]])
[3, 1]
n_dimensions: int

Alias for field number 0

n_tables: int

Alias for field number 1

no_nhood_prob_of_arm: Optional[List]

Alias for field number 2

class Radius(radius: Union[int, float] = 0.05, metric: str = 'euclidean', no_nhood_prob_of_arm: Optional[List] = None)

Radius Neighborhood Policy.

Radius is a nearest neighborhood approach that selects the observations within a given radius to be used with a learning policy.

radius

The maximum distance within which to select observations. Integer or Float. Must be greater than zero. Default value is 1.

Type

Num

metric

The metric used to calculate distance. Accepts any of the metrics supported by scipy.spatial.distance.cdist. Default value is Euclidean distance.

Type

str

no_nhood_prob_of_arm

The probabilities associated with each arm. Used to select random arm if context has no neighbors. If not given, a uniform random distribution over all arms is assumed. The probabilities should sum up to 1.

Type

None or List

Example

>>> from mabwiser.mab import MAB, LearningPolicy, NeighborhoodPolicy
>>> list_of_arms = [1, 2, 3, 4]
>>> decisions = [1, 1, 1, 2, 2, 3, 3, 3, 3, 3]
>>> rewards = [0, 1, 1, 0, 0, 0, 0, 1, 1, 1]
>>> contexts = [[0, 1, 2, 3, 5], [1, 1, 1, 1, 1], [0, 0, 1, 0, 0],[0, 2, 2, 3, 5], [1, 3, 1, 1, 1],                             [0, 0, 0, 0, 0], [0, 1, 4, 3, 5], [0, 1, 2, 4, 5], [1, 2, 1, 1, 3], [0, 2, 1, 0, 0]]
>>> mab = MAB(list_of_arms, LearningPolicy.EpsilonGreedy(epsilon=0),                           NeighborhoodPolicy.Radius(2, "euclidean"))
>>> mab.fit(decisions, rewards, contexts)
>>> mab.predict([[0, 1, 2, 3, 5], [1, 1, 1, 1, 1]])
[3, 1]
metric: str

Alias for field number 1

no_nhood_prob_of_arm: Optional[List]

Alias for field number 2

radius: Union[int, float]

Alias for field number 0

class TreeBandit(tree_parameters: Dict = {})

TreeBandit Neighborhood Policy.

This policy fits a decision tree for each arm using context history. It uses the leaves of these trees to partition the context space into regions and keeps a list of rewards for each leaf. To predict, it receives a context vector and goes to the corresponding leaf at each arm’s tree and applies the given context-free MAB learning policy to predict expectations and choose an arm.

The TreeBandit neighborhood policy is compatible with the following context-free learning policies only: EpsilonGreedy, ThompsonSampling and UCB1.

The TreeBandit neighborhood policy is a modified version of the TreeHeuristic algorithm presented in: Adam N. Elmachtoub, Ryan McNellis, Sechan Oh, Marek Petrik A Practical Method for Solving Contextual Bandit Problems Using Decision Trees, UAI 2017

tree_parameters

Parameters of the decision tree. The keys must match the parameters of sklearn.tree.DecisionTreeRegressor. When a parameter is not given, the default parameters from sklearn.tree.DecisionTreeRegressor will be chosen. Default value is an empty dictionary.

Type

Dict, **kwarg

Example

>>> from mabwiser.mab import MAB, LearningPolicy, NeighborhoodPolicy
>>> list_of_arms = ['Arm1', 'Arm2']
>>> decisions = ['Arm1', 'Arm1', 'Arm2', 'Arm1']
>>> rewards = [20, 17, 25, 9]
>>> contexts = [[0, 1, 2, 3], [1, 2, 3, 0], [2, 3, 1, 0], [3, 2, 1, 0]]
>>> mab = MAB(list_of_arms, LearningPolicy.EpsilonGreedy(epsilon=0), NeighborhoodPolicy.TreeBandit())
>>> mab.fit(decisions, rewards, contexts)
>>> mab.predict([[3, 2, 0, 1]])
'Arm2'
tree_parameters: Dict

Alias for field number 0

Pipeline

mab2rec.pipeline.benchmark(recommenders: Dict[str, mab2rec.rec.BanditRecommender], metrics: List[Union[jurity.recommenders.BinaryRecoMetrics, jurity.recommenders.RankingRecoMetrics]], train_data: Union[str, pandas.core.frame.DataFrame], test_data: Optional[Union[str, pandas.core.frame.DataFrame]] = None, cv: Optional[int] = None, user_features: Optional[Union[str, pandas.core.frame.DataFrame]] = None, user_features_list: Optional[Union[str, List[str]]] = None, user_features_dtypes: Optional[Union[str, Dict]] = None, item_features: Optional[Union[str, pandas.core.frame.DataFrame]] = None, item_list: Optional[Union[str, List[Arm]]] = None, item_eligibility: Optional[Union[str, pandas.core.frame.DataFrame]] = None, warm_start: bool = False, warm_start_distance: Optional[float] = None, user_id_col: str = 'user_id', item_id_col: str = 'item_id', response_col: str = 'response', batch_size: int = 100000, verbose: bool = False) Union[Tuple[Dict[str, pandas.core.frame.DataFrame], Dict[str, Dict[str, float]]], Tuple[List[Dict[str, pandas.core.frame.DataFrame]], List[Dict[str, Dict[str, float]]]]]

Benchmark Recommenders.

Benchmark a given set of recommender algorithms by training, scoring and evaluating each algorithm If using cross-validation (cv) it benchmarks the algorithms on cv-many folds from the train data, otherwise it trains on the train data and evaluates on the test data.

Parameters
  • recommenders (Dict[str, BanditRecommender]) – The recommender algorithms to be benchmarked. Dictionary with names (key) and recommender algorithms (value).

  • metrics (List[Union[BinaryRecoMetrics, RankingRecoMetrics]]) – List of metrics used to evaluate recommendations.

  • train_data (Union[str, pd.DataFrame]) – Training data used to train recommenders. Data should have a row for each training sample (user_id, item_id, response). Column names should be consistent with user_id_col, item_id_col and response_col arguments. CSV format with file header or Data Frame.

  • test_data (Union[str, pd.DataFrame]) – Test data used to generate recommendations. Data should have a row for each training sample (user_id, item_id, response). Column names should be consistent with user_id_col, item_id_col and response_col arguments. CSV format with file header or Data Frame.

  • cv (int, default=None) – Number of folds in the train data to use for cross-fold validation. A grouped K-fold iterator is used to ensure that the same user is not contained in different folds. Test data must be None when using cv.

  • user_features (Union[str, pd.DataFrame], default=None) – User features containing features for each user_id. Each row should include user_id and list of features (user_id, u_1, u_2, …, u_p). CSV format with file header or Data Frame.

  • user_features_list (Union[str, List[str]], default=None) – List of user features to use. Must be a subset of features in (u_1, u_2, … u_p). If None, all the features in user_features are used. CSV format with file header or List.

  • user_features_dtypes (Union[str, Dict], default=None) – Data type for each user feature. Maps each user feature name to valid data type. If none, no data type casting is done upon load and data types or inferred by Pandas library. JSON format or Dictionary.

  • item_features (Union[str, pd.DataFrame], default=None) – Item features file containing features for each item_id. Each row should include item_id and list of features (item_id, i_1, i_2, …. i_q). CSV format with file header or Data Frame.

  • item_list (Union[str, List[Arm]], default=None) – List of items to train. If None, all the items in data are used. CSV format with file header or List.

  • item_eligibility (Union[str, pd.DataFrame], default=None) – Items each user is eligible for. Used to generate excluded_arms lists. If None, all the items can be evaluated for recommendation for each user. CSV format with file header or Data Frame.

  • warm_start (bool, default=False) – Whether to warm start untrained (cold) arms after training or not.

  • warm_start_distance (float, default=None) – Warm start distance quantile. Value between 0 and 1 used to determine if an item can be warm started or not using closest item. All cold items will be warm started if 1 and none will be warm started if 0. Must be specified if warm_start=True.

  • user_id_col (str, default=Constants.user_id) – User id column name.

  • item_id_col (str, default=Constants.item_id) – Item id column name.

  • response_col (str, default=Constants.response) – Response column name.

  • batch_size (str, default=100000) – Batch size used for chunking data.

  • verbose (bool, default=False) – Whether to print progress status or not.

Returns

  • Tuple with recommendations and evaluation metrics for each algorithm.

  • The tuple values are lists of dictionaries if cross-validation is used, representing the results on each fold,

  • and individual dictionaries otherwise.

mab2rec.pipeline.score(recommender: Union[str, mab2rec.rec.BanditRecommender], data: Union[str, pandas.core.frame.DataFrame], user_features: Optional[Union[str, pandas.core.frame.DataFrame]] = None, user_features_list: Optional[Union[str, List[str]]] = None, user_features_dtypes: Optional[Union[str, Dict]] = None, item_features: Optional[Union[str, pandas.core.frame.DataFrame]] = None, item_list: Optional[Union[str, List[Arm]]] = None, item_eligibility: Optional[Union[str, pandas.core.frame.DataFrame]] = None, warm_start: bool = False, warm_start_distance: Optional[float] = None, user_id_col: str = 'user_id', item_id_col: str = 'item_id', response_col: str = 'response', batch_size: int = 100000, save_file: Optional[Union[str, bool]] = None) pandas.core.frame.DataFrame

Score Recommender.

Generates top-k recommendations for users in given data.

Parameters
  • recommender (Union[str, BanditRecommender]) – The recommender algorithm to be scored. Could be an instantiated BanditRecommender or file path of serialized recommender in pickle file.

  • data (Union[str, pd.DataFrame]) – Training data. Data should have a row for each training sample (user_id, item_id, response). Column names should be consistent with user_id_col, item_id_col and response_col arguments. CSV format with file header or Data Frame.

  • user_features (Union[str, pd.DataFrame], default=None) – User features containing features for each user_id. Each row should include user_id and list of features (user_id, u_1, u_2, …, u_p). CSV format with file header or Data Frame.

  • user_features_list (Union[str, List[str]], default=None) – List of user features to use. Must be a subset of features in (u_1, u_2, … u_p). If None, all the features in user_features are used. CSV format with file header or List.

  • user_features_dtypes (Union[str, Dict], default=None) – Data type for each user feature. Maps each user feature name to valid data type. If none, no data type casting is done upon load and data types or inferred by Pandas library. JSON format or Dictionary.

  • item_features (Union[str, pd.DataFrame], default=None) – Item features file containing features for each item_id. Each row should include item_id and list of features (item_id, i_1, i_2, …. i_q). CSV format with file header or Data Frame.

  • item_list (Union[str, List[Arm]], default=None) – List of items to train. If None, all the items in data are used. CSV format with file header or List.

  • item_eligibility (Union[str, pd.DataFrame], default=None) – Items each user is eligible for. Used to generate excluded_arms lists. If None, all the items can be evaluated for recommendation for each user. CSV format with file header or Data Frame.

  • warm_start (bool, default=False) – Whether to warm start untrained (cold) arms after training or not.

  • warm_start_distance (float, default=None) – Warm start distance quantile. Value between 0 and 1 used to determine if an item can be warm started or not using closest item. All cold items will be warm started if 1 and none will be warm started if 0. Must be specified if warm_start=True.

  • user_id_col (str, default=Constants.user_id) – User id column name.

  • item_id_col (str, default=Constants.item_id) – Item id column name.

  • response_col (str, default=Constants.response) – Response column name.

  • batch_size (str, default=100000) – Batch size used for chunking data.

  • save_file (str, default=None) – File name to save recommender pickle. If None, recommender is not saved to file.

Return type

Scored recommendations.

mab2rec.pipeline.train(recommender: mab2rec.rec.BanditRecommender, data: Union[str, pandas.core.frame.DataFrame], user_features: Optional[Union[str, pandas.core.frame.DataFrame]] = None, user_features_list: Optional[Union[str, List[str]]] = None, user_features_dtypes: Optional[Union[str, Dict]] = None, item_features: Optional[Union[str, pandas.core.frame.DataFrame]] = None, item_list: Optional[Union[str, List[Arm]]] = None, item_eligibility: Optional[Union[str, pandas.core.frame.DataFrame]] = None, warm_start: bool = False, warm_start_distance: Optional[float] = None, user_id_col: str = 'user_id', item_id_col: str = 'item_id', response_col: str = 'response', batch_size: int = 100000, save_file: Optional[Union[str, bool]] = None) None

Trains Recommender.

Parameters
  • recommender (BanditRecommender) – The recommender algorithm to be trained. The recommender object is updated in-place.

  • data (Union[str, pd.DataFrame]) – Training data. Data should have a row for each training sample (user_id, item_id, response). Column names should be consistent with user_id_col, item_id_col and response_col arguments. CSV format with file header or Data Frame.

  • user_features (Union[str, pd.DataFrame], default=None) – User features containing features for each user_id. Each row should include user_id and list of features (user_id, u_1, u_2, …, u_p). CSV format with file header or Data Frame.

  • user_features_list (Union[str, List[str]], default=None) – List of user features to use. Must be a subset of features in (u_1, u_2, … u_p). If None, all the features in user_features are used. CSV format with file header or List.

  • user_features_dtypes (Union[str, Dict], default=None) – Data type for each user feature. Maps each user feature name to valid data type. If none, no data type casting is done upon load and data types or inferred by Pandas library. JSON format or Dictionary.

  • item_features (Union[str, pd.DataFrame], default=None) – Item features file containing features for each item_id. Each row should include item_id and list of features (item_id, i_1, i_2, …. i_q). CSV format with file header or Data Frame.

  • item_list (Union[str, List[Arm]], default=None) – List of items to train. If None, all the items in data are used. CSV format with file header or List.

  • item_eligibility (Union[str, pd.DataFrame], default=None) – Items each user is eligible for. Not used during training. CSV format with file header or Data Frame.

  • warm_start (bool, default=False) – Whether to warm start untrained (cold) arms after training or not.

  • warm_start_distance (float, default=None) – Warm start distance quantile. Value between 0 and 1 used to determine if an item can be warm started or not using closest item. All cold items will be warm started if 1 and none will be warm started if 0. Must be specified if warm_start=True.

  • user_id_col (str, default=Constants.user_id) – User id column name.

  • item_id_col (str, default=Constants.item_id) – Item id column name.

  • response_col (str, default=Constants.response) – Response column name.

  • batch_size (str, default=100000) – Batch size used for chunking data.

  • save_file (Union[str, bool], default=None) – File name to save recommender pickle. If None, recommender is not saved to file.

Return type

Returns nothing.

Visualization

mab2rec.visualization.plot_inter_diversity_at_k(recommendation_results: Union[Dict[str, pandas.core.frame.DataFrame], List[Dict[str, pandas.core.frame.DataFrame]]], k_list: List[int], user_id_col: str = 'user_id', item_id_col: str = 'item_id', score_col: str = 'score', sample_size: Optional[float] = None, seed: int = 12345, num_runs: int = 10, n_jobs: int = 1, working_memory: Optional[int] = None, **kwargs)

Plots recommendation metric values (y-axis) for different values of k (x-axis) for each of the benchmark algorithms.

Parameters
  • recommendation_results (Union[Dict[str, pd.DataFrame], List[Dict[str, pd.DataFrame]]]) – Dictionary or list of dictionaries with recommendation results returned by benchmark function.

  • k_list (List[int]) – List of top-k values to evaluate.

  • user_id_col (str, default=Constants.user_id) – User id column name.

  • item_id_col (str, default=Constants.item_id) – Item id column name.

  • score_col (str, default=Constants.score) – Recommendation score column name.

  • sample_size (float, default=None) – Proportion of users to randomly sample for evaluation. If None, no sampling is performed.

  • seed (int, default=Constants.default_seed) – The seed used to create random state.

  • num_runs (int) – num_runs is used to report the approximation of Inter-List Diversity over multiple runs on smaller samples of users, default=10, for a speed-up on evaluations. The sampling size is defined by user_sample_size. The final result is averaged over the multiple runs.

  • n_jobs (int) – Number of jobs to use for computation in parallel, leveraged by sklearn.metrics.pairwise_distances_chunked. -1 means using all processors. Default=1.

  • working_memory (Union[int, None]) – Maximum memory for temporary distance matrix chunks, leveraged by sklearn.metrics.pairwise_distances_chunked. When None (default), the value of sklearn.get_config()[‘working_memory’], i.e. 1024M, is used.

  • **kwargs – Other parameters passed to sns.catplot.

Returns

ax – The plot with metric values.

Return type

matplotlib.axes.Axes

mab2rec.visualization.plot_intra_diversity_at_k(recommendation_results: Union[Dict[str, pandas.core.frame.DataFrame], List[Dict[str, pandas.core.frame.DataFrame]]], item_features: pandas.core.frame.DataFrame, k_list: List[int], user_id_col: str = 'user_id', item_id_col: str = 'item_id', score_col: str = 'score', sample_size: Optional[float] = None, seed: int = 12345, n_jobs: int = 1, num_runs: int = 10, **kwargs)

Plots recommendation metric values (y-axis) for different values of k (x-axis) for each of the benchmark algorithms.

Parameters
  • recommendation_results (Union[Dict[str, pd.DataFrame], List[Dict[str, pd.DataFrame]]]) – Dictionary or list of dictionaries with recommendation results returned by benchmark function.

  • item_features (pd.DataFrame) – Data frame with features for each item_id.

  • k_list (List[int]) – List of top-k values to evaluate.

  • user_id_col (str, default=Constants.user_id) – User id column name.

  • item_id_col (str, default=Constants.item_id) – Item id column name.

  • score_col (str, default=Constants.score) – Recommendation score column name.

  • sample_size (float, default=None) – Proportion of users to randomly sample for evaluation. If None, no sampling is performed.

  • seed (int, default=Constants.default_seed) – The seed used to create random state.

  • num_runs (int) – num_runs is used to report the approximation of Intra-List Diversity over multiple runs on smaller samples of users, default=10, for a speed-up on evaluations. The sampling size is defined by user_sample_size. The final result is averaged over the multiple runs.

  • n_jobs (int) – Number of jobs to use for computation in parallel, leveraged by sklearn.metrics.pairwise_distances. -1 means using all processors. Default=1.

  • **kwargs – Other parameters passed to sns.catplot.

Returns

ax – The plot with metric values.

Return type

matplotlib.axes.Axes

mab2rec.visualization.plot_metrics_at_k(metric_results: Union[Dict[str, Dict[str, float]], List[Dict[str, Dict[str, float]]]], **kwargs)

Plots recommendation metric values (y-axis) for different values of k (x-axis) for each of the benchmark algorithms.

Parameters
  • metric_results (Union[Dict[str, Dict[str, float]], List[Dict[str, Dict[str, float]]]]) – Nested-dictionary or list of dictionaries with evaluation results returned by benchmark function.

  • **kwargs – Other parameters passed to sns.catplot.

Returns

ax – The plot with metric values.

Return type

matplotlib.axes.Axes

mab2rec.visualization.plot_num_items_per_recommendation(recommendation_results: Union[Dict[str, pandas.core.frame.DataFrame], List[Dict[str, pandas.core.frame.DataFrame]]], actual_results: pandas.core.frame.DataFrame, normalize: bool = False, user_id_col: str = 'user_id', **kwargs)

Plots recommendation counts (y-axis) versus actual counts or average responses (x-axis) for each item.

Parameters
  • recommendation_results (Union[Dict[str, pd.DataFrame], List[Dict[str, pd.DataFrame]]]) – Dictionary or list of dictionaries with recommendation results returned by benchmark function.

  • actual_results (pd.DataFrame) – Test data frame used to generate recommendations. Data should have a row for each sample (user_id, item_id, response).

  • normalize (bool, default=False) – Whether to normalize the number of items to be proportions such that they add to 1.

  • user_id_col (str) – User id column name. Default value is set to Constants.user_id

  • **kwargs – Other parameters passed to sns.catplot.

Returns

ax – The plot with counts or proportions for different number of items per recommendation.

Return type

matplotlib.axes.Axes

mab2rec.visualization.plot_personalization_heatmap(recommendation_results: Union[Dict[str, pandas.core.frame.DataFrame], List[Dict[str, pandas.core.frame.DataFrame]]], user_to_cluster: Dict[Union[int, str], int], k: int, user_id_col: str = 'user_id', item_id_col: str = 'item_id', figsize: Optional[Tuple[int, int]] = None, **kwargs)

Plot heatmaps to visualize level of personalization, by calculating the distribution of recommendations by item within different user clusters.

Parameters
  • recommendation_results (Union[Dict[str, pd.DataFrame], List[Dict[str, pd.DataFrame]]]) – Dictionary or list of dictionaries with recommendation results returned by benchmark function.

  • user_to_cluster (Dict[Union[int, str], int]) – Mapping from user_id to cluster. Clusters could be derived from clustering algorithm such as KMeans or defined based on specific user features (e.g. age bands)

  • k (int) – Top-k recommendations to evaluate.

  • user_id_col (str, default=Constants.user_id) – User id column name.

  • item_id_col (str, default=Constants.item_id) – Item id column name.

  • figsize (Tuple[int, int], default=None) – Figure size of heatmap set using plt.figure()

  • **kwargs – Other parameters passed to sns.catplot.

Returns

ax – The plot with counts or proportions for different number of items per recommendation.

Return type

matplotlib.axes.Axes

Plots recommendation counts (y-axis) versus actual counts or average responses (x-axis) for each item.

Parameters
  • recommendation_results (Union[Dict[str, pd.DataFrame], List[Dict[str, pd.DataFrame]]]) – Dictionary or list of dictionaries with recommendation results returned by benchmark function.

  • actual_results (pd.DataFrame) – Test data frame used to generate recommendations. Data should have a row for each sample (user_id, item_id, response).

  • k (int) – Top-k recommendations to evaluate.

  • average_response (bool, default=False) – Whether to plot the average response/reward or not.

  • user_id_col (str, default=Constants.user_id) – User id column name.

  • item_id_col (str, default=Constants.item_id) – Item id column name.

  • response_col (str, default=Constants.response) – Response column name.

  • **kwargs – Other parameters passed to sns.relplot.

Returns

ax – The plot with recommended counts.

Return type

matplotlib.axes.Axes

Plots recommendation counts (y-axis) for different items (x-axis) for each of the benchmark algorithms. Only the top_n_items with the most recommendations for each algorithm are shown.

Parameters
  • recommendation_results (Union[Dict[str, pd.DataFrame], List[Dict[str, pd.DataFrame]]]) – Dictionary or list of dictionaries with recommendation results returned by benchmark function.

  • k (int) – Top-k recommendations to evaluate.

  • top_n_items (int, default=None) – Top-n number of items based on number of recommendations to plot.

  • normalize (bool, default=False) – Whether to normalize the counts per item to be proportions such that they add to 1.

  • user_id_col (str, default=Constants.user_id) – User id column name.

  • item_id_col (str, default=Constants.item_id) – Item id column name.

  • **kwargs – Other parameters passed to sns.catplot.

Returns

ax – The plot with recommended counts by item.

Return type

matplotlib.axes.Axes