Random MAB#
- class gobrec.mabs.random_mab.RandomMAB(seed: int | None = None)[source]#
Random Multi-Armed Bandit (MAB) Algorithm.
This class implements a random MAB algorithm that generates random scores for each arm. It can be used as a baseline for comparison with more sophisticated MAB algorithms.
Examples
A simple example using RandomMAB to generate random scores for items.
>>> import numpy as np >>> from gobrec.mabs import RandomMAB >>> random_mab = RandomMAB(seed=2) >>> random_mab.fit( ... contexts=np.array([[0, 1],[0, 1], [1, 0], [1, 0]]), ... decisions=np.array([0, 1, 0, 1]), ... rewards=np.array([1, 0, 0, 1]) ... ) >>> random_mab.predict(np.array([[0, 1],[0, 1], [1, 0], [1, 0]])) tensor([[0.2616, 0.2985], [0.8142, 0.0919], [0.6001, 0.7286], [0.1879, 0.0551]], dtype=torch.float64)
Methods
fit(contexts, decisions, rewards)Fit the RandomMAB algorithm with contexts, decisions, and rewards.
predict(contexts)Predict random scores for each arm given the contexts.
reset()Reset the RandomMAB algorithm to its initial state.
- fit(contexts: ndarray, decisions: ndarray, rewards: ndarray)[source]#
Fit the RandomMAB algorithm with contexts, decisions, and rewards.
This method just updates the label encoder and internal state. There is no actual learning in this random algorithm.
- Parameters:
- contextsnp.ndarray
A 2D array of shape (n_samples, n_features) representing the context arrays
- decisionsnp.ndarray
A 1D array of item IDs (arms or decisions) of shape (n_samples,) where each element can be strings or integers.
- rewardsnp.ndarray
A 1D array of rewards (ratings) of shape (n_samples,). It can be integers or floats.
- predict(contexts: ndarray)[source]#
Predict random scores for each arm given the contexts.
- Parameters:
- contextsnp.ndarray
A 2D array where each row represents the context features for which predictions are to be made.
- Returns:
- scorestorch.Tensor
A 2D tensor of shape (n_samples, n_arms) where each element is a random score for the corresponding context-arm pair.