gigl.src.common.models.layers.calculate_in_batch_candidate_sampling_probability#

gigl.src.common.models.layers.count_min_sketch.calculate_in_batch_candidate_sampling_probability(frequency_tensor: LongTensor, total_cnt: int, batch_size: int) → Tensor#

Calculate in batch negative sampling rate given the frequency tensor, total count and batch size. Please see https://www.tensorflow.org/extras/candidate_sampling.pdf for more details Here we estimate the negative sampling probability Q(y|x) P(candidate in batch | x) ~= P(candidate in batch)

= 1 - P(candidate not in batch) = 1 - P(candidate not in any position in batch) ~= 1 - (1 - frequency / total_cnt) ^ batch_size ~= 1 - (1 - batch_size * frequency / total_cnt) = batch_size * frequency / total_cnt

Where the approximation only holds when frequency / total_cnt << 1, which may not be true at the very beginning of training Thus, we cap the probability to be at most 1.0 Note that the estimation for positive and hard negatives may be less accurate than for random negatives because there is a larger error in P(candidate in batch | x) ~= P(candidate in batch)