Abstract
The fair Max-Min diversification problem for arbitrary metric spaces and Euclidean spaces has been previously studied in Moumoulidou et al. [ICDT 2021] and Addanki et al. [ICDT 2022]. The problem arises from the need to downsize a dataset while maintaining a diverse and representative (fair) subset. In this paper, we extend the problem to the setting of extended metric spaces. Given $n$ points in a extended metric space $\left(\mathcal{U},d\right)$ — where the triangle inequality is generalized to $c \cdot d(x,y) \leq d(x,z) + d(y,z)$ for $c\in(0,2]$ — and each point belongs to a group $i\in[m]$. Our task is to select $k_i$ points from each group $i$, with the goal of maximizing the minimum distance between the selected points. By ensuring $k_i$ points selection for each group $i$, fairness is preserved, while maximizing the minimum distance between any two points guarantees diversity in the sample set. In this paper, we present an algorithm for the refined metric space(i.e. extended metric space with distortion factor $c \in (1,2]$), which achieves a $\frac{c^m+c-2}{(c-1)c^m}$-approximation, analogous to the $m+1$ approximation algorithm for regular metric spaces(i.e. $c=1$) introduced by Addanki et al. [ICDT 2022]. Furthermore, we examine the case of both refined($c \in (1,2]$) and relaxed($c \in (0,1)$) metric spaces with $m=2$. By leveraging the 4-approximation algorithm for regular metric spaces proposed by Moumoulidou et al. [ICDT 2021], we demonstrate that a $\frac{4}{c^2}$-approximation can be achieved in the context of extended metrics where $c\in(0,2]$. A potential application of our work lies in its ability to handle high-dimensional datasets. For regular metric spaces, Moumoulidou et al. [ICDT 2021] demonstrated that achieving an approximation ratio better than $2$ is infeasible unless $P=NP$. Notably, when $c > 1.5$, our algorithm achieves an approximation ratio better than $2$ in such refined metric space.
Type