Dec 3, 2019 by Thibault Debatty | 1628 views
Today we are presenting distributed k nearest neighbors (k-nn) graphs and similarity search algorithms at the ULB.
Similarity search is an essential component of machine learning algorithms. However, performing efficient similarity search can be extremely challenging, especially if the dataset is distributed between multiple computers, and even more if the similarity measure is not a metric. With the rise of Big Data processing, these challenging datasets are actually more and more common. In this presentation we show how k nearest neighbors (k-nn) graphs can be used to perform similarity search, clustering and anomaly detection.
You can find the complete presentation on slideshare :