Dec 3, 2019 by Thibault Debatty | 1469 views
Similarity search is an essential component of machine learning algorithms. However, performing efficient similarity search can be extremely challenging, especially if the dataset is distributed between multiple computers, and even more if the similarity measure is not a metric. With the rise of Big Data processing, these challenging datasets are actually more and more common. In this presentation we show how k nearest neighbors (k-nn) graphs can be used to perform similarity search, clustering and anomaly detection.
You can find the complete presentation on slideshare :