Fast Approximate Point Set Matching for Information Retrieval

Rapha?, Benjamin Sach, Fast Approximate Point Set Matching for Information Retrieval. SOFSEM 2007. ISBN 978-3-540-69506-6, pp. 212–223. January 2007. PDF, 446 Kbytes.


We investigate randomised algorithms for subset matching with spatial point sets---given two sets of d-dimensional points: a data set T consisting of n points and a pattern P consisting of m points, find the largest match for a subset of the pattern in the data set. This problem is known to be 3-SUM hard and so unlikely to be solvable exactly in subquadratic time. We present an efficient bit-parallel O(nm) time algorithm and an O(n log(m)) time solution based on correlation calculations using fast Fourier transforms. Both methods are shown experimentally to give answers within a few percent of the exact solution and provide a considerable practical speedup over existing deterministic algorithms.

Bibtex entry.

Contact details

Publication Admin