2019-12-16 Custom C++ TopK#
It started with the fact the python runtime for the AdaBoostRegressor was quite slow. I noticed three operators were quite slow even though their implementation was based on numpy: TopK, ArrayFeatureExtractor and GatherElement. I made a custom implementation of the first two.
Unexpectedly, the query “topk c++ implementation” did not returned very interesting results. Maybe this paper Efficient Top-K Query Processing on Massively Parallel Hardware but that was not what I was looking for. I ended up writing my own implementation of this operator in C++ which you can find here: topk_element_min. It was faster than the previous python implementation totally inspired from scikit-learn but I was wondering how much faster. The answer is in the notebook Fast TopK elements. Worst case, it is twice faster than numpy best case, ten times, for small values of nearest neighbors.
I started to look into scikit-learn issues to see of that kind of issues was already raised. I found a link to this Nearest-neighbor chain algorithm. However, the gain of making such change would only help the brute force algorithm which is not really used to search for neighbors.