Performance enhancement of RGB image convolution using convolution kernel clustering algorithm for ARM64 processor architecture
DOI:
https://doi.org/10.18372/2073-4751.81.20144Keywords:
convolution operation, NEON64, ARM64, SIMD optimization, vectorization, RGB images, convolution kernel clustering, digital image processing, sparse matrices, OpenCVAbstract
The paper presents a method for improving the performance of RGB image convolution operation on the ARM64 platform using a convolution kernel element clustering algorithm. The proposed approach is based on vectorization of computations using NEON64 SIMD instructions and grouping of non-zero kernel elements with the same sign for efficient skipping of operations with zero elements. A mathematical model of vectorized convolution operation has been developed, which takes into account the specifics of sparse convolution kernel matrices. Experimental study on the Orange Pi 5 Pro platform demonstrated significant acceleration compared to the cv::filter2D() function of the OpenCV library: for medium-sized kernels (7×7 – 11×11), an acceleration of 5.0–9.7 times was achieved, for large kernels (12×12 – 15×15) – 1.7–5.5 times. The proposed method is particularly effective for processing high-resolution images and can be applied in real-time systems on single-board computers with limited computational resources.
References
Приставка П. О., Шевченко А. К. Дослідження реалізації лінійного оператора згортки цифрового зображення при 16-бітних обчисленнях. Проблеми програмування. 2016. № 2-3. С. 207–217. DOI: 10.15421/431608.
Shevchenko A., Tymchyshyn V. A SIMD-based approach to the enhancement of convolution operation performance. International Workshop on Conflict Management in Global Information Networks (CMiGIN 2019) : proceedings, Lviv, Ukraine, November 29, 2019 / 2019. P. 447–458. URL: https://ceur-ws.org/Vol-2588/paper37.pdf
Shevchenko A., Prystavka P., Tymchyshyn V. Research on Possible Convolution Operation Speed Enhancement via AArch64 SIMD. Lecture Notes on Data Engineering and Communications Technologies. Vol. 134. Advances in Computer Science for Engineering and Education / ed. by Z. Hu et al, 2022. P. 61–75. DOI: doi.org/10.1007/978-3-031-04812-8_6.
Fog A. Optimizing software in C++: An optimization guide for Windows, Linux and Mac platforms. Copenhagen : Copenhagen University College of Engineering, 2024. URL : https://www.agner.org/optimize/optimizing_cpp.pdf (access date: 26.05.2025.)
Universal intrinsics / OpenCV 4.x Main Documentation. URL: https://docs.opencv.org/4.x/d6/dd1/tutorial_univ_intrin.html. (access date 26.05.2025.)
HAL (Hardware Acceleration Layer) Explanations / OpenCV GSoC 2016 ideas ; GitHub. URL: https://github.com/opencv/opencv/wiki/GSoC_2016_ideas_HAL_Explanations (access date: 26.05.2025.)
Downloads
Published
How to Cite
Issue
Section
License
The scientific journal adheres to the principles of Open Access and provides free, immediate, and permanent access to all published materials without financial, technical, or legal barriers for readers.
All articles are published in Open Access under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.
Copyright
Authors who publish their works in the journal:
-
retain the copyright to their publications;
-
grant the journal the right of first publication of the article;
-
agree to the distribution of their materials under the CC BY 4.0 license;
-
have the right to reuse, archive, and distribute their works (including in institutional and subject repositories), provided that proper reference is made to the original publication in the journal.




