МЕТОД ПІДВИЩЕННЯ ЕФЕКТИВНОСТІ РОЗПІЗНАВАННЯ МОВИ НА ОСНОВІ ГЕНЕ-ТИЧНОЇ ОПТИМІЗАЦІЇ ВЕЙВЛЕТ-ФУНКЦІЇ

Oleksandr Lavrynenko

doi:10.18372/2310-5461.70.21200

Authors

Oleksandr Lavrynenko State University "Kyiv Aviation Institute", Kyiv, Ukraine https://orcid.org/0000-0002-7738-161X

DOI:

https://doi.org/10.18372/2310-5461.70.21200

Keywords:

adaptive wavelet analysis, Akim’s splines, genetic algorithms, convolutional neural networks, speech recognition, parallel computing, digital signal processing

Abstract

This article addresses the pressing issue of ensuring high reliability in the operation of speech recognition systems under conditions of noise interference. The scientific novelty of this research lies in the development of a method for synthesizing an optimal adaptive wavelet kernel for the initial layers of convolutional neural networks. Unlike existing approaches, which rely on stochastic weight initialization or the use of strictly deterministic basis functions (such as Meyer, Daubechies, or Simlet wavelets), the authors propose an algorithm for the targeted formation of kernel geometry based on Akima interpolation splines. The central focus of the study is the process of optimizing wavelet morphology, where the mean-squared error of the discrepancy between the amplitude-frequency response of the synthesized filter and the energy spectral portrait of a specific speech signal is chosen as the objective function. To solve the problem of minimizing this function in the multidimensional space of spline parameters, a modified parallel genetic algorithm is applied. The use of evolutionary search allows for effectively overcoming the problem of local extrema, characteristic of non-convex surfaces of objective functions, when searching for optimal ordinates of spline nodal points. The article provides a detailed analysis of the algorithm’s convergence up to the 50th generation and an assessment of computational efficiency depending on the number of processor cores used. The results of comparative modeling are presented, confirming the superiority of adaptive kernels over classical analytical wavelets. In particular, the implementation of an optimal adaptive filter into the structure of a convolutional classifier allowed for an increase in speech recognition accuracy by 15–22% at low signal-to-noise ratios of 5–15 dB. It is shown that, thanks to the use of parallel computing schemes, the system’s adaptation time to a new speaker is reduced to 2.1 seconds, which opens up broad prospects for integrating the method into robust voice control systems for unmanned robotic systems and specialized information and communication networks.

Author Biography

Oleksandr Lavrynenko, State University "Kyiv Aviation Institute", Kyiv, Ukraine

Candidate of Technical Sciences, Associate Professor

References

H. Lu et al., “Speech and Noise Dual-Stream Spectrogram Refine Network With Speech Distortion Loss For Robust Speech Recognition,” ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023, pp. 1-5, https://doi.org/10.1109/ICASSP49357.2023.10095872

Z. Shi, “New Speech Noise Reduction Recognition System Based on Spatial Filtering Technology and CI1103 Speech Module,” 2021 IEEE 3rd International Conference on Frontiers Technology of Information and Computer (ICFTIC), Greenville, SC, USA, 2021, pp. 355-360, https://doi.org/10.1109/ICFTIC54370.2021.9647201

S. Jia, “Electric theft system detection based on genetic algorithm optimization neural network,” 2024 6th International Conference on Energy, Power and Grid (ICEPG), Guangzhou, China, 2024, pp. 1880-1885, https://doi.org/10.1109/ICEPG63230.2024.10775810

V. Kuzmin, M. Zaliskyi, O. Holubnychyi and O. Lavrynenko, “Empirical Data Approximation Using Three-Dimensional Two-Segmented Regression,” 2022 IEEE 3rd KhPI Week on Advanced Technology (KhPIWeek), Kharkiv, Ukraine, 2022, pp. 1-6, https://doi.org/10.1109/KhPIWeek57572.2022.9916335

Z. Nian, Y. -H. Tu, J. Du and C. -H. Lee, “A Progressive Learning Approach to Adaptive Noise and Speech Estimation for Speech Enhancement and Noisy Speech Recognition,” ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 2021, pp. 6913-6917, https://doi.org/10.1109/ICASSP39728.2021.9413395

J. Chen, X. Zhou and Q. Qin, “Research on Speech Recognition of Sanitized Robot Based on Improved Speech Enhancement Algorithm,” 2024 5th International Seminar on Artificial Intelligence, Networking and Information Technology (AINIT), Nanjing, China, 2024, pp. 1641-1644, https://doi.org/10.1109/AINIT61980.2024.10581425

O. Lavrynenko et al., “Method of Remote Biometric Identification of a Person by Voice based on Wavelet Packet Transform,” CEUR Workshop Proceedings, vol. 3654, pp. 150-162, 2024.

Y. Shen et al., “Principal Component Analysis Based on Quantum Genetic Algorithm with T-Distribution Parameters,” 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China, 2021, pp. 2378-2382, https://doi.org/10.1109/IAEAC50856.2021.9390901

S. R. Bandela, S. Sharma Sadhu, V. S. Rathore and S. K. Jagini, “Development of Noise Robust Automatic Speech Recognition System,” 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT), Delhi, India, 2023, pp. 1-6, https://doi.org/10.1109/ICCCNT56998.2023.10307271

O. Lavrynenko et al., “Application of Daubechies wavelet analysis in problems of acoustic detection of UAVs,” CEUR Workshop Proceedings, vol. 3662, pp. 125-143, 2024.

M. Xu, “A Multi-Objective Genetic Algorithm for Financial Time Series Reversal Mode Mining,” 2024 International Conference on Integrated Intelligence and Communication Systems (ICIICS), Kalaburagi, India, 2024, pp. 1-5, https://doi.org/10.1109/ICIICS63763.2024.10860044

D. Bakhtiiarov et al., “Methods for assessing and forecasting electromagnetic radiation levels in urban environments,” Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Srodowiska, vol. 11, no. 1, pp. 24-27, 2021, https://doi.org/10.35784/iapgos.2430

Y. Zouhir, M. Zarka and K. Ouni, “Speech Recognition with Missing Data using Oracle-Mask-Cepstral Feature,” 2025 IEEE International Conference on Advanced Systems and Emergent Technologies (IC_ASET), Mammamet-Yasmine, Tunisia, 2025, pp. 1-4, https://doi.org/10.1109/IC_ASET65966.2025.11231917

G. Konakhovych et al., “Method of Reliability Increasing Based on Spare Parts Optimization for Telecommunication Equipment,” Lecture Notes in Networks and Systems, vol. 992, pp. 296-309, 2024, https://doi.org/10.1007/978-3-031-60196-5_22

J. Guan, “Optimization of BP neural network model based on genetic algorithm in nonlinear prediction,” 2024 IEEE 6th International Conference on Civil Aviation Safety and Information Technology (ICCASIT), Hangzhou, China, 2024, pp. 1228-1232, https://doi.org/10.1109/ICCASIT62299.2024.10827916

O. Holubnychyi et al., “Well-Adapted to Bounded Norms Predictive Model for Aviation Sensor Systems,” Lecture Notes in Networks and Systems, vol. 736, pp. 179-193, 2023, https://doi.org/10.1007/978-3-031-38082-2_14

V. Khedkar, M. Sreenivasu, S. L. Kantham Vinti, K. B. R. Naidu, A. Lakshmanarao and R. Kancharla, “Malware Classification Using Genetic Algorithm Based Feature Selection and Machine Learning Techniques,” 2024 2nd International Conference on Signal Processing, Communication, Power and Embedded System (SCOPES), Paralakhemundi Campus, Centurion University of Technology and Management, Odisha., India, 2024, pp. 1-6, https://doi.org/10.1109/SCOPES64467.2024.10991031

M. A. Ambewadikar and M. R. Baheti, “Review on Speech Recognition System for Disabled People Using Automatic Speech Recognition (ASR),” 2020 International Conference on Smart Innovations in Design, Environment, Management, Planning and Computing (ICSIDEMPC), Aurangabad, India, 2020, pp. 31-34, https://doi.org/10.1109/ICSIDEMPC49020.2020.9299615

S. Migel, M. Zaliskyi, R. Odarchenko, Z. Poberezhna, A. Osipchuk and O. Lavrynenko, “Speech Recognition System for Ukrainian Language,” 2024 14th International Conference on Advanced Computer Information Technologies (ACIT), Ceske Budejovice, Czech Republic, 2024, pp. 166-169, https://doi.org/10.1109/ACIT62333.2024.10712557

M. J. A. J and A. R. Jayan, “Speech to Speech Based Effortless Malayalam Dictionary Using Kaldi and Effect of CVR Modification on Isolated Word Recognition,” 2022 IEEE 19th India Council International Conference (INDICON), Kochi, India, 2022, pp. 1-6, https://doi.org/10.1109/INDICON56171.2022.10039854

D. Bakhtiiarov et al., “Distribute load among concurrent servers,” CEUR Workshop Proceedings, vol. 3826, pp. 260-266, 2024.

Y. Shi, L. Qin, D. Zhao and Y. Xu, “Research on Indoor Robot Localization Method Based on Clustering Optimizes Genetic Algorithm,” 2023 2nd International Conference on Artificial Intelligence and Intelligent Information Processing (AIIIP), Hangzhou, China, 2023, pp. 153-158, https://doi.org/10.1109/AIIIP61647.2023.00035

O. Lavrynenko et al., “Method of speech signal scrambling based on matched wavelet filters,” CEUR Workshop Proceedings, vol. 3826, pp. 229-235, 2024.

M. Labied, A. Belangour, M. Banane and A. Erraissi, “An overview of Automatic Speech Recognition Preprocessing Techniques,” 2022 International Conference on Decision Aid Sciences and Applications (DASA), Chiangrai, Thailand, 2022, pp. 804-809, https://doi.org/10.1109/DASA54658.2022.9765043

O. Lavrynenko et al., “A method for extracting semantic features for speech signal recognition based on the empirical wavelet transform,” Radioelectronic and Computer Systems, vol. 107, no. 3, pp. 101-124, 2023, https://doi.org/10.32620/reks.2023.3.09.

A METHOD FOR IMPROVING SPEECH RECOGNITION EFFICIENCY BASED ON GENETIC OPTIMIZATION OF WAVELET FUNCTIONS

Authors

DOI:

Keywords:

Abstract

Author Biography

Oleksandr Lavrynenko, State University "Kyiv Aviation Institute", Kyiv, Ukraine

References

Downloads

Published

How to Cite

Issue

Section

License

Developed By

Language

Information

Make a Submission

Logo