Comparative Analysis of LLM-based Graph Represenation Construction for Domain-specific Documents
DOI:
https://doi.org/10.18372/1990-5548.88.20970Keywords:
intellectual text analysis, natural language processing, text embeddings, graph representation, machine learning, LLM, RAGAbstract
Recent advances in large language models have substantially improved natural language understanding and enabled their application across a wide range of domains. However, highly specialized fields such as law and medicine remain challenging because their documents often contain complex structures, domain-specific terminology, and dense logical dependencies. In such settings, large language models may produce errors when important structural information is not explicitly preserved in the document representation. To address this limitation, we propose a novel approach for document decomposition into graph-based representations that better capture the structural and semantic relationships within complex texts. We develop a method for processing raw legal documents from the Ukrainian domain using an LLM-based decomposition pipeline, transforming them into structured graph representations that can reinforce contextual retrieval and support retrieval-augmented generation. The proposed method improves document understanding by preserving key contextual dependencies and enhancing the representation of legal knowledge in downstream tasks.
References
Bryan Perozzi, Rami Al-Rfou, Steven Skiena, DeepWalk: Online Learning of Social Representations, in Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710, 2014. https://doi.org/10.1145/2623330.2623732
Aditya Grover, Jure Leskovec, node2vec: Scalable Feature Learning for Networks, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864, 2016. https://doi.org/10.1145/2939672.2939754
Thomas N. Kipf, Max Welling, Semi-Supervised Classification with Graph Convolutional Networks, in International Conference on Learning Representations (ICLR), 2017.
William L. Hamilton, Rex Ying, Jure Leskovec, Inductive Representation Learning on Large Graphs, in Advances in Neural Information Processing Systems 30 (NeurIPS 2017), 2017.
Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, Yoshua Bengio, Graph Attention Networks, in International Conference on Learning Representations (ICLR), 2018.
Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L. Hamilton, Jure Leskovec, Graph Convolutional Neural Networks for Web-Scale Recommender Systems, in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2018. https://doi.org/10.1145/3219819.3219890
Ziniu Hu, Yuxiao Dong, Kuansan Wang, Yizhou Sun, Heterogeneous Graph Transformer, in Proceedings of The Web Conference 2020, pp. 2704–2710, 2020. https://doi.org/10.1145/3366423.3380027
Fenxiao Chen, Yun-Cheng Wang, Bin Wang, C.-C. Jay Kuo, Graph Representation Learning: A Survey, IEEE Access, vol. 8, pp. 211799–211823, 2020.
Nuo Xu, Pinghui Wang, Long Chen, Li Pan, Xiaoyan Wang, Junzhou Zhao, Distinguish Confusing Law Articles for Legal Judgment Prediction, in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 3086–3095, 2020. https://doi.org/10.18653/v1/2020.acl-main.280
Qian Zhao, Tong Gao, Shanshan Zhou, Dongping Li, Yanyan Wen, Legal Judgment Prediction via Heterogeneous Graphs and Knowledge of Law Articles, Applied Sciences, vol. 12, no. 5, article 2531, 2022. https://doi.org/10.3390/app12052531
Farid Ariai, Gianluca Demartini, Natural Language Processing for the Legal Domain: A Survey of Tasks, Datasets, Models, and Challenges, ACM Computing Surveys, 2024.
Hassan S. Al Khatib, Subash Neupane, Harish Kumar Manchukonda, Noorbakhsh Amiri Golilarz, Sudip Mittal, Amin Amirlatifi, Shahram Rahimi, Patient-Centric Knowledge Graphs: A Survey of Current Methods, Challenges, and Applications, Frontiers in Artificial Intelligence, vol. 7, 2024. https://doi.org/10.3389/frai.2024.1388479
Zheng Liu, Xiaohan Li, Hao Peng, Lifang He, Philip S. Yu, Heterogeneous Similarity Graph Neural Network on Electronic Health Records, 2021. https://doi.org/10.1109/BigData50022.2020.9377795
Maya Rotmensch, Yoni Halpern, Amr Tlimat, Steven Horng, David Sontag, Learning a Health Knowledge Graph from Electronic Medical Records, Scientific Reports, vol. 7, article 5994, 2017. https://doi.org/10.1038/s41598-017-05778-z
Hejie Cui, Jiaying Lu, Ran Xu, Shiyu Wang, Wenjing Ma, Yue Yu, Shaojun Yu, Xuan Kan, Chen Ling, Liang Zhao, Zhaohui S. Qin, Joyce C. Ho, Tianfan Fu, Jing Ma, Mengdi Huai, Fei Wang, Carl Yang, A Review on Knowledge Graphs for Healthcare: Resources, Applications, and Promises, Journal of Biomedical Informatics, 2025. https://doi.org/10.1016/j.jbi.2025.104861
Yanjun Gao, Ruizhe Li, John Caskey, Dmitriy Dligach, Timothy Miller, Matthew M. Churpek, Majid Afshar, Leveraging Medical Knowledge Graphs Into Large Language Models for Diagnosis Prediction: Design and Application Study, JMIR AI, vol. 4, article e58670, 2025. https://doi.org/10.2196/58670
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
The scientific journal “Electronics and control systems” adheres to the principles of Open Access and provides free, immediate, and permanent access to all published materials without financial, technical, or legal barriers for readers.
All articles are published in Open Access under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.
Copyright
Authors who publish their works in the journal “Electronics and control systems”:
-
retain the copyright to their publications;
-
grant the journal the right of first publication of the article;
-
agree to the distribution of their materials under the CC BY 4.0 license;
-
have the right to reuse, archive, and distribute their works (including in institutional and subject repositories), provided that proper reference is made to the original publication in the journal.




