Comparative Analysis of LLM-based Graph Represenation Construction for Domain-specific Documents

Illia Savenko

doi:10.18372/1990-5548.88.20970

Authors

Illia Savenko National Technical University of Ukraine “Ihor Sikorsky Kyiv Polytechnic Institute”

DOI:

https://doi.org/10.18372/1990-5548.88.20970

Keywords:

intellectual text analysis, natural language processing, text embeddings, graph representation, machine learning, LLM, RAG

Abstract

Recent advances in large language models have substantially improved natural language understanding and enabled their application across a wide range of domains. However, highly specialized fields such as law and medicine remain challenging because their documents often contain complex structures, domain-specific terminology, and dense logical dependencies. In such settings, large language models may produce errors when important structural information is not explicitly preserved in the document representation. To address this limitation, we propose a novel approach for document decomposition into graph-based representations that better capture the structural and semantic relationships within complex texts. We develop a method for processing raw legal documents from the Ukrainian domain using an LLM-based decomposition pipeline, transforming them into structured graph representations that can reinforce contextual retrieval and support retrieval-augmented generation. The proposed method improves document understanding by preserving key contextual dependencies and enhancing the representation of legal knowledge in downstream tasks.

Author Biography

Illia Savenko , National Technical University of Ukraine “Ihor Sikorsky Kyiv Polytechnic Institute”

Postgraduate Student

Artificial Intelligence Department

Institute for Applied System Analysis

References

Bryan Perozzi, Rami Al-Rfou, Steven Skiena, DeepWalk: Online Learning of Social Representations, in Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710, 2014. https://doi.org/10.1145/2623330.2623732

Aditya Grover, Jure Leskovec, node2vec: Scalable Feature Learning for Networks, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864, 2016. https://doi.org/10.1145/2939672.2939754

Thomas N. Kipf, Max Welling, Semi-Supervised Classification with Graph Convolutional Networks, in International Conference on Learning Representations (ICLR), 2017.

William L. Hamilton, Rex Ying, Jure Leskovec, Inductive Representation Learning on Large Graphs, in Advances in Neural Information Processing Systems 30 (NeurIPS 2017), 2017.

Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, Yoshua Bengio, Graph Attention Networks, in International Conference on Learning Representations (ICLR), 2018.

Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L. Hamilton, Jure Leskovec, Graph Convolutional Neural Networks for Web-Scale Recommender Systems, in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2018. https://doi.org/10.1145/3219819.3219890

Ziniu Hu, Yuxiao Dong, Kuansan Wang, Yizhou Sun, Heterogeneous Graph Transformer, in Proceedings of The Web Conference 2020, pp. 2704–2710, 2020. https://doi.org/10.1145/3366423.3380027

Fenxiao Chen, Yun-Cheng Wang, Bin Wang, C.-C. Jay Kuo, Graph Representation Learning: A Survey, IEEE Access, vol. 8, pp. 211799–211823, 2020.

Nuo Xu, Pinghui Wang, Long Chen, Li Pan, Xiaoyan Wang, Junzhou Zhao, Distinguish Confusing Law Articles for Legal Judgment Prediction, in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 3086–3095, 2020. https://doi.org/10.18653/v1/2020.acl-main.280

Qian Zhao, Tong Gao, Shanshan Zhou, Dongping Li, Yanyan Wen, Legal Judgment Prediction via Heterogeneous Graphs and Knowledge of Law Articles, Applied Sciences, vol. 12, no. 5, article 2531, 2022. https://doi.org/10.3390/app12052531

Farid Ariai, Gianluca Demartini, Natural Language Processing for the Legal Domain: A Survey of Tasks, Datasets, Models, and Challenges, ACM Computing Surveys, 2024.

Hassan S. Al Khatib, Subash Neupane, Harish Kumar Manchukonda, Noorbakhsh Amiri Golilarz, Sudip Mittal, Amin Amirlatifi, Shahram Rahimi, Patient-Centric Knowledge Graphs: A Survey of Current Methods, Challenges, and Applications, Frontiers in Artificial Intelligence, vol. 7, 2024. https://doi.org/10.3389/frai.2024.1388479

Zheng Liu, Xiaohan Li, Hao Peng, Lifang He, Philip S. Yu, Heterogeneous Similarity Graph Neural Network on Electronic Health Records, 2021. https://doi.org/10.1109/BigData50022.2020.9377795

Maya Rotmensch, Yoni Halpern, Amr Tlimat, Steven Horng, David Sontag, Learning a Health Knowledge Graph from Electronic Medical Records, Scientific Reports, vol. 7, article 5994, 2017. https://doi.org/10.1038/s41598-017-05778-z

Hejie Cui, Jiaying Lu, Ran Xu, Shiyu Wang, Wenjing Ma, Yue Yu, Shaojun Yu, Xuan Kan, Chen Ling, Liang Zhao, Zhaohui S. Qin, Joyce C. Ho, Tianfan Fu, Jing Ma, Mengdi Huai, Fei Wang, Carl Yang, A Review on Knowledge Graphs for Healthcare: Resources, Applications, and Promises, Journal of Biomedical Informatics, 2025. https://doi.org/10.1016/j.jbi.2025.104861

Yanjun Gao, Ruizhe Li, John Caskey, Dmitriy Dligach, Timothy Miller, Matthew M. Churpek, Majid Afshar, Leveraging Medical Knowledge Graphs Into Large Language Models for Diagnosis Prediction: Design and Application Study, JMIR AI, vol. 4, article e58670, 2025. https://doi.org/10.2196/58670

Comparative Analysis of LLM-based Graph Represenation Construction for Domain-specific Documents

Authors

DOI:

Keywords:

Abstract

Author Biography

Illia Savenko , National Technical University of Ukraine “Ihor Sikorsky Kyiv Polytechnic Institute”

References

Downloads

Published

How to Cite

Issue

Section

License

Developed By

Language

Information

Make a Submission

Logo