We kindly invite you to read our full paper. (Preprint coming soon on arXiv.)
- Introduction
- Taxonomy of Knowledge Sources
- External Knowledge Integration
- Discussions
- Challenges and Future Directions
- Citation
Large Language Models (LLMs) exhibit remarkable abilities in understanding and generating human language. However, they remain fundamentally constrained by their parametric memory and are prone to hallucination, especially in tasks requiring up-to-date or domain-specific knowledge. To overcome these limitations, this survey investigates methods for enhancing LLM inference through the integration of external knowledge, particularly structured data such as tables and knowledge graphs (KGs).
To ground the discussion, the figure below introduces a comprehensive taxonomy that divides external knowledge into unstructured and structured data. The structured category is further divided into two primary sources: tables and knowledge graphs. Each data type is linked to its relevant reasoning methodologies—symbolic, neural, or hybrid for tables, and loosely or tightly coupled models for KGs.
The integration of tables with LLMs can be categorized based on the reasoning paradigm adopted. The three main approaches are symbolic reasoning, neural reasoning, and hybrid reasoning.
Symbolic reasoning approaches translate natural language questions into SQL queries that are executed over the input table. These methods are interpretable and precise but often struggle with semantic ambiguity or reasoning beyond basic table operations.
In contrast, neural reasoning relies purely on the LLM to perform reasoning in the language space, using approaches like few-shot prompting or chain-of-thought (CoT). While these methods are more flexible and suitable for ambiguous or commonsense queries, they tend to suffer from hallucinations and lack interpretability.
Hybrid methods combine the precision of symbolic execution (e.g., SQL) with the adaptability of neural inference. They typically generate SQL to filter relevant rows and then use the LLM to perform further reasoning.
The figure below illustrates these three paradigms in terms of data flow and execution:
Knowledge graph integration strategies can be grouped into two categories depending on the degree of interaction between the LLM and the KG: loose coupling and tight coupling.
Loose coupling treats the KG as a separate retrieval module. The LLM queries the KG, retrieves relevant facts, and then processes them as part of its prompt. This method is modular and easy to implement but lacks interactive reasoning capabilities.
Tight coupling methods integrate the LLM and KG more deeply. The LLM acts as an agent that iteratively explores the KG, using retrieved entities and relations as context for step-by-step reasoning.
The following figure visually compares these two coupling strategies:
To assess the performance of different integration strategies, we include benchmark results from prior studies.
The figure below summarizes the performance of various table-based reasoning methods (symbolic, neural, hybrid) across two popular benchmarks: WikiTQ and TabFact. Notably, hybrid methods like H-STAR and ALTER outperform others, combining the strengths of both symbolic execution and neural reasoning.
In the context of knowledge graphs, the table below (from the survey) compares methods across three dimensions: objective, performance, and efficiency. Tight coupling methods such as ToG and GoG demonstrate superior reasoning depth, especially for multi-hop queries, while loosely coupled methods like KAPING offer simplicity and low latency.
Hybrid systems that rely on symbolic execution as an intermediate step are prone to cascading errors. If the symbolic stage produces incorrect results (e.g., due to faulty SQL), the subsequent neural reasoning may be misled.
Structured knowledge like tables and graphs often exceed LLM input limits. Existing solutions typically extract a subset of relevant content, but this may exclude crucial information, harming accuracy.
While hybrid systems tend to outperform pure symbolic or neural methods, they also introduce higher latency and complexity. Future methods must balance reasoning depth with scalability.
Real-world applications demand real-time reasoning with up-to-date knowledge. Integrating multimodal data (e.g., images, audio) and streaming updates remains an open challenge. Future research may explore lightweight architectures that dynamically update external knowledge and align it with LLM inference.
If you find this work helpful, please cite:
@article{lin2025llmsurvey,
title = {LLM Inference Enhanced by External Knowledge: A Survey},
author = {Yu-Hsuan Lin and Qian-Hui Chen and Yi-Jie Cheng and Jia-Ren Zhang and Yi-Hung Liu and Liang-Yu Hsia and Yun-Nung Chen},
year = {2025},
note = {Manuscript under review}
}