With more than two decades years of experience in data acquisition and engineering and machine learning, Nandakishor (Nanda) Koka leverages knowledge graphs and advanced AI architectures to solve complex enterprise challenges. Currently serving as Principal Technical Architect, Nanda has pioneered innovative solutions for crowd analytics, document intelligence, and agentic AI systems. In this extended interview, he shares how knowledge graphs enhance modern AI applications, his approach to building reliable RAG systems, and his vision for the future of agentic workflows.
Bridging Structured Knowledge With Generative AI
Knowledge graphs represent a fundamental shift in how AI systems access and utilize information. Rather than relying solely on probabilistic language patterns, modern applications can ground their responses in verified, structured data.
“Knowledge graphs enhance modern AI applications by providing structured, interconnected data that grounds LLMs in factual and contextual accuracy,” Nanda explains. They mitigate LLM hallucinations by linking responses to verified knowledge, enabling more reliable outputs. KGs also enrich contextual understanding by embedding domain-specific relationships and hierarchies, which LLMs can query to improve reasoning and recall.”
The relationship between knowledge graphs and large language models in RAG applications continues to evolve toward more sophisticated architectures. Initially serving mainly as verification layers, knowledge graphs are becoming integral to the generative process itself.
“The integration of large language models (LLMs) and knowledge graphs (KGs) in Retrieval-Augmented Generation (RAG) applications is evolving towards a more tightly coupled architecture,” he notes. “In modern RAG systems, KGs act as contextual pre-filters, providing structured facts that shape the prompt itself, rather than simply verifying results.”
This evolution includes the rise of hybrid reasoning models where knowledge graphs and LLMs work together. “In this setup, KGs provide structured causal chains or entity relationships, while LLMs elaborate on each step in natural language, blending symbolic precision with generative fluency.”
Solving Complex Problems With Graph-Based Approaches
During his work on the Konwinski Challenge, which involved using AI agents to automatically detect and fix software bugs, Nanda explored various approaches to efficiently retrieve relevant code snippets based on bug descriptions.
My initial strategy was to develop a code search engine that indexed the software repository, leveraging text-based retrieval techniques to match bug reports with relevant code chunks,” he explains. “This traditional RAG (Retrieval-Augmented Generation) approach involved creating embeddings of both bug descriptions and code snippets, allowing for similarity-based search. While this method provided some degree of accuracy, it struggled with maintaining contextual coherence and often failed to capture the nuanced dependencies within the codebase.”
The solution came through building a knowledge graph representation of the software repository. The process began by parsing the code into Abstract Syntax Trees (ASTs), enabling a structured representation of the code’s logical components. I then transformed these ASTs into a graph structure where modules, classes, functions, and variables were represented as nodes, while dependencies, inheritance, and function calls were depicted as edges connecting these nodes.”
This approach provided significantly better results. “This graph-based representation offered a much richer semantic layer compared to the flat structure of traditional RAG. For instance, when a bug description referenced a specific class method, the KG could not only locate the method itself but also trace related functions and dependencies, significantly improving the precision of retrieved snippets.”
Understanding The Challenges Of RAG Systems
Building effective RAG systems presents significant challenges across multiple dimensions. Nanda identifies three major categories of obstacles that organizations typically encounter.
One of the foremost challenges in deploying RAG systems is organizational readiness, including securing investment in the necessary infrastructure and skilled resources,” he explains. “Implementing RAG systems often requires advanced hardware for running large language models (LLMs) efficiently, such as GPU clusters or scalable cloud environments.”
Data acquisition and preparation represent another critical hurdle. “RAG systems rely heavily on acquiring and preparing vast amounts of diverse data, which is often fragmented and stored in heterogeneous formats. Data acquisition becomes particularly challenging when dealing with disparate data sources such as relational databases, APIs, file systems, and streaming data.
The third major challenge involves maintaining system reliability over time. “RAG systems may degrade over time due to changing data distributions or outdated information. To maintain robustness, RAG pipelines must also include monitoring and alerting mechanisms to detect failures in data ingestion, model scoring, or response generation.”
For data preprocessing specifically, Nanda outlines ten essential techniques, including text normalization, PII removal, document chunking, metadata extraction, entity and relationship extraction, language detection, deduplication, OCR preprocessing, embedding optimization, and quality filtering.
Architectural Decisions For Knowledge Graph Systems
When building complex AI systems with knowledge graphs, certain architectural choices prove critical for success. Nanda emphasizes the importance of decoupled architectures and hybrid storage approaches.
“One critical decision is adopting a decoupled microservices architecture that separates the KG from LLMs and other components, allowing for independent scaling and maintenance,” he explains. “Additionally, using hybrid storage solutions that combine graph databases, vector databases, and structured data acquisition and stores ensures flexibility in handling both relational and unstructured data.”
Performance optimization requires careful attention to caching and processing strategies. To optimize performance, systems should employ graph caching and asynchronous processing to reduce query latency, especially when handling large-scale graphs. Load balancing across clusters and precomputing frequently accessed subgraphs can help maintain responsiveness.”
Defining Effective AI Agents
Nanda provides a comprehensive definition of what constitutes an AI agent and the components necessary for effective agentic workflows.
An agent, in the context of Agentic AI, is essentially an autonomous program designed to make decisions, take actions, and pursue specific goals without continuous human input,” he explains. “These agents are built to operate independently, leveraging advanced computational techniques to interact with their environment, process information, and execute tasks in a way that mimics intelligent behavior.”
The architecture of effective agents involves three core components. “At its core, an AI agent typically comprises three fundamental components: Access to a Language Model (LLM), Tools, and Execution Loop. The agent relies on a large language model (LLM) to understand, interpret, plan the set of actions to take to fulfill the given objective.”
Memory plays a crucial role in agent effectiveness. “Memory is essential for building effective AI agents, enabling them to learn from past experiences, adapt to changing environments, and make informed decisions. It provides context awareness, helping agents recall previous actions and interactions, which leads to more efficient task execution and avoids redundant efforts.”
The key components for reliable agentic workflows include memory management, execution loops, access to tools and resources, guardrails and security measures, comprehensive logging and monitoring, learning and adaptation capabilities, and robust error handling.
Tools And Frameworks For Implementation
For developers looking to implement knowledge graphs in their AI applications, Nanda recommends a comprehensive toolkit spanning multiple categories of tools.
“For graph databases, popular choices include Neo4j (Cypher-based and widely adopted), Amazon Neptune (supporting RDF and property graphs), TigerGraph (built for deep link analytics), and Ontotext GraphDB (excellent SPARQL support),” he explains.
He highlights an emerging tool specifically designed for knowledge graph construction. “An emerging tool in the knowledge graph space is Triplex, an open-source large language model developed by SciPhi.AI, specifically designed for constructing knowledge graphs from unstructured data acquisition. Triplex extracts subject-predicate-object triples, the foundational elements of knowledge graphs, enabling efficient transformation of raw text into structured formats.”
For integration with LLMs and RAG workflows, he recommends frameworks like Pydantic AI, LangChain, LlamaIndex, and Neo4j’s LLM tooling. These libraries let you construct pipelines where the LLM queries the KG or is guided by it for context-aware generation.
Visualization remains crucial for development and debugging. “Finally, visualization tools like Neo4j Bloom, Gephi, and Graphistry are essential for exploring, debugging, and presenting your knowledge graph. These tools make it easier to understand graph structure, trace relationships, and validate your data.”
As knowledge graphs and large language models continue to evolve, Nanda’s expertise demonstrates how thoughtful integration of structured knowledge with generative AI creates more reliable and effective enterprise solutions. His insights point toward increasingly sophisticated architectures where symbolic reasoning and natural language understanding work seamlessly together.
Kyle Lewis is a seasoned technology journalist with over a decade of experience covering the latest innovations and trends in the tech industry. With a deep passion for all things digital, he has built a reputation for delivering insightful analysis and thought-provoking commentary on everything from cutting-edge consumer electronics to groundbreaking enterprise solutions.
























