Research Overview - Gnaneswar Villuri

I work on building and evaluating agentic, tool-augmented LLM systems that are more reliable in open-ended settings (coding, analysis, and decision support) by combining symbolic structure, verification, and learning-based tool-use policies.

Vision

My goal is to make LLM agents trustworthy in the loop: they should externalize their reasoning into tools (code, tests, search, structured checks), verify intermediate results, and recover when things go wrong.

Core themes

1) Verification-aware prompting and pipelines

Design generator–validator workflows that detect errors early and enforce explicit checks.
Use unit tests, static checks, and consistency constraints to reduce silent failures.

2) RL for tool-use policies

Learn when to call tools, how to compose tool outputs, and when to backtrack.
Optimize for reliability metrics (accuracy under distribution shift, robustness, and calibration).

3) Neuro-symbolic representations for dialog + code

Represent dialog and code with operational/axiomatic semantics to enable structured validation.
Combine symbolic constraints with neural generation to improve interpretability and correctness.

Selected work & experience

Graduate Research Assistant — Stony Brook University (01/2023–05/2027)

Built LLM-based validation agents to detect errors in open-ended collaborative coding tasks.
Implemented verification-aware generator–validator pipelines for static and real-time evaluation.
Developed tool-augmented pipelines using knowledge graphs and multimodal NLP.

Applied Scientist Intern — Amazon AWS (09/2025–12/2025)

Empirically evaluated the “GEPA” Automatic Prompt Optimization (APO) framework.
Benchmarked across 10 production LLMs (1B–405B), validating structured reasoning improvements up to 12%.
Analyzed cross-model prompt transfer (mid-sized → larger models) for reusable optimization.

ML Research Intern — Nokia Bell Labs (06/2025–08/2025)

Developed a multi-agent LLM pipeline using OPC UA to extract insights from industrial telemetry.
Integrated knowledge graphs + function calling to correlate events across system layers.

Publications

2025

Shaik, H., Villuri, G., & Doboli, A. (2025). Two Large Language Model-based Methods to Validate Open-Ended Problem Solving in Teams. (AIMS 2025)
Shaik, H., Villuri, G., & Doboli, A. (2025). An Overview of LLMs and a Novel, LLM-Based Cognitive Architecture for Solving Open-Ended Problems. (PDF)
Shaik, H., Villuri, G., & Doboli, A. (2025). Concept Combinations with Generator and Validator Agents Prompted Using Insights from Concept Networks. (Conference)
Villuri, G., & Doboli, A. (2025). An Experimental Study on the Interpretability of Transformer Models for Dialog Understanding. (IEEE)
Villuri, G., Shaik, H., & Doboli, A. (2025). A Stacked Multi-Layered Perceptron - LLM Model for Extracting the Relations in Textual Descriptions. (IEEE)
Villuri, G., Doboli, A., & Pallapu, H. R. (2025). Towards Semantic Classification: An Experimental Study on Automated Understanding of the Meaning of Verbal Utterances. (IEEE Annual Computing and Communication Workshop and Conference, 2025)

2024

Villuri, G., & Doboli, A. (2024). An Overview and Discussion of the Suitability of Existing Speech Datasets to Train Machine Learning Models for Collective Problem Solving. (arXiv:2412.18489)
Villuri, G., & Doboli, A. (2024). Using Speech Data to Automatically Characterize Team Effectiveness to Optimize Power Distribution in Internet-of-Things Applications. (IEEE CITDS 2024)
Pallapu, H. R., Villuri, G., Doboli, A., & Doboli, S. (2024). Automatically Understanding Human Behavior for IoT Applications with Optimized Human-in-the-Loop Control.
Villuri, G., & Doboli, A. (2024). Towards Semantic Classification of Dialog using Contextual Prediction Networks. (Scholar)

Contact

Email: villurignanesh@gmail.com | Google Scholar | LinkedIn

Neuro-symbolic & Tool-Augmented LLMs for Reliable Reasoning