AI Systems

Local LLM Portfolio Assistant with RAG on NVIDIA Jetson AGX Orin

Developed a locally deployed AI assistant integrated into this portfolio website using Retrieval-Augmented Generation and an NVIDIA Jetson AGX Orin for on-device inference.

Summary

Engineering context

Developed a locally deployed AI assistant integrated into this portfolio website using Retrieval-Augmented Generation and an NVIDIA Jetson AGX Orin for on-device inference.

Category: AI Systems
Year: 2025
Status: Operational
Context: Personal robotics and AI engineering portfolio

My Role

AI Systems Engineer

Technical Stack

System Architecture

Portfolio project content and engineering information are indexed for retrieval
User questions are processed through a Retrieval-Augmented Generation pipeline
Semantic retrieval selects relevant portfolio and project context
A locally deployed LLM generates responses grounded in retrieved engineering information
The assistant is integrated directly into the portfolio website
Inference runs locally on an NVIDIA Jetson AGX Orin edge-AI platform

Engineering Challenges

Running conversational AI locally on embedded edge hardware
Improving conversational quality while maintaining low-latency inference
Designing RAG workflows for portfolio/project retrieval
Maintaining conversational grounding and reducing repetitive outputs
Balancing inference quality with embedded hardware constraints

Hardware / Firmware / Software

Hardware

NVIDIA Jetson AGX Orin

Firmware

Jetson Linux

Software

Ollama
Local LLM inference stack
Next.js
TypeScript
RAG pipeline
Semantic retrieval system
Portfolio assistant frontend

Sensors

Protocols

HTTP API communication
Local inference API integration

Results / Outcomes

Developed a fully local AI assistant integrated into the engineering portfolio
Enabled conversational retrieval of robotics and AI project information
Demonstrated practical embedded AI deployment on Jetson hardware
Integrated semantic retrieval into a production-style portfolio system
Created an AI-assisted engineering portfolio experience

Engineering Notes

Project Overview

This project involved development of a locally deployed AI assistant integrated directly into this engineering portfolio website.

The system uses Retrieval-Augmented Generation (RAG) and a locally deployed language model running on an NVIDIA Jetson AGX Orin edge-AI platform to provide conversational access to portfolio projects, engineering experience, robotics systems, and research work.

The goal was to create an interactive engineering portfolio experience while demonstrating practical local AI deployment on embedded hardware.

My Role

My responsibilities included:

local LLM deployment
embedded edge-AI integration
RAG system design
semantic retrieval workflow development
AI assistant integration
prompt engineering
portfolio context indexing
conversational assistant optimization

The system was designed and deployed as a fully local AI workflow without relying on cloud-hosted LLM APIs.

System Architecture

The portfolio assistant uses a Retrieval-Augmented Generation pipeline integrated with the website backend.

The workflow includes:

indexing portfolio project content
semantic retrieval of engineering context
retrieval-based prompt construction
local LLM inference
conversational response generation

The assistant retrieves relevant portfolio information before generating responses, improving grounding and reducing hallucination.

Local Edge-AI Deployment

The system runs locally on an NVIDIA Jetson AGX Orin platform.

This deployment approach demonstrates:

embedded AI deployment
local inference workflows
edge-AI integration
robotics-compatible AI infrastructure
low-latency local AI systems

Running inference locally also improves:

privacy
system control
customization
deployment flexibility

AI Assistant Integration

The assistant is integrated directly into the portfolio website and can:

explain engineering projects
answer questions about robotics systems
discuss AI and embedded systems work
summarize technical experience
provide conversational project exploration

The objective was to transform the portfolio from a static website into an interactive AI-assisted engineering platform.

Engineering Challenges

Several practical engineering challenges were addressed during development:

running conversational AI on embedded edge hardware
balancing inference quality with compute limitations
reducing repetitive responses
improving conversational flow
designing portfolio-specific retrieval pipelines
grounding responses in engineering project data

Particular attention was given to conversational quality and maintaining technically relevant responses.

Engineering Impact

This project demonstrates practical deployment of:

local LLM systems
edge AI
Retrieval-Augmented Generation
semantic retrieval
embedded AI infrastructure
conversational engineering systems

The work also connects directly to broader interests in robotics, industrial AI, autonomous systems, and deployment-oriented machine learning.

Local LLM Portfolio Assistant with RAG on NVIDIA Jetson AGX Orin

Engineering context

My Role

Technical Stack

System Architecture

Engineering Challenges

Hardware / Firmware / Software

Hardware

Firmware

Software

Sensors

Protocols

Results / Outcomes

Engineering Notes

Project Overview

My Role

System Architecture

Local Edge-AI Deployment

AI Assistant Integration

Engineering Challenges

Engineering Impact

More engineering case studies in this domain

Industrial Knowledge Base AI Assistant Using RAG and Qwen2.5