Academic2025

Miley Chatbot

CS341 Big Data: Product Recommendation Bot

Role

AI/ML Engineer

Year

2025

Team

Group Project

Tech Stack

LLM, QWEN, Python, NLU, Hugging Face, GPU Cloud (Runpod.io)

A group project for CS341 (Big Data) focused on building a product recommendation chatbot powered by a Large Language Model (LLM) and an end-to-end data workflow. The system is designed using fully open-source technologies, covering the full pipeline—from understanding user messages and retrieving relevant products via vector search to composing a natural-language response grounded in the retrieved context.

01 The Problem

Customers struggle to discover relevant products when datasets are large and diverse, making traditional search insufficient.
Rule-based chatbots have limited ability to interpret natural language and user intent in context.
A practical solution requires connecting language understanding to retrieval and generating responses that remain consistent with retrieved product information.

02 The Solution

Adopted Qwen (LLM) with Hugging Face tooling and leveraged GPU Cloud (Runpod.io) to experiment, tune, and evaluate models efficiently.
Designed the Qwen-based LLM module as two components: (1) an Entity/Intent Extractor that parses user messages into structured signals before vector search, and (2) a Composer that processes retrieved product lists with conversational context to produce a well-formed final answer.
Improved model effectiveness through tuning and prompt/instruction refinement to better fit recommendation scenarios and to reduce inconsistent outputs.
Built a structured data pipeline to manage data preparation, query flow across modules, and the handoff between the LLM and vector search in a traceable, maintainable manner.

03 The Result

The system extracts entity and intent signals before retrieval, improving the structure and relevance of product candidates returned by vector search.
The Composer generates natural, context-aware responses grounded in retrieved product information, improving overall conversational quality.
Delivered practical end-to-end experience with an open-source LLM stack: data pipeline design, instruction/prompting, evaluation, and iterative tuning for better performance.

Project Gallery

Retrospective

Challenges

Designing a reliable data pipeline across Extractor → Vector Search → Composer with clear traceability and validation.
Learning effective LLM instruction/prompting and iterating over large evaluation sets to stabilize output quality.
Tuning models efficiently while managing compute constraints and experiment cycles on GPU cloud resources.
Reducing hallucination by grounding responses in retrieved product lists and enforcing response constraints.

Key Learnings

Designing a two-stage LLM workflow (Extractor/Composer) to connect language understanding, retrieval, and grounded response generation.
Using Hugging Face to manage experimentation, evaluation, and tuning for task-specific performance improvements.
Working with GPU Cloud (Runpod.io) for model training/experiments and managing iteration cycles effectively.
Building Big Data pipelines that orchestrate multiple modules with consistent data flow and maintainable interfaces.
Techniques to mitigate hallucination by constraining the generation context to vector-search results and using explicit prompting.

Technologies

LLMQWENPythonNLUHugging FaceGPU Cloud (Runpod.io)