ElevenLabs Worldwide Hackathon

Team

Vuen AI

Project Concept

Core Functionality: The agent is a Multimodal Voice-to-UI Assistant centered around a reactive 3D Visual Orb. It enables real-time, bidirectional voice conversations where users can interact not just verbally, but physically by dragging and dropping documents (PDF, DOCX) directly onto the orb to instantly “teach” the agent new context (RAG). The agent actively orchestrates a suite of Business Intelligence (BI) tools, dynamically generating and rendering interactive charts (Time Series, Geo Maps, KPI Cards) based on the conversation.

Working Prototype Stability: The prototype is highly stable, featuring robust state management for connection lifecycles (Idle, Listening, Thinking, Speaking) and error handling. It implements a resilient WebSocket connection for low-latency audio and includes sophisticated UI feedback loops (e.g., visual cues for file processing, drag states, and chart loading statuses).

Technical Complexity: The solution demonstrates high complexity through Client-Side Tool Orchestration. The agent handles a multimodal pipeline:

Audio Processing: Real-time FFT frequency analysis driving the 3D orb’s vertex shaders.
Context Injection: On-the-fly parsing of dropped files (mammoth, pdfjs-dist) into markdown context injected directly into the active ElevenLabs session.
Tool Execution: The LLM “calls” client-side functions (e.g.,
time_series_chart
,
geo_chart
), which the application parses to render Recharts components dynamically without page reloads.
Innovation & Real-World Impact
Innovation: The “Tactile Voice Interface”—where dropping a file onto a 3D avatar instantly creates a knowledge base—is a novel interaction pattern that bridges the gap between file systems and conversational AI. Furthermore, the ability of the agent to “show, don’t just tell” by summoning complex data visualizations on command transforms the browser from a passive document viewer into an active analytical partner.

Real-World Impact: This tool democratizes complex data analysis. A non-technical user can drop a sales report and simply ask, “Show me the revenue trend by region,” and the agent will instantly visualize the data, drastically reducing the time-to-insight for decision-makers.

Theme Alignment
This project perfectly aligns with the theme by transforming disparate web elements into a cohesive entity:

Browsers: Transformed from static pages into an immersive 3D spatial interface using React Three Fiber.
Voices: ElevenLabs powers the agent’s persona, giving it a natural, human-like presence that drives the interaction.
Clouds: High-speed cloud inference connects the user’s intent to actionable data and context processing.
Tools: The agent acts as an orchestrator, wielding 7 distinct BI tools as extensions of its capabilities to manipulate the user’s screen.
Technology Stack
Core Framework: React, Vite, TypeScript
AI & Voice: ElevenLabs React SDK (WebSockets), ElevenLabs Conversational AI
3D & Visuals: Three.js, React Three Fiber (R3F), GLSL Shaders (Custom noise/fresnel shaders), Framer Motion
Data Visualization: Recharts (Dynamic chart rendering)
Document Processing: PDF.js (PDF parsing), Mammoth (DOCX parsing)
UI/Styling: TailwindCSS, Shadcn/UI, Lucide React

Entry

Status: Submitted

Last saved: December 11 at 9:33 PM -05

View Entry

Team Roster

Message board not available for this team yet.

Enoc Silva Team Lead RSVP Approved

Founder at Vuen AI

create the agent in elevenlabs platform and Ai tools

Enoc Silva is the founder and CEO of Vuen AI, a voice-first AI platform building the “Visual OS” for business. With a background in data intelligence, real-time architecture, and product design, Enoc leads innovations that merge natural language, vision, and automation. His work has been recognized across LATAM’s startup ecosystem, and he’s driven by a mission to make AI interaction intuitive, human, and visual.

Exploring the intersection of voice intelligence, data visualization, and AI orchestration. Open to collaborating with engineers, researchers, and product leaders working on AI agents, business intelligence, multimodal UX, and data accessibility for enterprises. Passionate about building the next generation of human-computer interfaces.

Currently leading Vuen AI Visual Agent Platform, an enterprise-ready solution that transforms voice queries into intelligent visualizations in real time. The platform integrates OpenAI’s Realtime API, WebRTC, and FastAPI to let executives speak to their data and instantly see visual insights — turning dashboards into conversations.

Cristian Mancilla David RSVP Approved

CTO at VuenAI

connect webrtc frontend with elevenlabs as testing

Cristian David Mancilla Barajas, also known as Cristian Mancilla David, is the CTO at Vuen AI, where he is also involved as an AI Engineer focusing on Research & Development. He is currently employed and hiring contractors. Cristian holds a degree in Ingenieria Electronica from Universidad Pedagógica y Tecnológica de Colombia and has over four years of experience. His professional presence can be found on LinkedIn at https://www.linkedin.com/in/pasterdecris/.

Hiring contractors, LLMOps, AI engineering, AI agents, scalable architectures, Go, Kubernetes, Kafka

Operationalizing LLMs/AI agents, designing scalable architectures, and building inference pipelines. Integrating models into production services using LLMOps, Go, Python, Docker, Kubernetes, and Kafka tooling.

Johan Alexander Espinosa Rocuts RSVP Approved

CTO at Vuen.ai

create the rag system and drag and drop system for upload files

CTO at Vuen.ai. I enjoy turning prototypes into real systems: LLM apps, agents, pipelines, and platforms with strong security practices. I’ve worked on security automation and cloud hardening. Here to share learnings and connect with people building serious things with AI.

Agent evaluation for LLM apps (quality, safety, regression testing); agent frameworks & tooling (routing, tool-use); advanced RAG (chunking, retrieval, rerankers); LLM observability & tracing (LLMOps); production scaling & cost optimization; AI security (prompt injection, data boundaries); connecting with founders/builders and senior backend/ML talent.

Nova Education — personalized learning platform (FastAPI + Vite) with streaming LLM chat; building an “AI tutor” experience for real users. Hermitage — local-docs assistant: custom frontend + LM Studio/AnythingLLM, embeddings + retrieval pipeline, privacy-first RAG. Security automation — LLM-assisted alert triage (n8n agents), access review/reporting, and least-privilege IAM templates for AWS/GCP. AI training R&D — experimenting with training/fine-tuning smaller language models (GPT-2 class)