Utah DHHS Licensing Assistant
Get instant answers to your licensing questions
Built with industry-leading technologies
Real-Time Voice AI
OpenAI Realtime API with sub-second latency (300-500ms). Streaming voice responses with seamless text/voice switching.
AI-Powered Semantic Search
100% accurate answers using Qdrant vector database with advanced RAG architecture.
Cross-Mode Memory
Perfect conversation memory across voice and text modes. No context loss when switching between modes.
Intelligent Case Manager Routing
Automatically connects you to the right case manager based on your service type and Utah county.
Why HK Tech for DHHS?
A modern, secure, and flexible solution built with cutting-edge technology - without the corporate overhead, conflicts of interest, or red tape
Real-Time Voice AI
OpenAI Realtime API with sub-second latency (300-500ms) and seamless voice/text switching with perfect memory continuity
Advanced RAG Architecture
Semantic search with Qdrant vector database achieving 100% accuracy with session-based conversational memory
Complete Self-Hosting
Frontend, backend (n8n), and vector database all self-hosted for maximum control and security
Platform Agnostic
Deploy anywhere: AWS, GCP, Azure, or on-premises. Fully Dockerized for seamless portability
Cost Optimization
Self-hosting reduces ongoing costs. Open-source stack means no vendor lock-in or licensing fees
Small & Agile Team
No corporate red tape. Fast iteration, direct communication, and cutting-edge implementation
Side-by-Side Comparison
HK Tech vs. Traditional Consulting Firms (Deloitte, etc.)
| Feature | HK Tech | Large Consulting Firms | 
|---|---|---|
| Self-Hosted Solution | ||
| Open Source Stack | ||
| Platform Flexibility | ||
| No Vendor Lock-In | ||
| Direct Team Access | ||
| Rapid Iteration | ||
| Corporate Red Tape | ||
| Conflicts of Interest | 
What We've Built
Real-Time Voice Technology
OpenAI Realtime API with sub-second response times (300-500ms). WebRTC streaming with perfect memory synchronization across voice and text modes
Advanced RAG System
Semantic search with Qdrant vector database, achieving 100% accuracy on FAQ queries with conversational memory that remembers context within each session
Complete Self-Hosting
Every component runs on your infrastructure: Next.js frontend, n8n workflows, and Qdrant vector database. You control everything
Privacy Options
Option to use Ollama with open-weight models (Llama, Mistral) for complete privacy - no data ever leaves your network
Deploy Anywhere
Fully Dockerized and cloud-agnostic. Deploy on AWS, GCP, Azure, or on-premises infrastructure. The choice is yours
AI Safety & Topic Guardrails
Real-time content moderation and topic enforcement. Prevents jailbreak attempts and keeps conversations focused on DHHS licensing
Built by a small, agile team using cutting-edge, proven technology with zero conflicts of interest