Heads up!
The README content here can be outdated. To read the latest version, visit my GitHub profile.
voice-chat-ai-configurable-agent
ai
composio
openai
tts
voice-ai-agents
zustand
Voice AI Agent
A modern voice-enabled AI agent built with Next.js that lets you have natural conversations with AI using speech recognition and text-to-speech capabilities.
Features
- Voice Interaction: Speak to the AI and get audio responses
- Real-time Speech Recognition: Convert speech to text using browser APIs
- Text-to-Speech: AI responses are played back as audio
- Tool Integration: Powered by Composio for advanced AI capabilities
- Modern UI: Clean interface built with Tailwind CSS and Radix UI
- Responsive Design: Works seamlessly across desktop and mobile devices
Tech Stack
- Framework: Next.js 15.3.4 with App Router
- Language: TypeScript
- Styling: Tailwind CSS
- UI Components: Radix UI + shadcn/ui
- AI: OpenAI GPT models via LangChain
- Tools: Composio Core for AI agent capabilities
- Speech: Web Speech API for recognition and synthesis
- State Management: Zustand
- Animations: Framer Motion
Project Structure
voice-ai-agent/
├── app/ # Next.js app directory
│ ├── api/ # API routes
│ │ ├── chat/ # Chat endpoint
│ │ └── tts/ # Text-to-speech endpoint
│ ├── globals.css # Global styles
│ ├── layout.tsx # Root layout
│ └── page.tsx # Home page
├── components/ # React components
│ ├── ui/ # Base UI components
│ ├── chat-header.tsx # Chat header component
│ ├── chat-input.tsx # Message input component
│ ├── chat-interface.tsx # Main chat interface
│ ├── chat-messages.tsx # Messages display
│ └── settings-modal.tsx # Settings modal
├── hooks/ # Custom React hooks
│ ├── use-audio.ts # Audio playback logic
│ ├── use-chat.ts # Chat state management
│ ├── use-mounted.ts # Mount detection
│ └── use-speech-recognition.ts # Speech recognition
├── lib/ # Utility libraries
│ ├── validators/ # Input validation
│ ├── alias-store.ts # State management
│ ├── constants.ts # App constants
│ ├── error-handler.ts # Error handling
│ └── utils.ts # Helper functions
└── public/ # Static assets
Getting Started
Prerequisites
- Node.js 18+
- npm or yarn
- OpenAI API key
- Composio API key (optional)
Installation
- Clone the repository:
git clone <repository-url>
cd voice-ai-agent
- Install dependencies:
npm install
- Set up environment variables:
cp .env.example .env.local
Add your API keys:
OPENAI_API_KEY=your_openai_api_key_here
COMPOSIO_API_KEY=your_composio_api_key_here
- Run the development server:
npm run dev
- Open http://localhost:3000 in your browser
Usage
- Text Chat: Type messages in the input field and press Enter
- Voice Chat: Click the microphone button to start voice input
- Audio Responses: Toggle audio playback in the settings
- Settings: Access configuration options via the settings modal
Development
Available Scripts
npm run dev # Start development server with Turbopack
npm run build # Build for production
npm run start # Start production server
npm run lint # Run ESLint
Key Components
- ChatInterface: Main component orchestrating the chat experience
- useChat: Manages chat state and API communication
- useAudio: Handles text-to-speech functionality
- useSpeechRecognition: Manages voice input with debouncing
- API Routes: Handle chat processing and TTS generation
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
License
This project is licensed under the Apache License.