Investigation leveraging 'embedded Claude' the why and the where.


AI Innovation Leader & Fractional Head of AI
Transforming businesses through cutting-edge AI solutions, voice technology, and intelligent automation
Peter Michael Gits is a pioneering AI innovation leader based in Charlotte, North Carolina, specializing in transforming businesses through artificial intelligence and voice technology solutions.
With deep expertise in AI implementation, Peter serves as a Fractional Head of AI for organizations looking to leverage cutting-edge technology without the overhead of a full-time executive. His approach combines strategic vision with hands-on technical execution.
Peter's work spans from developing sophisticated speech-to-speech AI systems to creating intelligent chatbots and voice-enabled booking systems. His innovative solutions help businesses automate processes, enhance customer experiences, and unlock new revenue streams.
Based in Charlotte, Peter works with clients across North America, bringing enterprise-level AI capabilities to organizations of all sizes.

Charlotte, NC
Serving clients across North America
This is the best in the industry, tailored to your bespoke requirements. Transform your customer experience with RAG-aware agentic voice AI that understands, responds, and resolves.
Break my AI AdminOutdated & Frustrating
Intelligent & Conversational
Our agents leverage Retrieval-Augmented Generation to access your knowledge base in real-time, providing accurate, contextual responses every time.
No more frustrating menu trees. Customers speak naturally and get immediate, intelligent responses that understand context and intent.
Your AI voice agent never sleeps, never takes breaks, and handles unlimited concurrent calls with consistent quality.
Reduce call center costs by up to 70% while improving customer satisfaction scores and first-call resolution rates.
Smart routing ensures complex issues reach human agents with full conversation context, eliminating repetition.
Gain insights into customer needs, sentiment trends, and operational metrics with comprehensive dashboards.
Your customer dials in and is greeted by a natural, friendly AI voice
Advanced NLP processes intent, context, and sentiment in real-time
Instantly accesses your knowledge base for accurate, relevant information
Provides answers, takes actions, or seamlessly escalates when needed
Ready to transform your customer experience?
Schedule a DemoOver 15 years shaping the future of enterprise communications, from IP telephony architecture to AI-powered voice solutions at one of the world's leading technology companies.
Cisco Systems
Unified Communications
Voice Technology R&D
Distinguished Engineer & Architect
"My years at Cisco taught me that great voice technology isn't just about clear audio—it's about enabling human connection at scale."
— Peter Gits
Comprehensive AI solutions tailored to your business needs
Fractional Head of AI
Strategic AI leadership without the full-time commitment. Get executive-level AI expertise to guide your organization's AI transformation, develop roadmaps, and implement best practices.
Custom AI Implementation
End-to-end AI project delivery from concept to deployment. Leverage cutting-edge machine learning, natural language processing, and automation to solve your unique business challenges.
Ready-to-Deploy Solutions
Powerful AI-driven products designed to transform your customer interactions and streamline operations with voice technology and intelligent automation.
Explore live demonstrations of our cutting-edge voice AI technology
AI Products

Revolutionary AI-powered voice booking system that replaces traditional IVR with intelligent conversational agents. Experience seamless appointment scheduling through natural voice interactions.
AI Products
Smart AI-powered browser extension that automatically detects and skips YouTube advertisements, providing an uninterrupted viewing experience.
Only 25 licenses available per week during beta
Innovative Director and Principal Architect with over 15 years of experience leading cross-functional engineering teams and pioneering cutting-edge AI solutions.

Recently spearheaded GenAI business unit launch at Hexalinks, architecting Agentic Services, ACP, and MCP protocols using CrewAI, Claude Code, and Microsoft Copilot Studio. Proven track record managing teams up to 75 engineers while securing 40+ patents in scalable infrastructure.
Expert in full-stack development, RAG systems, LLM integration, and cloud-native architectures across AWS/Azure. Passionate about transforming complex technical concepts into business value, having delivered ML models with 89%+ accuracy and reduced delivery times by 25%.
Spearheaded the launch of a GenAI business unit, architecting Agentic Services, ACP, and MCP protocols using CrewAI, Claude Code, and Microsoft Copilot Studio.
Proven track record managing cross-functional engineering teams up to 75 engineers while securing 40+ patents in scalable infrastructure.
Expert in full-stack development, RAG systems, LLM integration, and cloud-native architectures across AWS/Azure platforms.
Delivered ML models with 89%+ accuracy and reduced delivery times by 25%, transforming complex technical concepts into business value.
Pioneering speech-to-speech systems and voice chatbots that replace traditional IVR with intelligent, RAG-aware conversational agents.
Adept at engaging customer-facing roles and assisting diverse teams in understanding user interfaces and discrete technical knowledge to streamline IT operations.
Adept at engaging customer-facing roles and assisting diverse teams in understanding user interfaces and discrete technical knowledge to streamline IT operations and maintenance.
Insights and innovations from the world of AI
Ready to transform your business with AI? Let's discuss your project
Charlotte, North Carolina
Serving clients across North America
Join the BetaGet early access to AdSkipper Pro and skip YouTube ads like never before
Claude in a box-4 - showing off a 'Claude Vision' reel. This is what Claude in a box or on a server can now handle Vision tasks that weren't there last week. Ask me how I built this skill and how you can use it in your work? Happy to have a conversation, you can reach me at https://petergits.com If you want to read the post prior to this one (Claude in a box-3), https://lnkd.in/etbwBA8D #ClaudeAI #IntelligentAutomation #RPA #AIAgents #SelfHealingAutomation #Playwright #BrowserAutomation #AnthropicAI #AIIntegration #DeveloperCommunity #TechInnovation #FutureProofing #AutomationEngineering
# Building ChatCal.ai - Getting Calendar Booking to Actually Work **Subject**: Building an AI calendar assistant from scratch - OAuth, Claude Sonnet 4, and Google Calendar --- I wanted to let people book meetings with me without showing them my entire calendar. Sounds simple, right? Turns out there's a lot between "idea" and "working prototype." Then you get down to the real bit, Speech-to-Speech. ## What I Built A conversational AI that handles meeting requests like "book 30 minutes with Peter tomorrow at 2pm" and creates actual Google Calendar events with Meet links. Built with Python FastAPI, Claude Sonnet 4 from Anthropic, and the Google Calendar API, all running in Docker containers. ## The Tools I picked Claude Sonnet 4 because it's good at extracting information from messy human language. Someone says "this is Betty, call me at 630-800-9000" and Claude figures out that's a name and phone number. The Google Calendar API handles the scheduling, and FastAPI serves it all up with a simple chat interface. ## What Actually Took Time Google OAuth2 was my first weekend. Setting up redirect URIs, configuring scopes, understanding the consent screen - the docs are thorough but there's a lot of trial and error. I spent more time on OAuth than building the chat interface. Getting email invitations working with .ics calendar attachments took another evening. SMTP configuration, email templates, making sure both people get notifications - small details that matter when someone's actually using this. ## Early Challenges Claude sometimes got creative with extracting contact info. "Book a meeting with Sarah" would work, but "Sarah wants to chat next week" might miss the name entirely. I added better prompting and examples to make it more reliable. Google Meet link generation worked once I found the right API parameters. The Calendar API has a `conferenceData` field that creates real Meet links, not fake ones. ## Was It Worth the Free Time? I learned way more about OAuth flows than I expected. And debugging why calendar invitations weren't showing up taught me about email headers and RFC 5545. Not glamorous, but useful. Next: Making it fast enough for real use. --- 📅 **I'm sharing this journey in 4 parts. Should I post the rest:** - a) Daily (bite-sized updates) - b) Weekly (time to digest) **Vote below!** 👇 --- **Architecture diagram**: See chatcal-ai-design-document.html for the complete flow **Stack**: Python, FastAPI, Claude Sonnet 4 (Anthropic), Google Calendar API, Docker, SMTP **Time invested**: ~3 weekends (OAuth setup, email integration, Claude prompting) **Note**: Full source code freely available on GitHub (details in Phase 2 post) #AI #Python #GoogleCalendar #OAuth #BuildingInPublic #SideProject
# Making ChatCal.ai Fast - The Groq Migration and Why Response Time Matters **Subject**: Switched from Claude to Groq for 10x speed - learned why response time matters more than I thought (This is post 2 of 4 in a series) --- ChatCal.ai worked, but waiting 2-3 seconds for every response felt slow. When someone's trying to book a meeting, that delay adds up. Time to make it faster. ## The Performance Problem Claude Sonnet 4 is great at reasoning, but each response took 2-3 seconds. For a back-and-forth conversation - name, time, confirmation - that's 10+ seconds of waiting. Not terrible, but not snappy either. ## Why I Switched to Groq Groq's Llama-3.1-8b-instant gave me ~500ms responses. That's the difference between "thinking..." and "instant." Plus the cost dropped from $3/million tokens to $0.60/million. I'm paying for this myself, so that matters. The trade-off? Claude's reasoning is slightly better for complex requests. But for scheduling meetings, Llama-3.1 handles it fine. I'll take the 5x speed improvement. ## The Memory Problem Early versions forgot context immediately. You'd say "I'm Betty" and two messages later it would ask your name again. Frustrating for users, obvious I needed conversation persistence. I added ChatMemoryBuffer to keep conversation history. Now it remembers names, emails, and what you talked about. The system even retains info for future conversations - if Betty books again next week, it already knows her email. ## What Failed First I tried keeping full conversation history. Hit token limits fast. A long conversation would blow through my context window and start dropping messages. The fix: Smart context windowing. Keep critical information (name, email, phone), summarize or drop the rest. Not perfect, but works. ## Other Improvements Added custom meeting IDs in MMDD-HHMM-DURm format (like "0731-1400-60m" for July 31, 2pm, 60 minutes). Makes it easy to reference meetings for cancellations. Built intelligent cancellation matching - say "cancel my Friday 2pm meeting" and it finds the right one without needing the ID. Works by matching user info + date/time. Switched to HTML-formatted responses for better readability. Styled confirmations, color coding, professional layout. ## The Saturday I Lost Debugging why context kept disappearing took an entire Saturday. Turned out I was reinitializing the memory buffer on every request. Moved it to session state and everything clicked. ## Was It Worth It? The speed difference is noticeable. Conversations feel natural now instead of waiting for responses. And saving 80% on API costs meant I could actually afford to keep this running. Next: Adding voice input so people don't have to type. --- **Metrics**: 2-3s → 500ms response time, $3/M → $0.60/M tokens, conversation memory added **Stack**: Groq Llama-3.1-8b-instant, ChatMemoryBuffer, Python FastAPI, Redis session storage #AI #Performance #Groq #LLM #Optimization #ProductDevelopment
# Adding Voice Input - When HuggingFace Spaces Fights Back **Subject**: Built WebRTC speech-to-text on free GPU - FastAPI+Gradio don't play nice on HuggingFace --- Typing "book a meeting tomorrow at 2pm" is fine. Saying it should be better. Time to add voice input using free GPU resources on HF. ## The Plan vs Reality I wanted WebRTC audio capture in the browser, streaming to Whisper STT running on HuggingFace Spaces (free H200 GPU). Seemed straightforward. It wasn't. ## Week 1: The FastAPI Trap Tried combining FastAPI for WebSocket endpoints with Gradio for the ML interface. HuggingFace Spaces hated this. Port conflicts, mount failures, 500 errors with no useful logs. The platform is optimized for Gradio interfaces, not hybrid FastAPI apps. ## Week 2: WebSocket Dreams Die Attempted pure WebSocket implementations for real-time audio streaming. HuggingFace Spaces only reliably supports HTTP/HTTPS traffic. Their proxy layer doesn't handle WebSocket connections in containers. Tried Socket.IO, tried different ports, tried bypassing their proxy. Nothing worked consistently. WebSockets are a no-go on HF using containers. ## Week 3: The Breakthrough Switched to HTTP-based chunking. Browser captures audio via WebRTC, but sends it in chunks over regular HTTP. Not true real-time streaming, but good enough. Gradio's client expects files wrapped with `handle_file()` function. Documentation didn't make this obvious. ## What Actually Works Browser captures 16kHz mono audio using navigator.mediaDevices. Voice Activity Detection (energy-based filtering) identifies when someone's actually speaking. Only chunks with voice activity get sent to avoid wasting GPU time. Audio goes to Gradio client wrapped in proper format, hits Whisper base model on H200 GPU, returns transcription in 6-8 seconds. Not real-time, but usable. ## The Cost Win Free H200 GPU vs paying $0.006/second for commercial STT APIs. For a side project, that's the difference between "affordable" and "not happening." ## What I Built for Testing Trialed MCP Voice Service. Voice-like sine waves at 300-500Hz with proper energy levels. ## The Free-Time Reality Four days of evenings debugging why FastAPI wouldn't mount, why WebSockets dropped, why file uploads failed validation. Could've just paid for a commercial STT API and been done. But I learned HuggingFace Spaces limits, Gradio's file handling quirks, and WebRTC. Knowledge worth having for future projects. ## Still Not Perfect 6-8 second latency is higher than I wanted. Commercial solutions hit 2 seconds. But for free GPU compute, I'll take it. Next: Adding voice output to complete the speech-to-speech loop. --- **Tech**: WebRTC, Whisper base (OpenAI), HuggingFace Spaces (H200 GPU), Gradio client, Voice Activity Detection **Time**: 4 evenings debugging, ~20 hours total **Code**: Full implementation available on GitHub: https://lnkd.in/gNvSDeBu #WebRTC #SpeechToText #HuggingFace #AI #VoiceAI #BuildingInPublic #OpenSource
Share this post