E-commerce · Ambient Audio AI Retail
Trim Pixel
AI-powered retail sales intelligence platform.
The brief
Build a product enabling retail brands to "become storytellers of their products and services" by recording, transcribing, and analyzing in-store sales conversations — surfacing coaching insights and SOP compliance scores to sales managers without manual review of audio recordings.
What we built
A full-stack, multi-brand retail conversational intelligence system branded internally as "YoYo AI" / "TrimPixel". The platform captures ambient audio from IoT recording devices worn by retail salespersons, processes recordings through an automated AWS pipeline (S3 → Lambda → SQS → ECS VAD/transcription), stores structured interaction data in PostgreSQL, and serves it via FastAPI to:
- a React analytics dashboard for brand/regional managers showing touchpoint compliance, SOP adherence scores, and sales conversion metrics per salesperson
- an internal annotation portal where ops teams label audio regions with touchpoints, test-drives, themes, and red flags to generate training data; and
- brand-specific AI pipelines (Evoke hair analysis, profanity detection) deployed on GPU ECS Fargate. Supports 10+ brands including Spinny, BlueStone, Tanishq, Lenskart, Wakefit, Purplle, Lensfit, Pitstop, Soulflower, Evoke Hair, and Livspace. Infrastructure is SOC2-tracked via Sprinto.
Production-grade multi-brand platform live across 10+ Indian retail brands. Automated audio ingestion pipeline (Lambda+SQS+ECS) processing daily uploads. Client-facing analytics dashboards (Spinny, BlueStone, Tanishq, Lenskart, Wakefit etc.) with SOC2 compliance in progress. Internal annotation panel (V2) used by ops trainers. VAD service deployed on ECS Fargate with GPU support. Device management and salesperson onboarding APIs in use. QA still active as of June 2025.
Delivery timeline
How it was built, phase by phase.
8 workstreams across 75 weeks of operated delivery.
- buildWeek 1–20
Audio Processing Pipeline & AI Transcription
End-to-end audio ingestion, silence/noise removal, trimming, stitching, peak extraction, and transcription using pydub, librosa, ffmpeg-python, and third-party AI APIs (SimpliSmart).
Automated audio processing pipeline capable of handling multi-file interactions.
pydublibrosaffmpeg-pythonAWS LambdaAWS S3SimpliSmart API - buildWeek 1–75
Multi-Brand Client Analytics Dashboard
Role-based sales analytics dashboard supporting multiple retail brands (Spinny, BlueStone, Tanishq, Lenskart, Wakefit, Purplle, Evoke, Pitstop, Soulflower, Livspace).
Live production dashboards used by sales managers at multiple Indian retail brands to monitor salesperson performance and customer interaction quality
ReactReduxAnt DesignChart.jsWaveSurfer.jsFastAPI - buildWeek 8–42
AI Training / Internal Annotation Panel
Web-based internal tool allowing operations teams to annotate audio recordings with markers, regions, touchpoints, test drives, themes, and red flags.
Ops team can annotate hundreds of sales interactions daily; data feeds the AI model training pipeline
ReactReduxWaveSurfer.jsFastAPIPostgreSQLAWS S3 - integrateWeek 8–60
Lambda-Triggered Automation & Data Ingestion
AWS Lambda functions triggered by S3 uploads via SQS queues to automatically process raw audio files, extract metadata, update the database, and trigger downstream ML pipelines.
Fully automated ingestion pipeline: audio uploaded to S3 → Lambda triggered → VAD/transcription queued → DB updated → dashboard refreshed
AWS LambdaAWS SQSAWS S3AWS EventBridgeAWS Batchboto3 - deployWeek 14–75
AWS Infrastructure & CI/CD Pipeline
Full AWS infrastructure including EC2, S3, Lambda, SQS, ECS Fargate, CloudFront, RDS/PostgreSQL, CloudWatch, EventBridge, and Jenkins-based CI/CD pipelines.
Multi-environment (sandbox/pilot/prod) deployment pipeline with full observability, auto-scaling ML services, and SOC2-track compliance posture
AWS EC2AWS S3AWS LambdaAWS SQSAWS ECS FargateAWS CloudFront - deployWeek 28–50
VAD (Voice Activity Detection) Model Integration
Custom deployment and integration of a Silero VAD model on AWS ECS with Fargate and GPU support. Includes speaker segmentation, fingerprinting, and audio classification.
Production-grade VAD service capable of segmenting salesperson vs. customer speech in retail interaction recordings
Silero VADAWS ECSAWS FargateDockerGPU EC2NVIDIA AMI - buildWeek 40–75
SOP Compliance & Adherence Analytics
Backend APIs and data models for tracking Standard Operating Procedure (SOP) adherence per salesperson across brands.
Sales managers can view SOP compliance scores per salesperson per day, with drill-down into individual interactions and downloadable CSV reports
FastAPIPostgreSQLSQLAlchemyPythonAWS S3 (CSV export) - buildWeek 40–58
Evoke AI Pipeline (Hair Analysis)
Dedicated AI processing pipeline for the 'Evoke' brand (hair/beauty sector).
End-to-end Evoke AI pipeline from device audio upload to SOP evaluation output, live in production
FastAPIAWS LambdaAWS ECS FargateGPU EC2Dockerboto3
More case studies
Related work
09 · Run a function
Stop renting hours. Start running functions.
Pick the function you want off your plate. We'll map the brain and name the outcome we'd commit to — before you do.
