JAIPUR: Jaipur-based 25-year-old founder Sparsh Agrawal has unveiled one of the first speech-to-speech foundational AI models that can sing, whisper, pause, and respond with emotional intelligence -- all developed without big-tech infrastructure or venture capital funding.

Launched under his startup Pixa AI, Luna AI directly processes audio to generate human-like speech instead of converting it to text and back, resulting in faster, more expressive, and emotionally aware conversations.

The system's architecture allows it to whisper, modulate tone, and even sing -- creating an experience that feels more human than machine, Agarwal said.

He recently met with Union IT minister Ashwini Vaishnaw and got appreciation from industry leaders for his achievement.

"Where is India's AI? Every WhatsApp group, every conference hallway, every founder call asks the same question. Today, we're sharing the answer. Meet Luna, world's first speech-to-speech foundational AI model to unify audio, music and speech," Agarwal posted on X after launching the model.

Benchmark results show Luna outperforming leading global systems such as OpenAI's GPT-4 TTS and ElevenLabs, with 50 per cent lower latency and greater naturalness in speech output.

"I didn't have a research lab or a USD 100 million runway," Agrawal said.