2 14

Shukdev Datta

shukdevdatta123

AI & ML interests

None yet

Recent Activity

updated a Space about 2 hours ago

shukdevdatta123/DhakaMetroWeb

published a Space about 4 hours ago

shukdevdatta123/DhakaMetroWeb

updated a Space 1 day ago

shukdevdatta123/WaveTalk

View all activity

Organizations

None yet

shukdevdatta123's activity

posted an update about 2 months ago

Post

1254

Introducing: Multi Modal Omni Chatbot

🚀 Exciting News! 🤖💡
I just built a Multimodal Chatbot that can understand both Text and Images! 🌐🖼️

🎤 Features:
Text Input: Ask any question or provide text, and get smart answers! 🧠✨
Image Input: Upload an image and let the assistant help you with it! 🤳🧐
Customizable: Choose the level of reasoning effort you want! 🧩
Choose Your Model: Switch between image-focused and text-focused models! 🔄

🔑 How It Works:
Enter your OpenAI API Key 🔑
Type a Text Question or Upload an Image 📸
Get intelligent responses from the assistant 🗣️💬
Plus, you can clear your chat history whenever you want! 🧹✨
Perfect for all your multimodal needs, whether you're working with text or visuals! Try it out and explore how AI can be your new assistant! 💬🤖

Demo: https://shukdevdatta123-multi-modal-o1-chatbot.hf.space

#AI #OpenAI #Chatbot #MachineLearning #TechInnovation #Multimodal #Coding #Python #Gradio #AIForAll

1 reply

posted an update 3 months ago

Post

1750

Introducing Kokoro TTS Translate For All users:

shukdevdatta123/Kokoro-TTS

https://colab.research.google.com/drive/1DIpBzJSBBeTcpkyxkHcpngLumMapEWQz?usp=sharing

(colab link for GPU access)

Our Streamlit application provides a text-to-speech conversion tool using the Kokoro library, allowing users to input text, select language and voice, and adjust speech speed. The generated audio can be played or downloaded as a WAV file. Optionally, an OpenAI API key enables text translation to English, with subsequent speech generation for both the original and translated text. This functionality, along with helpful instructions and sample prompts, positions the application for various business opportunities. It can be offered as a SaaS platform with tiered subscriptions for access to features like diverse voices, languages, and translation. Target markets include content creators, language learning platforms, accessibility tools, and businesses needing automated voice responses. Further revenue streams can be generated through API integration with other applications, custom voice creation or cloning services, and affiliate marketing with related services.