Shukdev Datta

shukdevdatta123

AI & ML interests

None yet

Recent Activity

updated a Space about 2 hours ago
shukdevdatta123/DhakaMetroWeb
published a Space about 4 hours ago
shukdevdatta123/DhakaMetroWeb
updated a Space 1 day ago
shukdevdatta123/WaveTalk
View all activity

Organizations

None yet

shukdevdatta123's activity

posted an update about 2 months ago
view post
Post
1254
Introducing: Multi Modal Omni Chatbot

๐Ÿš€ Exciting News! ๐Ÿค–๐Ÿ’ก
I just built a Multimodal Chatbot that can understand both Text and Images! ๐ŸŒ๐Ÿ–ผ๏ธ

๐ŸŽค Features:
Text Input: Ask any question or provide text, and get smart answers! ๐Ÿง โœจ
Image Input: Upload an image and let the assistant help you with it! ๐Ÿคณ๐Ÿง
Customizable: Choose the level of reasoning effort you want! ๐Ÿงฉ
Choose Your Model: Switch between image-focused and text-focused models! ๐Ÿ”„

๐Ÿ”‘ How It Works:
Enter your OpenAI API Key ๐Ÿ”‘
Type a Text Question or Upload an Image ๐Ÿ“ธ
Get intelligent responses from the assistant ๐Ÿ—ฃ๏ธ๐Ÿ’ฌ
Plus, you can clear your chat history whenever you want! ๐Ÿงนโœจ
Perfect for all your multimodal needs, whether you're working with text or visuals! Try it out and explore how AI can be your new assistant! ๐Ÿ’ฌ๐Ÿค–

Demo: https://shukdevdatta123-multi-modal-o1-chatbot.hf.space

#AI #OpenAI #Chatbot #MachineLearning #TechInnovation #Multimodal #Coding #Python #Gradio #AIForAll
  • 1 reply
ยท
posted an update 3 months ago
view post
Post
1750
Introducing Kokoro TTS Translate For All users:

shukdevdatta123/Kokoro-TTS

https://colab.research.google.com/drive/1DIpBzJSBBeTcpkyxkHcpngLumMapEWQz?usp=sharing

(colab link for GPU access)

Our Streamlit application provides a text-to-speech conversion tool using the Kokoro library, allowing users to input text, select language and voice, and adjust speech speed. The generated audio can be played or downloaded as a WAV file. Optionally, an OpenAI API key enables text translation to English, with subsequent speech generation for both the original and translated text. This functionality, along with helpful instructions and sample prompts, positions the application for various business opportunities. It can be offered as a SaaS platform with tiered subscriptions for access to features like diverse voices, languages, and translation. Target markets include content creators, language learning platforms, accessibility tools, and businesses needing automated voice responses. Further revenue streams can be generated through API integration with other applications, custom voice creation or cloning services, and affiliate marketing with related services.