|
--- |
|
tags: |
|
- image-captioning |
|
- deep-learning |
|
- pytorch |
|
- encoder-decoder |
|
- vision |
|
--- |
|
|
|
# ๐ผ๏ธ Image Captioning Model |
|
|
|
This is a deep learning-based **image captioning model** trained using a **CNN Encoder + LSTM Decoder** architecture. The model generates captions for input images based on visual features extracted by a Convolutional Neural Network (CNN). |
|
|
|
## ๐ Model Details |
|
- **Model Type**: Image Captioning |
|
- **Architecture**: CNN Encoder + LSTM Decoder |
|
- **Framework**: PyTorch |
|
- **Input**: Image (`.jpg`, `.png`, etc.) |
|
- **Output**: Generated caption (text) |
|
- **Vocabulary**: Pre-trained vocabulary file |
|
|
|
## ๐ How to Use |
|
### **1๏ธโฃ Install Dependencies** |
|
```bash |
|
pip install torch torchvision transformers huggingface_hub pickle5 |
|
|