sureal01's picture
Create README.md
ff6048b verified
metadata
tags:
  - image-captioning
  - deep-learning
  - pytorch
  - encoder-decoder
  - vision

πŸ–ΌοΈ Image Captioning Model

This is a deep learning-based image captioning model trained using a CNN Encoder + LSTM Decoder architecture. The model generates captions for input images based on visual features extracted by a Convolutional Neural Network (CNN).

πŸ“Œ Model Details

  • Model Type: Image Captioning
  • Architecture: CNN Encoder + LSTM Decoder
  • Framework: PyTorch
  • Input: Image (.jpg, .png, etc.)
  • Output: Generated caption (text)
  • Vocabulary: Pre-trained vocabulary file

πŸš€ How to Use

1️⃣ Install Dependencies

pip install torch torchvision transformers huggingface_hub pickle5