Spaces:
Paused
Drawing with LLM π¨
A Streamlit application that converts text descriptions into SVG graphics using multiple AI models.
Overview
This project allows users to create vector graphics (SVG) from text descriptions using three different approaches:
- ML Model - Uses Stable Diffusion to generate images and vtracer to convert them to SVG
- DL Model - Uses Stable Diffusion for initial image creation and StarVector for direct image-to-SVG conversion
- Naive Model - Uses Phi-4 LLM to directly generate SVG code from text descriptions
Features
- Text-to-SVG generation with three different model approaches
- Adjustable parameters for each model type
- Real-time SVG preview and code display
- SVG download functionality
- GPU acceleration for faster generation
Requirements
- Python 3.11+
- CUDA-compatible GPU (recommended)
- Dependencies listed in
requirements.txt
Installation
Using Miniconda (Recommended)
# Install Miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
bash miniconda.sh -b -p $HOME/miniconda
echo 'export PATH="$HOME/miniconda/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc
# Create and activate environment
conda create -n svg-app python=3.11 -y
conda activate svg-app
# Install star-vector
cd star-vector
pip install -e .
cd ..
# Install other dependencies
pip install -r requirements.txt
Using Docker
# Build and run with Docker Compose
docker-compose up -d
Usage
Start the Streamlit application:
streamlit run app.py
Or with the yes flag to automatically accept:
yes | streamlit run app.py
The application will be available at http://localhost:8501
Models
ML Model (vtracer)
Uses Stable Diffusion to generate an image from the text prompt, then applies vtracer to convert the raster image to SVG.
Configurable parameters:
- Simplify SVG
- Color Precision
- Filter Speckle
- Path Precision
DL Model (starvector)
Uses Stable Diffusion for initial image creation followed by StarVector, a specialized model designed to convert images directly to SVG.
Naive Model (phi-4)
Directly generates SVG code using the Phi-4 language model with specialized prompting.
Configurable parameters:
- Max New Tokens
Evaluation Data and Results
Data
The data
directory contains synthetic evaluation data created using custom scripts:
- The first 15 examples are from the Kaggle competition "Drawing with LLM"
descriptions.csv
- Text descriptions for generating SVGseval.csv
- Evaluation metricsgen_descriptions.py
- Script for generating synthetic descriptionsgen_vqa.py
- Script for generating visual question answering data- Sample images (
gray_coat.png
,purple_forest.png
) for reference
Results
The results
directory contains evaluation results comparing different models:
- Evaluation results for both Naive (Phi-4) and ML (vtracer) models
- The DL model (StarVector) was not evaluated as it typically fails on transforming natural images, often returning blank SVGs
- Performance visualizations:
category_radar.png
- Performance comparison across categoriescomplexity_performance.png
- Performance relative to prompt complexityquality_vs_time.png
- Quality-time tradeoff analysisgeneration_time.png
- Comparison of generation timesmodel_comparison.png
- Overall model performance comparison
- Generated SVGs and PNGs in respective subdirectories
- Detailed results in JSON and CSV formats
Project Structure
drawing-with-llm/ # Root directory
β
βββ app.py # Main Streamlit application
βββ requirements.txt # Python dependencies
βββ Dockerfile # Docker container definition
βββ docker-compose.yml # Docker Compose configuration
β
βββ ml.py # ML model implementation (vtracer approach)
βββ dl.py # DL model implementation (StarVector approach)
βββ naive.py # Naive model implementation (Phi-4 approach)
βββ gen_image.py # Common image generation using Stable Diffusion
β
βββ eval.py # Evaluation script for model comparison
βββ eval_analysis.py # Analysis script for evaluation results
βββ metric.py # Metrics implementation for evaluation
β
βββ data/ # Evaluation data directory
β βββ descriptions.csv # Text descriptions for evaluation
β βββ eval.csv # Evaluation metrics
β βββ gen_descriptions.py # Script for generating synthetic descriptions
β βββ gen_vqa.py # Script for generating VQA data
β βββ gray_coat.png # Sample image by GPT-4o
β βββ purple_forest.png # Sample image by GPT-4o
β
βββ results/ # Evaluation results directory
β βββ category_radar.png # Performance comparison across categories
β βββ complexity_performance.png # Performance by prompt complexity
β βββ quality_vs_time.png # Quality-time tradeoff analysis
β βββ generation_time.png # Comparison of generation times
β βββ model_comparison.png # Overall model performance comparison
β βββ summary_*.csv # Summary metrics in CSV format
β βββ results_*.json # Detailed results in JSON format
β βββ svg/ # Generated SVG outputs
β βββ png/ # Generated PNG outputs
β
βββ star-vector/ # StarVector dependency (installed locally)
βββ starvector/ # StarVector Python package
License
[Specify your license information here]
Acknowledgments
This project utilizes several key technologies:
- Stable Diffusion for image generation
- StarVector for image-to-SVG conversion
- vtracer for raster-to-vector conversion
- Phi-4 for text-to-SVG generation
- Streamlit for the web interface