File size: 1,064 Bytes
3bad439 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
---
pipeline_tag: image-to-image
library_name: diffusers
license: mit
---
# F-ViTA: Foundation Model Guided Visible to Thermal Translation
This repository contains the model described in the paper [F-ViTA: Foundation Model Guided Visible to Thermal Translation](https://huggingface.co/papers/2504.02801).
F-ViTA leverages foundation models (SAM and Grounded DINO) to guide the visible-to-thermal image translation process using an InstructPix2Pix diffusion model. This approach improves translation accuracy and generalizes well to out-of-distribution scenarios.
Code: https://github.com/jay-jnp/F-ViTA
Pre-trained checkpoints are available for several datasets:
* **KAIST:** [huggingface.co/jay-jnp/F-ViTA\_KAIST](https://huggingface.co/jay-jnp/F-ViTA_KAIST)
* **FLIR:** [huggingface.co/jay-jnp/F-VITA\_FLIR](https://huggingface.co/jay-jnp/F-VITA_FLIR)
* **NIRSCENE:** [huggingface.co/jay-jnp/F-VITA\_NIRSCENE](https://huggingface.co/jay-jnp/F-VITA_NIRSCENE)
* **OSU:** [huggingface.co/jay-jnp/F-VITA\_OSU](https://huggingface.co/jay-jnp/F-VITA_OSU) |