Add model card for VGGT

This PR adds a model card to the repository. The model card includes the relevant pipeline tag, license, and links to the paper and code. It also includes a brief overview of the model.

Files changed (1) hide show

README.md +9 -3

README.md CHANGED Viewed

@@ -2,8 +2,14 @@
 tags:
 - model_hub_mixin
 - pytorch_model_hub_mixin
 ---
-This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
-- Library: [More Information Needed]
-- Docs: [More Information Needed]

 tags:
 - model_hub_mixin
 - pytorch_model_hub_mixin
+library_name: pytorch
+pipeline_tag: image-to-3d
+license: mit
 ---
+Visual Geometry Grounded Transformer (VGGT, CVPR 2025) is a feed-forward neural network that directly infers all key 3D attributes of a scene, including extrinsic and intrinsic camera parameters, point maps, depth maps, and 3D point tracks, **from one, a few, or hundreds of its views, within seconds**.
+Paper: [VGGT: Visual Geometry Grounded Transformer](https://huggingface.co/papers/2503.11651)
+Code: https://github.com/facebookresearch/vggt
+Project Page: https://vgg-t.github.io/
+Demo: https://huggingface.co/spaces/facebook/vggt