SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics
Paper
•
2506.01844
•
Published
•
27
None defined yet.
pip install -U huggingface_hub[hf_xet]
from huggingface_hub import InferenceClient
client = InferenceClient(provider="fal-ai", bill_to="my-cool-company")
image = client.text_to_image(
"A majestic lion in a fantasy forest",
model="black-forest-labs/FLUX.1-schnell",
)
image.save("lion.png")
Simplify
option for reduced poly count)