ShaoRun
/

AllSparkv2-7B-V-P

Model card Files Files and versions Metrics Training metrics Community

ShaoRun commited on Mar 2

Commit

5f3105f

·

verified ·

1 Parent(s): 79ba0db

Update README.md

Files changed (1) hide show

README.md +28 -3

README.md CHANGED Viewed

@@ -1,3 +1,28 @@
----
-license: bsd-3-clause
----

+---
+base_model:
+- Qwen/Qwen2.5-7B-Instruct
+---
+# AllSparkv2: A Language-centric Progressive Omni-modal Learning Framework
+[Run Shao](https://scholar.google.com/citations?user=j3fct8MAAAAJ&hl=en&oi=ao), and [Haifeng Li](https://scholar.google.com/citations?user=51p_SJAAAAAJ&hl=en).
+**School of Geosciences and Info-physics, Central South University**
+<a href='https://yhycsu.github.io/AllSparkv2/'><img src='https://img.shields.io/badge/Project-Page-Green'></a> <a href='https://arxiv.org/abs/2310.09478.pdf'><img src='https://img.shields.io/badge/Paper-Arxiv-red'></a>   </a> <a href='https://github.com/GeoXLab/AllSparkv2'><img src='https://img.shields.io/badge/Github-Code-blue'></a>
+## Introduction
+AllSparkv2 is a progressive multimodal learning framework that decouples cross-modal general knowledge from modality-specific knowledge at both the architecture and training strategy levels. Inspired by Piaget's Theory of Cognitive Development, AllSparkv2 introduces the Modal Mixture of Experts (M-MoE) architecture, where dedicated experts handle different modalities to decouple the parameter space, and new modality experts inherit cross-modal general knowledge by initializing from existing ones. In training, a hierarchical modality learning strategy is implemented, starting with vision as the initial modality, followed by point clouds as the successive modality. AllSparkv2 undergoes full-parameter training on vision for powerful cross-modal general knowledge, while only modality-specific experts are trained for point clouds, preserving existing knowledge. Experimental results demonstrate that AllSparkv2 can progressively integrate new modalities while preventing catastrophic forgetting and enhancing cross-modal performance.
+## Note
+We provide this model in four different sizes: 0.5B, 1B, 3B, and 7B. You can find them at the following links:
+- 0.5B model: [[Link](https://huggingface.co/ShaoRun/AllSparkv2-0.5B-V-P)]
+- 1.5B model: [[Link](https://huggingface.co/ShaoRun/AllSparkv2-1.5B-V-P)]
+- 3B model: [[Link](https://huggingface.co/ShaoRun/AllSparkv2-3B-V-P)]
+- 7B model: [[Link](https://huggingface.co/ShaoRun/AllSparkv2-7B-V-P)]
+If you're using AllSparkv2 in your research or applications, please cite using this BibTeX:
+```bibtex
+```
+## License
+This repository is under [BSD 3-Clause License](LICENSE.md).