ShaoRun commited on
Commit
926b0d8
·
verified ·
1 Parent(s): fc34cce

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -1
README.md CHANGED
@@ -1,8 +1,12 @@
 
 
 
 
1
  # AllSparkv2: A Language-centric Progressive Omni-modal Learning Framework
2
  [Run Shao](https://scholar.google.com/citations?user=j3fct8MAAAAJ&hl=en&oi=ao), and [Haifeng Li](https://scholar.google.com/citations?user=51p_SJAAAAAJ&hl=en).
3
  **School of Geosciences and Info-physics, Central South University**
4
 
5
- <a href='https://yhycsu.github.io/AllSparkv2/'><img src='https://img.shields.io/badge/Project-Page-Green'></a> <a href='https://arxiv.org/abs/2310.09478.pdf'><img src='https://img.shields.io/badge/Paper-Arxiv-red'></a> <a href='https://yhycsu.github.io/AllSparkv2/'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue'> </a> <a href='https://huggingface.co/ShaoRun/AllSparkv2-7B-V-P'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Models-blue'>
6
 
7
  ## Introduction
8
  AllSparkv2 is a progressive multimodal learning framework that decouples cross-modal general knowledge from modality-specific knowledge at both the architecture and training strategy levels. Inspired by Piaget's Theory of Cognitive Development, AllSparkv2 introduces the Modal Mixture of Experts (M-MoE) architecture, where dedicated experts handle different modalities to decouple the parameter space, and new modality experts inherit cross-modal general knowledge by initializing from existing ones. In training, a hierarchical modality learning strategy is implemented, starting with vision as the initial modality, followed by point clouds as the successive modality. AllSparkv2 undergoes full-parameter training on vision for powerful cross-modal general knowledge, while only modality-specific experts are trained for point clouds, preserving existing knowledge. Experimental results demonstrate that AllSparkv2 can progressively integrate new modalities while preventing catastrophic forgetting and enhancing cross-modal performance.
 
1
+ ---
2
+ base_model:
3
+ - Qwen/Qwen2.5-0.5B-Instruct
4
+ ---
5
  # AllSparkv2: A Language-centric Progressive Omni-modal Learning Framework
6
  [Run Shao](https://scholar.google.com/citations?user=j3fct8MAAAAJ&hl=en&oi=ao), and [Haifeng Li](https://scholar.google.com/citations?user=51p_SJAAAAAJ&hl=en).
7
  **School of Geosciences and Info-physics, Central South University**
8
 
9
+ <a href='https://yhycsu.github.io/AllSparkv2/'><img src='https://img.shields.io/badge/Project-Page-Green'></a> <a href='https://arxiv.org/abs/2310.09478.pdf'><img src='https://img.shields.io/badge/Paper-Arxiv-red'></a> </a> <a href='https://github.com/GeoXLab/AllSparkv2'><img src='https://img.shields.io/badge/Github-Code-blue'></a>
10
 
11
  ## Introduction
12
  AllSparkv2 is a progressive multimodal learning framework that decouples cross-modal general knowledge from modality-specific knowledge at both the architecture and training strategy levels. Inspired by Piaget's Theory of Cognitive Development, AllSparkv2 introduces the Modal Mixture of Experts (M-MoE) architecture, where dedicated experts handle different modalities to decouple the parameter space, and new modality experts inherit cross-modal general knowledge by initializing from existing ones. In training, a hierarchical modality learning strategy is implemented, starting with vision as the initial modality, followed by point clouds as the successive modality. AllSparkv2 undergoes full-parameter training on vision for powerful cross-modal general knowledge, while only modality-specific experts are trained for point clouds, preserving existing knowledge. Experimental results demonstrate that AllSparkv2 can progressively integrate new modalities while preventing catastrophic forgetting and enhancing cross-modal performance.