George-API commited on
Commit
1d4c4c4
·
verified ·
1 Parent(s): 24ba360

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. README.md +57 -0
  2. update_space.py +13 -12
README.md CHANGED
@@ -1,3 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # Phase 1: Domain Adaptation (Unsupervised)
2
 
3
  This directory contains the code and configuration for domain adaptation of the phi-4-unsloth-bnb-4bit model to the cognitive science domain. This phase produces our domain-adapted model: [George-API/phi-4-research-assistant](https://huggingface.co/George-API/phi-4-research-assistant).
 
1
+ ---
2
+ title: Phi-4 Unsloth Training
3
+ emoji: 🧠
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: gradio
7
+ sdk_version: 5.17.0
8
+ app_file: app.py
9
+ pinned: false
10
+ license: mit
11
+ ---
12
+
13
+ # Phi-4 Unsloth Optimized Training
14
+
15
+ This space is dedicated to training Microsoft's Phi-4 model using Unsloth optimizations for enhanced performance and efficiency. The training process utilizes 4-bit quantization and advanced memory optimizations.
16
+
17
+ ## Features
18
+
19
+ - 4-bit quantization using Unsloth
20
+ - Optimized training pipeline
21
+ - Cognitive dataset integration
22
+ - Advanced memory management
23
+ - Gradient checkpointing
24
+ - Sequential data processing
25
+
26
+ ## Configuration Files
27
+
28
+ - `transformers_config.json`: Model and training parameters
29
+ - `hardware_config.json`: Hardware-specific optimizations
30
+ - `dataset_config.json`: Dataset processing settings
31
+ - `requirements.txt`: Required dependencies
32
+
33
+ ## Training Process
34
+
35
+ The training utilizes the following optimizations:
36
+ - Unsloth's 4-bit quantization
37
+ - Custom chat templates for Phi-4
38
+ - Paper-order preservation
39
+ - Efficient memory usage
40
+ - Gradient accumulation
41
+
42
+ ## Dataset
43
+
44
+ Training uses the cognitive dataset with:
45
+ - Maintained paper order
46
+ - Proper metadata handling
47
+ - Optimized sequence length
48
+ - Efficient batching
49
+
50
+ ## Hardware Requirements
51
+
52
+ - GPU: A10G or better
53
+ - VRAM: 24GB minimum
54
+ - RAM: 32GB recommended
55
+
56
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
57
+
58
  # Phase 1: Domain Adaptation (Unsupervised)
59
 
60
  This directory contains the code and configuration for domain adaptation of the phi-4-unsloth-bnb-4bit model to the cognitive science domain. This phase produces our domain-adapted model: [George-API/phi-4-research-assistant](https://huggingface.co/George-API/phi-4-research-assistant).
update_space.py CHANGED
@@ -26,6 +26,18 @@ logger = logging.getLogger(__name__)
26
 
27
  def load_env_variables():
28
  """Load environment variables from system or .env file."""
 
 
 
 
 
 
 
 
 
 
 
 
29
  # Check if we're running in a Hugging Face Space
30
  if os.environ.get("SPACE_ID"):
31
  logger.info("Running in Hugging Face Space")
@@ -33,23 +45,12 @@ def load_env_variables():
33
  username = os.environ.get("SPACE_ID").split("/")[0]
34
  os.environ["HF_USERNAME"] = username
35
  logger.info(f"Set HF_USERNAME from SPACE_ID: {username}")
36
- else:
37
- try:
38
- from dotenv import load_dotenv
39
- env_path = Path(__file__).parent.parent / ".env"
40
- if env_path.exists():
41
- load_dotenv(env_path)
42
- logger.info(f"Loaded environment variables from {env_path}")
43
- else:
44
- logger.warning(f"No .env file found at {env_path}")
45
- except ImportError:
46
- logger.warning("python-dotenv not installed, skipping .env loading")
47
 
48
  # Verify required variables
49
  required_vars = {
50
  "HF_TOKEN": os.environ.get("HF_TOKEN"),
51
  "HF_USERNAME": os.environ.get("HF_USERNAME"),
52
- "HF_SPACE_NAME": os.environ.get("HF_SPACE_NAME", "phi4-cognitive-training")
53
  }
54
 
55
  missing_vars = [k for k, v in required_vars.items() if not v]
 
26
 
27
  def load_env_variables():
28
  """Load environment variables from system or .env file."""
29
+ # First try to load from local .env file
30
+ try:
31
+ from dotenv import load_dotenv
32
+ env_path = Path(__file__).parent / ".env"
33
+ if env_path.exists():
34
+ load_dotenv(env_path)
35
+ logger.info(f"Loaded environment variables from {env_path}")
36
+ else:
37
+ logger.warning(f"No .env file found at {env_path}")
38
+ except ImportError:
39
+ logger.warning("python-dotenv not installed, skipping .env loading")
40
+
41
  # Check if we're running in a Hugging Face Space
42
  if os.environ.get("SPACE_ID"):
43
  logger.info("Running in Hugging Face Space")
 
45
  username = os.environ.get("SPACE_ID").split("/")[0]
46
  os.environ["HF_USERNAME"] = username
47
  logger.info(f"Set HF_USERNAME from SPACE_ID: {username}")
 
 
 
 
 
 
 
 
 
 
 
48
 
49
  # Verify required variables
50
  required_vars = {
51
  "HF_TOKEN": os.environ.get("HF_TOKEN"),
52
  "HF_USERNAME": os.environ.get("HF_USERNAME"),
53
+ "HF_SPACE_NAME": os.environ.get("HF_SPACE_NAME", "phi4training")
54
  }
55
 
56
  missing_vars = [k for k, v in required_vars.items() if not v]