derek-thomas
/

prompt-order-experiment

Model card Files Files and versions Community

derek-thomas commited on Nov 28, 2024

Commit

b4f4f9d

1 Parent(s): ed0ad35

Adding autotrain

Browse files

Files changed (1) hide show

02-autotrain.ipynb +199 -0

02-autotrain.ipynb ADDED Viewed

	@@ -0,0 +1,199 @@

+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "21111b1f-7cce-4e8b-8337-8f0cdab5804e",
+   "metadata": {},
+   "source": [
+    "# AutoTrain"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "dd09a9fd-4b90-48f3-b61c-d2349eb7f43e",
+   "metadata": {},
+   "source": [
+    "## Imports"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "52543575-f92e-4038-ad13-30967f47eb7a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "import subprocess\n",
+    "\n",
+    "import yaml"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "74987944-abfb-44f8-9331-ffbb2f7698bb",
+   "metadata": {},
+   "source": [
+    "## Config"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "97c25070-775a-4fb1-9694-4579250686a6",
+   "metadata": {},
+   "source": [
+    "### Template\n",
+    "Im creating a template so we can iterate through each of our experiments.\n",
+    "\n",
+    "Here you can see a few design decisions:\n",
+    "- We leave `project_name` and `text_column` empty to overwrite later per experiment\n",
+    "- We log in tensorboard, you can use wandb, but you will need to install it in the AutoTrain env that is run on spaces, which gets complex\n",
+    "- I choose an `l4x1` from [these options](https://github.com/huggingface/autotrain-advanced/blob/2d787b2033414d06f1e9be2ea0caacad3097f5e8/src/autotrain/backends/base.py#L21)\n",
+    "    - This is a [well priced](https://huggingface.co/pricing#spaces) way of training a 7B moodel \n",
+    "    - It's very efficient as well at 24GB VRAM\n",
+    "- It's becoming less common to use a `valid_split` \n",
+    "- I run 2 epochs as the loss still decreases steadily, but some say for LoRAs you should just do 1\n",
+    "- Its a good idea use `all-linear` when using LoRA "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "dc2a8514-51c1-404b-8cfa-6637cc810668",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Base config\n",
+    "config_template = {\n",
+    "    \"task\": \"llm-sft\",\n",
+    "    \"base_model\": \"mistralai/Mistral-7B-Instruct-v0.3\",\n",
+    "    \"project_name\": \"\",\n",
+    "    \"log\": \"tensorboard\",\n",
+    "    \"backend\": \"spaces-l4x1\",\n",
+    "    \"data\": {\n",
+    "        \"path\": \"derek-thomas/labeled-multiple-choice-explained-mistral-tokenized\",\n",
+    "        \"train_split\": \"train\",\n",
+    "        \"valid_split\": None,\n",
+    "        \"chat_template\": \"none\",\n",
+    "        \"column_mapping\": {\n",
+    "            \"text_column\": \"\"\n",
+    "            },\n",
+    "        },\n",
+    "    \"params\": {\n",
+    "        \"block_size\": 1024,\n",
+    "        \"model_max_length\": 1024,\n",
+    "        \"epochs\": 2,\n",
+    "        \"batch_size\": 1,\n",
+    "        \"lr\": 3e-5,\n",
+    "        \"peft\": True,\n",
+    "        \"quantization\": \"int4\",\n",
+    "        \"target_modules\": \"all-linear\",\n",
+    "        \"padding\": \"left\",\n",
+    "        \"optimizer\": \"adamw_torch\",\n",
+    "        \"scheduler\": \"linear\",\n",
+    "        \"gradient_accumulation\": 8,\n",
+    "        \"mixed_precision\": \"bf16\",\n",
+    "        },\n",
+    "    \"hub\": {\n",
+    "        \"username\": \"derek-thomas\",\n",
+    "        \"token\": os.getenv('HF_TOKEN'),\n",
+    "        \"push_to_hub\": True,\n",
+    "        },\n",
+    "}"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "22eb3d3a-0ab0-4f79-98c2-513a34ce1b6d",
+   "metadata": {},
+   "source": [
+    "### Experiments\n",
+    "Here we choose the `project_name` and `text_column` for each experiment."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "957eb2b7-feec-422f-ba46-b293d9a77c1b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "project_suffixes = [\"RFA-gpt3-5\", \"RFA-mistral\", \"FAR-gpt3-5\", \"FAR-mistral\", \"FA\"]\n",
+    "text_columns = [\"conversation_RFA_gpt3_5\", \"conversation_RFA_mistral\", \"conversation_FAR_gpt3_5\",\n",
+    "                \"conversation_FAR_mistral\", \"conversation_FA\"]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a5913085-83c9-4133-a90d-318fd13cc14e",
+   "metadata": {},
+   "source": [
+    "Directory to store generated configs"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b86702bf-f494-4951-863e-be5b8462fbd1",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "output_dir = \"./autotrain_configs\"\n",
+    "os.makedirs(output_dir, exist_ok=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3053d1e1-ca40-460c-8999-0787a1751d00",
+   "metadata": {},
+   "source": [
+    "## AutoTrain for each Experiment"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "025ccd2f-de54-4ac2-9f36-f606876dcd3c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Generate configs and run commands\n",
+    "for project_suffix, text_column in zip(project_suffixes, text_columns):\n",
+    "    # Modify the config\n",
+    "    config = config_template.copy()\n",
+    "    config[\"project_name\"] = f\"mistral-v03-poe-{project_suffix}\"\n",
+    "    config[\"data\"][\"column_mapping\"][\"text_column\"] = text_column\n",
+    "\n",
+    "    # Save the config to a YAML file\n",
+    "    config_path = os.path.join(output_dir, f\"{text_column}.yml\")\n",
+    "    with open(config_path, \"w\") as f:\n",
+    "        yaml.dump(config, f)\n",
+    "\n",
+    "    # Run the command\n",
+    "    print(f\"Running autotrain with config: {config_path}\")\n",
+    "    subprocess.run([\"autotrain\", \"--config\", config_path])"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.10"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}