--- library_name: transformers tags: [] --- # ***Mol-MoE***: Performant & Steerable Multi-Objective RLHF in Drug Design *Diego Calanzone (1, 2), Pierluca D'Oro (2), Pierre-Luc Bacon (1, 2)*
*(1) Universite de Montreal, (2) Mila Quebec AI Institute*
## How to use this model This LM is fine-tuned to generate molecules in the SMILES format wrt. desired properties. For unconditioned SMILES generation, use the BOS token ``.
For conditioned generation, please refer to the paper and the official codebase to derive different conditioned models.
This model is the merging result of 5 fine-tuned versions (`JNK3, DRD2, GSK3B, CYP2D6, CYP2D19`) with equal interpolation weight: *w_i = 0.2*. An example of the generation pipeline: ``` from transformers import AutoTokenizer, AutoModelForCausalLM import re # Setup device = "cuda" tokenizer = AutoTokenizer.from_pretrained("ddidacus/RS-mol-llama-1b") model = AutoModelForCausalLM.from_pretrained("ddidacus/RS-mol-llama-1b") generation_kwargs = { "max_new_tokens": 128, "min_length": -1, "top_k": 0.0, "top_p": 0.9, "do_sample": True, "pad_token_id": tokenizer.eos_token_id, "temperature": 1.0 } # Inference query = "" toks = tokenizer([query], return_tensors="pt")["input_ids"].to(device) output = model.generate(toks, **generation_kwargs) output = tokenizer.batch_decode(output) # Parsing filter = r'(.*?)' molecule = re.findall(filter, output[0], re.DOTALL) ``` ### Model Description This model is a fine-tuned version of LLaMa 3.2 1B through two stages: 1. Fine-tuning on ~3.5M molecules extracted from: ZINC 250K, MOSES, CHEMBL 2. RLHF-tuning using RLOO on 5 distinct reward functions from PyTDC [1] - **Developed by:** Diego Calanzone (diego.calanzone@mila.quebec) - **Model type:** Decoder-only Transformer - **Finetuned from model [optional]:** LLaMA 3.2 1B Read the paper for further details. ### Sources [1] https://tdcommons.ai/single_pred_tasks/overview