File size: 4,678 Bytes
0633aae
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
---
license: cc-by-nc-4.0
---
<p align="center">
<img width="500px" alt="xLAM" src="https://huggingface.co/datasets/jianguozhang/logos/resolve/main/xlam-no-background.png">
</p>
<p align="center">
  <a href="https://www.salesforceairesearch.com/projects/xlam-large-action-models">[Homepage]</a>  |
  <a href="https://github.com/SalesforceAIResearch/xLAM">[Github]</a> |
  <a href="https://blog.salesforceairesearch.com/large-action-model-ai-agent/">[Blog]</a>
</p>
<hr>

## Model Summary

This repo provides the GGUF format for the Llama-xLAM-2-8b-fc-r model. Here's a link to original model [Llama-xLAM-2-8b-fc-r](https://huggingface.co/Salesforce/Llama-xLAM-2-8b-fc-r).
[Large Action Models (LAMs)](https://blog.salesforceairesearch.com/large-action-models/) are advanced language models designed to enhance decision-making by translating user intentions into executable actions. As the **brains of AI agents**, LAMs autonomously plan and execute tasks to achieve specific goals, making them invaluable for automating workflows across diverse domains.  

## Model Overview
The new **xLAM-2** series, built on our most advanced data synthesis, processing, and training pipelines, marks a significant leap in **multi-turn reasoning** and **tool usage**. It achieves state-of-the-art performance on function-calling benchmarks like **BFCL** and **tau-bench**. We've also refined the **chat template** and **vLLM integration**, making it easier to build advanced AI agents. Compared to previous xLAM models, xLAM-2 offers superior performance and seamless deployment across applications.  
**This model release is for research purposes only.**  

## How to download GGUF files

1. **Install Hugging Face CLI:**

```
pip install huggingface-hub
```

2. **Login to Hugging Face:**
```
huggingface-cli login
```

3. **Download the GGUF model:**
```
huggingface-cli download Salesforce/Llama-xLAM-2-8b-fc-r-gguf Llama-xLAM-2-8b-fc-r-gguf --local-dir . --local-dir-use-symlinks False
```

## Prompt template
```
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

{TASK_INSTRUCTION}
You have access to a set of tools. When using tools, make calls in a single JSON array: 

[{"name": "tool_call_name", "arguments": {"arg1": "value1", "arg2": "value2"}}, ... (additional parallel tool calls as needed)]

If no tool is suitable, state that explicitly. If the user's input lacks required parameters, ask for clarification. Do not interpret or respond until tool results are returned. Once they are available, process them or make additional calls if needed. For tasks that don't require tools, such as casual conversation or general advice, respond directly in plain text. The available tools are:

{AVAILABLE_TOOLS}

<|eot_id|><|start_header_id|>user<|end_header_id|>

{USER_QUERY}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

{ASSISTANT_QUERY}<|eot_id|><|start_header_id|>user<|end_header_id|>

{USER_QUERY}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
```

## Usage

### Command Line
1. Install llama.cpp framework from the source [here](https://github.com/ggerganov/llama.cpp)
2. Run the inference task as below, to configure generation related paramter, refer to [llama.cpp](https://github.com/ggerganov/llama.cpp/blob/master/examples/main/README.md)
```
llama-cli -m [PATH-TO-LOCAL-GGUF]
```

### Python framwork

1. Install [llama-cpp-python](https://github.com/abetlen/llama-cpp-python)
```
pip install llama-cpp-python
```
2. Refer to [llama-cpp-API](https://github.com/abetlen/llama-cpp-python?tab=readme-ov-file#high-level-api), here's a example below
```python
from llama_cpp import Llama
llm = Llama(
      model_path="[PATH-TO-MODEL]"
)
output = llm.create_chat_completion(
      messages = [
        {
          "role": "system",
          "content": "You are a helpful assistant that can use tools. You are developed by Salesforce xLAM team."

        },
        {
          "role": "user",
          "content": "Extract Jason is 25 years old"
        }
      ],
      tools=[{
        "type": "function",
        "function": {
          "name": "UserDetail",
          "parameters": {
            "type": "object",
            "title": "UserDetail",
            "properties": {
              "name": {
                "title": "Name",
                "type": "string"
              },
              "age": {
                "title": "Age",
                "type": "integer"
              }
            },
            "required": [ "name", "age" ]
          }
        }
      }],
      tool_choice={
        "type": "function",
        "function": {
          "name": "UserDetail"
        }
      }
)
print(output['choices'][0]['message'])
```