File size: 1,507 Bytes
9211b0c
 
 
1ea65c9
 
 
 
 
 
 
 
 
 
 
 
 
d6734a5
1ea65c9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d6734a5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
---
license: mit
---

# SRTK Scorer

This model is a trained scorer for [SRTK](https://github.com/happen2me/subgraph-retrieval-toolkit). It is used to compare the similarity between a query and the expansion path at the time of subgraph retrieval.

## Training Information

It is initialized with `roberta-base`. It is trained jointly on the following datasets:

- [WebQSP for Freebase](https://www.microsoft.com/en-us/download/details.aspx?id=52763)
- [SimpleQuestionsWikidata for Wikidata](https://github.com/askplatypus/wikidata-simplequestions)
- [SimpleDBpediaQA](https://github.com/castorini/SimpleDBpediaQA)

It achieves an answer coverage rate of 0.9728 on SimpleQuestionsWikidata (depth 1) 0.8501 on WebQSP test set (depth 2) with a beam width of only 2!

## Usage Example

First install the package:

```bash
pip install srtk
```

Then you can retrieve subgraphs with the help of this scorer:

```bash
srtk retrieve -i data/wikidata-simplequestions/intermediate/scores_test.jsonl \
    -o artifacts/subgraphs/wikidata-simple-contrast \
    -e http://localhost:1234/api/endpoint/sparql \
    --scorer-model-path drt/srtk-scorer \
    --scorer  --beam-width 2 --max-depth 1 --evaluate
```

## Limitations

As both SimpleQuestionsWikidata and SimpleDBpediaQA contain only one-hop relations, the model tends to stop at one-hop when you retrieve subgraphs on Wikidata and DBpedia. We will release a updated version of the model that is trained on a more diverse dataset in the future.

## License

MIT