File size: 4,121 Bytes
85f7907
 
fe0735c
650f194
 
 
 
 
 
 
 
 
 
 
3563af4
 
85f7907
 
a1f7265
2724daf
85f7907
 
46500d8
85f7907
fe0735c
85f7907
 
 
 
 
 
46500d8
85f7907
fe0735c
 
 
 
 
85f7907
3563af4
85f7907
 
650f194
85f7907
fe0735c
85f7907
 
 
 
 
fe0735c
85f7907
fe0735c
85f7907
fe0735c
85f7907
 
 
fe0735c
 
85f7907
fe0735c
85f7907
fe0735c
85f7907
fe0735c
 
85f7907
 
 
 
fe0735c
85f7907
fe0735c
 
 
85f7907
fe0735c
85f7907
fe0735c
 
85f7907
fe0735c
cacab29
 
85f7907
fe0735c
 
 
 
85f7907
fe0735c
 
 
85f7907
fe0735c
 
 
 
85f7907
 
fe0735c
85f7907
fe0735c
85f7907
fe0735c
 
85f7907
fe0735c
 
 
 
 
 
85f7907
 
 
 
fe0735c
85f7907
fe0735c
 
85f7907
 
46500d8
85f7907
650f194
fe0735c
 
 
 
 
 
650f194
85f7907
 
 
650f194
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
---
library_name: transformers
tags:
- sentiment-analysis
- aspect-based-sentiment-analysis
- transformers
- bert
language:
- tr
metrics:
- accuracy
base_model:
- dbmdz/bert-base-turkish-cased
pipeline_tag: text-classification
datasets:
- Sengil/Turkish-ABSA-Wsynthetic
---


# Aspect Based Sentiment Analysis with Turkish 🇹🇷 Data

<!-- Provide a quick summary of what the model is/does. -->
This model performs **Aspect-Based Sentiment Analysis (ABSA) 🚀** for Turkish text. It predicts sentiment polarity (Positive, Neutral, Negative) towards specific aspects within a given sentence.

---

## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->
This model is fine-tuned from the `dbmdz/bert-base-turkish-cased` pretrained BERT model. It is trained on the **Turkish-ABSA-Wsynthetic** dataset, which contains Turkish restaurant reviews annotated with aspect-based sentiments. The model is capable of identifying the sentiment polarity for specific aspects (e.g., "servis," "fiyatlar") mentioned in Turkish sentences.

- **Developed by:** Sengil
- **Language(s):** Turkish 🇹🇷
- **License:** Apache-2.0
- **Finetuned from model:** `dbmdz/bert-base-turkish-cased`
- **Number of Labels:** 3 (Negative, Neutral, Positive)

### Sources

<!-- Provide the basic links for the model. -->
- **Notebook:** [ABSA_Turkish_BERT_Based_Small](https://www.kaggle.com/code/mertsengil/absa-train-w-synthetic-restaurant-reviews)

---
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

### Direct Use

This model can be used directly for analyzing aspect-specific sentiment in Turkish text, especially in domains like restaurant reviews.

### Downstream Use

It can be fine-tuned for similar tasks in different domains (e.g., e-commerce, hotel reviews, or customer feedback analysis).

### Out-of-Scope Use

- Not suitable for tasks unrelated to sentiment analysis or Turkish language.
- May not perform well on datasets with significantly different domain-specific vocabulary.

---

### Limitations

- May struggle with rare or ambiguous aspects not covered in the training data.
- May exhibit biases present in the training dataset.


## How to Get Started with the Model

<!-- This section provides code examples and links to further documentation. -->

```
!pip install -U transformers
```

Use the code below to get started with the model:

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("Sengil/ABSA-Turkish-bert-based-small")
model = AutoModelForSequenceClassification.from_pretrained("Sengil/ABSA-Turkish-bert-based-small")

# Example inference
text = "Servis çok yavaştı ama yemekler lezzetliydi."
aspect = "servis"
formatted_text = f"[CLS] {text} [SEP] {aspect} [SEP]"

inputs = tokenizer(formatted_text, return_tensors="pt", padding="max_length", truncation=True, max_length=128)
outputs = model(**inputs)
predicted_class = outputs.logits.argmax(dim=1).item()

# Map prediction to label
labels = {0: "Negative", 1: "Neutral", 2: "Positive"}
print(f"Sentiment for '{aspect}': {labels[predicted_class]}")
```


## Training Details

### Training Data

Training Data
The model was fine-tuned on the Turkish-ABSA-Wsynthetic.csv dataset. The dataset contains semi-synthetic Turkish sentences annotated for aspect-based sentiment analysis.

- Training Procedure
- Optimizer: AdamW
- Learning Rate: 2e-5
- Batch Size: 16
- Epochs: 5
- Max Sequence Length: 128


## Evaluation

The model achieved the following scores on the test set:

- Accuracy: 95.48%
- F1 Score (Weighted): 95.46%


## Citation

```
@misc{absa_turkish_bert_based_small,
  title={Aspect-Based Sentiment Analysis for Turkish},
  author={Sengil},
  year={2024},
  url={https://huggingface.co/Sengil/ABSA_Turkish_BERT_Based_Small}
}
```

## Model Card Contact

For any questions or issues, please open an issue in the repository or contact [LinkedIN](https://www.linkedin.com/in/mertsengil/).