File size: 4,121 Bytes
85f7907 fe0735c 650f194 3563af4 85f7907 a1f7265 2724daf 85f7907 46500d8 85f7907 fe0735c 85f7907 46500d8 85f7907 fe0735c 85f7907 3563af4 85f7907 650f194 85f7907 fe0735c 85f7907 fe0735c 85f7907 fe0735c 85f7907 fe0735c 85f7907 fe0735c 85f7907 fe0735c 85f7907 fe0735c 85f7907 fe0735c 85f7907 fe0735c 85f7907 fe0735c 85f7907 fe0735c 85f7907 fe0735c 85f7907 fe0735c cacab29 85f7907 fe0735c 85f7907 fe0735c 85f7907 fe0735c 85f7907 fe0735c 85f7907 fe0735c 85f7907 fe0735c 85f7907 fe0735c 85f7907 fe0735c 85f7907 fe0735c 85f7907 46500d8 85f7907 650f194 fe0735c 650f194 85f7907 650f194 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 |
---
library_name: transformers
tags:
- sentiment-analysis
- aspect-based-sentiment-analysis
- transformers
- bert
language:
- tr
metrics:
- accuracy
base_model:
- dbmdz/bert-base-turkish-cased
pipeline_tag: text-classification
datasets:
- Sengil/Turkish-ABSA-Wsynthetic
---
# Aspect Based Sentiment Analysis with Turkish 🇹🇷 Data
<!-- Provide a quick summary of what the model is/does. -->
This model performs **Aspect-Based Sentiment Analysis (ABSA) 🚀** for Turkish text. It predicts sentiment polarity (Positive, Neutral, Negative) towards specific aspects within a given sentence.
---
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
This model is fine-tuned from the `dbmdz/bert-base-turkish-cased` pretrained BERT model. It is trained on the **Turkish-ABSA-Wsynthetic** dataset, which contains Turkish restaurant reviews annotated with aspect-based sentiments. The model is capable of identifying the sentiment polarity for specific aspects (e.g., "servis," "fiyatlar") mentioned in Turkish sentences.
- **Developed by:** Sengil
- **Language(s):** Turkish 🇹🇷
- **License:** Apache-2.0
- **Finetuned from model:** `dbmdz/bert-base-turkish-cased`
- **Number of Labels:** 3 (Negative, Neutral, Positive)
### Sources
<!-- Provide the basic links for the model. -->
- **Notebook:** [ABSA_Turkish_BERT_Based_Small](https://www.kaggle.com/code/mertsengil/absa-train-w-synthetic-restaurant-reviews)
---
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
### Direct Use
This model can be used directly for analyzing aspect-specific sentiment in Turkish text, especially in domains like restaurant reviews.
### Downstream Use
It can be fine-tuned for similar tasks in different domains (e.g., e-commerce, hotel reviews, or customer feedback analysis).
### Out-of-Scope Use
- Not suitable for tasks unrelated to sentiment analysis or Turkish language.
- May not perform well on datasets with significantly different domain-specific vocabulary.
---
### Limitations
- May struggle with rare or ambiguous aspects not covered in the training data.
- May exhibit biases present in the training dataset.
## How to Get Started with the Model
<!-- This section provides code examples and links to further documentation. -->
```
!pip install -U transformers
```
Use the code below to get started with the model:
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("Sengil/ABSA-Turkish-bert-based-small")
model = AutoModelForSequenceClassification.from_pretrained("Sengil/ABSA-Turkish-bert-based-small")
# Example inference
text = "Servis çok yavaştı ama yemekler lezzetliydi."
aspect = "servis"
formatted_text = f"[CLS] {text} [SEP] {aspect} [SEP]"
inputs = tokenizer(formatted_text, return_tensors="pt", padding="max_length", truncation=True, max_length=128)
outputs = model(**inputs)
predicted_class = outputs.logits.argmax(dim=1).item()
# Map prediction to label
labels = {0: "Negative", 1: "Neutral", 2: "Positive"}
print(f"Sentiment for '{aspect}': {labels[predicted_class]}")
```
## Training Details
### Training Data
Training Data
The model was fine-tuned on the Turkish-ABSA-Wsynthetic.csv dataset. The dataset contains semi-synthetic Turkish sentences annotated for aspect-based sentiment analysis.
- Training Procedure
- Optimizer: AdamW
- Learning Rate: 2e-5
- Batch Size: 16
- Epochs: 5
- Max Sequence Length: 128
## Evaluation
The model achieved the following scores on the test set:
- Accuracy: 95.48%
- F1 Score (Weighted): 95.46%
## Citation
```
@misc{absa_turkish_bert_based_small,
title={Aspect-Based Sentiment Analysis for Turkish},
author={Sengil},
year={2024},
url={https://huggingface.co/Sengil/ABSA_Turkish_BERT_Based_Small}
}
```
## Model Card Contact
For any questions or issues, please open an issue in the repository or contact [LinkedIN](https://www.linkedin.com/in/mertsengil/). |