metadata

title: SigmaTriple
emoji: 🔍
colorFrom: blue
colorTo: indigo
sdk: streamlit
sdk_version: 1.32.0
app_file: app.py
pinned: false

SigmaTriple: Knowledge Graph Extraction from Markdown

This Hugging Face Space provides a Streamlit interface for extracting knowledge graphs from markdown text using the SciPhi/Triplex model.

Features

Extract Knowledge Graphs: Automatically identify entities and relationships from markdown text
Customizable Entity Types and Predicates: Define the types of entities and relationships you want to extract
Batch Processing: Process large markdown files efficiently using vllm
Interactive Visualization: View the extracted knowledge graph as an interactive network diagram
File Upload Support: Upload markdown files directly or input text manually

The application uses the SciPhi/Triplex model, which is fine-tuned for knowledge graph extraction
Markdown text is processed to extract plain text content
For large texts, batch processing is applied with overlapping chunks to ensure context is maintained
The model identifies entities and relationships based on the specified entity types and predicates
Results are parsed and visualized as an interactive knowledge graph

Configure Entity Types and Predicates:
- In the sidebar, customize the entity types (e.g., PERSON, ORGANIZATION) and predicates (e.g., WORKS_AT, FOUNDED) you want to extract
Input Text:
- Choose between direct text input or file upload
- For text input, simply paste your markdown text in the provided area
- For file upload, select a markdown (.md), markdown (.markdown), or text (.txt) file
Extract Knowledge Graph:
- Click the "Extract Knowledge Graph" button to process the text
- View the raw model output, extracted triplets table, and interactive visualization

Research Papers: Extract key concepts and relationships from academic papers
Documentation: Create knowledge graphs from technical documentation
Content Analysis: Identify key entities and relationships in articles or blog posts
Educational Content: Visualize relationships between concepts in educational materials

The quality of extraction depends on the clarity and structure of the input text
Very large documents may require significant processing time
The model may not capture all relationships, especially those requiring deep contextual understanding