VulnBuster / README.md
zjkarina's picture
Update README.md
57bcfed verified
---
title: VulnBuster
emoji: πŸ›‘οΈ
colorFrom: red
colorTo: purple
sdk: docker
app_file: start.sh
pinned: true
tags:
- agent-demo-track
- security
- mcp
- vulnerability-scanner
- ai-agent
short_description: 'AI Security Agent: Multi-MCP Code Vulnerability Scanner'
license: mit
authors:
- name: zjkarina
url: https://huggingface.co/zjkarina
- name: brtbrr
url: https://huggingface.co/brtbrr
- name: RustemX
url: https://huggingface.co/RustemX
- name: R0m9n
url: https://huggingface.co/R0m9n
---
# πŸ›‘οΈ VulnBuster
**An intelligent AI agent demonstrating automated code security auditing through orchestrated MCP services.**
VulnBuster showcases an agentic approach to vulnerability scanning by combining multiple security tools in a single, intelligent interface. The agent automatically analyzes code using various scanners, correlates findings, and provides AI-powered remediation suggestions.
## 🎯 Agentic Demo Features
- **πŸ€– Intelligent Agent Orchestration**: AI agent coordinates multiple MCP security scanners
- **πŸ”„ Automated Workflow**: Upload code β†’ Multi-tool analysis β†’ AI-powered fixes
- **🧠 Context-Aware Analysis**: Agent understands scan results and provides meaningful insights
- **⚑ Real-time Processing**: Live analysis with immediate feedback and suggestions
- **πŸŽ›οΈ Multi-Scanner Integration**: Bandit, Detect Secrets, Semgrep, Pip Audit, and Circle Test
## πŸŽ₯ Video Demo
[▢️ Watch VulnBuster Demo](https://youtu.be/kAy1c7rCmSw)
*Video demonstration showing the agentic workflow and real-world usage scenarios*
## πŸš€ Quick Start
1. **Upload your code file** (Python, JavaScript, Java, Go, Ruby)
2. **Select scanners** or let the agent choose automatically
3. **Review security findings** with AI analysis
4. **Download fixed code** with automatic remediation
## πŸ‘€ Authors
- [zjkarina](https://huggingface.co/zjkarina)
- [brtbrr](https://huggingface.co/brtbrr)
- [RustemX](https://huggingface.co/RustemX)
- [R0m9n](https://huggingface.co/R0m9n)
## πŸ› οΈ Integrated Security Tools
VulnBuster orchestrates five specialized MCP servers, each focusing on different aspects of code security. The AI agent intelligently coordinates these tools to provide comprehensive vulnerability analysis.
### πŸ”’ Bandit Security Scanner
**Repository**: [PyCQA/bandit](https://github.com/PyCQA/bandit)
**Specialization**: Python-specific security analysis
Bandit is a security linter designed to find common security issues in Python code. Our MCP integration enables:
- **Static Code Analysis**: Detects hardcoded passwords, SQL injection patterns, shell injection risks
- **Security Profiles**: Specialized scans for Shell Injection, SQL Injection, Crypto vulnerabilities
- **Baseline Management**: Creates security baselines for tracking new vulnerabilities over time
- **Severity & Confidence Levels**: Configurable thresholds (low/medium/high) for precise reporting
**Agent Integration**: The agent automatically selects appropriate Bandit profiles based on code patterns and adjusts severity levels based on the development context.
### πŸ” Detect Secrets Scanner
**Repository**: [Yelp/detect-secrets](https://github.com/Yelp/detect-secrets)
**Specialization**: Secret and credential detection
A security tool that prevents secrets from getting checked into your codebase. Our enhanced MCP server provides:
- **Entropy-Based Detection**: Configurable base64 and hex entropy limits for secret identification
- **Plugin Architecture**: Multiple detection plugins for API keys, passwords, private keys, tokens
- **Smart Filtering**: Excludes false positives while maintaining high detection accuracy
- **Baseline Support**: Tracks known secrets to focus on new leaks
- **Word List Integration**: Custom dictionaries for domain-specific secret patterns
**Agent Integration**: The agent fine-tunes entropy thresholds based on code type and implements intelligent filtering to reduce false positives in legitimate base64/hex content.
### πŸ›‘οΈ Semgrep Scanner
**Website**: [semgrep.dev](https://semgrep.dev)
**Specialization**: Advanced static analysis with custom rules
Semgrep is a powerful static analysis tool that finds bugs, security vulnerabilities, and enforces code standards. Our MCP implementation offers:
- **Multi-Language Support**: Python, JavaScript, Java, Go, Ruby, and 20+ other languages
- **Rule-Based Analysis**: Extensive rule sets from the Semgrep community (p/default, p/security)
- **Pattern Matching**: Advanced syntax-aware pattern matching for complex vulnerability detection
- **Custom Rules**: Support for organization-specific security policies and coding standards
- **Performance**: Fast scanning with minimal false positives
**Agent Integration**: The agent automatically selects appropriate rule sets based on detected programming languages and adjusts analysis depth based on file types and project context.
### πŸ“¦ Pip Audit Scanner
**Repository**: [pypa/pip-audit](https://github.com/pypa/pip-audit/tree/main)
**Specialization**: Python dependency vulnerability scanning
Pip-audit is the official Python Packaging Authority tool for auditing Python environments against known vulnerabilities. Features include:
- **CVE Database**: Scans against the Python Package Index (PyPI) vulnerability database
- **Requirements Analysis**: Processes requirements.txt, Pipfile.lock, and installed packages
- **Vulnerability Fixing**: Suggests specific version upgrades to resolve security issues
- **Supply Chain Security**: Identifies compromised or malicious packages in dependency trees
- **Integration Support**: Works with virtual environments, Docker containers, and CI/CD pipelines
**Agent Integration**: The agent correlates dependency vulnerabilities with code usage patterns, prioritizing fixes based on actual code paths and exposure risk.
### πŸ“‹ Circle Test Scanner
**Platform**: [White Circle AI](https://huggingface.co/whitecircle-ai)
**Specialization**: AI safety and policy compliance
Powered by White Circle's advanced AI safety platform, this scanner focuses on security policy compliance:
- **12 Security Policies**: Comprehensive checks covering SPDX licensing, credential exposure, deprecated APIs
- **Code Quality Gates**: Detects TODO/FIXME tags, debug statements, and development artifacts in production code
- **Path Security**: Validates file operations, prevents path traversal vulnerabilities
- **Cryptographic Standards**: Enforces modern cryptographic practices, detects weak algorithms (MD5, etc.)
- **Container Security**: Checks file permissions, environment variable handling
- **Supply Chain Policies**: Validates dependency pinning, production environment separation
**Agent Integration**: The agent uses Circle Test as a final compliance layer, ensuring that all code changes meet organizational security standards and regulatory requirements.
## πŸŽ›οΈ Agent Orchestration Workflow
```mermaid
graph TB
A[Code Upload] --> B[VulnBuster AI Agent]
B --> C[Language Detection]
C --> D[Tool Selection & Configuration]
D --> E[πŸ”’ Bandit<br/>Python Security]
D --> F[πŸ” Detect Secrets<br/>Credential Scan]
D --> G[πŸ›‘οΈ Semgrep<br/>Multi-Language Analysis]
D --> H[πŸ“¦ Pip Audit<br/>Dependency Check]
D --> I[πŸ“‹ Circle Test<br/>Policy Compliance]
E --> J[AI Correlation Engine]
F --> J
G --> J
H --> J
I --> J
J --> K[Vulnerability Prioritization]
K --> L[Automated Fix Generation]
L --> M[Remediated Code Output]
```
## πŸŽ›οΈ Agent Architecture
```mermaid
graph TB
A[User Input] --> B[VulnBuster Agent]
B --> C[MCP Scanner 1]
B --> D[MCP Scanner 2]
B --> E[MCP Scanner N]
C --> F[AI Analysis Engine]
D --> F
E --> F
F --> G[Remediation Suggestions]
F --> H[Fixed Code Output]
```
The agent intelligently:
1. **Analyzes** incoming code
2. **Selects** appropriate scanners
3. **Coordinates** parallel scanning
4. **Correlates** findings across tools
5. **Generates** fix recommendations
6. **Produces** remediated code
## πŸ“Š Advanced Usage Examples
### Example 1: Multi-Layer Python Security Analysis
```python
# Vulnerable Python code
import subprocess
import pickle
import sqlite3
# Multiple security issues for demonstration
API_KEY = "sk_live_51H1h2K3L4M5N6O7P8Q9R0S1T2U3V4W5X6Y7Z8" # Detect Secrets
password = "admin123" # Bandit B105
def execute_command(user_input):
subprocess.call(f"ls {user_input}", shell=True) # Bandit B602
def load_data(data):
return pickle.loads(data) # Bandit B301
def query_db(user_id):
conn = sqlite3.connect('users.db')
query = f"SELECT * FROM users WHERE id = {user_id}" # Semgrep: SQL injection
return conn.execute(query).fetchall()
# TODO: Fix authentication system # Circle Test Policy #3
```
**Agent Analysis Results**:
- **Bandit**: 3 high-severity issues (B105, B602, B301)
- **Detect Secrets**: 1 API key detected with high entropy
- **Semgrep**: SQL injection vulnerability identified
- **Circle Test**: TODO comment flagged, production code quality violation
- **Agent Remediation**: Generates secure alternatives with proper input validation
### Example 2: JavaScript/Node.js Security Scan
```javascript
// Vulnerable Node.js code
const express = require('express');
const fs = require('fs');
const app = express();
const API_SECRET = 'abc123def456'; // Detect Secrets
app.get('/file/:filename', (req, res) => {
// Path traversal vulnerability - Semgrep detection
const filepath = `/uploads/${req.params.filename}`;
fs.readFile(filepath, (err, data) => {
if (err) throw err;
res.send(data);
});
});
```
**Agent Response**:
- **Semgrep**: Path traversal vulnerability in file handler
- **Detect Secrets**: Hardcoded API secret detection
- **Circle Test**: Missing input validation policies
- **Agent Fix**: Implements path sanitization and secure secret management
### Example 3: Dependency Vulnerability Assessment
```txt
# requirements.txt with vulnerable packages
Django==2.0.0 # Known CVE vulnerabilities
requests==2.18.4 # Outdated version
Pillow>=5.0.0,<6.0.0 # Version range instead of pinned
pycrypto==2.6.1 # Deprecated cryptographic library
```
**Comprehensive Analysis**:
- **Pip Audit**: 4 vulnerable packages identified with specific CVE numbers
- **Circle Test**: Policy violations for unpinned dependencies and deprecated crypto
- **Agent Resolution**: Suggests exact secure versions and modern alternatives
- **Supply Chain Risk**: Analyzes dependency trees for transitive vulnerabilities
### Example 4: Enterprise Policy Compliance Check
```python
#!/usr/bin/env python3
# Missing SPDX license identifier - Circle Test Policy #1
import os
import hashlib
def authenticate_user(username, password):
# MD5 usage flagged by Circle Test Policy #13
password_hash = hashlib.md5(password.encode()).hexdigest()
# Hardcoded production URL - Circle Test Policy #11
auth_server = "https://prod-auth.company.com/api/login"
# TODO: Implement proper session management - Policy #3
return True
# Debug code left in production - Circle Test Policy #14
import pdb; pdb.set_trace()
```
**Policy Compliance Results**:
- **Circle Test**: 4 policy violations detected
- **Bandit**: MD5 usage and hardcoded values flagged
- **Agent Remediation**: Implements SPDX headers, modern crypto, environment variables, removes debug code
## πŸš€ Real-World Impact
VulnBuster's agent-driven approach provides:
- **95% Faster Analysis**: Parallel scanning reduces analysis time from hours to minutes
- **Cross-Tool Correlation**: Identifies vulnerability chains missed by individual tools
- **Context-Aware Fixes**: Generates fixes that maintain code functionality while improving security
- **Compliance Automation**: Ensures adherence to security policies across development lifecycle
- **Learning System**: Agent improves recommendations based on codebase patterns and fix acceptance rates
## 🌐 MCP Integration
Connect VulnBuster to your IDE using MCP:
```json
{
"mcpServers": {
"vulnbuster": {
"command": "npx",
"args": [
"-y",
"mcp-remote",
"https://agents-mcp-hackathon-vulnbuster.hf.space/gradio_api/mcp/sse",
"--transport",
"sse-only"
]
}
}
}
```
## πŸ” Comprehensive Vulnerability Coverage
VulnBuster's multi-scanner approach provides comprehensive security coverage across different layers:
### πŸ”’ Code-Level Vulnerabilities (Bandit + Semgrep)
- **Injection Attacks**: SQL injection, command injection, code injection via `eval()`/`exec()`
- **Cryptographic Issues**: Weak algorithms (MD5, SHA1), hardcoded encryption keys
- **Unsafe Functions**: Use of `pickle`, `marshal`, `yaml.load()` without safe parameters
- **Path Traversal**: Unsafe file operations, directory traversal vulnerabilities
- **XML External Entities (XXE)**: Insecure XML parsing configurations
- **Deserialization**: Unsafe object deserialization patterns
### πŸ” Secret & Credential Leaks (Detect Secrets)
- **API Keys**: AWS, Google Cloud, Azure access keys and tokens
- **Authentication Tokens**: JWT tokens, OAuth tokens, session cookies
- **Database Credentials**: Passwords, connection strings, database URLs
- **Private Keys**: SSH keys, SSL certificates, PGP keys
- **High-Entropy Strings**: Base64/hex encoded secrets with configurable thresholds
- **Custom Patterns**: Domain-specific secrets using word lists and regex patterns
### πŸ“¦ Supply Chain Vulnerabilities (Pip Audit)
- **Known CVEs**: Direct dependencies with published security advisories
- **Transitive Dependencies**: Vulnerabilities in dependencies of dependencies
- **Malicious Packages**: Typosquatting and compromised package detection
- **Version Pinning**: Outdated packages with available security updates
- **License Compliance**: Incompatible or problematic package licenses
### πŸ“‹ Policy & Compliance Violations (Circle Test)
- **License Compliance**: Missing or non-approved SPDX license identifiers
- **Code Quality**: TODO/FIXME comments in production code
- **Development Artifacts**: Debug statements, test code in production
- **Insecure Communication**: HTTP URLs without proper validation
- **Data Exposure**: Logging sensitive information without masking
- **Deprecated APIs**: Usage of functions marked as deprecated
- **File System Security**: Overly permissive file permissions (0o777)
- **Environment Security**: Runtime environment variable modifications
### πŸ›‘οΈ Multi-Language Support (Semgrep)
| Language | Vulnerability Types | Coverage |
|----------|-------------------|----------|
| **Python** | Injection, Crypto, Deserialization | Comprehensive |
| **JavaScript/Node.js** | XSS, Prototype pollution, Path traversal | Full |
| **Java** | Injection, XXE, Deserialization | Extensive |
| **Go** | Race conditions, Crypto, Input validation | Growing |
| **Ruby** | Injection, Mass assignment, Crypto | Good |
| **PHP** | Injection, File inclusion, Crypto | Basic |
### 🎯 Risk Prioritization Matrix
The agent automatically prioritizes vulnerabilities based on:
| Severity | Exploitability | Business Impact | Examples |
|----------|---------------|-----------------|----------|
| **Critical** | Remote + High | Data breach | SQL injection in auth system |
| **High** | Remote + Medium | Service disruption | Command injection in API |
| **Medium** | Local + High | Information leak | Hardcoded credentials |
| **Low** | Local + Low | Code quality | TODO comments, deprecated APIs |
### πŸ”„ Continuous Monitoring Capabilities
- **Baseline Tracking**: Monitors new vulnerabilities against established security baselines
- **Regression Detection**: Identifies when previously fixed issues reappear
- **Trend Analysis**: Tracks vulnerability patterns and improvement metrics
- **Policy Evolution**: Adapts to new security standards and organizational requirements
## πŸ›‘οΈ Local Development
```bash
# Clone and run
git clone https://huggingface.co/spaces/Agents-MCP-Hackathon/VulnBuster
cd VulnBuster
# Setup environment
echo "NEBIUS_API_KEY=your_api_key_here" > .env
# Build and run
docker build -t vulnbuster .
docker run -p 7860:7860 --env-file .env vulnbuster
```
## πŸ—οΈ Technical Architecture
- **Frontend**: Gradio web interface with file upload and real-time results
- **Backend**: FastAPI with async processing for concurrent scanner execution
- **Agent Framework**: Agno with Nebius LLM for intelligent analysis and correlation
- **MCP Servers**: 5 specialized security scanners with standardized interfaces
- **Containerization**: Single Docker image with all dependencies and services
- **Communication**: HTTP/SSE for MCP protocol, JSON for data exchange
**Tags:** `agent-demo-track`
**Note**: This tool provides static analysis and should be used as part of a comprehensive security strategy. The AI agent assists with remediation but human review is recommended for production code.