Spaces:

Agents-MCP-Hackathon
/

VulnBuster

Running

App Files Files Community

VulnBuster / README.md

zjkarina

Update README.md

57bcfed verified 10 days ago

preview code

raw

history blame contribute delete

17.1 kB

	---
	title: VulnBuster
	emoji: 🛡️
	colorFrom: red
	colorTo: purple
	sdk: docker
	app_file: start.sh
	pinned: true
	tags:
	- agent-demo-track
	- security
	- mcp
	- vulnerability-scanner
	- ai-agent
	short_description: 'AI Security Agent: Multi-MCP Code Vulnerability Scanner'
	license: mit
	authors:
	- name: zjkarina
	url: https://huggingface.co/zjkarina
	- name: brtbrr
	url: https://huggingface.co/brtbrr
	- name: RustemX
	url: https://huggingface.co/RustemX
	- name: R0m9n
	url: https://huggingface.co/R0m9n
	---

	# 🛡️ VulnBuster

	An intelligent AI agent demonstrating automated code security auditing through orchestrated MCP services.

	VulnBuster showcases an agentic approach to vulnerability scanning by combining multiple security tools in a single, intelligent interface. The agent automatically analyzes code using various scanners, correlates findings, and provides AI-powered remediation suggestions.

	## 🎯 Agentic Demo Features

	- 🤖 Intelligent Agent Orchestration: AI agent coordinates multiple MCP security scanners
	- 🔄 Automated Workflow: Upload code → Multi-tool analysis → AI-powered fixes
	- 🧠 Context-Aware Analysis: Agent understands scan results and provides meaningful insights
	- ⚡ Real-time Processing: Live analysis with immediate feedback and suggestions
	- 🎛️ Multi-Scanner Integration: Bandit, Detect Secrets, Semgrep, Pip Audit, and Circle Test

	## 🎥 Video Demo

	[▶️ Watch VulnBuster Demo](https://youtu.be/kAy1c7rCmSw)

	Video demonstration showing the agentic workflow and real-world usage scenarios

	## 🚀 Quick Start

	1. Upload your code file (Python, JavaScript, Java, Go, Ruby)
	2. Select scanners or let the agent choose automatically
	3. Review security findings with AI analysis
	4. Download fixed code with automatic remediation

	## 👤 Authors

	- [zjkarina](https://huggingface.co/zjkarina)
	- [brtbrr](https://huggingface.co/brtbrr)
	- [RustemX](https://huggingface.co/RustemX)
	- [R0m9n](https://huggingface.co/R0m9n)

	## 🛠️ Integrated Security Tools

	VulnBuster orchestrates five specialized MCP servers, each focusing on different aspects of code security. The AI agent intelligently coordinates these tools to provide comprehensive vulnerability analysis.

	### 🔒 Bandit Security Scanner
	Repository: [PyCQA/bandit](https://github.com/PyCQA/bandit)
	Specialization: Python-specific security analysis

	Bandit is a security linter designed to find common security issues in Python code. Our MCP integration enables:

	- Static Code Analysis: Detects hardcoded passwords, SQL injection patterns, shell injection risks
	- Security Profiles: Specialized scans for Shell Injection, SQL Injection, Crypto vulnerabilities
	- Baseline Management: Creates security baselines for tracking new vulnerabilities over time
	- Severity & Confidence Levels: Configurable thresholds (low/medium/high) for precise reporting

	Agent Integration: The agent automatically selects appropriate Bandit profiles based on code patterns and adjusts severity levels based on the development context.

	### 🔍 Detect Secrets Scanner
	Repository: [Yelp/detect-secrets](https://github.com/Yelp/detect-secrets)
	Specialization: Secret and credential detection

	A security tool that prevents secrets from getting checked into your codebase. Our enhanced MCP server provides:

	- Entropy-Based Detection: Configurable base64 and hex entropy limits for secret identification
	- Plugin Architecture: Multiple detection plugins for API keys, passwords, private keys, tokens
	- Smart Filtering: Excludes false positives while maintaining high detection accuracy
	- Baseline Support: Tracks known secrets to focus on new leaks
	- Word List Integration: Custom dictionaries for domain-specific secret patterns

	Agent Integration: The agent fine-tunes entropy thresholds based on code type and implements intelligent filtering to reduce false positives in legitimate base64/hex content.

	### 🛡️ Semgrep Scanner
	Website: [semgrep.dev](https://semgrep.dev)
	Specialization: Advanced static analysis with custom rules

	Semgrep is a powerful static analysis tool that finds bugs, security vulnerabilities, and enforces code standards. Our MCP implementation offers:

	- Multi-Language Support: Python, JavaScript, Java, Go, Ruby, and 20+ other languages
	- Rule-Based Analysis: Extensive rule sets from the Semgrep community (p/default, p/security)
	- Pattern Matching: Advanced syntax-aware pattern matching for complex vulnerability detection
	- Custom Rules: Support for organization-specific security policies and coding standards
	- Performance: Fast scanning with minimal false positives

	Agent Integration: The agent automatically selects appropriate rule sets based on detected programming languages and adjusts analysis depth based on file types and project context.

	### 📦 Pip Audit Scanner
	Repository: [pypa/pip-audit](https://github.com/pypa/pip-audit/tree/main)
	Specialization: Python dependency vulnerability scanning

	Pip-audit is the official Python Packaging Authority tool for auditing Python environments against known vulnerabilities. Features include:

	- CVE Database: Scans against the Python Package Index (PyPI) vulnerability database
	- Requirements Analysis: Processes requirements.txt, Pipfile.lock, and installed packages
	- Vulnerability Fixing: Suggests specific version upgrades to resolve security issues
	- Supply Chain Security: Identifies compromised or malicious packages in dependency trees
	- Integration Support: Works with virtual environments, Docker containers, and CI/CD pipelines

	Agent Integration: The agent correlates dependency vulnerabilities with code usage patterns, prioritizing fixes based on actual code paths and exposure risk.

	### 📋 Circle Test Scanner
	Platform: [White Circle AI](https://huggingface.co/whitecircle-ai)
	Specialization: AI safety and policy compliance

	Powered by White Circle's advanced AI safety platform, this scanner focuses on security policy compliance:

	- 12 Security Policies: Comprehensive checks covering SPDX licensing, credential exposure, deprecated APIs
	- Code Quality Gates: Detects TODO/FIXME tags, debug statements, and development artifacts in production code
	- Path Security: Validates file operations, prevents path traversal vulnerabilities
	- Cryptographic Standards: Enforces modern cryptographic practices, detects weak algorithms (MD5, etc.)
	- Container Security: Checks file permissions, environment variable handling
	- Supply Chain Policies: Validates dependency pinning, production environment separation

	Agent Integration: The agent uses Circle Test as a final compliance layer, ensuring that all code changes meet organizational security standards and regulatory requirements.

	## 🎛️ Agent Orchestration Workflow

	```mermaid
	graph TB
	A[Code Upload] --> B[VulnBuster AI Agent]
	B --> C[Language Detection]
	C --> D[Tool Selection & Configuration]

	D --> E[🔒 Bandit<br/>Python Security]
	D --> F[🔍 Detect Secrets<br/>Credential Scan]
	D --> G[🛡️ Semgrep<br/>Multi-Language Analysis]
	D --> H[📦 Pip Audit<br/>Dependency Check]
	D --> I[📋 Circle Test<br/>Policy Compliance]

	E --> J[AI Correlation Engine]
	F --> J
	G --> J
	H --> J
	I --> J

	J --> K[Vulnerability Prioritization]
	K --> L[Automated Fix Generation]
	L --> M[Remediated Code Output]
	```

	## 🎛️ Agent Architecture

	```mermaid
	graph TB
	A[User Input] --> B[VulnBuster Agent]
	B --> C[MCP Scanner 1]
	B --> D[MCP Scanner 2]
	B --> E[MCP Scanner N]
	C --> F[AI Analysis Engine]
	D --> F
	E --> F
	F --> G[Remediation Suggestions]
	F --> H[Fixed Code Output]
	```

	The agent intelligently:
	1. Analyzes incoming code
	2. Selects appropriate scanners
	3. Coordinates parallel scanning
	4. Correlates findings across tools
	5. Generates fix recommendations
	6. Produces remediated code

	## 📊 Advanced Usage Examples

	### Example 1: Multi-Layer Python Security Analysis
	```python
	# Vulnerable Python code
	import subprocess
	import pickle
	import sqlite3

	# Multiple security issues for demonstration
	API_KEY = "sk_live_51H1h2K3L4M5N6O7P8Q9R0S1T2U3V4W5X6Y7Z8" # Detect Secrets
	password = "admin123" # Bandit B105

	def execute_command(user_input):
	subprocess.call(f"ls {user_input}", shell=True) # Bandit B602

	def load_data(data):
	return pickle.loads(data) # Bandit B301

	def query_db(user_id):
	conn = sqlite3.connect('users.db')
	query = f"SELECT * FROM users WHERE id = {user_id}" # Semgrep: SQL injection
	return conn.execute(query).fetchall()

	# TODO: Fix authentication system # Circle Test Policy #3
	```

	Agent Analysis Results:
	- Bandit: 3 high-severity issues (B105, B602, B301)
	- Detect Secrets: 1 API key detected with high entropy
	- Semgrep: SQL injection vulnerability identified
	- Circle Test: TODO comment flagged, production code quality violation
	- Agent Remediation: Generates secure alternatives with proper input validation

	### Example 2: JavaScript/Node.js Security Scan
	```javascript
	// Vulnerable Node.js code
	const express = require('express');
	const fs = require('fs');

	const app = express();
	const API_SECRET = 'abc123def456'; // Detect Secrets

	app.get('/file/:filename', (req, res) => {
	// Path traversal vulnerability - Semgrep detection
	const filepath = `/uploads/${req.params.filename}`;
	fs.readFile(filepath, (err, data) => {
	if (err) throw err;
	res.send(data);
	});
	});
	```

	Agent Response:
	- Semgrep: Path traversal vulnerability in file handler
	- Detect Secrets: Hardcoded API secret detection
	- Circle Test: Missing input validation policies
	- Agent Fix: Implements path sanitization and secure secret management

	### Example 3: Dependency Vulnerability Assessment
	```txt
	# requirements.txt with vulnerable packages
	Django==2.0.0 # Known CVE vulnerabilities
	requests==2.18.4 # Outdated version
	Pillow>=5.0.0,<6.0.0 # Version range instead of pinned
	pycrypto==2.6.1 # Deprecated cryptographic library
	```

	Comprehensive Analysis:
	- Pip Audit: 4 vulnerable packages identified with specific CVE numbers
	- Circle Test: Policy violations for unpinned dependencies and deprecated crypto
	- Agent Resolution: Suggests exact secure versions and modern alternatives
	- Supply Chain Risk: Analyzes dependency trees for transitive vulnerabilities

	### Example 4: Enterprise Policy Compliance Check
	```python
	#!/usr/bin/env python3
	# Missing SPDX license identifier - Circle Test Policy #1

	import os
	import hashlib

	def authenticate_user(username, password):
	# MD5 usage flagged by Circle Test Policy #13
	password_hash = hashlib.md5(password.encode()).hexdigest()

	# Hardcoded production URL - Circle Test Policy #11
	auth_server = "https://prod-auth.company.com/api/login"

	# TODO: Implement proper session management - Policy #3
	return True

	# Debug code left in production - Circle Test Policy #14
	import pdb; pdb.set_trace()
	```

	Policy Compliance Results:
	- Circle Test: 4 policy violations detected
	- Bandit: MD5 usage and hardcoded values flagged
	- Agent Remediation: Implements SPDX headers, modern crypto, environment variables, removes debug code

	## 🚀 Real-World Impact

	VulnBuster's agent-driven approach provides:

	- 95% Faster Analysis: Parallel scanning reduces analysis time from hours to minutes
	- Cross-Tool Correlation: Identifies vulnerability chains missed by individual tools
	- Context-Aware Fixes: Generates fixes that maintain code functionality while improving security
	- Compliance Automation: Ensures adherence to security policies across development lifecycle
	- Learning System: Agent improves recommendations based on codebase patterns and fix acceptance rates

	## 🌐 MCP Integration

	Connect VulnBuster to your IDE using MCP:

	```json
	{
	"mcpServers": {
	"vulnbuster": {
	"command": "npx",
	"args": [
	"-y",
	"mcp-remote",
	"https://agents-mcp-hackathon-vulnbuster.hf.space/gradio_api/mcp/sse",
	"--transport",
	"sse-only"
	]
	}
	}
	}
	```

	## 🔍 Comprehensive Vulnerability Coverage

	VulnBuster's multi-scanner approach provides comprehensive security coverage across different layers:

	### 🔒 Code-Level Vulnerabilities (Bandit + Semgrep)
	- Injection Attacks: SQL injection, command injection, code injection via `eval()`/`exec()`
	- Cryptographic Issues: Weak algorithms (MD5, SHA1), hardcoded encryption keys
	- Unsafe Functions: Use of `pickle`, `marshal`, `yaml.load()` without safe parameters
	- Path Traversal: Unsafe file operations, directory traversal vulnerabilities
	- XML External Entities (XXE): Insecure XML parsing configurations
	- Deserialization: Unsafe object deserialization patterns

	### 🔍 Secret & Credential Leaks (Detect Secrets)
	- API Keys: AWS, Google Cloud, Azure access keys and tokens
	- Authentication Tokens: JWT tokens, OAuth tokens, session cookies
	- Database Credentials: Passwords, connection strings, database URLs
	- Private Keys: SSH keys, SSL certificates, PGP keys
	- High-Entropy Strings: Base64/hex encoded secrets with configurable thresholds
	- Custom Patterns: Domain-specific secrets using word lists and regex patterns

	### 📦 Supply Chain Vulnerabilities (Pip Audit)
	- Known CVEs: Direct dependencies with published security advisories
	- Transitive Dependencies: Vulnerabilities in dependencies of dependencies
	- Malicious Packages: Typosquatting and compromised package detection
	- Version Pinning: Outdated packages with available security updates
	- License Compliance: Incompatible or problematic package licenses

	### 📋 Policy & Compliance Violations (Circle Test)
	- License Compliance: Missing or non-approved SPDX license identifiers
	- Code Quality: TODO/FIXME comments in production code
	- Development Artifacts: Debug statements, test code in production
	- Insecure Communication: HTTP URLs without proper validation
	- Data Exposure: Logging sensitive information without masking
	- Deprecated APIs: Usage of functions marked as deprecated
	- File System Security: Overly permissive file permissions (0o777)
	- Environment Security: Runtime environment variable modifications

	### 🛡️ Multi-Language Support (Semgrep)
	\| Language \| Vulnerability Types \| Coverage \|
	\|----------\|-------------------\|----------\|
	\| Python \| Injection, Crypto, Deserialization \| Comprehensive \|
	\| JavaScript/Node.js \| XSS, Prototype pollution, Path traversal \| Full \|
	\| Java \| Injection, XXE, Deserialization \| Extensive \|
	\| Go \| Race conditions, Crypto, Input validation \| Growing \|
	\| Ruby \| Injection, Mass assignment, Crypto \| Good \|
	\| PHP \| Injection, File inclusion, Crypto \| Basic \|

	### 🎯 Risk Prioritization Matrix

	The agent automatically prioritizes vulnerabilities based on:

	\| Severity \| Exploitability \| Business Impact \| Examples \|
	\|----------\|---------------\|-----------------\|----------\|
	\| Critical \| Remote + High \| Data breach \| SQL injection in auth system \|
	\| High \| Remote + Medium \| Service disruption \| Command injection in API \|
	\| Medium \| Local + High \| Information leak \| Hardcoded credentials \|
	\| Low \| Local + Low \| Code quality \| TODO comments, deprecated APIs \|

	### 🔄 Continuous Monitoring Capabilities

	- Baseline Tracking: Monitors new vulnerabilities against established security baselines
	- Regression Detection: Identifies when previously fixed issues reappear
	- Trend Analysis: Tracks vulnerability patterns and improvement metrics
	- Policy Evolution: Adapts to new security standards and organizational requirements

	## 🛡️ Local Development

	```bash
	# Clone and run
	git clone https://huggingface.co/spaces/Agents-MCP-Hackathon/VulnBuster
	cd VulnBuster

	# Setup environment
	echo "NEBIUS_API_KEY=your_api_key_here" > .env

	# Build and run
	docker build -t vulnbuster .
	docker run -p 7860:7860 --env-file .env vulnbuster
	```

	## 🏗️ Technical Architecture

	- Frontend: Gradio web interface with file upload and real-time results
	- Backend: FastAPI with async processing for concurrent scanner execution
	- Agent Framework: Agno with Nebius LLM for intelligent analysis and correlation
	- MCP Servers: 5 specialized security scanners with standardized interfaces
	- Containerization: Single Docker image with all dependencies and services
	- Communication: HTTP/SSE for MCP protocol, JSON for data exchange

	Tags: `agent-demo-track`

	Note: This tool provides static analysis and should be used as part of a comprehensive security strategy. The AI agent assists with remediation but human review is recommended for production code.