π datadiver-ai is the ultimate tool for π web scraping, transforming πΈοΈ unstructured websites into β¨ clean JSON with the π§ AI-powered processing.
datadiver-ai is the ultimate tool for π web scraping, transforming πΈοΈ unstructured websites into β¨ clean JSON. Easily extract π paragraphs, π lists, π links, and πΌοΈ images with our π§ AI-powered processing.
<br>
<table><tr><a href="https://github.com/mostypc123/XediX"><img src="https://xedix.w3spaces.com/xedix-shield.png" alt="Made with XediX" height="35"></a></th></tr></table>
<br>
[!IMPORTANT]<br>
Extract structured data from any website with a simple API!π
<br>
π Overview
DataDiver AI is an intelligent web scraping tool that transforms unstructured web pages into clean, organized JSON data. Perfect for research, data analysis, content aggregation, and more!
<br>
β¨ Features
π Universal Scraping - Works with virtually any website
π§ AI-Powered - Uses Mistral AI for intelligent data processing
π Content Categorization - Automatically organizes content by section
π Rich Content Support - Extracts paragraphs, lists, links, and images
π» Simple API - Easy-to-use interface for quick integration
π οΈ Tech Stack
βοΈ Next.js + React
π TypeScript
π JSDOM for HTML parsing
π§ Mistral API for optimization
π¨ Custom CSS for beautiful UI
π¦ Installation
# Clone the repository
git clone https://github.com/divyanshudhruv/datadiver-ai.git
# Navigate to project directory
cd datadiver-ai
# Install dependencies
npm install
# Set up environment variables
cp .env.example .env
# Add your Mistral API key to .env file
π Getting Started
# Start the development server
npm run dev
# Open your browser and navigate to
http://localhost:3000
π Usage
Web Interface
Enter the URL you want to scrape
Click "Scrape"
View the structured JSON output
API Example
// Fetch data from a URL
const response = await fetch("/api/scrape", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ url: "https://example.com" })
});
const data = await response.json();
console.log(data);
π Example Response
{
"success": true,
"url": "https://example.com",
"data": {
"title": "Example Website",
"meta": {
"description": "This is an example website"
},
"content": {
"about_us": {
"title": "About Us",
"items": [
{
"type": "paragraph",
"text": "We are a sample company demonstrating DataDiver AI"
},
{
"type": "list",
"listType": "unordered",
"items": ["Feature 1", "Feature 2", "Feature 3"]
}
]
}
}
}
}
π€ Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
# Create a new branch
git checkout -b feature/amazing-feature
# Make your changes and commit them
git commit -m 'Add some amazing feature'
# Push to the branch
git push origin feature/amazing-feature
# Open a Pull Request
π License
This project is licensed under the MIT License - see the LICENSE file for details.