AI-powered CSV cleaning and validation for seamless data imports
Features • Quick Start • Demo • Architecture
Transform messy, inconsistent CSV files into clean, import-ready data with the power of AI. CSV Cleaner Agent analyzes your data, detects quality issues, and provides intelligent recommendations for cleaning—all powered by Claude's Agent SDK.
Perfect for preparing data imports for Shopify, QuickBooks, Business Central, and more.
Smart CSV Parsing - Automatic header detection and data structure analysis
Data Profiling - Detect column types, null values, and anomalies
AI-Powered Insights - Intelligent cleaning recommendations based on your target platform
Fast Processing - Built on Bun for lightning-fast performance
Custom MCP Tools - Extensible tool architecture for specialized cleaning operations
- Column mapping and header normalization
- Date format standardization
- Phone number formatting
- Deduplication
- Platform-specific schema validation (Shopify, QuickBooks, Business Central)
- Detailed cleaning reports
- Web interface for file upload and preview
- Saved cleaning presets
# Clone the repository
git clone https://github.com/joeynyc/-CSVCleanerAgent.git
cd CSVCleanerAgent
# Install dependencies
bun install
# Set up your API key
cp .env.example .env
# Edit .env and add your ANTHROPIC_API_KEYInteractive Mode:
bun startAnalyze a Specific File:
bun start "Analyze sample.csv and suggest cleaning steps"Development Mode (Auto-reload):
bun run devType Checking:
bun run typecheckTry it with the included sample CSV that contains common data quality issues:
bun start "Profile the data in sample.csv and identify issues"The agent will:
- Parse the CSV structure
- Analyze each column for data types and quality
- Detect issues (missing values, format inconsistencies, duplicates)
- Recommend cleaning strategies
Sample Data Issues:
- Missing names
- Inconsistent date formats (YYYY-MM-DD vs MM/DD/YYYY vs DD-MM-YYYY)
- Various phone number formats
- Inconsistent SKU casing
- Empty values in required fields
┌─────────────────────────────────────────────┐
│ Claude Agent SDK │
│ (Agent Loop + Context Management) │
└─────────────────┬───────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ MCP Server (csv-cleaner) │
│ ┌────────────────┐ ┌──────────────────┐ │
│ │ parse_csv │ │ profile_data │ │
│ │ - Headers │ │ - Type detection│ │
│ │ - Row count │ │ - Null analysis │ │
│ │ - Samples │ │ - Anomalies │ │
│ └────────────────┘ └──────────────────┘ │
└─────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ Your CSV Files │
└─────────────────────────────────────────────┘
- Claude Agent SDK - Autonomous agent framework
- Bun - Fast JavaScript runtime
- TypeScript - Type-safe development
- Zod - Schema validation
- MCP - Model Context Protocol for custom tools
CSVCleanerAgent/
├── index.ts # Main agent implementation
├── package.json # Dependencies and scripts
├── tsconfig.json # TypeScript configuration
├── .env.example # Environment template
├── sample.csv # Example data with quality issues
└── README.md # You are here
Contributions are welcome! Here's how you can help:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Built with Claude Agent SDK by Anthropic
- Powered by Bun runtime
- Inspired by the need for better data quality in business operations
Star this repo if you find it useful!
Made with AI