Skip to content

AI agent that cleans messy CSV files using Claude Agent SDK. Automatically detects data quality issues, normalizes formats, and prepares CSVs for import into Shopify, QuickBooks, and Business Central. Built with TypeScript and Bun.

License

Notifications You must be signed in to change notification settings

joeynyc/-CSVCleanerAgent

Repository files navigation

CSV Cleaner Agent

AI-powered CSV cleaning and validation for seamless data imports

License: MIT TypeScript Bun Claude Agent SDK

FeaturesQuick StartDemoArchitecture


Overview

Transform messy, inconsistent CSV files into clean, import-ready data with the power of AI. CSV Cleaner Agent analyzes your data, detects quality issues, and provides intelligent recommendations for cleaning—all powered by Claude's Agent SDK.

Perfect for preparing data imports for Shopify, QuickBooks, Business Central, and more.

Features

Smart CSV Parsing - Automatic header detection and data structure analysis

Data Profiling - Detect column types, null values, and anomalies

AI-Powered Insights - Intelligent cleaning recommendations based on your target platform

Fast Processing - Built on Bun for lightning-fast performance

Custom MCP Tools - Extensible tool architecture for specialized cleaning operations

Upcoming Features

  • Column mapping and header normalization
  • Date format standardization
  • Phone number formatting
  • Deduplication
  • Platform-specific schema validation (Shopify, QuickBooks, Business Central)
  • Detailed cleaning reports
  • Web interface for file upload and preview
  • Saved cleaning presets

Quick Start

Prerequisites

Installation

# Clone the repository
git clone https://github.com/joeynyc/-CSVCleanerAgent.git
cd CSVCleanerAgent

# Install dependencies
bun install

# Set up your API key
cp .env.example .env
# Edit .env and add your ANTHROPIC_API_KEY

Usage

Interactive Mode:

bun start

Analyze a Specific File:

bun start "Analyze sample.csv and suggest cleaning steps"

Development Mode (Auto-reload):

bun run dev

Type Checking:

bun run typecheck

Demo

Try it with the included sample CSV that contains common data quality issues:

bun start "Profile the data in sample.csv and identify issues"

The agent will:

  1. Parse the CSV structure
  2. Analyze each column for data types and quality
  3. Detect issues (missing values, format inconsistencies, duplicates)
  4. Recommend cleaning strategies

Sample Data Issues:

  • Missing names
  • Inconsistent date formats (YYYY-MM-DD vs MM/DD/YYYY vs DD-MM-YYYY)
  • Various phone number formats
  • Inconsistent SKU casing
  • Empty values in required fields

Architecture

┌─────────────────────────────────────────────┐
│           Claude Agent SDK                  │
│  (Agent Loop + Context Management)          │
└─────────────────┬───────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────┐
│          MCP Server (csv-cleaner)           │
│  ┌────────────────┐  ┌──────────────────┐  │
│  │  parse_csv     │  │  profile_data    │  │
│  │  - Headers     │  │  - Type detection│  │
│  │  - Row count   │  │  - Null analysis │  │
│  │  - Samples     │  │  - Anomalies     │  │
│  └────────────────┘  └──────────────────┘  │
└─────────────────────────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────┐
│         Your CSV Files                      │
└─────────────────────────────────────────────┘

Tech Stack

  • Claude Agent SDK - Autonomous agent framework
  • Bun - Fast JavaScript runtime
  • TypeScript - Type-safe development
  • Zod - Schema validation
  • MCP - Model Context Protocol for custom tools

Project Structure

CSVCleanerAgent/
├── index.ts              # Main agent implementation
├── package.json          # Dependencies and scripts
├── tsconfig.json         # TypeScript configuration
├── .env.example          # Environment template
├── sample.csv            # Example data with quality issues
└── README.md             # You are here

Contributing

Contributions are welcome! Here's how you can help:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Resources

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Built with Claude Agent SDK by Anthropic
  • Powered by Bun runtime
  • Inspired by the need for better data quality in business operations

Star this repo if you find it useful!

Made with AI

About

AI agent that cleans messy CSV files using Claude Agent SDK. Automatically detects data quality issues, normalizes formats, and prepares CSVs for import into Shopify, QuickBooks, and Business Central. Built with TypeScript and Bun.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •