CLAIM-Architecture
CLAIM-Architecture
CLAIM-Architecture
Project Overview
CLAIM (Comprehensive Logic for Analyzing and Identifying Microservices) is a system
built to analyze Git repositories for microservices and infrastructure components. The system
utilizes a microservices architecture to enhance scalability, modularity, and maintainability.
CLAIM is containerized using Docker and orchestrated via Docker Compose, with each
service performing a specific function in the analysis pipeline.
1. High-Level Architecture
CLAIM’s architecture is designed with the following principles:
Microservices: Each service is designed to perform a single, isolated function, which
allows for independent development, deployment, and scaling.
Event-Driven Communication: Services communicate via REST APIs, triggered by
events in the analysis workflow.
Database: MongoDB serves as the metadata storage solution, optimized for flexible
and scalable data storage.
Containerization: Docker ensures each service has a consistent runtime environment,
and Docker Compose orchestrates the multi-service environment.
Architecture Diagram
Below is an illustration of CLAIM's high-level architecture.
lua
Copy code
+-----------------------+
| CLAIM UI |
+-----------+-----------+
|
|
+-----------v-----------+
| API Gateway |
+-----------+-----------+
|
|
+-----------v-----------+
| Repository Miner |
+-----------+-----------+
|
|
+-----------v-----------+
| File Parser |
+-----------+-----------+
|
|
+-----------v-----------+
| Heuristic Identifier |
+-----------+-----------+
|
|
+-----------v-----------+
| Metadata Storage |
| (MongoDB) |
+-----------------------+
Component Summary
CLAIM UI: The web-based interface for user interactions.
API Gateway: Central entry point that routes requests to other services.
Repository Miner: Clones and examines repositories.
File Parser: Analyzes docker-compose.yml files.
Heuristic Identifier: Classifies components as microservices or infrastructure.
Metadata Storage (MongoDB): Stores analysis results and metadata.
2. Component-Level Architecture
2.1 CLAIM UI
Purpose: Allows users to input a repository URL and view analysis results.
Technology: Flask (Python), HTML, CSS.
Endpoints:
o /: Homepage with a form for entering the repository URL.
o /analyze: Submits the repository URL to the API Gateway for analysis.
Data Flow:
o The user enters a URL and submits it, triggering a request to the API Gateway.
Results are displayed after processing.
2.2 API Gateway
Purpose: Manages incoming requests and routes them to appropriate services.
Technology: Flask (Python).
Endpoints:
o /analyze: Accepts a repository URL and coordinates the analysis workflow.
Data Flow:
o Receives requests from the UI, forwards the repository URL to the Repository
Miner, and consolidates responses to send back to the UI.
Responsibilities:
o Serves as a mediator, handling request routing and error management to
ensure a smooth user experience.
2.3 Repository Miner Service
Purpose: Clones repositories and identifies the presence of docker-compose.yml files.
Technology: GitPython (Python library).
Endpoints:
o /clone: Clones the repository and locates the docker-compose.yml file, sending
it to the File Parser for processing.
Data Flow:
o Receives the repository URL from the API Gateway, clones the repository,
searches for docker-compose.yml, and sends the file content to the File Parser.
Responsibilities:
o Cloning repositories, handling errors if cloning fails, and verifying the
presence of a docker-compose.yml file.
o Maintaining a temporary storage location for cloned repositories.
2.4 File Parser Service
Purpose: Analyzes docker-compose.yml files to extract defined services and their
configurations.
Technology: PyYAML (Python library for YAML processing).
Endpoints:
o /parse: Accepts the content of docker-compose.yml, parses it, and forwards the
data to the Heuristic Identifier for classification.
Data Flow:
o Receives the docker-compose.yml content, parses each service definition, and
structures the data for further processing.
Responsibilities:
o Parsing and organizing data from docker-compose.yml files.
o Extracting each service’s configuration, dependencies, and runtime
environment.
2.5 Heuristic Identifier Service
Purpose: Applies heuristic rules to classify each component as a microservice or an
infrastructure component.
Technology: Custom heuristics (Python).
Endpoints:
o /identify: Accepts parsed data from the File Parser and categorizes each
service.
Data Flow:
o Receives parsed services data, applies classification rules, and sends classified
data to the Metadata Storage service.
Responsibilities:
o Identifying application logic (microservices) versus infrastructure services
(e.g., databases, caches).
o Tagging components based on heuristic criteria for accurate classification.
2.6 Metadata Storage Service (MongoDB)
Purpose: Stores structured analysis metadata for each repository.
Technology: MongoDB.
Endpoints:
o /store: Receives metadata from Heuristic Identifier and stores it.
o /retrieve: Allows retrieval of stored metadata for specific repositories.
Data Flow:
o Receives categorized data from the Heuristic Identifier and stores it in
MongoDB. Provides retrieval functionality when requested.
Responsibilities:
o Storing and managing analysis results.
o Providing structured metadata access for future reference.
4. Technology Stack
Back-End (Services)
Python: The primary programming language, leveraging Flask for API development
and specific libraries like GitPython and PyYAML.
MongoDB: NoSQL database chosen for flexibility, scalability, and compatibility with
JSON-like document storage.
Front-End (UI)
HTML/CSS: For the web interface, providing a form for URL input and displaying
results in a structured format.
Containerization
Docker: Each microservice runs in its own Docker container, allowing for consistent
environments and easy scaling.
Docker Compose: Orchestrates the multi-container setup, handling service
dependencies, networking, and configuration.
Conclusion
CLAIM’s architecture combines modularity, scalability, and resilience through a
microservices approach. MongoDB provides flexible data storage, while Docker ensures
consistent deployment across environments. The architecture supports future expansion,
allowing CLAIM to handle increasingly complex repository analyses and scale as needed.