CLAIM-Architecture

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

CLAIM - Architecture Document

Project Overview
CLAIM (Comprehensive Logic for Analyzing and Identifying Microservices) is a system
built to analyze Git repositories for microservices and infrastructure components. The system
utilizes a microservices architecture to enhance scalability, modularity, and maintainability.
CLAIM is containerized using Docker and orchestrated via Docker Compose, with each
service performing a specific function in the analysis pipeline.

1. High-Level Architecture
CLAIM’s architecture is designed with the following principles:
 Microservices: Each service is designed to perform a single, isolated function, which
allows for independent development, deployment, and scaling.
 Event-Driven Communication: Services communicate via REST APIs, triggered by
events in the analysis workflow.
 Database: MongoDB serves as the metadata storage solution, optimized for flexible
and scalable data storage.
 Containerization: Docker ensures each service has a consistent runtime environment,
and Docker Compose orchestrates the multi-service environment.
Architecture Diagram
Below is an illustration of CLAIM's high-level architecture.
lua
Copy code
+-----------------------+
| CLAIM UI |
+-----------+-----------+
|
|
+-----------v-----------+
| API Gateway |
+-----------+-----------+
|
|
+-----------v-----------+
| Repository Miner |
+-----------+-----------+
|
|
+-----------v-----------+
| File Parser |
+-----------+-----------+
|
|
+-----------v-----------+
| Heuristic Identifier |
+-----------+-----------+
|
|
+-----------v-----------+
| Metadata Storage |
| (MongoDB) |
+-----------------------+
Component Summary
 CLAIM UI: The web-based interface for user interactions.
 API Gateway: Central entry point that routes requests to other services.
 Repository Miner: Clones and examines repositories.
 File Parser: Analyzes docker-compose.yml files.
 Heuristic Identifier: Classifies components as microservices or infrastructure.
 Metadata Storage (MongoDB): Stores analysis results and metadata.

2. Component-Level Architecture
2.1 CLAIM UI
 Purpose: Allows users to input a repository URL and view analysis results.
 Technology: Flask (Python), HTML, CSS.
 Endpoints:
o /: Homepage with a form for entering the repository URL.
o /analyze: Submits the repository URL to the API Gateway for analysis.
 Data Flow:
o The user enters a URL and submits it, triggering a request to the API Gateway.
Results are displayed after processing.
2.2 API Gateway
 Purpose: Manages incoming requests and routes them to appropriate services.
 Technology: Flask (Python).
 Endpoints:
o /analyze: Accepts a repository URL and coordinates the analysis workflow.
 Data Flow:
o Receives requests from the UI, forwards the repository URL to the Repository
Miner, and consolidates responses to send back to the UI.
 Responsibilities:
o Serves as a mediator, handling request routing and error management to
ensure a smooth user experience.
2.3 Repository Miner Service
 Purpose: Clones repositories and identifies the presence of docker-compose.yml files.
 Technology: GitPython (Python library).
 Endpoints:
o /clone: Clones the repository and locates the docker-compose.yml file, sending
it to the File Parser for processing.
 Data Flow:
o Receives the repository URL from the API Gateway, clones the repository,
searches for docker-compose.yml, and sends the file content to the File Parser.
 Responsibilities:
o Cloning repositories, handling errors if cloning fails, and verifying the
presence of a docker-compose.yml file.
o Maintaining a temporary storage location for cloned repositories.
2.4 File Parser Service
 Purpose: Analyzes docker-compose.yml files to extract defined services and their
configurations.
 Technology: PyYAML (Python library for YAML processing).
 Endpoints:
o /parse: Accepts the content of docker-compose.yml, parses it, and forwards the
data to the Heuristic Identifier for classification.
 Data Flow:
o Receives the docker-compose.yml content, parses each service definition, and
structures the data for further processing.
 Responsibilities:
o Parsing and organizing data from docker-compose.yml files.
o Extracting each service’s configuration, dependencies, and runtime
environment.
2.5 Heuristic Identifier Service
 Purpose: Applies heuristic rules to classify each component as a microservice or an
infrastructure component.
 Technology: Custom heuristics (Python).
 Endpoints:
o /identify: Accepts parsed data from the File Parser and categorizes each
service.
 Data Flow:
o Receives parsed services data, applies classification rules, and sends classified
data to the Metadata Storage service.
 Responsibilities:
o Identifying application logic (microservices) versus infrastructure services
(e.g., databases, caches).
o Tagging components based on heuristic criteria for accurate classification.
2.6 Metadata Storage Service (MongoDB)
 Purpose: Stores structured analysis metadata for each repository.
 Technology: MongoDB.
 Endpoints:
o /store: Receives metadata from Heuristic Identifier and stores it.
o /retrieve: Allows retrieval of stored metadata for specific repositories.
 Data Flow:
o Receives categorized data from the Heuristic Identifier and stores it in
MongoDB. Provides retrieval functionality when requested.
 Responsibilities:
o Storing and managing analysis results.
o Providing structured metadata access for future reference.

3. Data Flow and Inter-Service Communication


Step-by-Step Data Flow
1. User Input and Initial Request:
o A user inputs a repository URL into the CLAIM UI and submits it for analysis.
o The UI sends this URL to the API Gateway via a POST request to the /analyze
endpoint.
2. Routing to Repository Miner:
o The API Gateway receives the URL and forwards it to the Repository Miner
Service.
o Repository Miner clones the repository and searches for docker-compose.yml.
3. Parsing docker-compose.yml:
o If a docker-compose.yml file is found, its content is sent to the File Parser
Service.
o The File Parser parses each service and organizes the data.
4. Heuristic Classification:
o Parsed service data is sent to the Heuristic Identifier Service.
o The Heuristic Identifier classifies each component based on pre-defined rules,
tagging components as either microservices or infrastructure.
5. Storing Metadata:
o The classified metadata is sent to the Metadata Storage Service, which stores it
in MongoDB.
o The Metadata Storage Service confirms successful storage back to the API
Gateway.
6. Displaying Results:
o The API Gateway compiles the metadata and sends it back to the UI.
o The UI displays the categorized microservices and infrastructure components
to the user.
Inter-Service Communication
 REST API Calls: Each microservice communicates through REST APIs, enabling
asynchronous requests and responses.
 JSON Payloads: Data exchanged between services is structured as JSON payloads,
simplifying parsing and processing across services.

4. Technology Stack
Back-End (Services)
 Python: The primary programming language, leveraging Flask for API development
and specific libraries like GitPython and PyYAML.
 MongoDB: NoSQL database chosen for flexibility, scalability, and compatibility with
JSON-like document storage.
Front-End (UI)
 HTML/CSS: For the web interface, providing a form for URL input and displaying
results in a structured format.
Containerization
 Docker: Each microservice runs in its own Docker container, allowing for consistent
environments and easy scaling.
 Docker Compose: Orchestrates the multi-container setup, handling service
dependencies, networking, and configuration.

5. Dockerization and Deployment


Docker Files
Each service has its own Dockerfile that defines its dependencies, environment, and start-up
commands. For example:
 The Repository Miner Dockerfile installs Git and Python packages needed for
cloning and analyzing repositories.
 The File Parser Dockerfile includes YAML parsing dependencies.
Docker Compose Setup
Docker Compose manages the multi-container environment by:
 Defining all services in docker-compose.yml with specific port mappings and network
settings.
 Creating a shared network (e.g., claim_default) to allow services to communicate
using service names.
 Automating the start-up, scaling, and shutdown of containers in a single command
(docker-compose up).
6. Design Considerations and Scalability
Scalability
Each microservice in CLAIM can be independently scaled by adding more instances,
allowing the system to handle increased loads. For example:
 During high traffic, the API Gateway or Repository Miner can be scaled to handle
more incoming requests or larger repositories.
Future Extensions
CLAIM’s architecture is designed for flexibility and can support additional features with
minimal disruption:
 New SCM Integrations: Support for Bitbucket or GitLab repositories.
 Enhanced Heuristics: Adding machine learning models to improve classification
accuracy.
 Kubernetes Migration: Transitioning to Kubernetes for automated scaling, failover,
and monitoring as CLAIM grows.

Conclusion
CLAIM’s architecture combines modularity, scalability, and resilience through a
microservices approach. MongoDB provides flexible data storage, while Docker ensures
consistent deployment across environments. The architecture supports future expansion,
allowing CLAIM to handle increasingly complex repository analyses and scale as needed.

You might also like