0% found this document useful (0 votes)
15 views4 pages

New

Uploaded by

Rutuja Bhagat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views4 pages

New

Uploaded by

Rutuja Bhagat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 4

dont start kafka using docker give commands to run it in vs code To implement your

YADTQ system with Kafka, worker nodes, and a client that asks for user input for
operations and values, here’s a recommended structure for your project and some
steps to set up and run it on macOS:

1. Project Structure

Yadtq/

├── core/
│ ├── __init__.py
│ ├── kafka_connector.py # Kafka connection and task queue management
│ ├── result_backend.py # Result storage and status management
│ └── task_processor.py # Task processing logic

├── workers/
│ ├── worker1.py
│ ├── worker2.py
│ └── worker3.py

├── client/
│ └── client.py

└── requirements.txt # List of dependencies

2. Dependencies

First, create a requirements.txt file with the necessary Python dependencies:

kafka-python
redis

Install the dependencies using the following command:

pip install -r requirements.txt

3. Core Components

core/kafka_connector.py

This will handle the connection to Kafka and queue management.

from kafka import KafkaProducer, KafkaConsumer


import json

class KafkaConnector:
def __init__(self, broker='localhost:9092', topic='tasks'):
self.broker = broker
self.topic = topic
self.producer = KafkaProducer(bootstrap_servers=self.broker,
value_serializer=lambda v: json.dumps(v).encode('utf-8'))
self.consumer = KafkaConsumer(self.topic, bootstrap_servers=self.broker,
group_id='yadtq-group', value_deserializer=lambda x: json.loads(x.decode('utf-8')))

def send_task(self, task):


self.producer.send(self.topic, value=task)

def get_task(self):
for message in self.consumer:
yield message.value

core/result_backend.py

This handles task results and stores them in a backend (Redis or in-memory).

import redis
import json

class ResultBackend:
def __init__(self, host='localhost', port=6379):
self.r = redis.Redis(host=host, port=port, db=0)

def set_status(self, task_id, status, result=None):


result_data = {
'status': status,
'result': result
}
self.r.set(task_id, json.dumps(result_data))

def get_status(self, task_id):


result_data = self.r.get(task_id)
if result_data:
return json.loads(result_data)
return None

core/task_processor.py

This processes the tasks received from Kafka and updates the result backend.

import time
from core.kafka_connector import KafkaConnector
from core.result_backend import ResultBackend

class TaskProcessor:
def __init__(self):
self.kafka = KafkaConnector()
self.backend = ResultBackend()

def process_task(self, task):


task_id = task['task-id']
self.backend.set_status(task_id, 'processing')
# Simulate task processing
time.sleep(7) # Simulating the task duration

try:
if task['task'] == 'add':
result = sum(task['args'])
elif task['task'] == 'sub':
result = task['args'][0] - task['args'][1]
elif task['task'] == 'multiply':
result = task['args'][0] * task['args'][1]
else:
raise ValueError("Unknown task")

self.backend.set_status(task_id, 'success', result)


except Exception as e:
self.backend.set_status(task_id, 'failed', str(e))
4. Worker Nodes

Each worker will continuously pull tasks from Kafka and process them.

workers/worker1.py, worker2.py, worker3.py

Each worker can be almost identical.

import sys
import time
from core.task_processor import TaskProcessor

def start_worker(worker_id):
print(f"Worker-{worker_id} starting...")
processor = TaskProcessor()
while True:
task = next(processor.kafka.get_task())
print(f"Worker-{worker_id} processing task {task['task-id']}")
processor.process_task(task)
time.sleep(7) # Sleep between tasks

if __name__ == "__main__":
worker_id = sys.argv[1]
start_worker(worker_id)

5. Client

The client will ask for user input and send tasks to the Kafka queue.

client/client.py

import uuid
import time
from core.kafka_connector import KafkaConnector

def get_task_from_user():
task = input("Enter task to perform (add/sub/multiply): ").strip()
args = input("Enter values (comma separated): ").strip().split(',')
args = [int(arg) for arg in args]
return {
"task-id": str(uuid.uuid4()),
"task": task,
"args": args
}

def main():
kafka = KafkaConnector()
while True:
task = get_task_from_user()
print(f"Submitting task {task['task-id']} to the queue...")
kafka.send_task(task)
time.sleep(7)

if __name__ == "__main__":
main()

6. Running the System

1. Start Kafka: First, ensure Kafka is running on your local machine. You
can use Docker to run Kafka if you don’t have it installed.
Example Docker setup for Kafka:

docker-compose -f kafka-docker-compose.yml up

2. Run Workers: Open three terminals and run the worker nodes in each
terminal:

python workers/worker1.py 1
python workers/worker2.py 2
python workers/worker3.py 3

3. Run Client: In another terminal, run the client:

python client/client.py

4. Task Submission: The client will prompt for an operation (e.g., add,
sub, multiply) and the values to be processed. Once submitted, tasks will be
distributed across the three workers.

7. Testing Fault Tolerance

To test fault tolerance, you can simulate worker crashes by manually stopping one
or more workers during task processing. The tasks should automatically be
reassigned to other active workers.

Additional Notes:

• Kafka Consumer Group: Kafka’s consumer group feature ensures that tasks
are distributed among workers evenly.
• Heartbeat: Add logic for workers to send periodic heartbeats to ensure
they are still active. You can use time.sleep() and check for worker failure. give
commands to run this project on macos with virtual env commands too

You might also like