0% found this document useful (0 votes)
49 views20 pages

Data Science - Analyst Requirement

12332123123

Uploaded by

Kiran Prajapati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views20 pages

Data Science - Analyst Requirement

12332123123

Uploaded by

Kiran Prajapati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Sr. no.

Job Title
1 AI Developer

2 Business Intelligence (BI) Developer

3 Senior Data Infrastructure Engineer


4 Sr. Machine Learning engineer

5 AI/ML Programmer
6 Data Engineer

7 Data Specialist

8 AI Engineer

9 GPT Engines and Copilot Integration


Developer
10 AI / ML resource Require

11 Senior AI Engineer (Head of AI Engineering

12 Power BI Developer (4+ Years Experience)

13 Power BI and Azure data Factory


14 Power BI and Azure data Factory
Job Description
AI Developer (LLMs, LLAMA Models, Fine-tuning & AI Training Expert)

As a BI Developer, you will work with product and technology teams to help design and develop data visualizatio
in Power BI and Domo platforms to make a significant impact in transforming the financial industry.
In this role, you will have following responsibilities:
The ideal candidate will be responsible for working with the Business Stakeholders, Project Managers, Product O
Business Analysts to:
Responsibilities
● Drive customer conversations to define the requirements and overall technical
solutions for data and related applications.
● Helpthe
About define
role the metrics and KPI development for effective decision-making.
As a Senior Data Infrastructure engineer, you will be working with Data Engineering, Analyst and Operations tea
maintain a secure, scalable and reliable data platform that powers XXX Enterprise Data Hub. The position requi
working knowledge of public cloud, big data fundamentals, excellent programming skills, experience with relatio
relational databases, containerization, IAM & networking security and troubleshooting.
What You’ll Do
● Responsible for building and managing our cloud infrastructure (AWS) with code first mindset.
● Design and architect ETL/ELT pipelines running on AWS EMR and Snowflake
● Performance tune data pipelines using technologies such as Apache Spark, Hive and Trino
● Write and maintain high-quality code and participate in design discussions and code reviews.
● Implement data storage solutions such as S3, ERDS and NoSQL databases.
● Automate workflows using orchestration systems like airflow, oozie etc.
● Build monitoring and observability dashboards.
JobSummary:
JoinoureliteteamasaSeniorMachineLearningEngineeratQxLabAI,aleaderinAI
andmachinelearninginnovation.You'llbeintegraltoadvancingourcutting-edge
machinelearningtechnologies.Yourroleinvolvesworkingalongsidetop-tier
researchers,softwareengineers,anddatascientiststoenhance,develop,and
implementadvancedmachinelearningmodels,tacklingsomeofthemostchallenging
problemsinthefield.
KeyResponsibilities:
LargeLanguageModel(LLM)Mastery:
● AdvancedModelArchitecture:DeepunderstandingandexperiencewithLLM
architecturesincludingTransformers,GPTvariants,BERT,etc.Expertiseinthe
complexitiesofattentionmechanisms,positionalencodings,andotherkeyLLM
components.
● SophisticatedLLMDeployment:Skilledindeployingadvancedpre-trainedmodels
likeGPT-4,BERT,etc.MasteryinoptimizationtechniquesforLLMs,including
advancedmethodslikequantizationanddistillation.
● ExpertFine-tuning:Proventrackrecordinfine-tuningLLMsforspecializedtasks.
Proficiencyinadvancedfine-tuningstrategiesandabilitytoadaptmodelsto
specificindustryordomainneeds.
● ComprehensiveTrainingfromScratch:ExtensiveexperienceintrainingLLMs
fromthegroundup,including:
● ComplexDataHandling:Expertincuratingandprocessinglarge,diverse
datasets.Keenabilitytoidentifyandmitigatebiases.
● AddressingComputationalChallenges:Masteryindistributedtraining,
maximizingmulti-GPUandTPUuse,andimplementingadvanced
techniquesforefficientmemoryusage.
● ModelStabilityandInnovation:Addressingtrainingchallengeslike
"catastrophicforgetting"andinnovatingingradientclipping,learningrate
adjustments,andcustomlossfunctions.
● AdvancedHyperparameterOptimization:Expertiseinfine-tuningkey
hyperparametersforpeakmodelperformance.
ResearchandAlgorithmDevelopment:
● Leadandinnovateinthedesignandimplementationofmachinelearning
algorithms,ensuringfairnessandinterpretability.
● Overseedatamanagement,ensuringqualityandcompliance,andperform
advanceddatapreprocessingtechniques.
ModelImplementationandValidation:
● Developscalableandefficientmachinelearningmodelsusingadvanced
frameworks.
● Implementandoverseethoroughvalidationprocessestoensuremodelintegrity.
OptimizationandScaling:
● Leadinmodeloptimizationforenhancedcomputationalefficiencyandaccuracy.
● Collaboratewithinfrastructureteamsforlarge-scalemodeldeployment.
JobSummary:
We are looking for askilledAI/MLProgrammertojoinourdynamicteam.Inthisrole,
youwillcontributetothedevelopmentandimplementationofinnovativemachine
learningmodelsandAIsolutions.Thispositionisidealforsomeonewithapassionfor
AIandmachinelearningtechnologies,eagertocollaboratewithadiverseteamof
professionalstosolvecomplexproblems.
KeyResponsibilities:
AIandMachineLearningDevelopment:
● ModelImplementation:AssistinthedevelopmentofAIandmachinelearning
models,includingunderstandingfoundationalarchitectureslikeTransformers,
GPTvariants,andBERT.
● ModelDeployment:Gainexperienceindeployingpre-trainedmodelssuchas
GPT-4andBERT,andlearnoptimizationtechniquesforeffectiveperformance.
TECHNICAL REQUIREMENTS
Programming languages: Strong technical ability in Python, Java, C++, or Scala.
Cloud Programming: An understanding of common cloud platforms including Google Cloud, Azure, Oracle and A
including the ability to work with other cloud-based data services.
Data Security: An understanding of data security best practices and the ability to comply with data privacy regula
candidates navigate the heavily regulated world of construction and engineering. Knowledge of data compliance
legal risks, and industry standard protocols.
Scripting and Automation: High Proficiency in scripting for automation tasks and system administration.
Previous Sector Experience: Experience in one or more of the following sectors: manufacturing, construction, fa
management, oil and gas, aerospace, and automotive industries."

TECHNICAL REQUIREMENTS
Core expertise:
∙ Experience with Unix and PowerShell scripting skills
∙Must
Understanding
have: of dimensional and relational data modeling
GenAI Development:
OpenAI, Langchain, VectorDBs.
ML Related: Real-time ML, Databricks delta live tables, Azure IOT Hub, Azure functions.
Data Science Workbench Related: Infrastructure Engineering, Automation, CI/CD.
Python + Azure + MLOps / Model Deployment experience( for all the 3 tracks listed above)
In all three tracks, Python + Azure + MLOps / Model Deployment experience is mandatory.

Task:

• Provide answers for tickets: when a new ticket arrives, we want an AI to understand the request, analyze the e
documentation, and knowledge base, and generate a possible answer as an internal note. The agent could then
as-is, or modify it as needed.

• Connect our internal data sources into an AI chatbot: we want to integrate our licenses, CRM, SharePoint, kno
and ticketing system into an AI chatbot that can give us quick and accurate answers. We are interested in using
this, as it already has access to our data and can combine it with other sources.

• Chatbot for our technical documentation: we want to feed a large language model with our existing technical do
and provide human language search and user-friendly solutions for user inquiries.
Job Description:
Join us in pioneering next-generation Wi-Fi-based vision and 3D sensing technologies that allow for real-time sp
including the ability to see behind walls. We are looking for a talented AI/ML Engineer to help develop and optim
algorithms that leverage wireless signals to create real-time 3D models, enabling applications such as human ac
recognition, object detection, and environmental sensing without the need for traditional cameras.
Key Responsibilities:
Design and optimize advanced signal processing algorithms that convert Wi-Fi signals into detailed 3D sensing
Develop and implement AI/ML models for Wi-Fi-based object detection, localization, and 3D human mesh recon
Collaborate on innovative research, pushing the boundaries of Wi-Fi-based vision for real-time human activity re
environmental sensing.
Work closely with engineering teams to ensure real-time performance on edge devices and cloud infrastructures
minimal latency and high efficiency.
Validate, test, and deploy models in dynamic environments for a variety of applications, ensuring robustness an
Required Qualifications:
Bachelors or Master’s degree in Electrical Engineering, Computer Science, or a related field.
4+ years of experience in AI/ML model development, digital signal processing (DSP), and algorithm design.
Proficiency in Tensor Flow, Keras, PyTorch, and other machine learning frameworks.
Strong programming skills in MATLAB, Python, or C++ for developing real-time processing algorithms.
Experience in data-driven modelling for signal interpretation and 3D spatial understanding. Preferred Qualificatio
Responsibilities-
· Innovate and Implement: Design, develop, and deploy AI solutions to optimize design options, streamline engin
workflows, and improve project planning.
· Lead and Mentor: Guide AI product development, mentor junior team members, and foster a collaborative wor
· Collaborate: Work closely with business leaders and end-users to identify AI-driven process improvements and
driven decision-making.
·🔍Prototype and Pilot: Rapidly prototype and test AI solutions in a fast-paced, agile environment.
Your Mission:
1. Lead KPI dashboard development using Power BI & Domo and bring data insights to life.
2. Drive conversations with business teams to shape data strategies and technical solutions.
3. Work your magic on complex DAX queries and customize visuals with JavaScript libraries.
4. Integrate data from SQL Server, Postgres, MongoDB, NoSQL, and beyond!
Optimize performance, implement security, and create unified data models for analytics.
5. Leverage Python scripting, and have hands-on experience with SSRS, SSAS, SSIS.
6. Familiar
Job with CI/CD pipelines (Jenkins/Bamboo) and cloud environments like AWS? That’s a big plus!
Description:

We are seeking a skilled Power BI and Azure Data Factory (ADF) Developer to design and implement data anal
The ideal candidate will be responsible for building, maintaining, and optimizing data pipelines using ADF and cr
interactive and insightful dashboards in Power BI.

Key Responsibilities:
Develop, deploy, and manage ETL pipelines in Azure Data Factory.
Design and build Power BI dashboards and reports to meet business requirements.
Collaborate with business stakeholders to gather data and reporting requirements.
Optimize and troubleshoot ADF workflows and Power BI datasets for performance.
Integrate multiple data sources such as Azure SQL, Blob Storage, and external APIs into a unified data model.
Ensure data quality, consistency, and security across all reporting and ETL processes.
Stay up-to-date with the latest BI and data engineering tools and technologies.

Requirements:
Proven experience with Power BI, including DAX, Power Query, and data modeling.
Hands-on experience with Azure Data Factory, including pipeline and activity creation.
Strong understanding of Azure services such as Data Lake, Azure SQL, and Synapse Analytics.
Proficiency in SQL and data transformation techniques.
Excellent problem-solving skills and attention to detail.
Strong communication and teamwork abilities.
Job Description:

We are seeking a skilled Power BI and Azure Data Factory (ADF) Developer to design and implement data anal
The ideal candidate will be responsible for building, maintaining, and optimizing data pipelines using ADF and cr
interactive and insightful dashboards in Power BI.

Key Responsibilities:
Develop, deploy, and manage ETL pipelines in Azure Data Factory.
Design and build Power BI dashboards and reports to meet business requirements.
Collaborate with business stakeholders to gather data and reporting requirements.
Optimize and troubleshoot ADF workflows and Power BI datasets for performance.
Integrate multiple data sources such as Azure SQL, Blob Storage, and external APIs into a unified data model.
Ensure data quality, consistency, and security across all reporting and ETL processes.
Stay up-to-date with the latest BI and data engineering tools and technologies.

Requirements:
Proven experience with Power BI, including DAX, Power Query, and data modeling.
Hands-on experience with Azure Data Factory, including pipeline and activity creation.
Strong understanding of Azure services such as Data Lake, Azure SQL, and Synapse Analytics.
Proficiency in SQL and data transformation techniques.
Excellent problem-solving skills and attention to detail.
Strong communication and teamwork abilities.
Technical Skills
Large Language Model (LLM) development and fine-tuning
LLAMA model expertise
Training and fine-tuning AI models for specific applications
Prompt engineering and model optimization
Data preparation and preprocessing for AI training
Model evaluation, testing, and deployment
API integration for AI solutions
Advanced knowledge of NLP techniques and model architectures

Data visualization
Data modeling
KPI and metrics development
Power BI performance optimization
DAX query writing
Data integration
Data strategy and unified data layer formulation
Dashboard customization with JavaScript
Cloud infrastructure management
ETL/ELT pipeline design
Data pipeline performance tuning
Big data processing
Relational and non-relational databases
Code review and design discussion
Workflow automation
Monitoring and observability
Cost optimization
Security and access management
Troubleshooting and incident management
Documentation and standards maintenance
Large Language Model (LLM) architecture and deployment
Model fine-tuning and hyperparameter optimization
Data curation, preprocessing, and bias mitigation
Distributed training and multi-GPU/TPU optimization
Model stability, innovation, and custom loss functions
Advanced machine learning algorithms
Model validation and performance monitoring
Cloud deployment and infrastructure scaling
Production-level code implementation
Performance monitoring and maintenance
Communication and cross-functional collaboration
Mentorship and team leadership

Machine learning model implementation


Model fine-tuning and training support
Data preparation and preprocessing
Model validation (cross-validation, A/B testing)
Model optimization and scaling
Cloud deployment assistance
Performance monitoring
Foundational knowledge in NLP and computer vision (preferred)
Communication and collaboration skills
Proficiency in Python, Java, C++, or Scala
Cloud programming and cloud-based data services
Data security and compliance
Scripting and automation for system administration
Industry knowledge in manufacturing, construction, facilities management, oil and gas,
aerospace, or automotive sectors

Unix and PowerShell scripting


Dimensional and relational data modeling
Database security, performance, and backup/recovery
Strong statistical
Generative and mathematical skills
AI development
Real-time machine learning
Infrastructure engineering
Automation and CI/CD for data science
MLOps and model deployment
Python programming
Azure cloud services

GPT and AI model integration


Prompt engineering and natural language understanding
Knowledge management integration (CRM, SharePoint, ticketing systems)
Data source integration with AI systems
API development and data connectivity
Chatbot development for technical documentation
Automation for offer generation
Data extraction and processing for dynamic pricing
Experience with MS Copilot and Microsoft ecosystem
Understanding of ticketing and CRM workflows
Document search and retrieval optimization
AI/ML model development and optimization
Digital signal processing (DSP)
Wi-Fi-based object detection and 3D sensing algorithms
Signal interpretation and 3D spatial modeling
Real-time processing and low-latency optimization
Model validation, testing, and deployment for robustness
Computer vision and 3D reconstruction techniques
Wireless sensing systems and Wi-Fi-based spatial awareness
Edge computing and cloud architecture for real-time AI model deployment

AI/ML solution design and deployment


AI product lifecycle management
Prototyping and agile development
AI model enhancement and maintenance
Architectural review for scalable AI solutions
LLMs and prompt engineering
Retrieve and Generate
KPI dashboard (RAG)and
development methodologies
data visualization
Data strategy formulation and technical solution design
Advanced DAX query writing
Data integration across SQL Server, Postgres, MongoDB, NoSQL
Performance optimization and security implementation
Unified data modeling for analytics
Python scripting for data manipulation
Experience
ETL pipelinewith SSRS, SSAS,
development SSIS
and management
Power BI dashboard and report design
Data modeling and DAX query proficiency
Data integration from multiple sources (Azure SQL, Blob Storage, APIs)
Performance optimization and troubleshooting in ADF and Power BI
Data quality, consistency, and security management
Collaboration with stakeholders for data and reporting requirements
ETL pipeline development and management in Azure Data Factory
Power BI dashboard and report creation
Advanced data modeling and DAX query writing in Power BI
Data integration from multiple sources (Azure SQL, Blob Storage, external APIs)
Performance optimization for ADF workflows and Power BI datasets
Data quality, consistency, and security management across reporting and ETL processes
SQL and data transformation techniques
Stakeholder collaboration for data and reporting requirements
Tools
LLAMA (Large Language Model Meta AI)
PyTorch
TensorFlow
Hugging Face Transformers
LangChain
Data versioning and management tools (e.g., DVC)
Docker and Kubernetes for deployment
Jupyter Notebook and Google Colab for experimentation
Version control (Git/GitHub)
Cloud platforms (AWS, Azure, GCP)
Power BI
Domo
SQL
DAX (Data Analysis Expressions)
JavaScript libraries for visualization (e.g., D3.js)
Data warehousing solutions (e.g., Snowflake, Redshift)
ETL tools (e.g., Alteryx, SSIS)
Excel
AWS (S3, EMR, RDS, IAM)
Snowflake
Apache Spark
Apache Hive
Trino
Airflow
Oozie
NoSQL databases (e.g., DynamoDB)
Monitoring tools (e.g., CloudWatch, Grafana)
Containerization (e.g., Docker, Kubernetes)
Infrastructure as Code (e.g., Terraform, CloudFormation)
LLMs (e.g., GPT, BERT, Transformers)
TensorFlow
PyTorch
Cloud platforms (AWS, Azure, GCP)
Distributed training frameworks (e.g., Horovod)
Hyperparameter tuning tools (e.g., Optuna)
DevOps tools for deployment and monitoring (e.g., Docker,
Kubernetes, MLflow)
Version control (Git)
Python, C++

Machine learning frameworks (TensorFlow, PyTorch)


Python, C++
Cloud platforms (AWS, Azure, GCP)
Version control (Git)
Basic deployment tools (e.g., Docker)
Data handling tools (e.g., Pandas, NumPy)
Cloud platforms (Google Cloud, Azure, Oracle, AWS)
Automation and scripting tools (e.g., Bash, PowerShell)
Data security protocols and compliance frameworks
Version control (Git)
Data processing frameworks (e.g., Apache Spark, Hadoop)

SSIS, SSRS, SSAS


Database management systems (e.g., SQL Server, Oracle)
Windows Server
Cloud platforms (Azure, Google Cloud, Oracle, AWS)
OpenAI
LangChain
Vector databases (VectorDBs)
Databricks Delta Live Tables
Azure IoT Hub
Azure Functions
Data Science Workbench tools
CI/CD pipelines (e.g., Jenkins, GitHub Actions, Azure
DevOps)

GPT models (OpenAI, Azure OpenAI)


Microsoft Copilot
Azure Cognitive Services
CRM platforms (e.g., Salesforce, Dynamics 365)
SharePoint
Ticketing systems (e.g., Zendesk, ServiceNow)
Chatbot frameworks (e.g., Bot Framework, LangChain)
API integration tools
Document processing tools (e.g., DocAI)
Automation and workflow tools (e.g., Power Automate,
Zapier)
Python and JavaScript for scripting and integration
Excel/Power BI for data handling and visualization (if
TensorFlow
Keras
PyTorch
MATLAB
Python
C++
Machine learning and signal processing libraries (e.g.,
SciPy, NumPy)
Cloud platforms (AWS, Azure, or Google Cloud for edge
and cloud deployment)
Edge computing platforms (e.g., NVIDIA Jetson,
Qualcomm Snapdragon)

Programming languages: Python


AI/ML frameworks: PyTorch, LangChain
LLMs: OpenAI GPT and similar models
Database management: MySQL, ChromeDB, Pinecone
Automation tools: BeautifulSoup, Scrapy, Selenium
Cloud platforms: AWS, Azure, GCP
Version
Power BIcontrol and CI/CD: Git, Jenkins, GitHub Actions
Domo
SQL databases (SQL Server, Postgres, MongoDB,
NoSQL)
Python
SSRS, SSAS, SSIS
CI/CD tools (Jenkins, Bamboo)
Cloud
Power platforms (AWS,
BI (including Azure)
DAX, Power Query)
Azure Data Factory
Azure SQL
Azure Data Lake
Synapse Analytics
SQL
Data integration tools and APIs
Power BI (DAX, Power Query)
Azure Data Factory (ADF)
Azure SQL Database
Azure Data Lake
Synapse Analytics
SQL

You might also like