File S
File S
Bachelor of Technology
Information Technology
More effective people and organizations that dream, believe, create and deliver!
In the age of the intelligent individual, we believe that the key measure of success is when effective people
and organizations engage in open, honest, two-way symmetrical communication based on understanding
and meaning. They therefore help people develop to the best of their abilities, so they can have what they
want and need, which will help them develop to the best of our abilities. They show leadership in offering
education programs that enable and transform the way people and business find, manage, interact and
communicate with one another, and thus make us a company that understands and satisfies the education,
entertainment and self-actualization needs of our customers. The company believes real value is in the
knowledge that one gains by having practical experiences which are not restricted by some external
credentials. They engage with students, observe what’s working and what’s not. Take instant decisions to
make sure learning outcomes are not hurt.
Companies need to understand the complexities in lives of students, which they sometimes reject as trivial,
they need to see from student’s perspective and make them understand what’s correct for them. They work
in sector of education, it’s not like any e-commerce, payments, entertainments, it’s a noble cause.
With all these efforts of 2 years and beyond, they are working to make Ybi Foundation, an institution
imparting excellent education and solving the core problems of Indian Education System.
DECLARATION
I Deepanshu Bhola , student of B.Tech (Information Technology) hereby declare that the summer training
entitled “Cancer Prediction App” which is submitted to Department of Information Technology, HMR
Institute of Technology & Management, Hamidpur, Delhi, affiliated to Guru Gobind Singh Indraprastha
University, Dwarka(New Delhi) in partial fulfilment of requirement for the award of the degree of Bachelor
of Technology in Information Technology , has not been previously formed the basis for the award of any
degree, diploma or other similar title or recognition.
This is to certify that the above statement made by the candidate is correct to the best of my knowledge.
ACKNOWLEDGEMENT
I am pleased to present this Industrial training report entitled CANCER PREDICTION APP. It is indeed a
great pleasure and a moment of immense satisfaction for me to express my sense of profound gratitude and
indebtedness towards our guide Mr. Gautam Yadav whose enthusiasm are the source of inspiration for us.
We are extremely thankful for the guidance and untiring attention, which he bestowed on me right from
the beginning. His valuable and timely suggestions at crucial stages and above all his constant
encouragement have made it possible for us to achieve this work. Would also like to give our sincere thanks
to Ms Renu Chaudhary Head of Department of INFORMATION TECHNOLOGY for necessary help and
providing us the required facilities for completion of this project report. I would like to thank the entire
Teaching staff who are directly or indirectly involved in the various data collection and soft- ware assistance
to bring forward this seminar report. I express my deep sense of gratitude towards my parents for their
sustained cooperation and wishes, which have been a prime source of inspiration to take this project work
to its end without any hurdles. Last but not the least, I would like to thank all our B.Tech. colleagues for
their co-operation and useful suggestion and all those who have directly or indirectly helped us in
completion of this project work.
ABSTRACT
Artificial Intelligence (AI) refers to the simulation of human intelligence in machines that are programmed
to think, learn, and solve problems like humans. AI systems can perform tasks such as recognizing speech,
making decisions, translating languages, and recognizing patterns, often with minimal human
intervention. Machine Learning is an application of artificial intelligence (AI) that provides systems
the ability to automatically learn and improve from experience without being explicitly programmed.
Machine learning focuses on the development of computer programs that can access data and use it learn
for themselves. During my summer training at YBI Foundation, I acquired hands-on experience in
Artificial Intelligence (AI) and Machine Learning (ML), focusing on both theoretical foundations and
practical implementations. The program introduced me to various machine learning algorithms, data
preprocessing techniques, and model evaluation methods. As part of the training, I developed a project
titled "Cancer Prediction App", which aimed to predict the likelihood of cancer based on patient data. The
app utilized machine learning algorithms, particularly logistic regression and support vector machines, to
analyze medical datasets and provide predictions. I worked on various stages of the project, including data
collection, feature selection, model training, and performance optimization. This experience not only
enhanced my understanding of AI and ML but also helped me strengthen my skills in Python
programming, data analysis, and model deployment. The Cancer Prediction App project showcased the
practical potential of AI in the healthcare industry and underscored the importance of accurate and reliable
predictive models in medical diagnosis.
TABLE OF CONTENTS
CHAPTER 1 INTRODUCTION 1
1.1 About YBI Foundation 1
1.2 Internship Overview 2
1.3 Objectives 5
12.1 Introduction 46
APPENDIX 49
49
Appendix A: Code Repositories and Implementation Details 49
Appendix B: Project Documentation 49
Appendix C: Certificates and Achievements 50
50
Appendix D: Weekly Progress Reports
50
Appendix E: Reference Materials 50
CONCLUSION 51
52
REFERENCES
LIST OF FIGURES
CHAPTER-1
INTRODUCTION
1.1 About YBI Foundation
The Youth Business International (YBI) Foundation is a global network aimed at fostering
entrepreneurship among young people, especially those facing social, economic, or personal barriers.
YBI’s mission is to help young entrepreneurs build successful and sustainable businesses that can create
jobs and stimulate local economies. With a focus on both developed and developing countries, YBI
tailors its approach based on local needs, cultural contexts, and the specific challenges young
entrepreneurs face in different regions.
1. Core Areas of Focus
YBI’s services are designed to support youth entrepreneurship across four key areas:
A. Mentoring
Mentorship is a cornerstone of YBI’s model, offering young entrepreneurs the opportunity to learn from
experienced business professionals. The mentoring process typically involves:
One-on-One Relationships: Young entrepreneurs are matched with experienced mentors who can guide
them through the business development process, offering both practical advice and emotional support.
Long-Term Engagement: Mentorship relationships usually last for 6 to 12 months, providing sustained
guidance as entrepreneurs navigate the challenges of starting and scaling their businesses.
Structured Framework: YBI provides training to mentors to ensure they are equipped to offer both
technical business advice and personal development support, fostering a holistic growth environment.
B. Training and Capacity Building
Training programs are offered in various formats, ranging from short workshops to longer, in-depth
courses. YBI’s training covers essential business skills, including:
Business Planning and Strategy: Entrepreneurs are taught how to develop comprehensive business plans,
including market analysis, operational strategies, and financial planning.
Financial Management: YBI emphasizes the importance of sound financial practices, teaching young
entrepreneurs how to manage cash flow, access credit, and maintain financial records.
Marketing and Branding: Entrepreneurs learn how to identify their target markets, build brands, and
execute marketing strategies both online and offline.
C. Access to Finance
One of the biggest hurdles for young entrepreneurs is access to funding. YBI helps bridge this gap by:
Providing Microloans and Seed Funding: Many YBI network members offer financial assistance directly
to entrepreneurs, often in the form of low-interest loans or grants. These financial products are designed
for young entrepreneurs who may lack collateral or credit history.
Connecting with Investors: YBI often partners with angel investors, venture capitalists, and crowdfunding
platforms to help young entrepreneurs secure larger-scale investment.
Guidance on Financial Literacy: Alongside providing access to funding, YBI offers financial literacy
programs to ensure entrepreneurs understand the financial obligations and risks associated with
borrowing.
2. Global Network and Local Impact
YBI is not a single entity but a network of over 50 independent organizations that operate in more than
70 countries. Each local organization is autonomous but adheres to YBI’s core values and
methodologies. This structure allows YBI to:
Tailor Solutions Locally: Programs are adapted to reflect local market conditions, cultural norms, and
regulatory environments. For example, an entrepreneurship training program in Kenya may focus on
agribusiness, while one in Canada may emphasize tech startups.
Share Global Knowledge: Through its global network, YBI facilitates the exchange of best practices,
success stories, and lessons learned among its members, ensuring continual improvement in program
delivery.
3. Impact Measurement
YBI places a strong emphasis on monitoring and evaluating the impact of its programs. Through data
collection and analysis, YBI tracks key performance indicators such as:
Job Creation: Measuring the number of jobs created by the businesses YBI supports.
Business Survival Rates: Tracking how many businesses are still operating after 1-3 years, which serves
as a proxy for long-term success.
Social Impact: Assessing how entrepreneurship has contributed to wider social outcomes, such as poverty
reduction and community development.
Machine Learning Frameworks: Tools like TensorFlow and PyTorch, used to build, train, and deploy
machine learning models.
Data Processing and Exploration: Techniques for cleaning and exploring data, handling missing values,
and feature scaling.
C. Machine Learning Techniques
Supervised Learning: Study algorithms like regression, support vector machines (SVM), decision trees,
and random forests. Learn how to use labeled data to make predictions.
Unsupervised Learning: Work with algorithms like k-means clustering and hierarchical clustering to
uncover patterns in unlabeled data.
Deep Learning: Introduction to neural networks, convolutional neural networks (CNNs) for image
processing, and recurrent neural networks (RNNs) for sequential data.
Natural Language Processing (NLP): Techniques for text processing, such as sentiment analysis,
tokenization, and language modeling.
D. Model Evaluation and Tuning
Performance Metrics: Learn how to evaluate model performance using metrics like accuracy, precision,
recall, F1 score, and ROC-AUC.
Cross-Validation: Understanding the importance of splitting data and using techniques like k-fold cross-
validation to improve model generalization.
Hyperparameter Tuning: Techniques like grid search and random search for finding the best model
parameters.
E. AI Ethics and Responsible AI
Fairness and Bias: Understanding the ethical implications of AI/ML, addressing bias in data and models,
and ensuring fairness in machine learning predictions.
Transparency: Learn about the importance of model interpretability and techniques like SHAP or LIME
for explaining model decisions.
3. Internship Responsibilities
Interns are typically expected to take on the following responsibilities during the course of their
internship:
Data Collection and Preprocessing: Work with raw datasets, cleaning and transforming data for analysis
and model training.
Model Development: Assist in developing and training machine learning models, from prototype to
deployment.
Experimentation and Tuning: Experiment with different machine learning algorithms, hyperparameters,
and training techniques to improve model performance.
Documentation: Maintain clear documentation of model development processes, code, and results.
Collaboration: Work with teams of data scientists, software engineers, and other AI/ML professionals on
real-world projects.
4. Tools and Technologies Used
Interns will typically work with a range of tools and technologies, including:
Programming Languages: Primarily Python (with R, Java, or C++ as secondary languages).
Machine Learning Libraries: TensorFlow, PyTorch, Keras, Scikit-learn.
Data Processing Tools: Pandas, NumPy, Dask.
Visualization Tools: Matplotlib, Seaborn, Plotly.
Cloud Platforms: AWS, Google Cloud, or Azure for AI/ML projects at scale.
Version Control: Git for managing code and project versioning.
5. Career Opportunities
Upon completing an AI/ML internship, participants can pursue a variety of career paths:
Machine Learning Engineer: Focus on building and optimizing machine learning models for deployment
in production environments.
Data Scientist: Analyze data to extract insights and build predictive models.
AI Researcher: Conduct research on advanced AI algorithms and contribute to academic or industrial
advancements in the field.
AI Product Manager: Manage the development and deployment of AI-driven products and solutions.
1.3 Objectives
The objective of an AI/ML internship is to provide participants with practical experience and
foundational knowledge in the fields of artificial intelligence and machine learning. The internship
aims to bridge the gap between theoretical learning and real-world applications, preparing participants
for future roles in AI/ML development, data science, or related fields.
The objective of an AI/ML internship is to provide participants with practical experience and
foundational knowledge in the fields of artificial intelligence and machine learning. The internship aims
to bridge the gap between theoretical learning and real-world applications, preparing participants for
future roles in AI/ML development, data science, or related fields.
o Foster critical thinking and analytical skills by working on real-world problems where AI/ML techniques
can be applied.
o Enable interns to experiment with different machine learning algorithms and strategies to solve specific
tasks such as classification, regression, clustering, or natural language processing.
o Introduce the concepts of AI ethics, fairness, and transparency in machine learning to ensure responsible
and sustainable AI practices.
o Help participants gain proficiency in AI/ML frameworks such as TensorFlow, PyTorch, and Scikit-learn,
along with key programming languages like Python.
o Introduce modern tools, technologies, and cloud-based platforms (such as AWS, Google Cloud, or Azure)
used in the industry to build and deploy AI solutions.
o Equip interns with the knowledge and skills required to pursue roles such as machine learning engineer,
data scientist, AI researcher, or AI product manager.
o Provide career guidance, networking opportunities, and mentorship from industry professionals to help
interns understand the career pathways available in AI/ML
A student can implement this internship experience in his future work area. YBI Foundation gives me
an opportunity for gathering practical experience and preparation of the report. I prepared this report
under the super supervision of Ms. Renu Chaudhary, Assistant Professor, IT Department, HMR
Institute of Technology and Management. The study focuses mainly on “CANCER PREDICTION APP
”. So technologies used in building the website has been briefly described in the report.The code and
the output of the aforementioned project has been attached to give detailed explanation about the field
and a sample project related to it.
CHAPTER 2
TECHNICAL FOUNDATION
2.1 Mathematics for Machine Learning
Mathematics for Machine Learning forms the backbone of understanding and developing machine
learning algorithms. A strong grasp of key mathematical concepts is essential for both applying machine
learning techniques and understanding how algorithms work under the hood.
Here’s an overview of the essential mathematics topics that are foundational for machine learning:
1. Linear Algebra
Linear algebra is crucial for handling high-dimensional data and is at the core of many machine learning
algorithms, particularly in deep learning.
Vectors and Matrices: Understanding operations on vectors and matrices is essential, as datasets are
often represented as matrices, with each row representing an example and each column representing a
feature.
o Vector operations: Addition, subtraction, dot product, and norm.
o Matrix operations: Multiplication, inversion, transposition.
Eigenvalues and Eigenvectors: These play a significant role in dimensionality reduction techniques like
Principal Component Analysis (PCA), which helps to reduce the complexity of data without losing much
information.
Matrix Factorization: Techniques such as Singular Value Decomposition (SVD) are used in
recommendation systems, among other applications.
2. Calculus
Calculus, particularly differential calculus, is used to optimize machine learning algorithms, especially
during the training phase.
Derivatives and Partial Derivatives: The concept of derivatives helps in understanding how a model’s
parameters (like weights in a neural network) change concerning the input data.
o Gradient: The gradient is the vector of partial derivatives and is used to find the direction of
steepest ascent or descent.
Gradient Descent: One of the core optimization algorithms in machine learning, it involves finding the
local minimum of a function by following the negative gradient of the function.
o Backpropagation: In neural networks, the chain rule from calculus is applied during backpropagation to
compute gradients that help in updating the weights.
3. Probability and Statistics
Probability theory and statistics provide the tools to model uncertainty, make predictions, and draw
inferences from data.
Probability Distributions: Understanding various probability distributions (e.g., Gaussian, Bernoulli,
Binomial, and Poisson distributions) is crucial, as many machine learning algorithms assume certain data
distributions.
Bayes’ Theorem: It is the foundation of many probabilistic algorithms, including Naive Bayes classifiers
and Bayesian Networks.
Expectation and Variance: These are measures of the central tendency and spread of data, important in
understanding models' predictions.
Maximum Likelihood Estimation (MLE): A method for estimating the parameters of a statistical model
by maximizing the likelihood that the process described by the model produced the observed data.
Hypothesis Testing: Techniques for determining whether a specific hypothesis about the data is
statistically significant or not (e.g., p-values, confidence intervals).
def predict(x):
Dictionaries/Hash Maps: Useful for storing key-value pairs, often used in cases where you need to
quickly retrieve data (e.g., mappings of categories to numerical values in classification).
Example:
label_mapping = {'dog': 0, 'cat': 1, 'bird': 2}
Tuples: Immutable sequences, useful for storing data that should not change, such as fixed coordinates
or parameters.
Example:
point = (3, 4)
Sets: A collection of unique items, often used to remove duplicates from data.
Example:
unique_labels = set([1, 2, 2, 3])
Stacks and Queues: Stacks use a Last-In-First-Out (LIFO) approach, while queues use a First-In-First-
Out (FIFO) structure. These structures are less commonly used in direct ML model building but are
essential in search algorithms, graph traversals, or when managing tasks in priority order.
Example:
from collections import deque
queue = deque([1, 2, 3])
queue.append(4)
print(queue.popleft()) # Removes the first element
Trees and Graphs: Data structures like decision trees, random forests, and graphs are essential for certain
ML algorithms.
o Decision Trees: Used in classification and regression tasks, where each node represents a
decision based on a feature, and the leaves represent outcomes.
o Graph Data Structures: Useful in network analysis, recommendation systems, and graph-based
learning techniques.
3. Algorithms
Machine learning models themselves are often implemented as algorithms that learn from data.
Understanding algorithms helps in improving the efficiency and performance of model training and
inference.
Search Algorithms:
o Linear Search: Checking every element one by one (inefficient for large datasets).
o Binary Search: Efficiently finding an element in a sorted array, reducing the time complexity to
O(log n).
Sorting Algorithms:
o Bubble Sort, Quick Sort, Merge Sort: Sorting data is a common preprocessing step in machine
learning pipelines.
Quick Sort has an average-case complexity of O(n log n) and is commonly used due to
its efficiency.
Dynamic Programming:
o Solves problems by breaking them down into simpler overlapping subproblems. Used in
optimization problems, like finding the shortest path in a graph or the most efficient way to carry
out tasks (e.g., Markov decision processes).
Greedy Algorithms:
o Make locally optimal choices at each step with the hope of finding a global optimum. This is
used in various AI/ML optimization techniques, including feature selection and scheduling tasks.
Graph Algorithms:
o Breadth-First Search (BFS) and Depth-First Search (DFS): Used in problems like navigating through
states, search spaces, or finding connected components in graphs.
o Dijkstra’s Algorithm: Used for finding the shortest path in a graph, applicable in recommendation
engines, routing, and navigation systems.
Example of BFS:
from collections import deque
def bfs(graph, start):
visited = set()
queue = deque([start])
while queue:
node = queue.popleft()
if node not in visited:
CHAPTER 3
DATA SCIENCE FUNDAMENTALS
3.1 Data Collection and Preprocessing
1. Data Collection
Data Collection refers to the process of gathering relevant data for a specific analysis or machine
learning project. The quality, quantity, and relevance of the data directly affect the performance of
machine learning models and the accuracy of data-driven insights.
Types of Data:
Structured Data: Organized into rows and columns, often found in databases, spreadsheets, or CSV files
(e.g., customer records, sales data).
Unstructured Data: Does not follow a pre-defined structure (e.g., text documents, images, videos, social
media posts).
Semi-Structured Data: Contains some organizational structure but does not follow a rigid schema (e.g.,
JSON, XML files, log data).
Sources of Data:
Manual Data Entry: Data manually entered by users or through forms (e.g., surveys, input fields).
APIs: Many companies offer APIs (Application Programming Interfaces) for retrieving structured or
unstructured data (e.g., Twitter API, OpenWeatherMap API).
Web Scraping: The process of automatically extracting data from websites (e.g., scraping product data
from e-commerce sites).
Sensors/IoT Devices: Data generated by devices in real-time (e.g., temperature sensors, fitness trackers).
Public Datasets: Many organisations and platforms provide open datasets (e.g., UCI Machine Learning
Repository, Kaggle, government data portals).
Data Preprocessing
Data Preprocessing is the next crucial step after data collection, where raw data is cleaned, transformed,
and prepared for analysis or machine learning models. Raw data is often messy and
contains missing values, outliers, or irrelevant information. Preprocessing ensures that the data is in a
suitable format for further analysis.
Example:
data.drop_duplicates(inplace=True)
Outlier Detection and Treatment: Outliers can skew your analysis or affect the performance of your
models.
Methods:
Z-Score: Identify outliers based on how far a data point deviates from the mean.
IQR (Interquartile Range): Remove outliers that lie outside a certain range.
Transformations: Use techniques like logarithmic or square root transformations to reduce the
impact of outliers.
Fig 3.1
Figure 3.2
CHAPTER 4
MACHINE LEARNING
Figure 4.1
Use Cases: Used in finance (e.g., predicting stock prices), economics, and real estate (e.g., predicting
house prices).
2. Logistic Regression
Purpose: Used for binary classification problems to predict the probability that a given input belongs to
a particular category.
Method: Applies the logistic function to a linear combination of the input features, producing an S-
shaped curve that outputs probabilities between 0 and 1. The decision boundary is determined by a
threshold (e.g., 0.5).
Use Cases: Used in medical fields for disease classification, marketing for customer churn prediction,
and many other binary classification tasks.
3. Decision Trees
Purpose: Can be used for both classification and regression tasks.
Method: A tree-like model that splits the data into subsets based on feature values, leading to a decision
outcome at each leaf node. Each internal node represents a feature test, and each branch represents the
outcome of the test.
Use Cases: Commonly used in finance (e.g., credit scoring), healthcare (e.g., diagnosing patients), and
various decision-making processes.
4. Random Forests
Purpose: An ensemble method that improves the performance of decision trees.
Method: Builds multiple decision trees during training and merges their outputs to increase accuracy and
control overfitting. Each tree is trained on a random subset of the data and features.
Use Cases: Used in scenarios requiring high accuracy, such as stock market predictions, image
classification, and medical diagnosis.
5. Support Vector Machines (SVM)
Purpose: Primarily used for classification tasks, but can also be adapted for regression.
Method: Finds the hyperplane that best separates data points of different classes in a high-dimensional
space. The goal is to maximize the margin between the closest points of the classes (support vectors).
Use Cases: Effective in high-dimensional spaces, used in text classification, image recognition, and
bioinformatics.
2. Dimensionality Reduction
Definition: Dimensionality reduction techniques reduce the number of features (variables) in a dataset
while retaining its essential structure and information. This can simplify models and reduce
computational costs.
Common Techniques:
o Principal Component Analysis (PCA): Transforms the data into a new coordinate system where
the greatest variance by any projection lies on the first coordinate (principal component), the
second greatest variance on the second coordinate, and so on.
o t-Distributed Stochastic Neighbor Embedding (t-SNE): A nonlinear technique that reduces
dimensions while preserving local structure, often used for visualizing high-dimensional data in
two or three dimensions.
o Autoencoders: Neural networks that learn to compress data into a lower-dimensional space and
then reconstruct the input data from this representation.
o F1 Score: The harmonic mean of precision and recall, providing a balance between the two.
o Mean Absolute Error (MAE) and Mean Squared Error (MSE): Metrics for regression tasks that
measure the average error between predicted and actual values.
3. Hyperparameter Tuning
Definition: Hyperparameter tuning involves optimizing the hyperparameters of a machine learning
model to improve its performance. Hyperparameters are external to the model and are set before training
(e.g., learning rate, number of trees in a random forest).
Techniques:
o Grid Search: Exhaustively searches through a specified parameter grid to find the optimal
combination of hyperparameters.
o Random Search: Samples a fixed number of hyperparameter combinations randomly, which can
be more efficient than grid search.
o Bayesian Optimization: A probabilistic model-based approach that selects hyperparameters
based on past evaluation results to find optimal settings more efficiently.
Benefits: Enhances model performance and ensures that the best parameter configurations are used
during training.
CHAPTER 5
DEEP LEARNING
Fig 5.1
Figure 5.2
Architecture of CNNs
CNNs typically consist of several key layers:
Input Layer: Accepts the raw pixel values of the input image, typically in the format of height × width ×
channels (e.g., RGB images have three channels).
Convolutional Layers:
o Convolution Operation: Applies a set of filters (kernels) to the input image. Each filter slides
(convolves) over the input, computing the dot product between the filter and the local region of
the image. This operation captures local patterns and features (e.g., edges, textures).
Feature Maps: The result of applying the convolution operation is a set of feature maps, which represent
the presence of specific features in the input image.
Activation Function: Commonly, the ReLU (Rectified Linear Unit) activation function is applied to
introduce non-linearity, allowing the network to learn complex patterns.
Pooling Layers:
o Purpose: Down-sample the feature maps, reducing their spatial dimensions while retaining the
most important features. This helps to reduce computational complexity and overfitting.
Popular CNN Architectures
LeNet-5: One of the earliest CNN architectures, primarily used for handwritten digit recognition.
AlexNet: A deep CNN that won the ImageNet competition in 2012, known for its depth and the use of
ReLU and dropout.
VGGNet: Characterized by its simple architecture using small (3x3) convolution filters, allowing for
deeper networks.
ResNet: Introduced residual connections (skip connections) that help mitigate the vanishing gradient
problem in very deep networks.
Figure 5.3
Architecture of RNNs
Basic Structure: An RNN consists of an input layer, one or more recurrent layers, and an output layer.
o Input Layer: Takes the sequential data as input, where each element of the sequence is fed to the network
at each time step.
o Recurrent Layer: Contains the hidden states that capture information from previous time steps.
Applications of RNNs
Natural Language Processing: RNNs are widely used for tasks such as language modeling, text
generation, and machine translation.
Speech Recognition: RNNs can process audio signals to transcribe spoken language into text.
Time Series Prediction: RNNs are effective for predicting future values in a sequence, such as stock
prices or weather data.
Video Analysis: RNNs can analyze sequential frames in video data for tasks like action recognition.
5.4 Transfer Learning
Definition: Transfer learning is a machine learning approach where a model developed for one task is
reused as the starting point for a model on a second, related task. It leverages knowledge gained from a
pre-trained model to improve learning in a different but related problem domain.
Purpose: It aims to reduce the time and computational resources required to train models from scratch,
especially in scenarios where labeled data is scarce.
How Transfer Learning Works
Pre-trained Models: In transfer learning, a model is typically pre-trained on a large dataset (e.g.,
ImageNet for image classification) to learn generalized features. These features can be adapted to a new
task with different data.
Fine-Tuning: The pre-trained model can be fine-tuned on a smaller dataset specific to the new task. This
involves:
o Freezing Layers: Keeping some layers of the pre-trained model unchanged while only retraining
the last few layers.
o Adjusting Hyperparameters: Modifying learning rates and other hyperparameters to optimize
performance for the new task.
CHAPTER 6
PROJECTS AND IMPLEMENTATIONS
Figure 6.1
The model learns these coefficients through maximum likelihood estimation, attempting to find values
that best explain the observed data while preventing overfitting.
Dataset Analysis and Preparation
The project utilized a comprehensive cancer dataset that required careful preparation and analysis:
Data Exploration Phase:
1. Initial Data Assessment:
o The dataset was first examined for completeness and quality
o Statistical summaries were generated to understand feature distributions
Figure 6.2
CHAPTER 7
TOOLS AND TECHNOLOGIES
7.1 Development Environment and Infrastructure
The development environment for this project was carefully architected to ensure optimal performance,
reproducibility, and collaboration capabilities. Google Colaboratory served as the primary development
platform, offering several crucial advantages for machine learning implementation:
Development Infrastructure: The cloud-based infrastructure provided by Google Colaboratory offered
several key benefits:
Cloud Computing Resources:
Access to high-performance GPU acceleration for model training
12GB RAM allocation enabling efficient data processing
68GB storage capacity for dataset management
Automatic session management and resource allocation
Python Environment Management: The development environment maintained strict version control of
dependencies:
Python 3.8 as the base interpreter
Pip package manager for dependency management
Virtual environment isolation to prevent package conflicts
Automatic package installation and compatibility resolution
Interactive Development Features: The notebook-based development environment facilitated:
Real-time code execution and visualization
In-line markdown documentation
Interactive debugging capabilities
Dynamic code cell management
Integrated error tracking and resolution
Figure 7.1
CHAPTER 8
BEST PRACTICES AND INDUSTRY STANDARDS
8.1 Code Quality and Architecture
Our implementation adhered to stringent coding standards and architectural principles designed to
ensure long-term maintainability and scalability. The codebase follows a modular architecture, with
clear separation of concerns between different functional areas. This organization enhances code
readability while facilitating future modifications and improvements. Each module operates
independently while maintaining clear interfaces with other components, creating a robust and flexible
system.
Documentation plays a central role in our implementation, with comprehensive coverage of both code-
level details and broader architectural concepts. Each function and class is accompanied by detailed
documentation explaining its purpose, parameters, and return values. Implementation notes capture the
rationale behind key decisions, while usage examples demonstrate practical applications of different
components. This thorough documentation ensures that future developers can understand and build
upon our work effectively.
Model transparency forms a key component of our ethical framework. Through careful documentation
and visualization of decision boundaries, we maintain clear understanding of how the model reaches
its conclusions. This transparency extends to confidence scoring and uncertainty quantification,
ensuring users understand the reliability of model predictions in different scenarios.
Figure 8.1
CHAPTER 9
LEARNING OUTCOMES
9.1 Technical Skills Acquired
Throughout the internship, we gained a comprehensive understanding of the AI and ML ecosystem,
focusing on both theoretical foundations and practical applications. Our work on cancer prediction, in
particular, allowed us to hone specific technical skills essential in the field of data science and machine
learning:
Python Proficiency: As the primary language for the project, Python played a central role in our work.
We deepened our understanding of Python’s capabilities, especially its use in handling large datasets,
data preprocessing, and implementing machine learning algorithms. Additionally, working with libraries
like NumPy, Pandas, and Matplotlib enhanced our data manipulation and visualization skills.
Machine Learning Algorithms: We became proficient in key machine learning models such as linear
regression, logistic regression, decision trees, random forests, and support vector machines (SVMs).
Understanding how to apply these models to real-world datasets, including selecting the appropriate
algorithms for specific problems, was a vital part of our learning.
Deep Learning Fundamentals: We were introduced to neural networks and more complex architectures
like Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs). The cancer
prediction project, although not purely focused on deep learning, provided insight into how neural
networks could be applied for advanced predictive models in healthcare.
Data Science Workflow: We learned the importance of following a structured data science workflow,
from data collection and preprocessing to model training, evaluation, and fine-tuning. Mastering
techniques such as data cleaning, feature engineering, and hyperparameter tuning proved invaluable in
improving model performance.
Problem Solving: Tackling technical challenges such as algorithm selection, data preprocessing issues,
and model evaluation required a methodical approach to problem-solving. We learned how to break down
complex tasks into manageable parts and how to troubleshoot effectively when encountering roadblocks.
CHAPTER 10
CHALLENGES AND SOLUTIONS
10.1 Technical Challenges
During the course of our internship, we encountered several technical challenges, each of which
provided valuable learning opportunities. These challenges pushed us to deepen our understanding of
machine learning concepts and develop better problem-solving strategies.
Data Collection and Cleaning: One of the first major hurdles was dealing with the raw data for our cancer
prediction project. The dataset we worked with had missing values, inconsistencies, and noisy data. This
required us to invest a significant amount of time in data cleaning and preprocessing. We used Pandas
extensively for handling missing values, filling gaps using statistical imputation, and removing duplicate
entries. Additionally, feature scaling was essential to normalize data and improve the model’s
performance.
Feature Engineering: Transforming raw data into useful features was another challenge. We had to
understand the domain of cancer diagnosis and treatment, which required careful selection of features that
were relevant for prediction. The complexity of healthcare data often made it difficult to decide which
features to include or exclude. We approached this problem using both domain research and exploratory
data analysis (EDA) techniques. Tools like Seaborn and Matplotlib were invaluable for visualizing feature
correlations, which ultimately helped improve the accuracy of our model.
Model Selection: Determining which machine learning model would best suit the dataset was a significant
decision. Early in the process, we tried various models, including logistic regression, decision trees, and
support vector machines (SVM). However, some models suffered from overfitting or poor generalization
to new data. To address this, we incorporated cross-validation techniques and grid search to find the
optimal hyperparameters. Random Forests and logistic regression emerged as the best performers for our
specific dataset after extensive experimentation.
Model Performance and Evaluation: Another technical challenge was ensuring that the chosen model
did not just perform well on training data but also generalized well to unseen data. This required careful
evaluation using metrics such as accuracy, precision, recall, F1-score, and the ROC-AUC curve. We ran
into issues where the model’s accuracy appeared high but recall was low, indicating it might miss
predicting positive cancer cases. By iterating on feature selection, adjusting class imbalances, and tuning
hyperparameters, we managed to optimize performance to a satisfactory level.
Collaboration and Peer Learning: When faced with issues in model implementation, we turned to our
peers and online forums for help. Discussing problems with teammates or consulting Stack Overflow
helped us find solutions more quickly. We also participated in group discussions facilitated by YBI, which
allowed us to share insights and learn from others’ experiences.
Iterative Experimentation: Solving machine learning problems often requires trying different
approaches, and we embraced an iterative method. When initial models failed to perform well, we iterated
on our feature selection, refined the preprocessing pipeline, and experimented with different algorithms.
For example, when our decision tree models overfit, we switched to random forests, which gave us better
generalization by averaging results across multiple trees.
Documentation and Testing: We found that maintaining thorough documentation of our code and
processes was crucial. By keeping detailed records of each experiment and the corresponding results, we
could better understand what worked and what didn’t. Additionally, writing unit tests for individual
functions in our codebase helped us identify errors early, preventing larger problems later in the project.
Breaking Down Complex Problems: Complex issues such as hyperparameter tuning or handling
imbalanced datasets could sometimes feel overwhelming. To avoid confusion, we broke these challenges
down into smaller, manageable tasks. For example, when dealing with class imbalance, we started by
testing different resampling techniques like SMOTE (Synthetic Minority Over-sampling Technique) and
under sampling, before settling on a solution that worked best for our dataset.
models required extensive tuning, and sometimes simple models with default parameters performed just
as well.
Figure 10.1
CHAPTER 11
TOOLS AND TECHNOLOGIES USED
11.1 Overview of Tools and Technologies
Throughout the course of our AI/ML-based cancer prediction internship, we leveraged a variety of tools
and technologies to complete the project successfully. These tools were essential for different phases
of the project, including data handling, model development, evaluation, and visualization. Below is a
breakdown of the most important tools and technologies we used:
11.2 Programming Languages
Python: Python was the primary programming language we used for the entire project. Its extensive
libraries, readability, and ease of use make it the go-to language for most data science and machine
learning tasks. Python's rich ecosystem of machine learning libraries like Scikit-learn, TensorFlow, and
Pandas provided us with powerful tools to analyze data, build models, and fine-tune them.
11.3 Machine Learning and Data Science Libraries
Scikit-learn: Scikit-learn was the cornerstone of our machine learning workflow. This library provided us
with a wide range of tools for model selection, feature engineering, preprocessing, and model evaluation.
We used Scikit-learn for training various models, including logistic regression and random forests. The
ease of using functions such as train_test_split, GridSearchCV, and classification_report made it an
invaluable part of the project.
Pandas: For data manipulation and preprocessing, Pandas was essential. Its ability to handle data in the
form of DataFrames allowed us to quickly clean, manipulate, and analyze the cancer dataset. With Pandas,
we could easily handle missing values, drop unnecessary columns, and extract relevant features.
Operations such as filtering rows, applying group-by functions, and merging datasets were crucial for
preparing the dataset for machine learning.
NumPy: NumPy, which forms the backbone of many machine learning and data science libraries, was
used extensively for numerical operations. It helped us manage arrays and perform mathematical
operations that are foundational in machine learning algorithms. Matrix manipulations, element-wise
operations, and data type conversions were handled seamlessly using NumPy.
Matplotlib and Seaborn: Visualizing data was an integral part of understanding the dataset and
communicating insights. We used Matplotlib and Seaborn to create a variety of plots, such as histograms,
scatter plots, heatmaps, and box plots, to identify trends, correlations, and outliers. These libraries enabled
us to understand feature distributions and relationships between variables, helping us in feature
engineering and model selection.
TensorFlow: We experimented with TensorFlow for building deep learning models, although it was not
the final approach for the project. TensorFlow’s versatility in constructing neural networks made it a useful
tool for testing more complex models, especially when considering future enhancements that could
involve deep learning techniques. Its integration with Keras made it relatively simple to create, train, and
evaluate neural networks.
Keras: Built on top of TensorFlow, Keras was used for building prototypes of neural networks due to its
user-friendly API. While traditional machine learning models proved sufficient for our cancer prediction
project, Keras helped us gain insights into deep learning model construction and how they might be applied
to healthcare problems.
Jupyter Notebooks: Throughout the project, we used Jupyter Notebooks for coding, documentation,
and visualization. The notebook interface allowed us to test individual blocks of code quickly, iterate
on them, and document our findings in real-time. It also made it easy to share code and insights with
teammates, making it an essential tool for collaborative work.
Google Colab: Google Colab provided us with cloud-based access to GPUs and TPUs, which were
especially useful when experimenting with deep learning models. Colab’s seamless integration with
Google Drive also allowed us to collaborate and share resources with ease. Additionally, its free GPU
support made it an affordable option for running complex computations on large datasets.
GitHub: Version control was critical throughout the project, and we relied heavily on GitHub for code
management and collaboration. GitHub’s repository management and branching features allowed us to
work on different parts of the project simultaneously, track changes, and maintain a history of our
codebase. GitHub Issues and Pull Requests helped us manage tasks and review code before merging it
into the main branch.
SHAP (SHapley Additive exPlanations): Towards the end of the project, we explored SHAP for model
interpretability. In healthcare applications, it’s critical to understand why a model makes certain
predictions, and SHAP values allowed us to visualize the contribution of each feature in a model’s
decision-making process. SHAP (SHapley Additive exPlanations) is a method used to interpret the output
of machine learning models by explaining the contribution of each feature to the model's predictions. It is
based on Shapley values from cooperative game theory, which ensures a fair distribution of contributions
among the features. SHAP provides both global and local interpretability: globally, it helps understand the
overall impact of features on model decisions, while locally, it explains individual predictions by
quantifying how much each feature influenced the output. This makes SHAP particularly valuable for
complex models like XGBoost, neural networks, or random forests, where interpreting the decision-
making process can be challenging. The method ensures transparency, helping users trust and validate
machine learning models, which is especially important in critical applications like finance, healthcare,
and legal systems.
CHAPTER 12
CHALLENGES FACED AND LESSONS LEARNED
12.1 Introduction
Throughout the duration of our internship project on AI/ML-based cancer prediction, we encountered
numerous challenges that tested both our technical and analytical abilities. Overcoming these obstacles
not only deepened our understanding of machine learning and data science but also honed our problem-
solving skills. This chapter outlines the key challenges we faced during the project and the valuable
lessons we learned from them.
12.2 Data-Related Challenges
12.2.1 Data Availability and Quality
One of the first major challenges we encountered was related to the availability and quality of the
dataset. In healthcare applications, obtaining a comprehensive, accurate dataset is critical for model
performance. Initially, we struggled with the limited availability of public cancer-related datasets.
Additionally, the dataset we eventually selected contained a significant amount of missing values,
inconsistencies, and noise, which required extensive preprocessing.
Lesson Learned: We learned that the quality of data is paramount in machine learning projects. No
matter how advanced the model, its predictions are only as good as the data it is trained on. This
experience taught us the importance of robust data cleaning processes and how to handle missing or
incorrect data using techniques like imputation, filtering, and careful preprocessing.
12.2.2 Imbalanced Data
Another data-related challenge was the imbalance between cancer-positive and cancer-negative cases
in the dataset. Since the number of cancer-positive cases was significantly smaller than the cancer-
negative ones, it led to biased model performance that favored predicting the majority class.
Lesson Learned: We addressed this issue by applying techniques like SMOTE (Synthetic Minority
Over-sampling Technique) to balance the dataset. This challenge highlighted the importance of
addressing class imbalances, especially in healthcare applications, where false negatives can have
serious consequences. The experience emphasized that model evaluation metrics like accuracy should
not be the sole determinant; other metrics like precision, recall, and the F1 score are equally important
in imbalanced datasets.
Lesson Learned: This challenge taught us the importance of rigorous model evaluation and the ethical
considerations of using AI in healthcare. It is essential to not only strive for high accuracy but also
ensure that the model’s limitations are clearly understood and communicated. Transparency, fairness,
and explainability in AI models are crucial, especially in sensitive applications like cancer prediction.
APPENDIX
explanatory notes that clarify the purpose and impact of specific implementation choices. The
documentation also includes troubleshooting guides and common issues that might be encountered
during deployment or maintenance.
Performance metrics and evaluation procedures are documented with detailed explanations of their
significance and interpretation. This includes comprehensive coverage of the confusion matrix,
accuracy scores, precision, recall, and F1-score metrics. The documentation provides context for
understanding these metrics in the specific context of cancer prediction.
Appendix C: Certificates and Achievements
Throughout the course of this internship at YBI Foundation, several key milestones and achievements
were documented. The primary certificate of completion demonstrates proficiency in machine learning
fundamentals and practical implementation skills. This certificate validates the successful completion
of the core curriculum and project requirements.
The project implementation itself received recognition for its thorough approach to medical data
analysis and careful consideration of ethical implications in healthcare AI. The documentation of these
achievements includes detailed descriptions of the specific skills and competencies demonstrated
throughout the project development process.
Technical competencies certified through this internship include proficiency in Python programming,
data analysis using pandas and numpy, machine learning implementation using scikit-learn, and version
control using Git and GitHub. The certification process validated practical experience in handling real-
world datasets and implementing machine learning solutions for critical healthcare applications.
Appendix D: Weekly Progress Reports
The development of the cancer prediction model progressed through several distinct phases, each
documented in weekly progress reports. These reports provide a chronological record of the project's
evolution and the learning journey throughout the internship.
Week 1 focused on foundational concepts and environment setup. During this period, the necessary
development tools were installed and configured, including Python, required libraries, and the Google
Colaboratory environment. The week included intensive study of machine learning fundamentals and
their applications in healthcare.
Week 2 involved deep diving into data analysis and preprocessing techniques. The cancer dataset was
thoroughly examined, with particular attention paid to understanding the significance of each feature
and its relationship to the target variable. This period included extensive exploratory data analysis and
the development of initial data preprocessing strategies.
Week 3 concentrated on model implementation and training. The logistic regression model was
developed, trained, and initially evaluated. This week involved intensive coding sessions, debugging,
and optimization of the model's performance. The implementation was iteratively refined based on
performance metrics and validation results.
The final week focused on model evaluation, documentation, and project finalization. Comprehensive
testing was performed to ensure the model's reliability and accuracy. The documentation was
completed, including detailed explanations of the implementation and its results.
CONCLUSION
Throughout this internship journey with YBI Foundation, I have gained invaluable insights into the
rapidly evolving field of Artificial Intelligence and Machine Learning. The comprehensive program
structure, starting from the fundamentals of AI/ML to hands-on experience with Google Colab, has
provided me with a solid foundation in this domain. The course not only emphasized theoretical
knowledge but also its practical applications in real-world scenarios.
The linear regression project I developed served as a practical demonstration of machine learning
concepts, allowing me to understand the complete lifecycle of an ML project - from data preprocessing
to model deployment. This hands-on experience has enhanced my problem-solving abilities and
technical skills significantly. The project helped me grasp the importance of data analysis, feature
selection, and model evaluation in creating effective machine learning solutions.
This internship has reinforced my understanding of AI's transformative potential across various
industries. The exposure to industry-standard tools and methodologies has prepared me for future
challenges in the field. Moving forward, I am confident that the knowledge and skills acquired during
this internship will prove instrumental in my professional growth. The experience has not only
enhanced my technical capabilities but also developed my analytical thinking and project management
skills, making me better equipped for future opportunities in the AI/ML domain.
One of the most significant aspects of this internship was learning to navigate and utilize Google Colab
effectively. This cloud-based platform has revolutionized the way we approach machine learning
projects, offering powerful computational resources and a collaborative environment. The hands-on
experience with Colab has made me proficient in implementing machine learning algorithms and
handling large datasets efficiently, skills that are crucial in today's data-driven world.
The weekly progression of the course was well-structured, allowing for gradual skill development and
concept mastery. The initial weeks focused on building a strong theoretical foundation, which proved
essential when tackling more complex topics later in the program. The interactive learning environment
fostered engagement and encouraged practical application of concepts, making the learning process
both effective and enjoyable.
The feedback received during project development was constructive and helped improve both the
technical aspects of the work and my understanding of best practices in the field.
Looking ahead, this internship has laid a strong foundation for my career in AI and machine learning.
The combination of theoretical knowledge and practical experience has prepared me to take on more
challenging projects in the future. I am particularly excited about exploring advanced machine learning
concepts and contributing to innovative solutions that can make a meaningful impact in various
domains.
REFERENCES
[1] F. Hussain, Internet of Things; Building Blocks and Business Models. Springer, 2017.
[2] D. B. J. Sen, “Internet of Things - Applications and Challenges in Technology and Standardization,”
IEEE Transactions in Wireless Personal Communication, May 2011.
[3] U. S. Shanthamallu, A. Spanias, C. Tepedelenlioglu, and M. Stanley, “A Brief Survey of Machine
Learning Methods and their Sensor and IoT Applications ,” IEEE Conference on Information,
Intelligence, Systems and Applications, March 2018.
[4] Y. Li, Z. Gao, L. Huang, X. Du, and M. Guizani, “Resource management for future mobile
networks: Architecture and technologies,” Elsevier, pp. 392–398, 2018.
[5] X. Li, J. Fang, W. Cheng, H. Duan, Z. Chen, and H. Li, “Intelligent power control for spectrum
sharing in cognitive radios: A deep reinforcement learning approach,” IEEE Access, vol. 6, pp. 25 463–
25 473, 2018.
[7] V. Gazis, “A survey of standards for machine-to-machine and the internet of things,” IEEE
Communications Surveys Tutorials, vol. 19, no. 1, pp. 482–511, Firstquarter 2017.
[9] S. B. Baker, W. Xiang, and I. Atkinson, “Internet of things for smart healthcare: Technologies,
challenges, and opportunities,” IEEE Access, vol. 5, pp. 26 521–26 544, 2017.
[10] F. Javed, M. K. Afzal, M. Sharif, and B. Kim, “Internet of things (iot) operating systems support,
networking technologies, applications, and challenges: A comparative review,” IEEE Communications
Surveys Tutorials, vol. 20, no. 3, pp. 2062–2100, thirdquarter 2018.
[11] D. E. Kouicem, A. Bouabdallah, and H. Lakhlef, “Internet of things security: A top-down survey,”
Computer Networks, vol. 141, pp. 199 – 221, 2018. [Online]. Available:
http://www.sciencedirect.com/science/ article/pii/S1389128618301208
[12] A. Chowdhury and S. A. Raut, “A survey study on internet of things resource management,”
Journal of Network and Computer Applications, vol. 120, pp. 42 – 60, 2018.[Online].Available:
http://www.sciencedirect.com/science/article/pii/S1084804518302315
[13] D. Airehrour, J. Gutierrez, and S. K. Ray, “Secure routing for internet of things: A survey,” Journal
of Network and Computer Applications, vol. 66, pp. 198 – 213, 2016.[Online].Available:
http://www.sciencedirect.com/science/article/pii/S1084804516300133
[14] J. Guo, I.-R. Chen, and J. J. Tsai, “A survey of trust computation models for service management
in internet of things systems,” Computer Communications, vol. 97,pp.1–14,2017.[Online].Available:
http://www.sciencedirect.com/science/article/pii/S0140366416304959
[15] W. Ding, X. Jing, Z. Yan, and L. T. Yang, “A survey on data fusion in internet of things: Towards
secure and privacy-preserving fusion,” Information Fusion, vol. 51,pp.129–
144,2019.[Online].Available: http://www.sciencedirect.com/science/article/pii/S1566253518304731
[16] M. Mohammadi, A. Al-Fuqaha, S. Sorour, and M. Guizani, “Deep learning for iot big data and
streaming analytics: A survey,” IEEE Communications Surveys Tutorials, vol. 20, no. 4, pp. 2923–
2960, Fourthquarter 2018.
[17] O. Simeone, “A very brief introduction to machine learning with applications to communication
systems,” IEEE Transaction on Cognitive Communications and Networking, vol. 4, no. 4, pp. 648–664,
December 2018.
[18] Q. et al., “Real-time multi-application network traffic identification based on machine learning,”
in in Springer,, vol. 92, 2015, pp. 473–480.
[20] R. Masiero, G. Quer, D. Munaretto, M. Rossi, J. Widmer, and M. Zorzi, “Data acquisition through
joint compressive sensing and principal component analysis,” in GLOBECOM 2009 - 2009 IEEE
Global Telecommunications Conference, Nov 2009, pp. 1–6.
[23] F. Hussain, A. Anpalagan, A. S. Khwaja, and M. Naeem, “Resource Allocation and Congestion
Control in Clustered M2M Communication using Q-Learning,” Wiley Transactions on Emerging
Telecommunications Technologies, 2015.
[24] B. FrankoviIvana and B. Budinska, “Advantages and disadvantages of heuristic and multi agents
approaches to the solution of scheduling problem,” in in IFAC,, vol. 33, no. 13, May 2000, pp. 367–
372.
[25] J. Moura and D. Hutchison, “Game theory for multi-access edge computing: Survey, use cases,
and future trends,” IEEE Communications Surveys Tutorials, vol. 21, no. 1, pp. 260–288, Firstquarter
2019.
[26] S. et al., “Sleeping multi-armed bandit learning for fast uplink grant allocation in machine type
communications,,” in https://arxiv.org/abs/1810.12983, October 2018.
[27] K. et al., “Big-data streaming applications scheduling based on staged multi-armed bandits,,” in
IEEE Transactions on Computers, vol. 65, no. 12, November 2016, p. 115.
[29] W. Wu, W. Xiang, Y. Zhang, K. Zheng, and W. Wang, “Performance analysis of device-to-device
communications underlaying cellular networks,” Telecommun. Syst., vol. 60, no. 1, pp. 29–41, Sep.
2015. [Online]. Available: http://dx.doi.org/10.1007/s11235-014-9919-y
[30] C. Yu, K. Doppler, C. B. Ribeiro, and O. Tirkkonen, “Resource sharing optimization for device-
to-device communication underlaying cellular networks,” IEEE Transactions on Wireless
Communications, vol. 10, no. 8, pp. 2752–2763, August 2011.
[31] Y. L. Lee, T. C. Chuah, J. Loo, and A. Vinel, “Recent advances in radio resource management for
heterogeneous lte/lte-a networks,” IEEE Communications Surveys Tutorials, vol. 16, no. 4, pp. 2142–
2180, Fourthquarter 2014.
[33] S. Tarapiah, K. Aziz, and S. Atalla, “Common radio resource management algorithms in
heterogeneous wireless networks with kpi analysis,” International Journal of Advanced Computer
Science and Applications, vol. 6, no. 10, 2015. [Online]. Available:
http://dx.doi.org/10.14569/IJACSA.2015.061007
[34] M. Al-Imari, P. Xiao, and M. A. Imran, “Receiver and resource allocation optimization for uplink
noma in 5g wireless networks,” in 2015 International Symposium on Wireless Communication Systems
(ISWCS), Aug 2015, pp. 151–155.