0% found this document useful (0 votes)

35 views36 pages

Report

Uploaded by

Aastha Dewangan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views36 pages

Report

Uploaded by

Aastha Dewangan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 36

An

Internship Report
On

Artificial Intelligence Virtual Internship

Submitted To
Bhilai Institute of Technology, Durg
in partial fulfillment of requirement for the award of degree
of
Bachelor of Technology
in

Information Technology

Abhishek Dewangan

(300103322301)

BHILAI INSTITUTE OF TECHNOLOGY DURG,

CHHATTISGARH (INDIA)
DEPARTMENT OF INFORMATION TECHNOLOGY

Session: 2021-2025
DECLARATION BY THE CANDIDATE

I hereby declare that the Industrial Internship report entitled “Cognizant- Artificial
Intelligence” submitted by me to Bhilai Institute of Technology, Durg in partial
fulfilment of the requirement for the award of the degree of Bachelor of Technology in
Information Technology Engineering is a record of bonafide industrial training
undertaken by me under the supervision of Smith, Cognizant. I further declare that the
work reported in this report has not been submitted and will not be submitted, either
in part or in full, for the award of any other degree or diploma in this institute or any
other institute or university.

Signature of the student:

Name of student: Abhishek Dewangan
University roll no:300103322301
Enrollment no.:BH8068
ACKNOWLEDGEMENT

I would like to express my sincere gratitude and appreciation to everyone who has
helped me during my internship. First and foremost, I would like to thank Smith for
providing me with the opportunity to intern at Cognizant. Their support, guidance,
and encouragement have been instrumental in my learning and growth.

I would also like to thank my colleagues, who have welcomed me with open arms and
have been incredibly supportive throughout my internship. Their willingness to share
their knowledge and expertise has been invaluable.

Furthermore, I would like to express my appreciation to Smith, who has been my

mentor during my internship. Their feedback and constructive criticism have helped
me improve my skills and work more efficiently.

Finally, I would like to thank my family and friends for their unwavering support and
encouragement throughout my internship. Their constant motivation has helped me
stay focused and achieve my goals.

Name of student: Abhishek Dewangan

University roll no: 300103322301

Enrollment no.: BH8068

ABSTRACT

The AI Cognizant Virtual Internship plays a crucial role in bridging the gap between
academic knowledge and practical industry application, particularly for students and
professionals aspiring to enter the field of Artificial Intelligence. Traditional education
often emphasizes theoretical concepts, which, while essential, do not always equip
individuals with the hands-on skills needed in the industry. This internship addresses this
gap by providing participants with real-world projects that allow them to apply AI
techniques and tools to solve practical problems, deepening their understanding and
enhancing their practical skill set.

A significant benefit of the internship is the opportunity for participants to build a portfolio
of projects that demonstrates their ability to implement AI concepts in real-world
scenarios. This portfolio serves as a tangible asset in the competitive job market, setting
participants apart from others who may lack practical experience. Employers value this
hands-on experience, as it shows that candidates not only possess theoretical knowledge
but also the capability to apply it effectively in professional settings.

In addition to technical skills, the internship helps participants develop essential

competencies such as project management, teamwork, and communication. By engaging
with real-world challenges, participants gain insights into the practical application of AI
across various industries, while also honing their ability to collaborate and communicate
effectively. This comprehensive experience makes them well-rounded candidates who are
better prepared to meet the demands of the AI industry, significantly boosting their
competitiveness in the job market.
TABLE OF CONTENTS

S.NO. TITLE PAGE NO.

1 About the organization 1-3

2 Application of the gained 4-5

knowledge in/during the
training

3 Comparison of competency 6-7

levels before and after the
training

4 Learnt during training 8-24

5 Objectives 25

6 Technologies used 26-27

7 Purpose and importance 28

8 Area and scope 29

References---------------------------------------------------------------------------------------- 35

Conclusion---------------------------------------------------------------------------------------- 36

ABOUT THE ORGANIZATION

Cognizant is a prominent global IT services and consulting company, renowned for its
ability to help organizations navigate and thrive in the digital era. Founded in 1994 as an
in-house technology unit of Dun & Bradstreet, Cognizant has since evolved into a Fortune
500 company, listed among the world’s most admired and fastest-growing firms. The
company’s rapid ascent in the tech industry can be attributed to its strong focus on
innovation, customer-centricity, and its ability to leverage emerging technologies to deliver
measurable business outcomes.

With its headquarters in Teaneck, New Jersey, Cognizant operates in over 40 countries,
employing more than 300,000 professionals worldwide. The company’s global presence
and vast workforce enable it to serve a diverse client base that spans various industries,
including healthcare, financial services, insurance, manufacturing, retail, and
communications. This extensive industry expertise allows Cognizant to offer tailored
solutions that address the specific challenges and opportunities within each sector.

Cognizant’s service offerings are comprehensive, encompassing digital strategy, technology

consulting, IT infrastructure, and business process services. The company is particularly
well-known for its digital transformation initiatives, where it partners with clients to
modernize their operations, enhance their customer experiences, and drive innovation.
This is achieved through a deep understanding of the client’s business combined with
expertise in key technologies such as artificial intelligence (AI), machine learning, cloud
computing, big data analytics, Internet of Things (IoT), and cybersecurity.

One of Cognizant’s core strengths lies in its ability to integrate digital solutions with
traditional IT systems, creating seamless and scalable platforms that enable businesses to
operate more efficiently and competitively. The company’s focus on digital transformation
is not just about adopting new technologies but also about reimagining business models,
optimizing processes, and fostering a culture of continuous innovation.
Cognizant’s client-centric approach is a cornerstone of its business philosophy. The
company emphasizes close collaboration with clients, aiming to understand their unique
challenges and deliver solutions that are not only effective but also aligned with their
strategic goals. This commitment to client success has earned Cognizant a reputation for
reliability and excellence, leading to long-term partnerships with many of the world’s
leading organizations.

Beyond its commercial success, Cognizant is also dedicated to corporate social

responsibility (CSR) and sustainability. The company’s CSR initiatives focus on areas such
as education, workforce development, environmental sustainability, and community
engagement. Through its Cognizant Foundation and other philanthropic efforts, the
company supports numerous programs aimed at promoting digital literacy, STEM
education, and social equity. Cognizant’s environmental initiatives are designed to reduce
the company’s carbon footprint and promote sustainable business practices across its
global operations.

Cognizant also places a strong emphasis on ethical business conduct and governance. The
company adheres to strict ethical standards in all its operations and is committed to
maintaining transparency and accountability in its business practices. This ethical
framework not only strengthens Cognizant’s corporate reputation but also ensures that it
operates in a manner that is responsible and respectful of all its stakeholders.
APPLICATION OF THE GAINED KNOWLEDGE IN/DURING THE
TRAINING

● One of the foundational areas covered is Machine Learning Algorithms,

understand and implement various techniques such as supervised learning,
unsupervised learning, regression, and classification. These algorithms form the
backbone of AI, enabling systems to learn from data, make predictions, and improve
over time. Participants gain hands-on experience in applying these algorithms to
different types of data, understanding how to select the appropriate method based
on the problem at hand.

● Data Preprocessing is another critical area of focus during the training.

Participants learn techniques for cleaning, normalizing, and transforming raw data,
which is a crucial step before feeding it into machine learning models. This involves
handling missing values, encoding categorical variables, scaling numerical features,
and more. Effective data preprocessing is vital because the quality of data directly
impacts the performance of AI models. By mastering these techniques, participants
ensure that the data used for modeling is in its most optimal form.

● In addition to data preparation, Model Evaluation is a key component of the

training. Participants learn various methods to assess the performance of machine
learning models, using metrics such as accuracy, precision, recall, and F1-score.
Understanding these metrics allows participants to critically evaluate the
effectiveness of their models and make informed decisions about improvements.
This evaluation process is crucial for ensuring that the models not only perform
well on training data but also generalize effectively to new, unseen data.
● Python Programming is also a significant focus of the training, as Python is one of
the most widely used programming languages in AI. Participants learn to write
efficient and optimized code, utilizing powerful libraries such as NumPy, pandas,
and scikit-learn. These libraries are essential for tasks ranging from numerical
computations and data manipulation to implementing machine learning algorithms.
The training emphasizes best practices in Python programming, ensuring that
participants can develop scalable and maintainable AI applications.

● Finally, Project Management skills are integrated into the training, teaching
participants how to develop and manage AI projects from inception to deployment.
This includes planning the project timeline, managing resources, collaborating with
team members, and adhering to quality standards. Effective project management is
crucial in ensuring that AI projects are completed on time, within scope, and meet
the desired objectives. By gaining these skills, participants are better prepared to
take on leadership roles in AI projects, ensuring their successful execution in a real-
world setting.
COMPARISON OF COMPETENCY LEVELS BEFORE AND AFTER THE
TRAINING

Comparing competency levels before and after the data analyst internship program can
effectively highlight the growth and development achieved during the training. Below are some
key areas where this comparison can be made:

● Data Analyst Knowledge: Prior to the training, my understanding of data analysis was
primarily confined to the basic concepts covered in the academic curriculum. However,
after completing the program, I have gained a much deeper understanding of Python, its
functionalities, and how to effectively implement and develop applications using the
platform.

● Technical Skills: Before the training, my experience with programming languages and
tools used in data analysis, such as Python, Power BI, Tableau, and R, was limited. After
the internship, I have become proficient in these technologies and have acquired hands-
on experience in developing data analysis applications.

● Problem-Solving Skills: The internship involved tackling real-world problems and

challenges commonly encountered in data analysis. Before the training, my experience
with solving complex problems was limited. Now, I am much more skilled at problem-
solving and have developed a stronger understanding of how to approach and resolve
complex issues.

● Collaboration and Communication Skills: In real-world data analysis projects,

collaboration and communication skills are crucial. Before the training, my experience in
working within a team and effectively communicating was limited. After completing the
internship, I am now confident in my ability to collaborate with others and communicate
ideas and solutions effectively.
LEARNT DURING TRAINING

Exploratory Data Analysis

import pandas as pd

import seaborn as sns

df = pd.read_csv('/content/sample_sales_data

(1).csv') df.drop(columns=['Unnamed: 0'],

inplace=True) df.head()

Index(['transaction_id', 'timestamp', 'product_id', 'category',

'customer_type', 'unit_price', 'quantity', 'total', 'payment_type'],

dtype='object')

Distributions and Plots

def plot_continuous_distribution(data: pd.DataFrame = None, column: str =

None, height: int = 8):

_ = sns.displot(data, x=column, kde=True, height=height,

aspect=height/5).set(title=f'Distribution of {column}');

def get_unique_values(data, column):

num_unique_values = len(data[column].unique()) value_counts =

data[column].value_counts() print(f"Column: {column} has

{num_unique_values} unique values\n") print(value_counts)

def plot_categorical_distribution(data: pd.DataFrame = None, column: str =

None, height: int = 8, aspect: int = 2):

_ = sns.catplot(data=data, x=column, kind='count', height=height,

aspect=aspect).set(title=f'Distribution of {column}');
def correlation_plot(data: pd.DataFrame = None):

corr = df.corr()

corr.style.background_gradient(cmap='coolwarm')

plot_continuous_distribution(df, 'unit_price')

plot_continuous_distribution(df, 'quantity')

plot_continuous_distribution(df, 'total')

Handling Missing Data

Start by examining your dataset for any missing values. It's essential to identify where data might
be missing and decide how to handle it. Depending on the situation, you might drop rows or
columns with missing data or fill in the gaps with appropriate values, such as the mean or median
for numerical data or the mode for categorical data.

Outlier Detection
Outliers are data points that deviate significantly from other observations and can potentially
distort your analysis. You can identify outliers by visualizing the data or calculating statistical
measures. Once detected, you can decide whether to remove them, transform them, or keep them
based on their relevance to your analysis.

Feature Engineering
This involves creating new features or modifying existing ones to better capture the underlying
patterns in your data. For instance, you might derive new features such as the day of the week or
month from a timestamp or create interaction terms between existing variables to enhance
predictive models.

Categorical Data Analysis

For categorical variables, it's essential to explore their distribution and relationship with other
variables. Visualizing how different categories are distributed or how they relate to continuous
variables can reveal important insights. Additionally, encoding categorical data into numerical form
can be a critical step if you plan to use machine learning models.
Correlation Analysis
Exploring correlations between numerical variables helps to understand relationships within the
data. A correlation matrix, often visualized with a heatmap, can quickly show which variables are
positively or negatively correlated. This can be useful for feature selection or identifying potential
multicollinearity in regression models.

Pairwise Plotting
Pairwise plotting (or pair plots) allows you to visualize relationships between multiple pairs of
variables at once. This is particularly useful when exploring interactions and correlations between
variables, helping you spot trends, clusters, and potential outliers in the data.

Data Normalization and Scaling

If your data includes features with varying scales, normalization or scaling might be necessary,
especially before using distance-based algorithms like K-Nearest Neighbors (KNN) or clustering.
Normalization ensures that each feature contributes equally to the analysis, improving model
performance and interpretability.

Summary Statistics
Generating summary statistics gives a quick overview of the central tendency, variability, and
distribution of your data. This can include measures like mean, median, standard deviation, and
quartiles, providing insights into the overall structure and behavior of your dataset.

By incorporating these additional steps, your EDA will be more comprehensive, leadi
# method to build the plot def get_plot(stock_1,

stock_2, date, value): stock_1 =

dataset[dataset['symbol'] == stock_1] stock_2 =

dataset[dataset['symbol'] == stock_2]

stock_1_name=stock_1['symbol'].unique()[0]

stock_1_range=stock_1[(stock_1['short_date'] >= date[0]) &

(stock_1['short_date'] <= date[1])]
stock_2_name=stock_2['symbol'].unique()[0]

stock_2_range=stock_2[(stock_2['short_date'] >= date[0]) &

(stock_2['short_date'] <= date[1])]

plot=figure(title='Stock prices',
x_axis_label='Date',

x_range=stock_1_range['short_date']

, y_axis_label='Price in $USD',

plot_width=800, plot_height=500)

plot.xaxis.major_label_orientation = 1

plot.grid.grid_line_alpha=0.3

if value == 'open-close':

add_candle_plot(plot, stock_1_name, stock_1_range, 'blue')

add_candle_plot(plot, stock_2_name, stock_2_range, 'orange')

if value == 'volume':

plot.line(stock_1_range['short_date'], stock_1_range['volume'],

legend_label=stock_1_name, muted_alpha=0.2)

plot.line(stock_2_range['short_date'], stock_2_range['volume'],

legend_label=stock_2_name, muted_alpha=0.2,

line_color='orange')

plot.legend.click_policy="mute"

return plot
def add_candle_plot(plot, stock_name, stock_range, color):

inc_1 = stock_range.close > stock_range.open

dec_1 = stock_range.open > stock_range.close w

= 0.5

plot.segment(stock_range['short_date'], stock_range['high'],

stock_range['short_date'], stock_range['low'],

color="grey")

plot.vbar(stock_range['short_date'][inc_1], w,

stock_range['high'][inc_1], stock_range['close'][inc_1],

fill_color="green", line_color="black", legend_label=('Mean

price of ' + stock_name), muted_alpha=0.2)

plot.vbar(stock_range['short_date'][dec_1], w,

stock_range['high'][dec_1], stock_range['close'][dec_1],

fill_color="red", line_color="black", legend_label=('Mean

price of ' + stock_name), muted_alpha=0.2)

stock_mean_val=stock_range[['high', 'low']].mean(axis=1)

plot.line(stock_range['short_date'], stock_mean_val,

legend_label=('Mean price of ' + stock_name),

muted_alpha=0.2, line_color=color, alpha=0.5)

plot_categorical_distribution(df, 'category')

plot_categorical_distribution(df, 'customer_type')

plot_categorical_distribution(df, 'payment_type')
get_plot Method
● Purpose: This method generates a stock price plot for two selected stocks within a specified
date range, allowing for visualization of either the open-close price movement or the volume
of trades.
● Parameters:
1. stock_1, stock_2: Symbols of the two stocks to be compared.

2. date: A tuple representing the start and end dates for the data range.
3. value: Determines what aspect of the stock data to visualize (either 'open-close' for
candlestick charts or 'volume' for trade volume).
● Process:
1. Data Filtering: It filters the dataset to obtain data relevant to the selected stocks
within the specified date range.
2. Plot Initialization: A plot is created with appropriate labels, dimensions, and grid
settings.
3. Plotting:

■ If value is 'open-close', the add_candle_plot function is called to add

candlestick plots for both stocks.
■ If value is 'volume', it plots the trade volume for each stock over time using
line plots.
4. Interactivity: The legend allows users to mute and unmute the plot lines, enhancing
interactivity.

add_candle_plot Method

● Purpose: This function adds candlestick plots to the main plot for the specified stock,
indicating the opening, closing, high, and low prices.
● Parameters:
1. plot: The main plot object where the candlestick charts will be added.

2. stock_name: The name of the stock to display in the legend.

3. stock_range: The filtered data range for the stock.

4. color: The color used for the mean price line.

● Process:
1. Candlestick Calculation: Determines whether the stock closed higher or lower
than it opened (inc_1 and dec_1).
2. Drawing Candles:

■ Segments: Vertical lines representing the range between high and low prices.
■ Bars: Vertical bars filled with green or red, depending on whether the stock
price increased or decreased.
3. Mean Price Line: A line representing the average of the high and low prices is
drawn over the candlestick plot.

plot_categorical_distribution Function

● Purpose: This function visualizes the distribution of a categorical variable using a bar plot.
● Parameters:
0 data: The DataFrame containing the data.
○ column: The categorical column to be visualized.
○ height, aspect: Dimensions of the plot.
● Process:
0 It generates a bar plot showing the frequency of each category in the specified
column. This helps understand the distribution and prevalence of different
categories within the dataset.

Example Plots

● Category Distribution: The plot_categorical_distribution(df, 'category') will show the

distribution of different product categories within the dataset.
● Customer Type Distribution: The plot_categorical_distribution(df, 'customer_type') will
illustrate how different types of customers are represented.
● Payment Type Distribution: The plot_categorical_distribution(df, 'payment_type') will
display the frequency of different payment methods used by customers.
sns.countplot(df, y='category').set(title=f'Distribution of Category')

sns.countplot(df, y='category').set(title=f'Distribution of Category'

sns.heatmap(df.corr())
Data Merging

Parameters:

data (pd.DataFrame): The input DataFrame. column (str): The

name of the column containing timestamp data.
Returns:

A modified DataFrame with the timestamp rounded to the nearest hour.

def convert_timestamp_to_hourly(data: pd.DataFrame = None, column: str =

None):

dummy = data.copy() new_ts = dummy[column].tolist()

new_ts = [i.strftime('%Y-%m-%d %H:00:00') for i in

new_ts] new_ts = [datetime.strptime(i, '%Y-%m-%d

%H:00:00') for i in new_ts] dummy[column] = new_ts

return dummy sales_df =

convert_timestamp_to_hourly(sales_df, 'timestamp')

stock_df = convert_timestamp_to_hourly(stock_df,

'timestamp')

temp_df = convert_timestamp_to_hourly(temp_df, 'timestamp')

This step ensures that all timestamps in the sales, stock, and temperature data

are consistent and aligned on an hourly basis. sales_agg =

sales_df.groupby(['timestamp', 'product_id']).agg({'quantity':

'sum'}).reset_index()

This step consolidates the sales data, making it easier to analyze and model
trends on an hourly basis.

1. Timestamp Conversion: Align timestamps across datasets to the nearest hour for
consistency, making it easier to merge data from different sources.
2. Data Aggregation: Summarize sales data by aggregating quantities sold per hour
and product, simplifying analysis and trend detection.
3. Merging DataFrames: Combine sales, stock, and temperature datasets based on the
aligned timestamp to create a unified dataset for comprehensive analysis.
4. Handling Missing Data: Address any missing values post-merge using techniques
like forward-filling, interpolation, or dropping rows.
5. Final Analysis: Use the combined dataset to explore trends, perform correlation
analysis, and create visualizations such as time-series plots and heatmaps to derive
actionable insights.

Feature Engineering

merged_df['timestamp_day_of_month'] = merged_df['timestamp'].dt.day

merged_df['timestamp_day_of_week'] =

merged_df['timestamp'].dt.dayofweek merged_df['timestamp_hour'] =
merged_df['timestamp'].dt.hour merged_df.drop(columns=['timestamp'],

inplace=True) merged_df.head()

These time-based features help the model capture temporal patterns in the data, which
could be important for forecasting sales and stock levels.

Modeling

Model Setup:
Target Variable (y): estimated_stock_pct (percentage of estimated stock).

Features (X): All other columns in merged_df.

X =

merged_df.drop(columns=['estimated_stock_pct']) y

= merged_df['estimated_stock_pct'] accuracy = []

Cross-Validation and Model Training:

A Random Forest Regressor is used to predict the target variable. The data is split into
training and testing sets, and the model is evaluated using Mean Absolute Error (MAE)
across K folds.

for fold in range(0, K):

# Instantiate algorithm model

= RandomForestRegressor()

scaler = StandardScaler()

# Create training and test samples

X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=split,

random_state=42)

# Scale X data, we scale the data because it helps the algorithm to

converge

# and helps the algorithm to not be greedy with large values

scaler.fit(X_train)

X_train = scaler.transform(X_train)

X_test = scaler.transform(X_test)

# Train model trained_model =

model.fit(X_train, y_train) # Generate

predictions on test sample y_pred =

trained_model.predict(X_test)

# Compute accuracy, using mean absolute error mae =

mean_absolute_error(y_true=y_test, y_pred=y_pred)

accuracy.append(mae) print(f"Fold {fold + 1}: MAE =

{mae:.3f}")

print(f"Average MAE: {(sum(accuracy) / len(accuracy)):.2f}")

Model: A RandomForestRegressor is instantiated for each fold. This model is an ensemble
method that combines the predictions of multiple decision trees to improve accuracy and
robustness.

Scaler: StandardScaler is initialized to standardize the feature data by removing the mean
and scaling to unit variance. This ensures that the model's learning process is not biased by
features with larger scales.

Feature Importance Plot:

The relative importance of each feature is visualized to understand which variables

contribute most to the model’s predictions.

features = [i.split("__")[0] for i in

X.columns] importances =

model.feature_importances_ indices =

np.argsort(importances)

fig, ax = plt.subplots(figsize=(10, 20)) plt.title('Feature Importances')

plt.barh(range(len(indices)), importances[indices], color='y',

align='center') plt.yticks(range(len(indices)), [features[i] for i in

indices]) plt.xlabel('Relative Importance') plt.show()

Identifying important features helps in refining the model and provides insights into
which factors most influence the estimated stock levels.
OBJECTIVES

● Specialized Training: Participants receive in-depth training in specialized areas

such as natural language processing (NLP), computer vision, and deep learning,
allowing them to gain expertise in niche domains within AI.

● Tool Proficiency: The program ensures that participants become proficient in

essential tools like TensorFlow, Keras, PyTorch, and other AI frameworks, enabling
them to build and deploy sophisticated models.

● Problem-Solving Techniques: Interns are taught advanced problem-solving

techniques, such as feature engineering, hyperparameter tuning, and model
optimization, which are critical for improving model performance and accuracy.

● Hands-On Projects: Interns work on hands-on projects that require them to design,
develop, and deploy AI models, simulating real-world scenarios and challenges.

● End-to-End Implementation: Participants gain experience in the entire AI project

lifecycle, from data collection and preprocessing to model development, evaluation,
and deployment.

● Exposure to Diverse Use Cases: The internship provides exposure to a variety of

AI applications across different industries, such as healthcare, finance, and e-
commerce, allowing interns to understand the versatility of AI solutions.
TECHNOLOGIES USED

Programming Language: Python

Python is the cornerstone of AI and data science, recognized for its simplicity,
readability, and extensive ecosystem of libraries and frameworks. As the primary
programming language used in the AI Cognizant Virtual Internship, Python provides
a versatile platform for developing everything from simple scripts to complex
machine learning models. Its syntax is user-friendly, which makes it accessible to
beginners, while its vast array of libraries enables the handling of advanced AI tasks.
Python's extensive support for various data types, powerful in-built functions, and
ease of integration with other technologies make it the ideal language for AI
development.

Libraries and Frameworks

● NumPy: This fundamental library is essential for numerical computing in

Python. NumPy supports large, multi-dimensional arrays and matrices, along
with a collection of mathematical functions to operate on these arrays.

● pandas: pandas is a powerful library used for data manipulation and analysis.
It provides data structures like DataFrames.
Development Tools

● Jupyter Notebook: Jupyter Notebook is an open-source web application that

allows participants to create and share documents that contain live code,

● equations, visualizations, and narrative text. It’s widely used in data science
for its ability to combine code execution with rich text and visual outputs.
This tool is particularly valuable for iterating on ideas, documenting the
modeling process, and presenting results in an interactive format.

Data Visualization Tools

● Matplotlib and Seaborn: While already mentioned as libraries, it’s worth

emphasizing their role as data visualization tools. Matplotlib provides a solid
foundation for creating a wide range of static, animated, and interactive
visualizations.

Collaboration Tools

● GitHub: Beyond its role in code versioning, GitHub serves as a central hub for
project collaboration. Participants can use GitHub to share their code
repositories with team members, track issues, manage pull requests, and
collaborate on code development.
PURPOSE AND IMPORTANCE

The AI Cognizant Virtual Internship plays a pivotal role in closing the gap between
theoretical knowledge acquired in academic settings and the practical skills required in the
AI industry. For students and professionals aspiring to transition into AI roles, this
internship provides an invaluable platform to gain hands-on experience that is often
missing from traditional educational programs. While academic courses typically focus on
foundational theories, mathematical principles, and basic programming, they may not fully
prepare individuals for the complexities and challenges encountered in real-world AI
projects. This is where the AI Cognizant Virtual Internship becomes essential, offering a
structured environment where participants can apply what they’ve learned in a practical
context.

Through this internship, participants engage in a series of carefully designed projects that
mimic the kinds of challenges they will face in the industry. These projects cover a wide
range of AI applications, from machine learning and data analysis to natural language
processing and computer vision. By working on these projects, interns learn to navigate the
entire AI development lifecycle, from data preprocessing and model selection to
deployment and performance evaluation. This practical experience is crucial not only for
reinforcing theoretical knowledge but also for developing a deeper understanding of how
AI solutions are implemented in real-world scenarios. Participants learn to deal with the
nuances of real data, such as handling missing values, dealing with imbalanced datasets,
and optimizing models for performance and scalability.

In summary, the AI Cognizant Virtual Internship is more than just a learning experience—
it’s a transformative journey that equips participants with the skills, confidence, and
practical experience needed to excel in AI roles. It bridges the gap between academia and
industry, ensuring that participants are not only knowledgeable but also industry-ready.
This combination of theoretical understanding and practical application makes the
internship an essential step for anyone looking to succeed in the rapidly evolving field of AI.
AREA AND SCOPE

The scope of the AI Cognizant Virtual Internship is expansive, designed to provide

participants with a comprehensive understanding of artificial intelligence and its many
applications. This internship is meticulously structured to cover various critical aspects of
AI, making it an ideal program for those aspiring to build careers in AI, data science, and
software engineering.

Data Analysis
One of the foundational elements of the internship is data analysis, a crucial skill in AI that
involves extracting meaningful insights from complex datasets. Participants are introduced
to advanced data manipulation techniques using tools like pandas and NumPy, enabling
them to handle large datasets efficiently.

Machine Learning
The machine learning component of the internship is designed to immerse participants in
the core concepts and techniques of this rapidly growing field. Implement various machine
learning algorithms, ranging from simple linear regression to more complex models like
decision trees, support vector machines, and neural networks.

Project Management
Beyond technical skills, the internship also emphasizes the importance of project
management in the context of AI. Participants learn to manage AI projects from inception
to deployment, ensuring that they are delivered on time and meet quality standards.
REFERENCES

https://github.com/openlists/PythonResources

https://github.com/showcases/data-visualization

https://github.com/microsoft/ML-For-Beginners

https://github.com/PyGithub/PyGithub
CONCLUSION

In conclusion, the cross-validation process implemented using the RandomForestRegressor model

is a comprehensive approach to evaluating and fine-tuning the model's performance. By
systematically splitting the dataset into multiple training and testing subsets across K folds, the
model is trained and tested repeatedly on different portions of the data. This method ensures that
the model is not just overfitting to a single subset but is learning patterns that generalize well to
various data points.

The use of StandardScaler to normalize the features before training further enhances the model's
ability to converge efficiently, ensuring that all features contribute equally to the learning process.
This is particularly important in machine learning, where features with larger magnitudes can
otherwise disproportionately influence the model's predictions.

After training, the model's predictions are evaluated using the Mean Absolute Error (MAE) metric,
which provides a clear indication of the average error in the model's predictions. The MAE is
calculated for each fold, and the final average MAE across all folds offers a reliable measure of the
model's overall accuracy and performance.

By adopting this cross-validation approach, the code ensures that the RandomForestRegressor
model is robust, accurate, and capable of generalizing well to new, unseen data. This thorough
evaluation process ultimately leads to a more reliable and effective model, making it well-suited for
real-world applications where consistent and accurate predictions are critical.

Automatic Registration Version.2.5 Revision. 1.0 en
No ratings yet
Automatic Registration Version.2.5 Revision. 1.0 en
78 pages
Lecture 12
No ratings yet
Lecture 12
8 pages
Internship Report
No ratings yet
Internship Report
23 pages
Sop Uh
No ratings yet
Sop Uh
3 pages
MCS-011 Solved Q.Papers
No ratings yet
MCS-011 Solved Q.Papers
64 pages
Implementing AI Solutions
No ratings yet
Implementing AI Solutions
12 pages
Resume Suryansh C++ Mern Stack
No ratings yet
Resume Suryansh C++ Mern Stack
1 page
Fusion Application Benefits
No ratings yet
Fusion Application Benefits
4 pages
Intern Report at Cognizant
No ratings yet
Intern Report at Cognizant
37 pages
PL SQL
No ratings yet
PL SQL
62 pages
Internship Report Google AI-ML
50% (2)
Internship Report Google AI-ML
25 pages
S4ABAP Curriculum Classroom Vendor
No ratings yet
S4ABAP Curriculum Classroom Vendor
25 pages
STPR 4TH
No ratings yet
STPR 4TH
102 pages
Hrithik Internship
No ratings yet
Hrithik Internship
33 pages
BAN5023 Week 2
No ratings yet
BAN5023 Week 2
8 pages
Mastering Modern AI
No ratings yet
Mastering Modern AI
10 pages
College Report
No ratings yet
College Report
45 pages
Meeting8 Files and Exceptions N
No ratings yet
Meeting8 Files and Exceptions N
35 pages
Gym Project Report2021
No ratings yet
Gym Project Report2021
75 pages
2 DUnsteady Flow Analysisin HECRAS
No ratings yet
2 DUnsteady Flow Analysisin HECRAS
16 pages
AI WP3 READING 4.4. Development of A Training Plan and Roadmap - 09-24
No ratings yet
AI WP3 READING 4.4. Development of A Training Plan and Roadmap - 09-24
29 pages
AI WP3 READING 4.4. Development of A Training Plan and Roadmap - 09-24
No ratings yet
AI WP3 READING 4.4. Development of A Training Plan and Roadmap - 09-24
25 pages
Internship Report Sample
No ratings yet
Internship Report Sample
56 pages
Updated Internship Front Page
No ratings yet
Updated Internship Front Page
13 pages
Ethical AI Course Syllabus Cd1827
No ratings yet
Ethical AI Course Syllabus Cd1827
9 pages
Spring Boot Notes 1
No ratings yet
Spring Boot Notes 1
83 pages
Internship Report by Utkarsh Rajput
No ratings yet
Internship Report by Utkarsh Rajput
45 pages
Prof Sanket J Shah: Prepared By: Alpha College of Enginering & Technology
No ratings yet
Prof Sanket J Shah: Prepared By: Alpha College of Enginering & Technology
31 pages
Harsh Kadam Resume
No ratings yet
Harsh Kadam Resume
2 pages
Ai and Busine
No ratings yet
Ai and Busine
30 pages
Mohd Hasnain Resume
No ratings yet
Mohd Hasnain Resume
2 pages
Chapter 1 - Course Intro
No ratings yet
Chapter 1 - Course Intro
27 pages
3RB20AI001
No ratings yet
3RB20AI001
34 pages
SPE-203755-MS Digital Transformation of The Standing and Katz Compressibility Factor Chart For Natural Gases
No ratings yet
SPE-203755-MS Digital Transformation of The Standing and Katz Compressibility Factor Chart For Natural Gases
17 pages
Cs Important Questions by Ujjwal
No ratings yet
Cs Important Questions by Ujjwal
19 pages
Zendesk Control Costs Playbook UK English
No ratings yet
Zendesk Control Costs Playbook UK English
14 pages
DBMS Papers
No ratings yet
DBMS Papers
10 pages
Python Intership Report
No ratings yet
Python Intership Report
22 pages
Year 7 ICT
No ratings yet
Year 7 ICT
4 pages
Avocor G Series Datasheets ENGLISH
No ratings yet
Avocor G Series Datasheets ENGLISH
2 pages
Internship
No ratings yet
Internship
30 pages
Internship Training Report by Vidhita Jain
No ratings yet
Internship Training Report by Vidhita Jain
44 pages
1 Month MGT - 20250705 - 151625 - 0000
No ratings yet
1 Month MGT - 20250705 - 151625 - 0000
8 pages
12 Trim 6 IT - ANA - Artificial Intelligence For Managers
No ratings yet
12 Trim 6 IT - ANA - Artificial Intelligence For Managers
5 pages
IndustrialTraining Report
No ratings yet
IndustrialTraining Report
26 pages
189y1a05d4 Internship
No ratings yet
189y1a05d4 Internship
46 pages
21IQ-ITT-001 - Protocolo IQ
No ratings yet
21IQ-ITT-001 - Protocolo IQ
7 pages
Indian Society Lec-09 (Harpreet Sir) Upsc Hinglish
No ratings yet
Indian Society Lec-09 (Harpreet Sir) Upsc Hinglish
22 pages
Summary Ican Week
No ratings yet
Summary Ican Week
4 pages
22a31a4484-Google Aiml Final
No ratings yet
22a31a4484-Google Aiml Final
63 pages
CompTIA Security - (SY0-601) Practice Exam #6 - Resultados
No ratings yet
CompTIA Security - (SY0-601) Practice Exam #6 - Resultados
56 pages
400,000 WORDS Over A Weekend: Leonid Glazychev, CEO, Logrus IT Pavel Doronin, Product Manager, Smartcat
No ratings yet
400,000 WORDS Over A Weekend: Leonid Glazychev, CEO, Logrus IT Pavel Doronin, Product Manager, Smartcat
31 pages
Abstract Internship
No ratings yet
Abstract Internship
5 pages
Screenshot 2024-12-14 at 1.26.20 PM
No ratings yet
Screenshot 2024-12-14 at 1.26.20 PM
15 pages
Shrishail G B - 1CR20MC090 - SAMPLE
No ratings yet
Shrishail G B - 1CR20MC090 - SAMPLE
20 pages
Components of An Operating System
No ratings yet
Components of An Operating System
29 pages
B LOUVAIN01-Bachelor in Computer Science-At Louvain-la-Neuve
No ratings yet
B LOUVAIN01-Bachelor in Computer Science-At Louvain-la-Neuve
19 pages
An Industrial Training Report On: Ai - ML Internship
No ratings yet
An Industrial Training Report On: Ai - ML Internship
17 pages
Intern Report
No ratings yet
Intern Report
51 pages
Screenshot 2024-06-21 at 8.08.56 PM
No ratings yet
Screenshot 2024-06-21 at 8.08.56 PM
26 pages
IITG Executive Program in Leadership With AI
No ratings yet
IITG Executive Program in Leadership With AI
32 pages
Group 2
No ratings yet
Group 2
16 pages
Aditya Singh
No ratings yet
Aditya Singh
15 pages
AI For Business - Addendum
No ratings yet
AI For Business - Addendum
30 pages
IITG Executive Program in Leadership With AI
No ratings yet
IITG Executive Program in Leadership With AI
32 pages
Internship Report: A Report Submitted in Partial Fulfillment of The Requirements of
No ratings yet
Internship Report: A Report Submitted in Partial Fulfillment of The Requirements of
19 pages
Internship Report - Format
No ratings yet
Internship Report - Format
23 pages
Ug1209 Embedded Design Tutorial
No ratings yet
Ug1209 Embedded Design Tutorial
165 pages
Konda
No ratings yet
Konda
36 pages
Durkopp DAC Programming 745-34-S
No ratings yet
Durkopp DAC Programming 745-34-S
28 pages
Sample Exam 2
No ratings yet
Sample Exam 2
23 pages
Copy-Roorkee Institute of Technology-Merged
No ratings yet
Copy-Roorkee Institute of Technology-Merged
15 pages
PGP-AIFL Brochure
No ratings yet
PGP-AIFL Brochure
16 pages
Internship Report
No ratings yet
Internship Report
102 pages
Railway Group D Exam Guide
No ratings yet
Railway Group D Exam Guide
8 pages
JSS 1 Computer Studies 2nd Term Exam
100% (5)
JSS 1 Computer Studies 2nd Term Exam
4 pages
Pgp-Aifl 2024
No ratings yet
Pgp-Aifl 2024
16 pages
Bhargav
No ratings yet
Bhargav
27 pages
Ram Report
No ratings yet
Ram Report
35 pages
Suriya Intern Report2021 2024
No ratings yet
Suriya Intern Report2021 2024
15 pages
A Training Report
No ratings yet
A Training Report
24 pages
B2C Brochure UC Berkeley AI 13-6-24 V61
No ratings yet
B2C Brochure UC Berkeley AI 13-6-24 V61
21 pages
Artificial Intelligence: Business Strategies and Applications
100% (2)
Artificial Intelligence: Business Strategies and Applications
21 pages
Internshipreport FINAL441
No ratings yet
Internshipreport FINAL441
14 pages
Serial-Ethernet Converter Access To Network
No ratings yet
Serial-Ethernet Converter Access To Network
1 page
AI For Business Leaders: Executive Program Syllabus
100% (1)
AI For Business Leaders: Executive Program Syllabus
12 pages
Finplus Within Swift'S Enhanced Platform
No ratings yet
Finplus Within Swift'S Enhanced Platform
2 pages
The Future-Ready Professional: Essential Digital Skills for Career Growth
From Everand
The Future-Ready Professional: Essential Digital Skills for Career Growth
Daniel Carter
No ratings yet
Cognizant Technology Solutions: Programmer Analyst
No ratings yet
Cognizant Technology Solutions: Programmer Analyst
34 pages
Trackpad Information Technology for Class 10: CODE 402 | Skill Education, Based on Windows & OpenOffice
From Everand
Trackpad Information Technology for Class 10: CODE 402 | Skill Education, Based on Windows & OpenOffice
Shalini Harisukh
No ratings yet