Report
Report
Internship Report
On
Information Technology
By
Abhishek Dewangan
(300103322301)
Session: 2021-2025
DECLARATION BY THE CANDIDATE
I hereby declare that the Industrial Internship report entitled “Cognizant- Artificial
Intelligence” submitted by me to Bhilai Institute of Technology, Durg in partial
fulfilment of the requirement for the award of the degree of Bachelor of Technology in
Information Technology Engineering is a record of bonafide industrial training
undertaken by me under the supervision of Smith, Cognizant. I further declare that the
work reported in this report has not been submitted and will not be submitted, either
in part or in full, for the award of any other degree or diploma in this institute or any
other institute or university.
I would like to express my sincere gratitude and appreciation to everyone who has
helped me during my internship. First and foremost, I would like to thank Smith for
providing me with the opportunity to intern at Cognizant. Their support, guidance,
and encouragement have been instrumental in my learning and growth.
I would also like to thank my colleagues, who have welcomed me with open arms and
have been incredibly supportive throughout my internship. Their willingness to share
their knowledge and expertise has been invaluable.
Finally, I would like to thank my family and friends for their unwavering support and
encouragement throughout my internship. Their constant motivation has helped me
stay focused and achieve my goals.
The AI Cognizant Virtual Internship plays a crucial role in bridging the gap between
academic knowledge and practical industry application, particularly for students and
professionals aspiring to enter the field of Artificial Intelligence. Traditional education
often emphasizes theoretical concepts, which, while essential, do not always equip
individuals with the hands-on skills needed in the industry. This internship addresses this
gap by providing participants with real-world projects that allow them to apply AI
techniques and tools to solve practical problems, deepening their understanding and
enhancing their practical skill set.
A significant benefit of the internship is the opportunity for participants to build a portfolio
of projects that demonstrates their ability to implement AI concepts in real-world
scenarios. This portfolio serves as a tangible asset in the competitive job market, setting
participants apart from others who may lack practical experience. Employers value this
hands-on experience, as it shows that candidates not only possess theoretical knowledge
but also the capability to apply it effectively in professional settings.
5 Objectives 25
References---------------------------------------------------------------------------------------- 35
Conclusion---------------------------------------------------------------------------------------- 36
Cognizant is a prominent global IT services and consulting company, renowned for its
ability to help organizations navigate and thrive in the digital era. Founded in 1994 as an
in-house technology unit of Dun & Bradstreet, Cognizant has since evolved into a Fortune
500 company, listed among the world’s most admired and fastest-growing firms. The
company’s rapid ascent in the tech industry can be attributed to its strong focus on
innovation, customer-centricity, and its ability to leverage emerging technologies to deliver
measurable business outcomes.
With its headquarters in Teaneck, New Jersey, Cognizant operates in over 40 countries,
employing more than 300,000 professionals worldwide. The company’s global presence
and vast workforce enable it to serve a diverse client base that spans various industries,
including healthcare, financial services, insurance, manufacturing, retail, and
communications. This extensive industry expertise allows Cognizant to offer tailored
solutions that address the specific challenges and opportunities within each sector.
One of Cognizant’s core strengths lies in its ability to integrate digital solutions with
traditional IT systems, creating seamless and scalable platforms that enable businesses to
operate more efficiently and competitively. The company’s focus on digital transformation
is not just about adopting new technologies but also about reimagining business models,
optimizing processes, and fostering a culture of continuous innovation.
Cognizant’s client-centric approach is a cornerstone of its business philosophy. The
company emphasizes close collaboration with clients, aiming to understand their unique
challenges and deliver solutions that are not only effective but also aligned with their
strategic goals. This commitment to client success has earned Cognizant a reputation for
reliability and excellence, leading to long-term partnerships with many of the world’s
leading organizations.
Cognizant also places a strong emphasis on ethical business conduct and governance. The
company adheres to strict ethical standards in all its operations and is committed to
maintaining transparency and accountability in its business practices. This ethical
framework not only strengthens Cognizant’s corporate reputation but also ensures that it
operates in a manner that is responsible and respectful of all its stakeholders.
APPLICATION OF THE GAINED KNOWLEDGE IN/DURING THE
TRAINING
● Finally, Project Management skills are integrated into the training, teaching
participants how to develop and manage AI projects from inception to deployment.
This includes planning the project timeline, managing resources, collaborating with
team members, and adhering to quality standards. Effective project management is
crucial in ensuring that AI projects are completed on time, within scope, and meet
the desired objectives. By gaining these skills, participants are better prepared to
take on leadership roles in AI projects, ensuring their successful execution in a real-
world setting.
COMPARISON OF COMPETENCY LEVELS BEFORE AND AFTER THE
TRAINING
Comparing competency levels before and after the data analyst internship program can
effectively highlight the growth and development achieved during the training. Below are some
key areas where this comparison can be made:
● Data Analyst Knowledge: Prior to the training, my understanding of data analysis was
primarily confined to the basic concepts covered in the academic curriculum. However,
after completing the program, I have gained a much deeper understanding of Python, its
functionalities, and how to effectively implement and develop applications using the
platform.
● Technical Skills: Before the training, my experience with programming languages and
tools used in data analysis, such as Python, Power BI, Tableau, and R, was limited. After
the internship, I have become proficient in these technologies and have acquired hands-
on experience in developing data analysis applications.
import pandas as pd
df = pd.read_csv('/content/sample_sales_data
inplace=True) df.head()
dtype='object')
corr = df.corr()
corr.style.background_gradient(cmap='coolwarm')
plot_continuous_distribution(df, 'unit_price')
plot_continuous_distribution(df, 'quantity')
plot_continuous_distribution(df, 'total')
Outlier Detection
Outliers are data points that deviate significantly from other observations and can potentially
distort your analysis. You can identify outliers by visualizing the data or calculating statistical
measures. Once detected, you can decide whether to remove them, transform them, or keep them
based on their relevance to your analysis.
Feature Engineering
This involves creating new features or modifying existing ones to better capture the underlying
patterns in your data. For instance, you might derive new features such as the day of the week or
month from a timestamp or create interaction terms between existing variables to enhance
predictive models.
Pairwise Plotting
Pairwise plotting (or pair plots) allows you to visualize relationships between multiple pairs of
variables at once. This is particularly useful when exploring interactions and correlations between
variables, helping you spot trends, clusters, and potential outliers in the data.
Summary Statistics
Generating summary statistics gives a quick overview of the central tendency, variability, and
distribution of your data. This can include measures like mean, median, standard deviation, and
quartiles, providing insights into the overall structure and behavior of your dataset.
By incorporating these additional steps, your EDA will be more comprehensive, leadi
# method to build the plot def get_plot(stock_1,
stock_1_name=stock_1['symbol'].unique()[0]
plot=figure(title='Stock prices',
x_axis_label='Date',
x_range=stock_1_range['short_date']
, y_axis_label='Price in $USD',
plot_width=800, plot_height=500)
plot.xaxis.major_label_orientation = 1
plot.grid.grid_line_alpha=0.3
if value == 'open-close':
if value == 'volume':
plot.line(stock_1_range['short_date'], stock_1_range['volume'],
legend_label=stock_1_name, muted_alpha=0.2)
plot.line(stock_2_range['short_date'], stock_2_range['volume'],
legend_label=stock_2_name, muted_alpha=0.2,
line_color='orange')
plot.legend.click_policy="mute"
return plot
def add_candle_plot(plot, stock_name, stock_range, color):
= 0.5
plot.segment(stock_range['short_date'], stock_range['high'],
stock_range['short_date'], stock_range['low'],
color="grey")
plot.vbar(stock_range['short_date'][inc_1], w,
stock_range['high'][inc_1], stock_range['close'][inc_1],
plot.vbar(stock_range['short_date'][dec_1], w,
stock_range['high'][dec_1], stock_range['close'][dec_1],
stock_mean_val=stock_range[['high', 'low']].mean(axis=1)
plot.line(stock_range['short_date'], stock_mean_val,
plot_categorical_distribution(df, 'category')
plot_categorical_distribution(df, 'customer_type')
plot_categorical_distribution(df, 'payment_type')
get_plot Method
● Purpose: This method generates a stock price plot for two selected stocks within a specified
date range, allowing for visualization of either the open-close price movement or the volume
of trades.
● Parameters:
1. stock_1, stock_2: Symbols of the two stocks to be compared.
2. date: A tuple representing the start and end dates for the data range.
3. value: Determines what aspect of the stock data to visualize (either 'open-close' for
candlestick charts or 'volume' for trade volume).
● Process:
1. Data Filtering: It filters the dataset to obtain data relevant to the selected stocks
within the specified date range.
2. Plot Initialization: A plot is created with appropriate labels, dimensions, and grid
settings.
3. Plotting:
add_candle_plot Method
● Purpose: This function adds candlestick plots to the main plot for the specified stock,
indicating the opening, closing, high, and low prices.
● Parameters:
1. plot: The main plot object where the candlestick charts will be added.
● Process:
1. Candlestick Calculation: Determines whether the stock closed higher or lower
than it opened (inc_1 and dec_1).
2. Drawing Candles:
■ Segments: Vertical lines representing the range between high and low prices.
■ Bars: Vertical bars filled with green or red, depending on whether the stock
price increased or decreased.
3. Mean Price Line: A line representing the average of the high and low prices is
drawn over the candlestick plot.
plot_categorical_distribution Function
● Purpose: This function visualizes the distribution of a categorical variable using a bar plot.
● Parameters:
0 data: The DataFrame containing the data.
○ column: The categorical column to be visualized.
○ height, aspect: Dimensions of the plot.
● Process:
0 It generates a bar plot showing the frequency of each category in the specified
column. This helps understand the distribution and prevalence of different
categories within the dataset.
Example Plots
sns.heatmap(df.corr())
Data Merging
Parameters:
stock_df = convert_timestamp_to_hourly(stock_df,
'timestamp')
This step ensures that all timestamps in the sales, stock, and temperature data
sales_df.groupby(['timestamp', 'product_id']).agg({'quantity':
'sum'}).reset_index()
This step consolidates the sales data, making it easier to analyze and model
trends on an hourly basis.
1. Timestamp Conversion: Align timestamps across datasets to the nearest hour for
consistency, making it easier to merge data from different sources.
2. Data Aggregation: Summarize sales data by aggregating quantities sold per hour
and product, simplifying analysis and trend detection.
3. Merging DataFrames: Combine sales, stock, and temperature datasets based on the
aligned timestamp to create a unified dataset for comprehensive analysis.
4. Handling Missing Data: Address any missing values post-merge using techniques
like forward-filling, interpolation, or dropping rows.
5. Final Analysis: Use the combined dataset to explore trends, perform correlation
analysis, and create visualizations such as time-series plots and heatmaps to derive
actionable insights.
Feature Engineering
merged_df['timestamp_day_of_month'] = merged_df['timestamp'].dt.day
merged_df['timestamp_day_of_week'] =
merged_df['timestamp'].dt.dayofweek merged_df['timestamp_hour'] =
merged_df['timestamp'].dt.hour merged_df.drop(columns=['timestamp'],
inplace=True) merged_df.head()
These time-based features help the model capture temporal patterns in the data, which
could be important for forecasting sales and stock levels.
Modeling
Model Setup:
Target Variable (y): estimated_stock_pct (percentage of estimated stock).
X =
merged_df.drop(columns=['estimated_stock_pct']) y
= merged_df['estimated_stock_pct'] accuracy = []
= RandomForestRegressor()
scaler = StandardScaler()
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)
trained_model.predict(X_test)
mean_absolute_error(y_true=y_test, y_pred=y_pred)
{mae:.3f}")
Scaler: StandardScaler is initialized to standardize the feature data by removing the mean
and scaling to unit variance. This ensures that the model's learning process is not biased by
features with larger scales.
X.columns] importances =
model.feature_importances_ indices =
np.argsort(importances)
● Hands-On Projects: Interns work on hands-on projects that require them to design,
develop, and deploy AI models, simulating real-world scenarios and challenges.
Python is the cornerstone of AI and data science, recognized for its simplicity,
readability, and extensive ecosystem of libraries and frameworks. As the primary
programming language used in the AI Cognizant Virtual Internship, Python provides
a versatile platform for developing everything from simple scripts to complex
machine learning models. Its syntax is user-friendly, which makes it accessible to
beginners, while its vast array of libraries enables the handling of advanced AI tasks.
Python's extensive support for various data types, powerful in-built functions, and
ease of integration with other technologies make it the ideal language for AI
development.
● pandas: pandas is a powerful library used for data manipulation and analysis.
It provides data structures like DataFrames.
Development Tools
● equations, visualizations, and narrative text. It’s widely used in data science
for its ability to combine code execution with rich text and visual outputs.
This tool is particularly valuable for iterating on ideas, documenting the
modeling process, and presenting results in an interactive format.
Collaboration Tools
● GitHub: Beyond its role in code versioning, GitHub serves as a central hub for
project collaboration. Participants can use GitHub to share their code
repositories with team members, track issues, manage pull requests, and
collaborate on code development.
PURPOSE AND IMPORTANCE
The AI Cognizant Virtual Internship plays a pivotal role in closing the gap between
theoretical knowledge acquired in academic settings and the practical skills required in the
AI industry. For students and professionals aspiring to transition into AI roles, this
internship provides an invaluable platform to gain hands-on experience that is often
missing from traditional educational programs. While academic courses typically focus on
foundational theories, mathematical principles, and basic programming, they may not fully
prepare individuals for the complexities and challenges encountered in real-world AI
projects. This is where the AI Cognizant Virtual Internship becomes essential, offering a
structured environment where participants can apply what they’ve learned in a practical
context.
Through this internship, participants engage in a series of carefully designed projects that
mimic the kinds of challenges they will face in the industry. These projects cover a wide
range of AI applications, from machine learning and data analysis to natural language
processing and computer vision. By working on these projects, interns learn to navigate the
entire AI development lifecycle, from data preprocessing and model selection to
deployment and performance evaluation. This practical experience is crucial not only for
reinforcing theoretical knowledge but also for developing a deeper understanding of how
AI solutions are implemented in real-world scenarios. Participants learn to deal with the
nuances of real data, such as handling missing values, dealing with imbalanced datasets,
and optimizing models for performance and scalability.
In summary, the AI Cognizant Virtual Internship is more than just a learning experience—
it’s a transformative journey that equips participants with the skills, confidence, and
practical experience needed to excel in AI roles. It bridges the gap between academia and
industry, ensuring that participants are not only knowledgeable but also industry-ready.
This combination of theoretical understanding and practical application makes the
internship an essential step for anyone looking to succeed in the rapidly evolving field of AI.
AREA AND SCOPE
Data Analysis
One of the foundational elements of the internship is data analysis, a crucial skill in AI that
involves extracting meaningful insights from complex datasets. Participants are introduced
to advanced data manipulation techniques using tools like pandas and NumPy, enabling
them to handle large datasets efficiently.
Machine Learning
The machine learning component of the internship is designed to immerse participants in
the core concepts and techniques of this rapidly growing field. Implement various machine
learning algorithms, ranging from simple linear regression to more complex models like
decision trees, support vector machines, and neural networks.
Project Management
Beyond technical skills, the internship also emphasizes the importance of project
management in the context of AI. Participants learn to manage AI projects from inception
to deployment, ensuring that they are delivered on time and meet quality standards.
REFERENCES
https://github.com/openlists/PythonResources
https://github.com/showcases/data-visualization
https://github.com/microsoft/ML-For-Beginners
https://github.com/PyGithub/PyGithub
CONCLUSION
The use of StandardScaler to normalize the features before training further enhances the model's
ability to converge efficiently, ensuring that all features contribute equally to the learning process.
This is particularly important in machine learning, where features with larger magnitudes can
otherwise disproportionately influence the model's predictions.
After training, the model's predictions are evaluated using the Mean Absolute Error (MAE) metric,
which provides a clear indication of the average error in the model's predictions. The MAE is
calculated for each fold, and the final average MAE across all folds offers a reliable measure of the
model's overall accuracy and performance.
By adopting this cross-validation approach, the code ensures that the RandomForestRegressor
model is robust, accurate, and capable of generalizing well to new, unseen data. This thorough
evaluation process ultimately leads to a more reliable and effective model, making it well-suited for
real-world applications where consistent and accurate predictions are critical.