Internship Report
Internship Report
21INT37 -Innovation/Entrepreneurship
/Societal Internship
Submitted by
1BI21CS063
Kothakota Bindhu Sree
2023-2024
ACKNOWLEDGEMENT
The satisfaction and euphoria that accompanies the successful completion of any task
would be incomplete without complementing those who made it possible and whose
guidance and encouragement made my efforts successful. So, my sincere thanks to all
those who have supported me in completing this Internship successfully.
My sincere thanks to Dr. M. U. Aswath, Principal, BIT and Dr. J. Girija, HOD,
Department of CSE, BIT for their encouragement, support and guidance to the student
community in all fields of education. I am grateful to our institution for providing us a
congenial atmosphere to carry out the Internship successfully.
I extend my sincere thanks to all the department faculty members and non-teaching staff
for supporting me directly or indirectly in the completion of this Internship.
1.1 Overview 5
1.2 Objective 6
1.3.1 Purpose 7
1.3.2 Scope 8
1.3.3 Applicability 9
10
Chapter 2 - Problem Statement
11-12
Chapter 3 -Methodology /System Architecture/Algorithm
13-14
Chapter 4 - Tools/Technologies
Chapter 5 - Implementation 15-16
Chapter 6 - Results 17
Chapter 8 - References 19
Internship Certificate 20
BANGALORE INSTITUTE OF TECHNOLOGY
K. R. Road, V. V. Puram, Bengaluru-560004
DEPARTENT OF COMPUTER SCIENCE AND ENGINEERING
INTERNSHIP RUBRICS
21INT68 - 2021 BATCH - 2023-2024
4
INTRODUCTION
1.1 Overview
This project for Sprocket Central Pty Ltd represents a comprehensive endeavor aimed at
harnessing the power of data-driven insights to inform strategic decision-making
processes. The overarching objective is to analyze provided datasets and develop an
interactive dashboard tailored to targeting the new 1000 customer list. The project unfolds
in two key phases: Data Quality Assessment and Dashboard Development, each playing a
crucial role in achieving the project's objectives.
The Data Quality Assessment phase serves as the foundation of the project, focusing on
ensuring the integrity and reliability of the datasets. This involves a meticulous
examination of the data to identify and address various data quality issues. Missing values
are detected and resolved to ensure data completeness, while inconsistencies in data
formats, such as date formats or categorical variable naming conventions, are
standardized to facilitate accurate analysis. Additionally, duplicate entries are identified
and outliers are investigated to determine their impact on the analysis
The project transitions to the Dashboard Development phase, where the focus is on
creating a visually appealing and informative dashboard. This dashboard serves as a
dynamic platform for presenting key insights and analysis results derived from the
datasets. The dashboard is designed to be user-friendly and intuitive, with interactive
elements allowing stakeholders to explore the data further and gain deeper insights.
The significance of the project lies in its potential to empower Sprocket Central Pty Ltd
with actionable insights derived from data analysis, enabling them to optimize their
marketing and targeting strategies effectively
5
Introduction
1.2 Objective
6
Introduction
trends, and market sentiment, enabling data-driven insights for marketing and
customer engagement strategies.
Collaborative Data Analysis with AI:
Explore collaborative data analysis workflows using AI-driven tools and
platforms.
These detailed objectives provide a clear roadmap for conducting the data analysis and
dashboard development process, ensuring that the outcomes are aligned with the client's
needs and objectives.
7
Introduction
1.3.1 Purpose
The purpose of the above project for IITIAN TEACHES Pvt Ltd is multifaceted and revolves
around leveraging AI based tools to achieve s objectives:
1. Enhancing Productivity through AI Tools: The core mission revolves around empowering
individuals and organizations to harness the transformative power of AI tools for enhancing
productivity. Through comprehensive courses, we provide learners with the knowledge and
skills needed to leverage AI-driven solutions effectively in their day-to-day workflows.
2. Interactive Learning Platform: The platform serves as a hub for professionals seeking to
elevate their productivity through AI tools. With a diverse range of courses covering various
AI applications, from data analysis and predictive modeling to natural language processing
and computer vision, learners have the flexibility to customize their learning journey
according to their specific interests and career goals. Through engaging multimedia content,
interactive quizzes, and hands-on projects, we foster an immersive learning environment that
enables learners to acquire practical skills and expertise in AI-driven productivity tools.
3. Continuous Learning and Professional Development: Creating a platform that offers a range
of resources and tools for continuous learning and professional development, including online
forums, networking events, and mentorship opportunities. By fostering a community of like-
minded professionals and experts, we create a collaborative ecosystem where individuals can
exchange ideas, share best practices, and stay updated on the latest advancements in AI
technology.
8
Introduction
1.3.2 Scope
The scope of the project encompasses specific tasks, activities, and deliverables that will be
undertaken to achieve the project's purpose.
It includes:
Data Collection: Gathering the datasets provided by Sprocket Central Pty Ltd, ensuring
completeness and accuracy.
Data Preparation: Cleaning and preprocessing the data to address issues such as missing
values, inconsistencies, and duplicates.
Exploratory Data Analysis (EDA): Conducting exploratory analysis to understand the
characteristics and patterns present in the data.
Statistical Analysis: Performing statistical tests and modeling techniques to extract
meaningful insights from the data.
Dashboard Development: Creating an interactive dashboard to visualize key findings and
analysis results, allowing stakeholders to explore the data dynamically.
Targeting Strategy Formulation: Using insights derived from the analysis to develop
targeted marketing and outreach strategies for the new 1000 customer list.
Stakeholder Engagement: Collaborating with stakeholders to gather requirements,
provide updates on project progress, and solicit feedback on analysis findings and
dashboard prototypes.
Documentation and Reporting: Documenting methodologies, analysis results, insights,
and recommendations in a comprehensive report for stakeholders' reference.
The scope delineates the specific activities and deliverables that will be executed within the
project, providing a roadmap for achieving the project's purpose effectively.
9
Introduction
1.3.3 Applicability
The applicability of this project extends to various industries and organizations seeking to
leverage data-driven insights to inform strategic decision-making processes.
10
[Type here] [Type here] [Type here]
Chapter 2
PROBLEM STATEMENT
Task 01:
The problem at hand is the presence of data quality issues within the datasets provided by
Sprocket Central Pty Ltd. These issues include missing values, inconsistencies, duplicates,
outliers, and discrepancies in data integrity.
Task 02:
11
[Document title]
Chapter 3
METHODOLOGY
2. Data Profiling: Perform data profiling to summarize key statistics and characteristics of
the datasets. This includes analyzing data types, distributions, summary statistics, and
identifying any initial data quality issues.
3. Missing Values Handling: Identify missing values within the datasets and determine
appropriate handling strategies. This may involve techniques such as imputation (e.g., mean,
median, mode imputation), deletion of missing records, or advanced imputation methods
(e.g., predictive modeling).
4. Data Cleaning: Address inconsistencies, duplicates, and outliers in the data to improve its
quality. This may include standardizing data formats, resolving inconsistencies in naming
conventions, removing duplicate entries, and detecting and addressing outliers using
statistical methods.
5. Data Integration: Integrate and reconcile related datasets to ensure consistency and
coherence in the data. This involves identifying common identifiers or keys and merging
datasets based on these keys. Ensure data integrity by resolving any discrepancies or conflicts
between datasets..
12
[Document title]
2. Data Preparation: Prepare the cleaned and validated datasets for dashboard development.
This involves structuring the data, formatting it appropriately for visualization, and
potentially aggregating or summarizing data to meet dashboard requirements.
3. Dashboard Design: Design the layout, structure, and visual elements of the dashboard
based on the gathered requirements and best practices in data visualization. Consider user
personas, user stories, and user experience (UX) principles to ensure the dashboard meets
stakeholder needs.
5. Dashboard Development: Develop the interactive dashboard using relevant tools and
technologies in Tableau. Implement the designed layout and visualizations, ensuring
interactivity, responsiveness, and alignment with branding guidelines.
6. Testing and Iteration: Test the dashboard functionality and usability to identify any
issues or areas for improvement. Gather feedback from stakeholders and iterate on the design
and functionality accordingly to enhance user experience and effectiveness.
7. Deployment and Training: Deploy the dashboard to stakeholders and provide training
and support as needed to ensure they can effectively use and interpret the dashboard for
decision-making. Provide user documentation and tutorials to facilitate adoption and usage.
13
[Document title]
Chapter 4
TOOLS AND TECHNOLOGY
1. Jupyter Notebook: Jupyter Notebook is an open-source web application that allows for
the creation and sharing of documents containing live code, equations, visualizations, and
narrative text. It provides an interactive environment for data analysis and exploration.
2. Pandas Library: Pandas is a powerful Python library for data manipulation and analysis.
It provides data structures (such as DataFrames and Series) and functions for cleaning,
transforming, and analyzing tabular data.
4. Data Preparation: Pandas is used for data preparation tasks such as loading datasets,
cleaning missing values, handling outliers, and performing data transformations. Jupyter
Notebook provides an interactive environment for executing and documenting these data
preparation steps.
5. Exploratory Data Analysis (EDA): Pandas and Matplotlib/Seaborn are used for
conducting exploratory data analysis to gain insights into the data's distribution,
relationships, and patterns. Visualizations such as histograms, scatter plots, and box plots are
created to explore and understand the data.
By utilizing Pandas and Jupyter Notebook for dashboard development, we leverage the
flexibility, interactivity, and rich visualization capabilities of Python-based data analysis
tools to create informative and exploratory data analysis notebooks for stakeholders at
Sprocket Central Pty Ltd
14
Tools and Technologies
1. Tableau Desktop:Tableau Desktop is the primary tool used for creating interactive
dashboards. It offers a user-friendly interface and powerful visualization capabilities,
allowing users to drag and drop data fields to create various visualizations.
2. Data Connection: Tableau connects to various data sources, including databases (e.g.,
SQL Server, MySQL), spreadsheets (e.g., Excel), cloud services (e.g., Google Analytics),
and web data connectors. It allows for seamless integration of data from multiple sources
into the dashboard.
5. Publishing and Sharing: Once the dashboard is created in Tableau Desktop, it can be
published to Tableau Server or Tableau Online for sharing with stakeholders. Tableau
Server enables secure access to the dashboard via web browsers or Tableau Mobile app,
with options for embedding in web pages or applications.
15
CHAPTER5
IMPLEMENTATION
16
Implementation
17
Implementation
18
Chapter 6
RESULTS
19
Chapter 7
REFLECTION NOTES
The process of conducting a data quality assessment was crucial in ensuring the reliability
and accuracy of the analysis conducted. It highlighted the importance of thorough data
cleaning and validation procedures to address issues such as missing values,
inconsistencies, and outliers.
Developing the interactive dashboard provided insights into the power of data visualization
in communicating key findings and analysis results effectively. It emphasized the
significance of designing user-friendly interfaces and incorporating interactivity to
facilitate stakeholder engagement and decision-making.
Utilizing tools such as Tableau, Pandas, and Jupiter Notebook showcased the versatility
and effectiveness of different technologies in data analysis and dashboard development.
Each tool offered unique features and capabilities, allowing for a comprehensive approach
to data exploration and visualization.
Collaborating with stakeholders throughout the project was essential in gathering
requirements, soliciting feedback, and ensuring alignment with organizational objectives.
Effective communication and engagement fostered a sense of ownership and involvement
among stakeholders, leading to more impactful outcomes.
Ultimately, the project's success will be measured by its impact on driving informed
decision-making and achieving business objectives for Sprocket Central Pty Ltd. By
leveraging data-driven insights and visualization techniques, the project aims to empower
stakeholders with actionable recommendations to drive growth and profitability.
Looking ahead, there are opportunities to expand the scope of the project by incorporating
advanced analytics techniques, exploring additional data sources, and integrating predictive
modeling capabilities into the dashboard. Continuously evolving and adapting to emerging
trends and technologies will be key to sustaining the project's impact over time.
The project has provided valuable insights into the importance of data quality assessment,
dashboard development, stakeholder collaboration, and continuous improvement in driving
business success through data-driven decision-making. It serves as a foundation for future
initiatives aimed at leveraging data as a strategic asset for Sprocket Central Pty Ltd.
20
Chapter 8
REFERENCES
References for the project:
1. Tableau: [https://www.tableau.com](https://www.tableau.com)
2.PandasDocumentation:
[https://pandas.pydata.org/docs/](https://pandas.pydata.org/docs/)
3.JupyterNotebookDocumentation:
[https://jupyter.org/documentation](https://jupyter.org/documentation)
4.MatplotlibDocumentation:
[https://matplotlib.org/stable/contents.html](https://matplotlib.org/stable/contents.html)
5.SeabornDocumentation:
[https://seaborn.pydata.org/tutorial.html](https://seaborn.pydata.org/tutorial.html)
These references were consulted for guidance on using specific tools and technologies, as
well as for documentation and tutorials on data analysis, visualization, and dashboard
development.
21
INTERNSHIP CERTIFICATE
22