An Internship Report On
An Analysis of
Amazon Sales Data using Data Analytics
Submitted in the partial fulfillment of requirement For
the 6th Semester of
BACHELOR OF COMPUTER APPLICATIONS
Of
Submitted by
ASFIYA KAUSAR
U18HF21S0020
3rd Year 6th sem
Under the guidance of
Internship Guide
Dr.Priya MS
Head of Department of Computers
Applications
St. Anne’s First Grade College for Women
Bachelor of Computer Applications
Millers Road, Vasanth Nagar Bangalore - 560052
2023 - 2024
iNeuron Intelligence Pvt Ltd
17th Floor Tower A, Brigade Signature Towers,
Sannatammanahalli, Bengaluru, Karnataka -562129.
Internship Experience Letter
DATE: 6th May 2024
TO WHOM IT MAY CONCERN
This is to certify that Mr/Ms/Mrs Asfiya Kausar of VI Sem BCA, St. Anne’s First
Grade College for Women College having Reg.No U18HF21S0020 has successfully
completed internship program from 4th April 2024 to 6th May 2024 in Amazon Sales
Data Analysis at INEURON INTELLIGENCE PRIVATE LIMITED. During their internship
programme with us, they demonstrated exceptional skills with a self-motivated attitude
to learn new things and implement them end to end with all of our mentioned
industrial standards. Their performance was excellent and was able to complete the
project successfully on time.
We wish them all the best for future endeavours.
Regards, Sudhanshu Kumar
C E R T I FI CAT E
This to certify that the internship entitled
An Analysis of
Amazon Sales Data using Data Analytics
Submitted in the partial fulfillment of requirement of the 6th
Semester of
Bachelor of Computer Applications
is a result of the bonafide work carried out by
ASFIYA KAUSAR
U18HF21S00020
III BCA
From 04-04-2024 to 06-05-2024, at iNeuron.ai
Head of the Department Principal
DR. PRIYA .MS Rev.DR.SR.ANEECIA
Department Of Computer
Application St. Anne’s FGC for Women
St. Anne’s FGC for Women
St. Anne’s First Grade College for Women
Millers Road, Vasanth Nagar, Bangaluru - 560052
Phone - 7618789552
[email protected] Student Declaration
I Asfiya Kausar, hereby declare that this report entitled a study on
An Analysis of Amazon Sales Data using Data Analytics study
conducted by me duringinternship duration from 04-04-2024 to 06-05-
2024 at iNeuron.ai as partialfulfillment of the examination of VI
Semester BCA.
ASFIYA KAUSAR
ACKNOWLEDGEMENT
Firstly I would like to thank Sudhanshu Kumar of iNeuron.ai for
giving anopportunity to do an internship within the organization.
We are indebted to our cherished Principal “Dr. Sr. Aneecia”, whose
support and permission were instrumental in undertaking this
dissertation. Their guidance throughout this challenging endeavor has
beeninvaluable.
We are deeply grateful to our respected Head of Department, “Dr.
Priya S” whose support and permission were instrumental in enabling
us to embark on this dissertation journey and guiding us through its
completion.
Our sincere appreciation goes to our dedicated guide, “Mohammed
Iliyaz” for their invaluable mentorship and timely assistance in bringing
our project to fruition.
We are also indebted to the Staff of the Computer Department for
their guidance and support throughout this endeavor. Last but not least,
we express our heartfelt gratitude to our beloved parents and dear
friends for their unwavering encouragement and inspiration.
ASFIYA KAUSAR
U18HF21S0020
CONTENTS
Serial No. Page
No
1 Executive Summary 7
1.1 Purpose 8
1.2 Scope of the internship 9
1.3 Outcome of the internship 10
2 Introduction and Organizational Profile 11
2.1 Scope of work in company 11
2.2 Domain description 12
2.3 Organizational Profile 14
2.3.1 Products & Clients 15
3 Work Description 16
3.1 Organizational Chart 16
3.2 Interns job role description 17
3.3 Programming language/concept/technology 18
3.4 Software/hardware used 20
4 Learning Outcome 28
4.1 Abstract of work experience 28
4.2 Any application development/ technology learnt 29
5 Bibliography 31
Annexures
Letter of application to the employer for internship [ from
college]
Letter of Acceptance by Employer [internship confirmation letter by Employer]
Log sheet Provided by Company
Questionaries
1. Executive Summary
Problem statement
Create a report by extracting, transforming , and loading data that contains
sales trends by year, month and quarter as well as finding some
relationships through data to understand and analyse the facts.
Benefits
Help out to make better business decisions.
Help analyse customer trends and satisfaction, which can lead to new and
better products and services.
Gives better insight of customer base.
Helps in easy flow for managing resources.
Objective
Sale management has grown in importance in response to increased
competitiveness and the necessity for efficient distribution strategies to
decrease costs and raise revenues.
Find the monthly and yearly sales and profit trends.
1.1 Purpose
With effective sales data analysis, you can work smarter without having to work
longer or harder. You’ll be seeing better results, closing more deals, and your team
will be happier and more motivated because they’ll be achieving success
themselves.
Identifying your bottlenecks
Let’s start out by looking at your pipeline. It’s the easiest place to start and you
don’t need to change your team’s behavior to understand where things could be
improved.
First of all, take a look at your likelihood of sales numbers and how likely leads
are to move on from each stage of your pipeline. Analysis should reveal any
bottlenecks in your sales process. Remember, you’re always going to lose more
deals that you close; that’s the nature of sales, but you can act to identify stages
where a lot of leads are dropping out of your pipeline and work out where the
problem lies.
Once you’ve isolated the part that needs work, it’s simply a case of looking at your
accumulated data regarding the underperforming section of your pipeline to
determine what prevented deals from progressing to the next stage.
1.2 Scope of the internship
Gives experience
A data analyst internship is a resume booster that highlights your real-world data
analysis experience and hands-on application of techniques. It shows employers that
you have practical skills and are familiar with industry analytics tools, making you a
strong candidate for future data analyst positions.
Develops data analyst skills
Working with experienced professionals helps you gain valuable insights and develop
practical expertise in data collection, analysis, and interpretation. On top of that,
internships offer firsthand experience with industry-standard tools and technologies,
giving you the chance to build your work experience and expertise to apply for entry-
level positions.
Gives exposure to the industry
A data analyst internship is your chance to get up close and personal with data
analysis, learning all about its practices, challenges, and trends. You'll gain valuable
insights into how data is used across different sectors, helping you see how your skills
make a real impact in the business world.
Offers networking opportunities
Joining an internship program helps you gain valuable experience and expand your
network, so you can tap into the wisdom of industry experts and fellow interns. These
connections can open doors to exciting job opportunities and mentorships, taking your
career in data analysis to the next level.
1.3 Outcome of the internship
Informed Decisions
With accurate and timely sales data, businesses can make more informed
decisions about their product development, marketing, and sales strategies.
For instance, by analysing historical sales data, businesses can identify
seasonal trends, customer preferences, and market demands. This
information can then be used to develop new products or services, target
specific customer segments, and optimise marketing campaigns. Additionally,
analysing sales per region helps in determining where products or services
are selling the best, enhancing sales and marketing efforts through intelligent
performance insights and actionable suggestions for improving these efforts.
Understanding Customer Behaviour
Sales data analysis provides valuable insights into customer behaviour,
including their buying patterns, preferences, and pain points. By using sales
analytics and understanding customer behaviour, businesses can develop
targeted marketing campaigns, improve customer service, and create
products and services that better meet customer needs.
Identifying Profitable Products and Services
Sales data analysis helps businesses identify their most profitable products
and services. This information can then be used to allocate resources more
effectively, focus on high-potential opportunities, and discontinue
underperforming products or services.
Tracking Progress
Sales data analysis allows businesses to track their progress over time and
measure the effectiveness of their sales and marketing strategies. By using
predictive sales analysis and comparing current sales data to historical data,
businesses can identify areas of improvement and make necessary
adjustments to their strategies.
1. Introduction and Organizational Profile
2.1 Scope of work in company
The scope of work for an intern can vary significantly depending on the company,
industry, and specific role they’re assigned to. However, some common tasks and
responsibilities for interns might include:
1. Assisting with projects: Interns often support various projects within their
department or team. This could involve conducting research, gathering data,
or assisting with the implementation of plans.
2. Administrative tasks: Interns may be responsible for administrative duties
such as organizing files, scheduling meetings, answering emails, or
preparing documents and presentations.
3. Learning and training: Internships are also learning opportunities. Interns
may receive training in specific software, tools, or processes relevant to
their role or industry. They may also shadow more experienced employees
to gain insight into their roles and responsibilities.
4. Supporting team members: Interns often provide support to team
members by helping them with tasks, collaborating on projects, or offering
assistance as needed.
5. Special projects: Depending on the company and the intern’s skills and
interests, they may be given the opportunity to work on special projects that
align with their career goals or areas of expertise.
6. Contributing ideas: Interns are encouraged to contribute their ideas and
perspectives to discussions and projects. They may be asked to brainstorm
solutions to problems or offer suggestions for improvement.
7. Networking: Interns have the opportunity to network with professionals in
their field, both within the company and through industry events or
networking opportunities. Building relationships and making connections
can be valuable for future career opportunities.
8. Professional development: Internships often include opportunities for
professional development, such as workshops, seminars, or mentorship
programs, to help interns develop new skills and advance their careers.
9. Feedback and evaluation: Interns may receive feedback on their
performance throughout their internship and have regular evaluations to
assess their progress and areas for improvement.
Overall, the scope of work for an intern is typically designed to provide them with
a valuable learning experience while also contributing to the goals and objectives
of the company.
2.2 Domain description
The field of data analytics involves the analysis of large datasets to uncover
insights, trends, and patterns that can inform decision-making and drive business
strategies. It encompasses a variety of techniques, tools, and methodologies aimed
at extracting valuable information from data.
Domain Description:
1. Data Collection and Integration: Data analytics begins with collecting
relevant data from various sources, including databases, spreadsheets,
sensors, social media, and other sources. This data is then integrated and
organized into a format suitable for analysis.
2. Data Cleaning and Preprocessing: Raw data often contains errors,
missing values, or inconsistencies that need to be addressed before analysis.
Data cleaning involves identifying and correcting these issues to ensure the
quality and integrity of the data.
3. Exploratory Data Analysis (EDA): EDA involves examining the data
visually and statistically to understand its characteristics and identify
patterns or relationships. This typically includes summary statistics, data
visualization, and correlation analysis.
4. Statistical Analysis: Statistical techniques are used to analyze data and
infer insights. This may involve hypothesis testing, regression analysis, time
series analysis, and other statistical methods to uncover relationships and
trends within the data.
5. Machine Learning and Predictive Analytics: Machine learning
algorithms are used to build models that can make predictions or
classifications based on historical data. Predictive analytics leverages these
models to forecast future trends or outcomes and make data-driven
decisions.
6. Data Visualization: Data visualization techniques are used to present data
in a visual format, such as charts, graphs, and dashboards, to facilitate
understanding and communication of insights to stakeholders.
7. Big Data Analytics: With the proliferation of big data, organizations are
increasingly using advanced analytics techniques to analyze large and
complex datasets. This involves technologies such as distributed computing,
parallel processing, and cloud computing to handle the volume, velocity,
and variety of big data.
8. Text Analytics and Natural Language Processing (NLP): Text analytics
and NLP techniques are used to analyze unstructured text data, such as
emails, social media posts, and customer reviews. This involves extracting
insights from text, sentiment analysis, topic modeling, and other techniques
to understand and categorize textual data.
9. Data Mining: Data mining involves discovering patterns and relationships
in large datasets using techniques such as clustering, association rule
mining, and anomaly detection. This can uncover hidden insights and
opportunities for optimization or improvement.
10. Business Intelligence (BI): Data analytics is often integrated into business
intelligence systems to provide actionable insights and support strategic
decision-making within organizations. BI tools and dashboards enable users
to interactively explore data and generate reports for stakeholders.
Overall, data analytics plays a crucial role in helping organizations extract value
from their data assets and gain a competitive advantage in today’s data-driven
world. It encompasses a wide range of techniques and technologies aimed at
transforming data into actionable insights to drive business growth and innovation.
2.3 Organizational Profile
Welcome to iNeuron, your gateway to affordable, high-quality education in
emerging technologies. We prioritize making education accessible without
compromising excellence. Our industry-expert crafted programs ensure learners
receive practical, real-world skills.
With a commitment to affordability, iNeuron's 360-degree learning solution
integrates virtual labs, internship, and job portals. Our innovative ecosystem
empowers students to thrive in the dynamic tech landscape. Join the iNeuron
community, where collaboration and innovation fuel success. Whether you're a
beginner or a professional seeking to upskill, iNeuron is your partner on the
journey to a future filled with limitless possibilities.
Our goal is to make education and experiential skills affordable and accessible to
everyone regardless of their disparate economic and educational backgrounds. We
empower students to make demands unlike any other platform or institute because
curiosity cannot be contained. Learning cannot be boxed in a book. So let’s step
ahead and 'build together'.
2.3.1 Products and Clients
Affordable online courses
Affordable online courses along with learning communities.
Best in Class/Industry Mentors
Mentors are renowned community contributors, successful entrepreneurs & Industry
leaders.
Internship Portal
Simulate with 500+ real industry projects, and earn your experience letter
Job Portal
An unparalleled job portal that rewards both recruiters and applicants.
One Neuron
One Neuron, a subscription based OTT platform
Virtual Lab
Product development at R&D lab with respect to robotics, drones, and customized products like
electronic devices, AI on edge devices.
3 Work Description
3.1 Organizational Chart
Sudhanshu Kumar (CEO )
Syed Shikalgar Soniya S Sindhu S Dariohs.B
3.2 Interns job role description
Company Overview:
iNeuron is a dynamic and innovative Edtech company dedicated to IT training.
We believe in harnessing the power of data to drive informed decision-making and
enhance business performance. As a data analytics intern with us, you'll have the
opportunity to work alongside seasoned professionals and gain hands-on
experience in leveraging data to derive actionable insights.
Position Overview:
We are seeking a highly motivated and analytical individual to join our team as a
Data Analytics Intern. The ideal candidate is passionate about data, possesses
strong analytical skills, and is eager to learn and contribute to our data-driven
initiatives. This internship offers the opportunity to gain valuable experience in
data analysis, statistical modeling, and data visualization techniques.
Key Responsibilities:
Assist in collecting, cleaning, and analyzing large datasets from various
sources.
Use statistical methods to uncover patterns, trends, and insights within the
data.
Support the development and implementation of predictive models and
algorithms.
Create visually compelling dashboards and reports to communicate findings
to stakeholders.
Collaborate with cross-functional teams to understand business
requirements and deliver data-driven solutions.
Contribute to ongoing projects and initiatives aimed at improving data
quality and accessibility.
Stay updated on industry trends and emerging technologies in data
analytics.
Qualifications:
Currently pursuing a degree in Data Science, Statistics, Computer Science,
Mathematics, or a related field.
Strong proficiency in data analysis tools and programming languages such
as Python, R, or SQL.
Familiarity with statistical techniques and machine learning algorithms.
Excellent problem-solving skills and attention to detail.
Effective communication skills with the ability to explain technical concepts
to non-technical stakeholders.
Ability to work both independently and collaboratively in a fast-paced
environment.
Prior experience with data visualization tools (e.g., Tableau, Power BI) is a
plus.
Duration:
This is a 1-month internship position.
3.3 Programming language/concept/technology
Python is a high-level, versatile programming language known for its simplicity,
readability, and ease of use. Developed by Guido van Rossum and first released in
1991, Python has gained immense popularity among developers, data scientists,
researchers, and educators alike.
Key Features of Python:
1. Simple and Readable Syntax: Python's syntax is designed to be easily
readable and straightforward, making it ideal for beginners and experienced
programmers alike. Its indentation-based block structure encourages clean
and organized code.
2. Interpreted and Interactive: Python is an interpreted language, meaning
that code is executed line by line, which allows for interactive development
and debugging. This makes it easy to test small code snippets and
experiment with different solutions.
3. Extensive Standard Library: Python comes with a comprehensive
standard library, offering modules and packages for a wide range of tasks,
from file I/O and networking to web development and data manipulation.
This reduces the need for external dependencies and accelerates
development.
4. Dynamic Typing: Python is dynamically typed, meaning that variable
types are determined at runtime rather than explicitly declared. This
flexibility simplifies coding and allows for rapid prototyping and
development.
5. Cross-Platform Compatibility: Python is available for various operating
systems, including Windows, macOS, and Linux, making it a versatile
choice for developing applications that can run across different platforms
without modification.
6. Large and Active Community: Python boasts a large and vibrant
community of developers who contribute to its ecosystem by creating
libraries, frameworks, and tools. This active community provides extensive
support, resources, and documentation, making it easy to find solutions to
problems and stay updated on best practices.
7. Versatility: Python is widely used across various domains, including web
development (with frameworks like Django and Flask), data science (with
libraries like NumPy, Pandas, and SciPy), machine learning and artificial
intelligence (with libraries like TensorFlow and PyTorch), scripting,
automation, scientific computing, and more.
Applications of Python:
Web Development: Frameworks like Django and Flask are popular for
building web applications and APIs.
Data Science and Analytics: Python is widely used for data manipulation,
analysis, and visualization, with libraries like Pandas, NumPy, Matplotlib,
and Seaborn.
Machine Learning and AI: Python's extensive ecosystem of libraries,
including TensorFlow, PyTorch, and scikit-learn, makes it a preferred
choice for developing machine learning and AI applications.
Scripting and Automation: Python's simplicity and versatility make it well-
suited for scripting tasks and automating repetitive processes.
Scientific Computing: Python is widely used in scientific computing and
research, with libraries like SciPy and SymPy providing tools for numerical
computation, symbolic mathematics, and more.
Overall, Python's combination of simplicity, flexibility, and powerful features
makes it an excellent choice for a wide range of programming tasks, from
beginner-level scripting to complex software development projects.
3.4 Software/hardware used
Tools Used:
1. Microsoft Excel
2. Microsoft Power BI
3. Jupyter Notebook
Microsoft Excel
Budgeting, chart creation, data analytics and more – all at your
fingertips. The Excel spreadsheet and budgeting app lets you create,
view, edit and share files, charts and data. Excel’s built-in file editor
lets you manage your finances with on-the-go budget and expense
tracking integration. We make it easy to review and analyze data, edit
templates, and more.
With Excel you can confidently edit documents, track expenses, and
compile charts and data . Create charts directly from your phone for
convenient data analysis, accounting, and financial management.
Access to spreadsheets, pivot tables and chart makers make budgeting
in Excel easy.
Make spreadsheets and data files with robust formatting tools and
features that boost your productivity. Build charts and sheets that meet
specific needs with Excel’s wide array of worksheet resources.
Spreadsheets, business collaboration, charts and data analysis tools all
on your phone with Microsoft Excel.
Microsoft Excel Features:
Microsoft Excel Features:
Spreadsheets & Calculations
• Create charts, budgets, task lists, accounting & financial analysis
with Excel's modern templates.
• Use an accounting calculator, data analysis tools & familiar formulas
to run calculations on spreadsheets.
• Workbook sheets and charts are easier to read and use with rich
Office features & formatting options.
• Spreadsheet & chart features, formats & formulas operate the same
way on any device.
Accounting, Budget & Expense Tracking
• Budget Template: Spreadsheets & charts help calculate budget needs.
• Budget Planner: Budget templates & tools to help you drill down to
your finance needs.
• Budget Tracker: Track expenses & save money.
• Accounting app: Use as a tax calculator for estimates, personal
finances & more.
Data Analysis
• Chart maker: Annotate, edit & insert charts that bring data to life.
• Data analysis: Add & edit chart labels to highlight key insights.
• Budget tracker: Track expenses with a personal budget template.
• Pivot Charts and spreadsheet visualization tools offer easily
digestible formats.
Review and Edit
• File editor: Edit documents, charts and data from anywhere.
• Data analysis features like sort & filter columns.
• Annotate charts, highlight parts of worksheets, create shapes & write
equations with the draw tab on devices with touch capabilities.
Collaborate and Work Anywhere
• Share files & Excel sheets in a few taps to invite others to edit, view
or leave comments.
• Edit & copy your worksheet in the body of an email or attach a link
to your workbook.
Microsoft Excel is your all-in-one expense manager, chart maker,
budget planner, and more. Get more done today with extensive
spreadsheet tools to enhance your productivity.
REQUIREMENTS:
1 GB RAM or above
To create or edit documents, sign in with a free Microsoft account on
devices with a screen size smaller than 10.1 inches.
Unlock the full Microsoft experience with a qualifying Microsoft 365
subscription for your phone, tablet, PC and Mac.
Microsoft 365 subscriptions purchased from the app will be charged to
your Play Store account and will automatically renew within 24 hours
prior to the end of the current subscription period, unless auto-renewal
is disabled beforehand. You can manage your subscriptions in your
Play Store account settings. A subscription cannot be cancelled during
the active subscription period.
Microsoft Power BI
Power BI is a collection of software services, apps, and connectors that work
together to turn your unrelated sources of data into coherent, visually immersive,
and interactive insights. Your data might be an Excel spreadsheet, or a collection
of cloud-based and on-premises hybrid data warehouses. Power BI lets you easily
connect to your data sources, visualize and discover what's important, and share
that with anyone or everyone you want.
The parts of Power BI
Power BI consists of several elements that all work together, starting with these
three basics:
A Windows desktop application called Power BI Desktop.
An online software as a service (SaaS) service called the Power BI
service.
Power BI Mobile apps for Windows, iOS, and Android devices.
These three elements—Power BI Desktop, the service, and the mobile apps—are
designed to let you create, share, and consume business insights in the way that
serves you and your role most effectively.
Beyond those three, Power BI also features two other elements:
Power BI Report Builder, for creating paginated reports to share in
the Power BI service. Read more about paginated reports later in this
article.
.
Power BI Report Server, an on-premises report server where you canpublish your
Power BI reports, after creating them in Power BI Desktop. Read more about Power
BI Report Server later in this articleHow Power BI matches your role
How you use Power BI depends on your role in a project or on a team. Other
people, in other roles, might use Power BI differently.
For example, you might primarily use the Power BI service to view reports and dashboards.
Your number-crunching, business-report-creating coworker might make extensive use of
Power BI Desktop or Power BI Report Builder to create reports, then publish those reports to
the Power BI service, where you view them. Another coworker, in sales, might mainly use the
Power BI Mobile app to monitorprogress on sales quotas, and to drill into new sales lead
details
If you're a developer, you might use Power BI APIs to push data into semantic
models or to embed dashboards and reports into your own custom applications.
Have an idea for a new visual? Build it yourself and share it with others.
You also might use each element of Power BI at different times, depending on
what you're trying to achieve or your role for a given project.
How you use Power BI can be based on which feature or service of Power BI is
the best tool for your situation. For example, you can use Power BI Desktop to
create reports for your own team about customer engagement statistics and you
can view inventory and manufacturing progress in a real-time dashboard in the
Power BI service. You can create a paginated report of mailable invoices, based
on a Power BI semantic model. Each part of Power BI is available to you, which is
why it's so flexible and compelling.
Explore documents that pertain to your role:
Power BI for business users
Power BI Desktop for report creators
Power BI Report Builder for enterprise report creators
Power BI for administrators
Power BI for developers
o What is Power BI embedded analytics?
o Create your own visuals in Power BI
o What can developers do with the Power BI API?
The flow of work in Power BI
One common workflow in Power BI begins by connecting to data sources in
Power BI Desktop and building a report. You then publish that report from Power
BI Desktop to the Power BI service, and share it so business users in the Power BI
service and on mobile devices can view and interact with the report.
This workflow is common, and shows how the three main Power BI elements
complement one another.
Use the deployment pipeline tool
In the Power BI service, you can use the deployment pipeline tool to test your
content before you release it to your users. The deployment pipeline tool can help
you deploy reports, dashboards, semantic models, and paginated reports. Read
about how to get started with deployment pipelines in the Power BI service.
How Microsoft Fabric works with Power BI
Microsoft Fabric is an offering that combines data + services in a unified
environment, making it easier to perform analysis and analytics on various sets of
data. Power BI is an example of one of the services that's integrated with
Microsoft Fabric, and your organization's OneLake data store is an example of
the data that can be used, analyzed, or visualized. Large organizations find
Microsoft Fabric particularly useful, since it can corral and then bring greater
value to large stores of data, then using services (like Power BI) to bring such data
to business life.
Administration of Power BI is now handled by Microsoft Fabric, but your favorite
tools like the Power BI service and Power BI Desktop still operate like they
always have - as a service that can turn your data, whether in OneLake or in Excel,
into powerful business intelligence insights.
Paginated reports in the Power BI service
Another workflow involves paginated reports in the Power BI service. Enterprise
report creators design paginated reports to be printed or shared. They can also
share these reports in the Power BI service. They're called paginated because
they're formatted to fit well on a page. They're often used for operational reports,
or for printing forms such as invoices or transcripts. They display all the data in a
table, even if the table spans multiple pages. Power BI Report Builder is the
standalone tool for authoring paginated reports.
Jupyter Notebook
Introduction
Jupyter Notebook is a notebook authoring application, under the Project
Jupyter umbrella. Built on the power of the computational notebook
format, Jupyter Notebook offers fast, interactive new ways to prototype and
explain your code, explore and visualize your data, and share your ideas with
others.
Notebooks extend the console-based approach to interactive computing in a
qualitatively new direction, providing a web-based application suitable for
capturing the whole computation process: developing, documenting, and executing
code, as well as communicating the results. The Jupyter notebook combines two
components:
A web application: A browser-based editing program for interactive authoring of
computational notebooks which provides a fast interactive environment for
prototyping and explaining code, exploring and visualizing data, and sharing ideas
with others
Computational Notebook documents: A shareable document that combines
computer code, plain language descriptions, data, rich visualizations like 3D
models, charts, mathematics, graphs and figures, and interactive controls
Main features of the web application
In-browser editing for code, with automatic syntax highlighting,
indentation, and tab completion/introspection.
The ability to execute code from the browser, with the results of
computations attached to the code which generated them.
Displaying the result of computation using rich media representations, such
as HTML, LaTeX, PNG, SVG, etc. For example, publication-quality figures
rendered by the [matplotlib] library, can be included inline.
In-browser editing for rich text using the [Markdown] markup language,
which can provide commentary for the code, is not limited to plain text.
The ability to easily include mathematical notation within markdown cells
using LaTeX, and rendered natively by MathJax.
Notebook documents
Notebook documents contains the inputs and outputs of a interactive session as
well as additional text that accompanies the code but is not meant for execution. In
this way, notebook files can serve as a complete computational record of a session,
interleaving executable code with explanatory text, mathematics, and rich
representations of resulting objects. These documents are internally JSON files
and are saved with the .ipynb extension. Since JSON is a plain text format, they
can be version-controlled and shared with colleagues.
Notebooks may be exported to a range of static formats, including HTML (for
example, for blog posts), reStructuredText, LaTeX, PDF, and slide shows, via the
[nbconvert] command.
Furthermore, any .ipynb notebook document available from a public URL can be
shared via the Jupyter Notebook Viewer <nbviewer>. This service loads the
notebook document from the URL and renders it as a static web page. The results
may thus be shared with a colleague, or as a public blog post, without other users
needing to install the Jupyter notebook themselves. In effect, nbviewer is simply
[nbconvert] as a web service, so you can do your own static conversions with
nbconvert, without relying on nbviewer.
To get the most out of this tutorial, familiarity with programming,
particularly Python and pandas, is recommended. However, even if you have
experience with another language, the Python code in this article should be
accessible.
Jupyter Notebooks can also serve as a flexible platform for learning pandas and
Python. In addition to the core functionality, we'll explore some exciting features:
Cover the basics of installing Jupyter and creating your first notebook
Delve deeper into important terminology and concepts
Explore how notebooks can be shared and published online
Demonstrate the use of Jupyter Widgets, Jupyter AI, and discuss security
considerations
(This article was written as a Jupyter Notebook and published in read-only form,
showcasing the versatility of notebooks. Most of our programming tutorials and
Python courses were created using Jupyter Notebooks).
By the end of this tutorial, you'll have a solid understanding of how to set up and
utilize Jupyter Notebooks effectively, along with exposure to powerful features
like Jupyter AI, while keeping security in mind.
4. Learning Outcome
4.1 Abstract of work experience
During my tenure as a Data Analyst Intern, I embarked on a transformative
learning journey that equipped me with invaluable skills and insights into the
realm of data analytics. This abstract encapsulates the essence of my learning
experience, highlighting key areas of growth and accomplishments.
Data Acquisition and Preprocessing: My internship commenced with an
exploration of data acquisition techniques, where I gained proficiency in sourcing,
cleansing, and preprocessing diverse datasets. Through hands-on experience with
tools such as SQL and Python libraries like Pandas, I honed my ability to extract
relevant information from raw data and prepare it for analysis.
Statistical Analysis and Modeling: Delving deeper into the analytics process, I
delved into statistical analysis and modeling methodologies, learning how to
identify patterns, trends, and correlations within datasets. Through projects and
guided exercises, I applied statistical techniques such as regression analysis,
hypothesis testing, and clustering to derive meaningful insights and inform
decision-making processes.
Data Visualization and Communication: A pivotal aspect of my internship
involved mastering the art of data visualization and communication. Leveraging
tools like Tableau and Matplotlib, I acquired the ability to create visually
compelling dashboards and reports that effectively communicated complex
findings to stakeholders. This skill proved instrumental in distilling technical
insights into actionable recommendations, fostering a culture of data-driven
decision-making within the organization.
Collaboration and Professional Development: Throughout my internship, I
actively engaged in cross-functional collaboration, working alongside seasoned
professionals and contributing to team projects. This collaborative environment
provided me with exposure to real-world challenges and opportunities to refine my
problem-solving and communication skills. Additionally, I participated in
workshops, seminars, and mentorship programs, further enhancing my
professional development and expanding my knowledge of emerging trends in the
field of data analytics.
Conclusion: In conclusion, my internship as a Data Analyst was a transformative
and enriching experience that deepened my understanding of data analytics
principles and methodologies. Through hands-on projects, collaborative
endeavors, and continuous learning, I emerged as a more confident and proficient
data analyst, ready to embark on future endeavors in this dynamic and rapidly
evolving field.
4.2 Any application development/ technology learnt
Exploratory Data Analysis (EDA) is a critical phase in the data analysis process
that focuses on understanding the structure, patterns, and relationships within a
dataset. It involves examining and visualizing data to uncover insights, identify
trends, and formulate hypotheses before proceeding with more in-depth analysis or
modeling. Here's a breakdown of the key components and techniques involved in
exploratory data analysis:
1. Data Summarization: EDA typically begins with summarizing the key
characteristics of the dataset, including measures of central tendency (mean,
median, mode), dispersion (variance, standard deviation), and distribution
(skewness, kurtosis). Descriptive statistics provide an initial overview of the
dataset's properties.
2. Univariate Analysis: Univariate analysis involves analyzing individual
variables in isolation to understand their distribution and properties. This
may include generating histograms, box plots, or bar charts to visualize the
distribution of numerical and categorical variables. Summary statistics and
frequency tables are also used to describe the characteristics of each
variable.
3. Bivariate Analysis: Bivariate analysis explores the relationships between
pairs of variables to uncover potential correlations or associations. Scatter
plots, correlation matrices, and cross-tabulations are commonly used
techniques to visualize the relationship between two variables. This helps
identify patterns and dependencies that may exist within the data.
4. Multivariate Analysis: Multivariate analysis extends beyond the
examination of individual variables and explores interactions between
multiple variables simultaneously. Techniques such as cluster analysis,
principal component analysis (PCA), and factor analysis are employed to
identify underlying structures or groupings within the dataset. These
techniques help uncover complex relationships and patterns that may not be
evident in univariate or bivariate analyses.
Data Visualization: Data visualization plays a crucial role in EDA, as it enables the
exploration of complex datasets through graphical representations. Visualizations such as
scatter plots, histograms, heatmaps, and pair plots help identify trends, outliers, and anomalies
within the data. Interactive visualization tools like Plotly and Tableau facilitate dynamic
1. exploration and analysis of data.
2. Missing Data and Outlier Detection: EDA also involves identifying and
handling missing data and outliers, as these can significantly impact the
analysis results. Techniques such as imputation, deletion, or outlier
treatment are employed to address missing values and outliers appropriately,
ensuring the integrity and reliability of the analysis.
3. Hypothesis Generation: EDA often leads to the formulation of hypotheses
or insights about the underlying data generating process. These hypotheses
serve as the basis for further analysis, experimentation, or modeling,
guiding the direction of subsequent data-driven investigations.
Overall, exploratory data analysis is a crucial preliminary step in the data analysis
process, providing a comprehensive understanding of the dataset and laying the
groundwork for more advanced analysis techniques. By leveraging descriptive
statistics, visualization, and statistical techniques, EDA enables data analysts to
uncover meaningful insights and derive actionable conclusions from complex
datasets.
5. Bibliography
https://www.microsoft.com/en-in/microsoft-365/excel
https://www.microsoft.com/en-us/power-platform/products/power-bi
https://www.dataquest.io/blog/jupyter-notebook-tutorial/
Annexures
iNeuron Intelligence Pvt Ltd
17th Floor Tower A, Brigade Signature Towers,
Sannatammanahalli, Bengaluru, Karnataka 562129.
LOGBOOK
Day
Number Topic Content Description Session
Introduction, Programming Importance,
Day 1 Python Basics – Part 1 Completed
Variables,& Operators
List Methods, String Methods, Conditional
Day 2 Python Basics – Part 2 Completed
Statements.
Day 3 Python Basics – Part 3 Loops, Functions, Other Must Know Functions. Completed
Tuples, Sets, Dictionaries and their Methods,
Day 4 Python Basics Assignment Completed
Comprehensions.
Questions based on python basics and data
Day 5 Numpy – Part 1 Completed
structures concepts.
Array Dimensions, List vs Numpy, Reshaping,
Day 6 Numpy – Part 2 Completed
Indexing, Operations.
View vs Copy, Hstack vs Vstack, Concatenation,
Day 7 Pandas – Part 1 Completed
Insert Append Delete Functions.
Pandas Series, DataFrame, Concatenation, Top
Day 8 Pandas – Part 2 Completed
Commands, Indexing DataFrame.
Upload and Read Data, Groupby, Data Range,
Day 9 Python Library Assignment Completed
Reading CSV files.
Day 10 Data Cleaning Module Questions based on Numpy and Pandas libraries Completed
Introduction, Nan Cases, Missing Value
Day 11 Exploratory Data Analysis – Part 1 Treatment, Ffill, Bfill, Imputation Techniques, Completed
DataFrame Iterations, Pandas Function.
Deleting Rows & Columns, Duplicate Values,
Day 12 Exploratory Data Analysis – Part 2 Completed
Missing Values Handling, Removing Outliers.
IQR Outlier & 2 Score, Data Visualization,
Day 13 Python EDA Assignment Completed
Complete Matplotlib and Seaborn libraries.
Questions based on Data Cleaning &
Day 14 Statistics – Part 1 Completed
Exploratory Data Analysis concepts
Introduction, Mean, Median, Mode, Population
Day 15 Stats Behind Plots – Part 1 Completed
& Sample Mean.
Population & Sample Variance, Standard
Day 16 Stats Behind Plots – Part 2 Completed
Deviation.
Univariate, Bivariate, Multivariate, Histogram,
Day 17 Statistics Case Study 1 Completed
Boxplot, Bar plot etc.
Scatterplot, Line Plot, Pie Plot, Treemap,
Day 18 Statistics Case Study 2 Completed
Advanced Line, Bar & Piecharts etc.
Concepts of Descriptive Statistics working
Day 19 Statistics Assignment Completed
Employees dataset.
Questions based on Applied Statistics working
Day 20 Getting Started with Power BI Completed
with Mobile Phone dataset.
Day 21 Business Challenge Introduction, PowerBI Options Completed
Day 22 Creating Charts in Tableau Power BI Installation, PowerBI Desktop etc. Completed
Day 23 Charts Standout Techniques Working with Datasets Completed
Day 24 Filters Groups & Sets PowerBI Interface, Connecting Data etc Completed
Charting Methods, Line Chart, Types of Bar
Day 25 Calculated Fields & Parameters Chart, Scatter Chart, Tree-map, Text Tables, Pie Completed
Chart and etc.
Formatting Layout & Axes, Dual Axis, Combined
Day 26 Dashboard and Stories Completed
Axis and Tool Tips.
Types of Filters, Filtering Measures, Groups &
Day 27 Power BI Case Study Completed
Sets Implementation.
Number Calculations, Logical Functions and
Day 28 Power BI Assignment Completed
Parameters.
Dashboard Layout, Action Filter, Hands – on
Day 29 Power BI Assignment Experience working with PowerBI with the help Completed
of sample dataset.
Day 30 Power BI Assignment Major Dashboard Project Completed
Day 31 Project Submission One Project Submitted
Regards,
Sudhanshu Kumar
CEO & Chief AI Engineer at iNeuron.ai