The document provides an overview of various data science tools and techniques including Excel, Power BI, Python programming, SQL, NumPy, and data visualization methods. It describes the purpose and functionality of each tool. Excel is used for organizing, analyzing, and manipulating tabular data. Power BI allows creation of interactive reports and visualizations from various data sources. Python is an interpreted, object-oriented programming language useful for data analysis and rapid application development. SQL is used to create and manage relational databases. NumPy is a Python library for numerical computations that uses n-dimensional arrays. Visualization methods like bar charts, line charts and heatmaps can provide insights from large datasets.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
19 views10 pages
Advance Data Science Clusters
The document provides an overview of various data science tools and techniques including Excel, Power BI, Python programming, SQL, NumPy, and data visualization methods. It describes the purpose and functionality of each tool. Excel is used for organizing, analyzing, and manipulating tabular data. Power BI allows creation of interactive reports and visualizations from various data sources. Python is an interpreted, object-oriented programming language useful for data analysis and rapid application development. SQL is used to create and manage relational databases. NumPy is a Python library for numerical computations that uses n-dimensional arrays. Visualization methods like bar charts, line charts and heatmaps can provide insights from large datasets.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 10
ADVANCE
DATA SCIENCE Presented by B.kavya Advance Data Science:
A Data Science Advanced Course provides exposure to machine learning
and other advanced techniques. It helps you learn software development through Dashboards, Machine Learning, Prediction Models, Regression Analysis and other latest technologies.
The objective of this programme is to introduce participants to the different
tools and techniques used for handling, managing, analyzing and interpreting data. EXCEL :
Excel is a spreadsheet application developed by Microsoft that allows users to
organize, analyze, and manipulate data in a tabular format. It provides a user- friendly interface that enables users to enter data, perform calculations, and create charts and graphs. Excel stores data in worksheets, which are organized into workbooks. A workbook is a collection of one or more worksheets, and each worksheet is composed of cells arranged in rows and columns. Each cell can contain data, such as text, numbers, formulas, or functions. Power BI :
Power BI is a business intelligence tool developed by Microsoft.
It allows you to create interactive visualizations and reports from various data sources, including Excel spreadsheets, cloud-based and on-premise databases, and other online services. Power BI provides users with the ability to transform, analyze and visualize data with ease, making it an ideal solution for businesses of all sizes. Let’s explore some scenarios where Power BI can be useful. Scenario1:Sales analysis Scenario2:Customer Behavior Analysis Scenario3:Financial Analysis Python Programming : Python is an interpreted, object-oriented, high-level programming language. Its high-level built in data structures, combined with dynamic typing and dynamic binding, make it very attractive for Rapid Application Development, as well as for use as a scripting or glue language to connect existing components together. Conditional Statements : There are no programs which cannot be executed without conditional statements. Any problem statement given to you must be solved using these statements. Dictionary : A dictionary is a collection which is unordered, changeable and indexed. LISTS : They are many inbuilt functions and methods of Lists which helps to solve any problem statements with ease and less lines of code. Tuples : They are many inbuilt functions and methods of Tuples which helps to solve any problem statements with ease and less lines of code. STRING : A string is a sequence of characters . String literals in python are surrounded by either single quotation marks, or double quotation marks. Tabulae: Tabulae (plural of Tabula) is a Latin word that means "tables," and in the context of data analysis, it refers to a collection of information in a structured format. Tabulae are often used in data visualization and analysis to provide insights into large datasets . Types of Tabulae Charts :There are several types of tabulae charts that are commonly used in data visualization. Some of these include: Bar Chart: A bar chart is a chart that uses rectangular bars to represent data. The height or length of each bar corresponds to the value of the data it represents. Line Chart: A line chart is a chart that uses lines to connect data points. It is often used to show trends over time. Pie Chart: A pie chart is a chart that uses slices of a circle to represent data. The size of each slice corresponds to the value of the data it represents. Scatter Chart: A scatter chart is a chart that uses dots to represent data. It is often used to show the relationship between two variables. Heatmap Chart: A heatmap chart is a chart that uses color to represent data. It is often used to show patterns in large datasets. METADATA : Metadata is data that describes other data. It provides information about the structure, content, and context of a dataset. Metadata is essential for understanding and working with tabulae because it provides the necessary information to analyze and interpret the data accurately. Visual Analyst : The visual analyst is the process of analyzing data using visual tools and techniques. It involves using charts, graphs, and other visualizations to understand and interpret data. Visual analysis is essential for understanding large datasets because it allows analysts to see patterns and trends that may not be apparent in raw data. SQL: SQL, or Structured Query Language, is a programming language used to manage and manipulate relational databases. It is a standard language used by most database systems, including MySQL, Oracle, SQL Server, and PostgreSQL . SQL allows users to create and manage databases, as well as perform tasks such as inserting, updating, and retrieving data from a database. It uses a set of commands and syntax to communicate with the database and manipulate the data stored within it. Some of the most commonly used SQL commands include: SELECT: retrieves data from one or more tables in the database. INSERT: adds new data to a table in the database. UPDATE: modifies existing data in a table in the database. DELETE: removes data from a table in the database. CREATE: creates a new table or database in the database system. DROP: deletes a table or database from the database system. NUMPY: NUMPY ARRAY : Numpy is a python library for numerical computations . It is an essential tool for scientific computing, data analysis and machine learning . One of the fundamental objects in numpy is the N-dimensional array or ndarray . Operations on Array : Creating an array : Numpy allows you to create arrays of a specific shape and data type using the np.array() function . Indexing and Slicing : you can access individual elements of an array by indexing and multiple elements of an array using slicing. Reshaping : you can change the shape of an array using the np.reshape() function . Concatenation : using the np.concatenate() function , you can concatenate two or more arrays . Splitting : Using the np , you can split an array into two or more small arrays.split() function . THANK YOU