Assignment_on_datatypes
Assignment_on_datatypes
• Floating-point (float): Used for numbers with a decimal point. Example: 3.14, -0.001.
• String (str): Used for sequences of characters. Example: "Hello, World!", "Python".
• List: An ordered collection of elements that can be of different types. Example: [1, 2, 3, "aiml",
4.5].
• Tuple: An ordered collection of elements that are immutable. Example: (1, 2, 3, "aiml", 4.5).
• Dictionary (dict): A collection of key-value pairs. Example: {"name": "Vaibhav", "age": 35}.
• Class: A blueprint for creating objects, encapsulating data and methods to manipulate that
data.
Example:
class Car:
self.brand = brand
self.year = year
Why Are Data Types Essential?
1. Memory Management:
• Data types help the interpreter allocate the correct amount of memory for a variable. For
instance, an int might take up less memory than a float.
2. Data Integrity:
• Using the correct data type ensures that the data is stored and manipulated correctly. For
example, storing a string in an integer variable would lead to errors.
3. Type Checking:
• Data types enable the interpreter to perform type checking, which helps catch errors at
runtime. For example, trying to add a string to an integer would result in a runtime error.
• Clearly defined data types make the code more readable and easier to maintain. It becomes
clear what kind of data is expected and how it should be used.
5. Performance Optimization:
• Choosing the right data type can lead to performance optimizations. For example, using a list
instead of a dictionary for simple collections can be more efficient.
Primitive data types refer to the simplest types of data that a language can support. Generally, these
types are predefined by programming languages and serve as building blocks for data manipulation.
Below are some of the primitive data types in Python:
On the contrary, composite data types are those that are made up of more than one of the primitive
data types. They are capable of holding collections of values, thereby making them a little more
sophisticated.
Here, then, are some examples of the popular composite data types in Python:
1. List: An ordered collection of elements that can be of different types. Example: [1, 2, 3, "aiml",
4.5].
2. Tuple: An ordered collection of elements that are immutable. Example: (1, 2, 3, "aiml", 4.5).
3. Dictionary (dict): A collection of key-value pairs. Example: {"name": "Vaibhav", "age": 35}.
4. Set: An unordered collection of unique elements. Example: {1, 2, 3, "aiml"}.
Let's consider an example where we are working on an AI/ML course project. We might use both
primitive and composite data types to handle various data.
num_students = 50
average_score = 85.6
is_course_active = True
student_info = {
"age": 35,
Key Differences
• Composite data types are complex and can hold multiple values.
• Composite data types are used to store collections of data and to structure data in a more
organized way.
• Composite data types may require more memory as they can store multiple values.
• Operations on primitive data types are straightforward (e.g., arithmetic operations on integers).
• Operations on composite data types can be more complex (e.g., accessing elements in a list or
dictionary).
3. How do you declare and initialize variables with specific data
types in your preferred programming language?
Declare and Initialize Variables in Python
In Python, we don't have to declare a variable's type. That is, no type specification is needed for
declaring a variable. Just assign some value to a variable, and Python automatically assigns the data
type according to the value specified. Thus, Python becomes a dynamically typed language. Here are
a few examples:
num_students = 50.
average_score = 85.6;
is_course_active = True;
We are more likely to encounter classes while learning AI/ML, as such data structures are usually
represented in a class form. Here is an example of how to declare and initialize a class:
class Course:
self.name = name
self.instructor = instructor
self.year = year
print(f"Instructor: {ai_ml_course.instructor}")
print(f"Year: {ai_ml_course.year}")
Summary
Variable declaration and initialization is simple and dynamic in Python because of its type dynamic
nature. Here, we just need to assign a value to a variable, and Python will do the rest. This makes
Python particularly very config friendly and easy to use.
Type casting or type conversion is the process of changing a variable of one data type into another
data type. There are many reasons for high significance.
1. Data Compatibility:
In many cases, we may need to convert data types so that operations or functions can be
compatible. For example, some mathematical operations can only be performed on numeric
data types, i.e.: either integers or floats would need to be casted from strings.
2. Precision of Measurement:
In general, converting data types is a worthy decision while maintaining or adjusting the
precision of data. For example, an integer may be converted to float numbers for a calculation in
which decimal points are important.
3. Memory Management:
Type conversion can help conserve space. For example, an integer derived from a largish
floating-point number will help save room if decimal precision is no longer necessary.
There may be a scenario where, while working with external systems or APIs, conversion of data
may be required into those types that would be expected by those respective systems.
Python provides all the available inbuilt functionality functions to be used for type casting. Some
common casting will be discussed with other examples regarding a university course.
In the CRM, we might read data from a file or user input, which is often in string format. However,
for numerical operations, we want to convert these strings into integers or floats.
# String to Integer
num_students_str = "50"
num_students = int(num_students_str)
# String to Float
average_score_str = "85.6"
average_score = float(average_score_str)
Sometimes, we might need to convert numerical data to strings for display or concatenation
purposes.
# Integer to String
num_students = 50
num_students_str = str(num_students)
print(f"Number of Students (str): {num_students_str}")
# Float to String
average_score = 85.6
average_score_str = str(average_score)
We might need to convert lists to sets or tuples for specific operations, such as removing duplicates
or ensuring immutability.
unique_scores = set(student_scores)
student_scores_tuple = tuple(student_scores)
We might need to extract keys or values from a dictionary and convert them to a list for further
processing.
keys_list = list(student_info.keys())
values_list = list(student_info.values())
num_students = int(data["num_students"])
average_score = float(data["average_score"])
highest_score = max(student_scores)
In this example, we read data from a file, converted the strings to integers and floats, and then
performed further processing, such as calculating the highest score.
Dynamic Typing
• Definition: In dynamically typed programming languages, the type of a
variable is resolved at run time. There is no need to declare the type of a
variable explicitly.
• Examples: Python, JavaScript, Ruby, and PHP.
• Advantages:
1. Flexibility: A bit more flexible in that it does not require us to declare
types explicitly.
2. Conciseness: It usually requires less code since types do not need to
be explicitly declared.
• Disadvantages:
1. Type Safety: Any type-related errors are caught at run time and
might result in runtime errors.
2. Performance: This may result in slower running time, since checking
will have to be done at run time.
3. Readability and Maintainability: Can be harder to read and
maintain as explicit declaration of types is kept to a minimum.
Dynamic Typing
1. Data Validation: Data validation is done at runtime, which can cause a
runtime error if not properly handled.
2. Memory Management: Memory management can be less optimal since
types are determined only during runtime.
3. Function Signatures: Function signatures are vague, which may make it
harder to show what types of arguments a function expects and what
types it may return.
Example in Python (Dynamic Typing)
Let's look at a simple example in Python to illustrate dynamic typing:
def add(a, b):
return a + b
In this example, the add function can handle both integers and strings because
Python is dynamically typed. The type of a and b is determined at runtime,
allowing the same function to work with different types of data.
In this example, the add function in Java can only handle integers because Java is
statically typed. The types of a and b are declared explicitly, and trying to use
the function with strings would result in a compile-time error.
Example usage
In general, this is defined by some course_enrollments [1000, 2000, 1500,
2147483647] Last value is approximately the limit of a 32-bit signed variable.
course_enrollments = [1000, 2000, 1500, 2147483647]
# Last value is close to the max 32-bit int
try:
total = calculate_total_students(course_enrollments)
print(f"Total students enrolled: {total}")
except OverflowError as e:
print(e)
Mitigation Strategies
1. Use Larger Data Types: In languages with fixed-size integers, increasing
the data type to (e.g., long in Java) can provide overflow and underflow
protection.
2. Check for Overflow/Underflow: Implement checks in our code to
identify when an operation could produce overflow or underflow, as in the
example above.
3. Use Arbitrary-Precision Libraries: There are languages and libraries
that take advantage of arbitrary-precision arithmetic, i.e., large or small
numbers can be processed without overflow or underflow.
4. Modular Arithmetic: In certain situations, employing modular arithmetic
can be used to control overflow by bounding values within an appropriate
interval.
5. Language Features: Utilize language-specific features or libraries
designed to handle large numbers safely. For example, Python's int type
automatically handles large integers.
7. What are the common numeric data types, and how do they
differ in terms of range and precision?
Common Numeric Data Types
Integer (int)
• Range: Complex numbers are made of a real part and an imaginary part,
each being floats.
• Precision: Similar to floats, complex numbers have limited precision.
• Example: Representing complex data in AI/ML algorithms.
complex_number = 3 + 4j
Differences in Range and Precision
• Integers: Have unlimited range in Python but are precise. They are good
for counting discrete objects, such as, the student population.
• Floats: Have a wide range but limited precision. They are appropriate for
fractional values, such as grades or measurements, where fractional
values matter.
• Complex Numbers: For special purposes, e.g., use of certain AI/ML
algorithms that need heavy arithmetic.
Example: AI/ML University Course Onboarding Process
Now, suppose we want to determine the combined number of students enrolled
in different areas of AI/ML as well as the average grade achieved by students.
List of enrollments in different AI/ML courses
course_enrollments = [150, 200, 180, 220]
Calculate total number of students
total_students = sum(course_enrollments)
List of average grades in different AI/ML courses
course_grades = [85.5, 90.0, 88.75, 92.3]
Calculate overall average grade
average_grade = sum(course_grades) / len(course_grades)
print(f"Total number of students: {total_students}")
print(f"Overall average grade: {average_grade:.2f}")
In this example:
We represent the number of students using integers.
We use floats to represent the average grades, ensuring we capture the
fractional component.
Strings are a basic data type in programming that is used for representing and processing text. They
are character sequences, which may contain letter, numerals, symbols and whitespace. In Python
strings are delimited either by single (') or double quotation marks (").
Importance of Strings:
1. Data Representation: Strings are used for storing and representing textual data, e.g., names,
addresses, and descriptions.
2. Communication: Strings play a vital role in presenting messages to users, in recording
information and in API interactions.
3. Data Parsing: String is widely used for input data parsing and processing, i.e., reading files or
managing user input.
Following are some of the typical string processing operations in Python, as demonstrated through
examples within the context of an AI/ML university course on boarding pipeline:
1. Concatenation
2. String Length
3. Substring
4. String Replacement
5. String Splitting
6. String Joining
7. Case Conversion
Let's consider an example where we need to process student names and course details during the
on-boarding process.
In this example, we use string concatenation, formatting, and iteration to generate personalized
welcome messages for each student.
Arrays are a fundamental data structure used to store collections of data. They are a sequence of
elements, all of the same type, stored in contiguous memory locations. Arrays allow for efficient
access and manipulation of data, making them essential for various programming tasks.
1. Organized Storage: Arrays provide a way to store multiple values in a single variable, keeping data
organized and easily accessible.
2. Efficient Access: Elements in an array can be accessed quickly using their index, which makes
retrieval and updates efficient.
3. Memory Management: Arrays use contiguous memory locations, which can improve cache
performance and overall program efficiency.
4. Data Manipulation: Arrays support various operations like sorting, searching, and iterating, which
are crucial for data processing tasks.
In Python, arrays can be implemented using lists, which are flexible and can store elements of
different types.
However, for numerical data, the array module or libraries like numpy are often used for better
performance and functionality.
Using Lists
In this example, we use a numpy array to store the enrollments for different AI/ML courses and
calculate the total number of students using the np.sum function.
10. What are user-defined data types, and how can they be created
and used in programming?
User-Defined Data Types
User-defined data types (UDTs) are custom data structures created by programmers to represent
complex data in a more meaningful and organized way. They allow us to define our own data types
that can encapsulate multiple attributes and behaviours, making our code more modular, readable,
and maintainable.
In Python, user-defined data types are typically created using classes. A class is a blueprint for
creating objects (instances), and it can contain attributes (data) and methods (functions) that define
the behaviour of the objects.
Let's consider an example where we need to manage information about students and courses during
the onboarding process for an AI/ML university course.
1. Defining the Student Class
class Student:
self.name = name
self.student_id = student_id
self.email = email
def display_info(self):
print(f"Name: {self.name}")
print(f"Email: {self.email}")
# Example usage
student.display_info()
In this example, the Student class has three attributes: name, student_id, and email. It also has a
method display_info to print the student's information.
class Course:
self.course_name = course_name
self.course_code = course_code
self.instructor = instructor
self.students = []
self.students.append(student)
def display_course_info(self):
print(f"Instructor: {self.instructor}")
print("Enrolled Students:")
student.display_info()
The Course class has attributes for the course name, course code, and instructor. It also has a list to
store enrolled students and methods to add students and display course information.
3. Using the Classes
course.add_student(student1)
course.add_student(student2)
course.display_course_info()
In this example, we create instances of the Student class for Alice and Bob, and an instance of the
Course class for the "Introduction to AI and ML" course. We then add the students to the course and
display the course information.
• Reusability: Once defined, UDTs can be reused across different parts of the program or even in
other programs.
• Encapsulation: UDTs encapsulate data and behaviour, promoting data integrity and reducing the
risk of unintended interference.
By understanding and utilizing user-defined data types, we can create more structured and
maintainable code, especially when dealing with complex data and operations.