0% found this document useful (0 votes)
50 views100 pages

AIML

The document covers fundamental concepts in programming, including C language basics, data structures, Java basics, operating systems, DBMS, artificial intelligence, machine learning, and neural networks. It provides definitions, differences, and applications of various programming concepts such as variables, arrays, stacks, queues, linked lists, trees, and memory management. Additionally, it explains key programming functions and principles relevant to each topic.

Uploaded by

Krishika
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views100 pages

AIML

The document covers fundamental concepts in programming, including C language basics, data structures, Java basics, operating systems, DBMS, artificial intelligence, machine learning, and neural networks. It provides definitions, differences, and applications of various programming concepts such as variables, arrays, stacks, queues, linked lists, trees, and memory management. Additionally, it explains key programming functions and principles relevant to each topic.

Uploaded by

Krishika
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 100

List of Contents

→C FUNDAMENTALS

→DATA STRUCTURES

→JAVA BASICS

→BASICS OF OS

→DBMS

→ARTIFICIAL INTELLIGENCE BASICS

→MACHINE LEARNING FUNDAMENTALS

→NEURAL NETWORKS BASICS

1
C FUNDAMENTALS

1. Why is C called a mid-level programming language?

Due to its ability to support both low-level and high-level features, C is


considered a middlelevel language. It is both an assembly-level language,
i.e. a low-level language, and a higherlevel language. Programs that are
written in C are converted into assembly code, and they support pointer
arithmetic (low-level) while being machine-independent (high-level).
Therefore, C is often referred to as a middlelevel language. C can be used
to write operating systems and menu-driven consumer billing systems.

2. What are tokens in C?


• Tokens are identifiers or the smallest single unit in a program that is
meaningful to the compiler.
• Keywords, Identifiers, Constants, Strings, Special Symbols, Operators.

3. What is the difference between printf() and scanf() functions? printf() is


used to print formatted output to the screen, while scanf() is used to read
formatted input from the user.

4. What is a built-in function in C?


Built-in functions in C are scanf(), printf(), strcpy, strlwr, strcmp, strlen, strcat.

5. What is a Preprocessor?
A preprocessor is a software program that processes a source file before
sending it to be compiled. Inclusion of header files, macro expansions,
conditional compilation, and line control are all possible with the
preprocessor.

6. Difference between local and global variables in C.


Global variables are declared outside the function, and the scope of a variable
is available till the end of the program. Local variables are declared inside
the function, and the scope is available in these variables.

7. What are arrays in C programming?


It is a group of elements of similar sizes. The size of the array can also be
changed after declaration. The array has a contiguous memory location, and
it makes code optimized and easier.

2
8. What is the purpose of #include?
#include is a preprocessor directive that tells the compiler to include the
standard input-output library in the program.

9. What do you understand by dynamic memory allocation in C?


When memory is allocated at the run time in any program and can be
increased at the time of execution, it is called dynamic memory allocation in
C programming.

10. What is recursion in C?


When a function calls itself, the process is known as recursion, and the
function is known as a recursive function. It has two functions, the winding
phase, and the unwinding phase.

11. What is a pointer in C?


A variable that refers to the memory address of any value is called a pointer.
It is used to optimize the program to run faster. Some memory is allocated to
a variable and it holds the address number that is called the pointer variable.

12. What is the difference between formal and actual parameters in C?


The parameters sent from the main function to subdivided functions are
called actual parameters. The parameters declared at subdivided functions
are called formal parameters.

13. What is typecasting in C?


When one data type is converted into another data type, the process is known
as typecasting.

14. What is union in C program?


Union in C programming is a data type that is user-defined that allows many
types of data in a single unit. This data type holds the memory of the largest
member, and it does not use the sum of the memory of all functions.

15. Explain the difference between malloc() and calloc().


malloc() is used to allocate memory dynamically, while calloc() is used to
allocate memory dynamically and initialize all bytes to zero.

16. Explain the use of volatile keyword in C. The volatile keyword is used to
indicate that a variable may be changed unexpectedly by external factors,
such as hardware interrupts or concurrent threads.
3
DATA STRUCTURES

1. What are Data Structures?


A data structure is a mechanical or logical way that data is organized within a program.
The organization of data is what determines how a program performs. There are many
types of data structures, each with its own uses.

1. What are some applications of Data structures?


• Decision Making
• Genetics
• Image Processing
• Blockchain
• Numerical and Statistical Analysis
• Compiler Design
• Database Design

2. Explain the difference between file structure and storage structure?


File Structure: Representation of data into secondary or auxiliary memory say any device
such as a hard disk or pen drives that stores data which remains intact until manually
deleted is known as a file structure representation.
Storage Structure: In this type, data is stored in the main memory i.e RAM, and is deleted
once the function that uses this data gets completely executed.

3. What is a stack data structure? What are the applications of stack?


A stack is a data structure that is used to represent the state of an application at a
particular point in time. The stack consists of a series of items that are added to the top
of the stack and then removed from the top. It is a linear data structure that follows a
particular order in which operations are performed.
Some applications for stack data structure:
• It acts as temporary storage during recursive operations
• Redo and Undo operations in doc editors
• Reversing a string
• Parenthesis matching
• Postfix to Infix Expressions
• Function calls order

4. What is a queue data structure? What are the applications of queue?

4
A queue is a linear data structure that allows users to store items in a list in a systematic
manner. The items are added to the queue at the rear end until they are full, at which
point they are removed from the queue from the front.
Some applications of queue data structure:
• Breadth-first search algorithm in graphs
• Operating system: job scheduling operations, Disk scheduling, CPU
scheduling etc.
• Call management in call centres

5. Differentiate between stack and queue data structure.


Stack Queue

Stack Queue

Stack is a linear data structure where Queue is a linear data structure where data is
data is added and removed from the top. ended at the rear end and removed from the
front.

Stack is based on
LIFO(Last In First Out) Queue is based on FIFO(First In First Out)
principle principle

Insertion operation in Stack is known as Insertion operation in Queue is known as


push. eneque.

Delete operation in Stack is known as Delete operation in Queue is known as


pop. dequeue.

Only one pointer is available for both


addition and deletion:
top() Two pointers are available for addition and
deletion: front() and rear()

Used in solving recursion problems Used in solving sequential processing problems

5
6. What is a linked list data structure? What are the applications for the
Linked list?
A linked list can be thought of as a series of linked nodes (or items) that are connected
by links (or paths). Each link represents an entry into the linked list, and each entry points
to the next node in the sequence. The order in which nodes are added to the list is
determined by the order in which they are created.
Some applications of linked list data structure:
• Stack, Queue, binary trees, and graphs are implemented using linked lists.
• Dynamic management for Operating System memory.
• Round robin scheduling for operating system tasks.
• Forward and backward operation in the browser.

7. Elaborate on different types of Linked List data structures?


• Singly Linked List: A singly linked list is a data structure that is used to store
multiple items. The items are linked together using the key. The key is used to
identify the item and is usually a unique identifier. In a singly linked list, each
item is stored in a separate node.
• Doubly Linked List: A doubly linked list is a data structure that allows for two-
way data access such that each node in the list points to the next node in the list
and also points back to its previous node. In a doubly linked list, each node can
be accessed by its address, and the contents of the node can be accessed by its
index.
• Circular Linked List: A circular linked list is a unidirectional linked list where
each node points to its next node and the last node points back to the first node,
which makes it circular.
• Doubly Circular Linked List: A doubly circular linked list is a linked list where
each node points to its next node and its previous node and the last node points
back to the first node and first node’s previous points to the last node.

8. Difference between Array and Linked List.


Arrays Linked Lists

An array is a collection A linked list is a collection of

Arrays Linked Lists

of data elements of the same type. entities known as nodes. The node is divided into
two sections: data and address.

6
It keeps the data elements in a single
memory. It stores elements at random, or anywhere in the
memory.

The memory size of an array is fixed


and cannot be changed during
runtime. The memory size of a linked list is allocated during
runtime.

An array's elements are not


dependent on one another. Linked List elements are dependent on one
another.

It is easier and faster to access an


element in an array.
In the linked list, it takes time to access an element.

Memory utilization is ineffective in


the case of an array. Memory utilization is effective in the case of
linked lists.

Operations like insertion


and deletion take longer time in an
rray. Operations like insertion and deletion are faster in
the linked list.

9. What is binary tree data structure? What are the applications for binary trees?
A binary tree is a data structure that is used to organize data in a way that allows for
efficient retrieval and manipulation. It is a data structure that uses two nodes, called
leaves and nodes, to represent the data.It's widely used in computer networks for storing
routing table
information. Some of the applications are Decision Trees.
• Expression Evaluation.
• Database indices.

10. What is binary search tree data structure?

7
A binary search tree is a data structure that stores items in sorted order. In a binary search
tree, each node stores a key and a value. The key is used to access the item and the value
is used to determine whether the item is present or not.

11. What is a priority queue?


Priority Queue is an abstract data type that is similar to a queue in that each element is
assigned a priority value. The order in which elements in a priority queue are served is
determined by their priority (i.e., the order in which they are removed). If the elements
have the same priority, they are served in the order they appear in the queue.

12. What is graph data structure and its


representations?
A graph is a type of non-linear data structure made up of nodes and edges. The nodes are
also known as vertices, and edges are lines or arcs that connect any two nodes in the
graph.
Adjacency Matrix
Adjacency List

13. What is the difference between the Breadth First Search (BFS) and Depth
First Search (DFS)?
Breadth First Search (BFS) Depth First Search (DFS)

It stands for “Breadth First Search” It stands for “Depth First Search”

DFS (Depth First Search) finds the


shortest path using the Stack data
BFS (Breadth First Search) finds the shortest structure.
path using the Queue data structure.

When compared to DFS, When compared to BFS,


BFS is slower. DFS is faster.

BFS performs better when the target is close DFS performs better when the target is far
to the source. from the source.

BFS necessitates more DFS necessitates less


memory. memory.

8
When there are no more nodes to visit, the
visited nodes are added to the stack and
Nodes that have been traversed multiple then removed.
times are removed from the queue.

The DFS algorithm is a recursive


algorithm that employs the concept of
backtracking.
Backtracking is not an option in BFS.

It is based on the LIFO principle (Last In


It is based on the FIFO principle (First In First First Out).
Out).

14. What is a heap data structure?


Heap is a special tree-based non-linear data structure in which the tree is a complete
binary tree. A binary tree is said to be complete if all levels are completely filled except
possibly the last level and the last level has all elements as left as possible. Heaps are of
two types:
Max-Heap:
In a Max-Heap the data element present at the root node must be the greatest among all
the data elements present in the tree.
This property should be recursively true for all sub-trees of that binary tree.
Min-Heap:
In a Min-Heap the data element present at the root node must be the smallest (or
minimum) among all the data elements present in the tree.
This property should be recursively true for all sub-trees of that binary tree.

9
JAVA BASICS

1. What is meant by variable?

Variables are locations in memory that can hold values. Before assigning any value to
a variable, it must be declared.

2. What are the kinds of variables in Java? What are their uses?

• Java has three kinds of variables namely, the instance variable, the local variable and
the class variable.

• Local variables are used inside blocks as counters or in methods as temporary variables
and are used to store information needed by a single method.

• Instance variables are used to define attributes or the state of a particular object and are
used to store information needed by multiple methods in the objects.

• Class variables are global to a class and to all the instances of the class and are useful
for communicating between different objects of all the same class or keeping track of
global states.

3. How are the variables declared?

• Variables can be declared anywhere in the method definition and can be


initialized during their declaration.They are commonly declared before usage at the
beginning of the definition.

• Variables with the same data type can be declared together. Local variables
must be given a value before usage.

4. What are variable types?

Variable types can be any data type that java supports, which includes the eight primitive
data types, the name of a class or interface and an array.

5. How do you assign values to variables?

Values are assigned to variables using the assignment operator =

6. What is a literal? How many types of literals are there?

10
A literal represents a value of a certain type where the type describes how that value
behaves. There are different types of literals namely number literals, character
literals,boolean literals, string literals,etc.

7. What is an array?

An array is an object that stores a list of items.

8. How do you declare an array?

Array variable indicates the type of object hat the array holds. Ex: int arr[];

9. What is a string?

A combination of characters is called as string.

10. When a string literal is used in the program, Java automatically creates
instances of the string class.

True

11. Which operator is to create and concatenate string?

Addition operator(+).

12. How do we change the values of the elements of the array?

The array subscript expression can be used to change the values of the elements of the
array.

13. What is final varaible?

If a variable is declared as final variable, then you can not change its value. It becomes
constant.

14. What is static variable?

Static variables are shared by all instances of a class.

11
15. What is mean by garbage collection?

When an object is no longer referred to by any variable, Java automatically reclaims


memory used by that object. This is known as garbage collection.

16. What are methods and how are they defined?

Methods are functions that operate on instances of classes in which they are
defined.Objects can communicate with each other using methods and can call methods in
other classes.Method definition has four parts. They are name of the method, type of
object or primitive type the method returns, a list of parameters and the body of the
method. A method's signature is a combination of the first three parts mentioned above.

17. What is calling method?

Calling methods are similar to calling or referring to an instance variable. These


methods are accessed using dot notation.
Ex: obj.methodname(param1,param2)

18. Which method is used to determine the class of an object?

getClass( ) method can be used to find out what class the belongs to. This class is defined
in the object class and is available to all objects.

19. What is casting?


Casting is bused to convert the value of one type to another.

20. What are packages ? what is use of packages ?

The package statement defines a name space in which classes are stored.If you omit the
package, the classes are put into the default package.Signature... package pkg;
Use: * It specifies to which package the classes defined in a file belongs to. * Package is
both naming and a visibility control mechanism.

21. What is difference between importing


"java.applet.Applet" and "java.applet.*; "java.applet.Applet" will import
only the class Applet from the package java.applet Where as "java.applet.*"
will import all the classes from java.applet package.

12
22. What do you understand by package access specifier?

Public: Anything declared as public can be accessed from anywhere

Private: Anything declared in the private can’t be seen outside of its class.

Default: It is visible to subclasses as well as to other classes in the same package.

23. What is interface? What is use of interface

It is similar to class which may contain method’s signature only but not bodies.

24.Is it is necessary to implement all methods in an interface?


Yes. All the methods have to be implemented.

25.What is difference between interface and an abstract class?


All the methods declared inside an Interface are abstract. Where as abstract class must
have at least one abstract method and others may be concrete or abstract.In Interface we
need not use the keyword abstract for the methods.

13
BASICS OF OS

1. What is a process and process table?

A process is an instance of a program in execution. For example, a Web Browser is a


process, and a shell (or command prompt) is a process. The operating system is
responsible for managing all the processes that are running on a computer and allocates
each process a certain amount of time to use the processor. In addition, the operating
system also allocates various other resources that processes will need, such as computer
memory or disks. To keep track of the state of all the processes, the operating system
maintains a table known as the process table. Inside this table, every process is listed
along with the resources the process is using and the current state of the process.

2. What are the different states of the process?


Processes can be in one of three states: running, ready, or waiting. The running state
means that the process has all the resources it needs for execution and it has been given
permission by the operating system to use the processor. Only one process can be in the
running state at any given time. The remaining processes are either in a waiting state
(i.e., waiting for some external event to occur such as user input or disk access) or a
ready state (i.e., waiting for permission to use the processor). In a real operating system,
the waiting and ready states are implemented as queues that hold the processes in these
states.
3. What is a Thread?
A thread is a single sequence stream within a process. Because threads have some of the
properties of processes, they are sometimes called lightweight processes. Threads are a
popular way to improve the application through parallelism. For example, in a browser,
multiple tabs can be different threads. MS Word uses multiple threads, one thread to
format the text, another thread to process inputs, etc.

4. What are the differences between process and thread?


Threads are lightweight processes that share the same address space including the code
section, data section and operating system resources such as the open files and signals.
However, each thread has its own program counter (PC), register set and stack space
allowing them to the execute independently within the same process context. Unlike
processes, threads are not fully independent entities and can communicate and
synchronize more efficiently making them suitable for the concurrent and parallel
execution in the multi-threaded environment.

5. What are the benefits of multithreaded programming?


14
It makes the system more responsive and enables resource sharing. It leads to the use of
multiprocess architecture. It is more economical and preferred.

6. What is Thrashing?
Thrashing is a situation when the performance of a computer degrades or collapses.
Thrashing occurs when a system spends more time processing page faults than executing
transactions. While processing page faults is necessary in order to appreciate the benefits
of virtual memory, thrashing has a negative effect on the system. As the page fault rate
increases, more transactions need processing from the paging device. The queue at the
paging device increases, resulting in increased service time for a page fault.

7. What is Buffer?
A buffer is a memory area that stores data being transferred between two devices or
between a device and an application.

8. What is virtual memory?

Virtual memory creates an illusion that each user has one or more contiguous address
spaces, each beginning at address zero. The sizes of such virtual address spaces are
generally very high. The idea of virtual memory is to use disk space to extend the RAM.
Running processes don’t need to care whether the memory is from RAM or disk. The
illusion of such a large amount of memory is created by subdividing the virtual memory
into smaller pieces, which can be loaded into physical memory whenever they are needed
by a process.

9. Explain the main purpose of an operating system?

An operating system acts as an intermediary between the user of a computer and


computer hardware. The purpose of an operating system is to provide an environment in
which a user can execute programs conveniently and efficiently.
An operating system is a software that manages computer hardware. The hardware must
provide appropriate mechanisms to ensure the correct operation of the computer system
and to prevent user programs from interfering with the proper operation of the system.

10. What is demand paging?


The process of loading the page into memory on demand (whenever a page fault occurs)
is known as demand paging.

11. What is a kernel?

15
A kernel is the central component of an operating system that manages the operations of
computers and hardware. It basically manages operations of memory and CPU time. It
is a core component of an operating system. Kernel acts as a bridge between applications
and data processing performed at the hardware level using interprocess communication
and system calls.

12. What are the different scheduling algorithms?


First-Come, First-Served (FCFS) Scheduling.
Shortest-Job-Next (SJN) Scheduling.
Priority Scheduling.
Shortest Remaining Time.
Round Robin(RR) Scheduling.
Multiple-Level Queues Scheduling.

13. Describe the objective of multi-programming.

Multi-programming increases CPU utilization by organizing jobs (code and data) so that
the CPU always has one to execute. The main objective of multiprogramming is to keep
multiple jobs in the main memory. If one job gets occupied with IO, the CPU can be
assigned to other jobs.

14. What is the time-sharing system?

Time-sharing is a logical extension of multiprogramming. The CPU performs many tasks


by switches that are so frequent that the user can interact with each program while it is
running. A time-shared operating system allows multiple users to share computers
simultaneously.

15. What problem we face in computer system without OS?

• Poor resource management


• Lack of User Interface
• No File System
• No Networking
• Error handling is big issue
16. Give some benefits of multithreaded programming?

A thread is also known as a lightweight process. The idea is to achieve parallelism by


dividing a process into multiple threads. Threads within the same process run in shared
memory space,

16
17. Briefly explain FCFS.

FCFS stands for First Come First served. In the FCFS scheduling algorithm, the job that
arrived first in the ready queue is allocated to the CPU and then the job that came second
and so on. FCFS is a non-preemptive scheduling algorithm as a process that holds the
CPU until it either terminates or performs I/O. Thus, if a longer job has been assigned to
the CPU then many shorter jobs after it will have to wait.

18. What is the RR scheduling algorithm?

A round-robin scheduling algorithm is used to schedule the process fairly for each job in
a time slot or quantum and interrupting the job if it is not completed by then the job
comes after the other job which is arrived in the quantum time makes these scheduling
fairly.

• Round-robin is cyclic in nature, so starvation doesn’t occur


• Round-robin is a variant of first-come, first-served scheduling
• No priority or special importance is given to any process or task
• RR scheduling is also known as Time slicing scheduling
19. Enumerate the different RAID levels?

A redundant array of independent disks is a set of several physical disk drives that the
operating system sees as a single logical unit. It played a significant role in narrowing
the gap between increasingly fast processors and slow disk drives. RAID has different
levels:

• Level-0
• Level-1
• Level-2
• Level-3
• Level-4
• Level-5
• Level-6
20. What is Banker’s algorithm?

The banker’s algorithm is a resource allocation and deadlock avoidance algorithm that
tests for safety by simulating the allocation for the predetermined maximum possible
amounts of all resources, then makes an “s-state” check to test for possible activities,
before deciding whether allocation should be allowed to continue.

21. State the main difference between logical and physical address space?

17
LOGICAL PHYSICAL
Parameter
ADDRESS ADDRESS

generated by the location in a memory Basic


CPU. unit.

Logical Address
Physical Address is a
Space is a set of all set of all physical
logical addresses addresses mapped to generated by
Address Space the the corresponding
CPU in reference to a logical addresses. program.

Users can view the Users can never view logical address of a the
physical address of program. the program.
Visibility

18
Generation generated by the Computed by MMU.
LOGICAL PHYSICAL
Parameter
ADDRESS ADDRESS

CPU.
The user can use the The user can indirectly logical address to access
physical
Access
access the physical addresses but not address. directly.

22. How does dynamic loading aid in better memory space utilization?

With dynamic loading, a routine is not loaded until it is called. This method is especially
useful when large amounts of code are needed in order to handle infrequently occurring
cases such as error routines.

23. What are overlays?

The concept of overlays is that whenever a process is running it will not use the complete
program at the same time, it will use only some part of it. Then overlay concept says that
whatever part you required, you load it and once the part is done, then you just unload
it, which means just pull it back and get the new part you required and run it. Formally,
“The process of transferring a block of program code or other data into internal memory,
replacing what is already stored”.

24. What is fragmentation?

Processes are stored and removed from memory, which makes free memory space, which
is too little to even consider utilizing by different processes. Suppose, that process is not
ready to dispense to memory blocks since its little size and memory hinder consistently
staying unused is called fragmentation. This kind of issue occurs during a dynamic
memory allotment framework when free blocks are small, so it can’t satisfy any request.

25. What is the basic function of paging?

Paging is a method or technique which is used for noncontiguous memory allocation. It


is a fixed-size partitioning theme (scheme). In paging, both main memory and secondary
memory are divided into equal fixed-size partitions. The partitions of the secondary
memory area unit and the main memory area unit are known as pages and frames
respectively.

Paging is a memory management method accustomed fetch processes from the secondary
memory into the main memory in the form of pages. in paging, each process is split into
parts wherever the size of every part is the same as the page size. The size of the last half
could also be but the page size. The pages of the process area unit hold on
within the frames of main memory relying upon their accessibility

19
26. How does swapping result in better memory management?

Swapping is a simple memory/process management technique used by the operating


system(os) to increase the utilization of the processor by moving some blocked processes
from the main memory to the secondary memory thus forming a queue of the temporarily
suspended processes and the execution continues with the newly arrived process. During
regular intervals that are set by the operating system, processes can be copied from the
main memory to a backing store and then copied back later. Swapping allows more
processes to be run that can fit into memory at one time

27. Write a name of classic synchronization


problems?

• Bounded-buffer
• Readers-writers
• Dining philosophers
• Sleeping barber
28. What is the Direct Access Method?

The direct Access method is based on a disk model of a file, such that it is viewed as a
numbered sequence of blocks or records. It allows arbitrary blocks to be read or written.
Direct access is advantageous when accessing large amounts of information. Direct
memory access (DMA) is a method that allows an input/output (I/O) device to send or
receive data directly to or from the main memory, bypassing the CPU to speed up
memory operations. The process is managed by a chip known as a DMA controller
(DMAC).

29. When does thrashing occur?

Thrashing occurs when processes on the system frequently access pages, not available
memory.

30. What is the best page size when designing an operating system?

The best paging size varies from system to system, so there is no single best when it
comes to page size. There are different factors to consider in order to come up with a
suitable page size, such as page table, paging time, and its effect on the overall efficiency
of the operating system.

31. What is multitasking?

Multitasking is a logical extension of a multiprogramming system that supports multiple


programs to run concurrently. In multitasking, more than one task is executed at the same
time. In this technique, the multiple tasks, also known as processes, share common
processing resources such as a CPU.

20
32. What is caching?

The cache is a smaller and faster memory that stores copies of the data from frequently
used main memory locations. There are various different independent caches in a CPU,
which store instructions and data. Cache memory is used to reduce the average time to
access data from the Main memory.

33. What is spooling?

Spooling refers to simultaneous peripheral operations online, spooling refers to putting


jobs in a buffer, a special area in memory, or on a disk where a device can access them
when it is ready. Spooling is useful because devices access data at different rates.

34. What is the functionality of an Assembler?

The Assembler is used to translate the program written in Assembly language into
machine code. The source program is an input of an assembler that contains assembly
language instructions. The output generated by the assembler is the object code or
machine code understandable by the computer.
35. What are interrupts?

The interrupts are a signal emitted by hardware or software when a process or an event
needs immediate attention. It alerts the processor to a high-priority process requiring
interruption of the current working process. In I/O devices one of the bus control lines is
dedicated to this purpose and is called the Interrupt Service Routine (ISR).

36. What is GUI?

GUI is short for Graphical User Interface. It provides users with an interface wherein
actions can be performed by interacting with icons and graphical symbols.

37. What is preemptive multitasking?

Preemptive multitasking is a type of multitasking that allows computer programs to share


operating systems (OS) and underlying hardware resources. It divides the overall
operating and computing time between processes, and the switching of resources
between different processes occurs through predefined criteria.

38. What is a pipe and when is it used?

A Pipe is a technique used for inter-process communication. A pipe is a mechanism by


which the output of one process is directed into the input of another process. Thus it
provides a one-way flow of data between two related processes.

39. What are the advantages of semaphores?

21
• They are machine-independent.
• Easy to implement.
Correctness is easy to determine.
• Can have many different critical sections with different
semaphores.
• Semaphores acquire many resources
simultaneously.
• No waste of resources due to busy waiting.
40. What is a bootstrap program in the OS?

Bootstrapping is the process of loading a set of instructions when a computer is first


turned on or booted. During the startup process, diagnostic tests are performed, such as
the power-on self-test (POST), which set or checks configurations for devices and
implements routine testing for the connection of peripherals, hardware, and external
memory devices. The bootloader or bootstrap program is then loaded to initialize the OS.

41. What is IPC?

Inter-process communication (IPC) is a mechanism that allows processes to


communicate with each other and synchronize their actions. The communication
between these processes can be seen as a method of cooperation between them.

42. What are the different IPC mechanisms?

These are the methods in IPC:

• Pipes (Same Process): This allows a flow of data in one direction only.
Analogous to simplex systems (Keyboard). Data from the output is usually
buffered until the input process receives it which must have a common origin.
Named Pipes (Different Processes): This is a pipe with a specific name it
can be used in processes that don’t have a shared common process origin.
E.g. FIFO where the details written to a pipe are first named.

• Message Queuing: This allows messages to be passed between processes


using either a single queue or several message queues. This is managed by
the system kernel these messages are coordinated using an API.

• Semaphores: This is used in solving problems associated with


synchronization and avoiding race conditions. These are integer values that
are greater than or equal to 0.

• Shared Memory: This allows the interchange of data through a defined area
of memory. Semaphore values have to be obtained before data can get access
to shared memory.

22
• Sockets: This method is mostly used to communicate over a network between
a client and a server. It allows for a standard connection which is computer
and OS independent

43. What is the difference between preemptive and non-preemptive


scheduling?

• In preemptive scheduling, the CPU is allocated to the processes for a limited


time whereas, in Nonpreemptive scheduling, the CPU is allocated to the
process till it terminates or switches to waiting for the state.
The executing process in preemptive scheduling is interrupted in the middle
of execution when a higher priority one comes whereas, the executing
process in non-preemptive scheduling is not interrupted in the middle of
execution and waits till its execution.

• In Preemptive Scheduling, there is the overhead of switching the process


from the ready state to the running state, vice-verse, and maintaining the
ready queue. Whereas the case of non-preemptive scheduling has no
overhead of switching the process from running state to ready state.

• In preemptive scheduling, if a high-priority process frequently arrives in the


ready queue then the process with low priority has to wait for a long, and it
may have to starve. On the other hand, in non-preemptive scheduling, if CPU
is allocated to the process having a larger burst time then the processes with
a small burst time may have to starve.

• Preemptive scheduling attains flexibility by allowing the critical processes to


access the CPU as they arrive in the ready queue, no matter what process is
executing currently. Non-preemptive scheduling is called rigid as even if a
critical process enters the ready queue the process running CPU is not
disturbed.

• Preemptive Scheduling has to maintain the integrity of shared data that’s why
it is cost associative which is not the case with Nonpreemptive Scheduling.

23
44. What is the zombie process?

A process that has finished the execution but still has an entry in the process table to
report to its parent process is known as a zombie process. A child process always first
becomes a zombie before being removed from the process table. The parent process
reads the exit status of the child process which reaps off the child process entry from the
process table.

45. What are orphan processes?

A process whose parent process no more exists i.e. either finished or terminated without
waiting for its child process to terminate is called an orphan process.

46. What are starvation and aging in OS?

Starvation: Starvation is a resource management problem where a process does not get
the resources it needs for a long time because the resources are being allocated to other
processes.

Aging: Aging is a technique to avoid starvation in a scheduling system. It works by


adding an aging factor to the priority of each request. The aging factor must increase the
priority of the request as time passes and must ensure that a request will eventually be
the highest priority request

47. Write about monolithic kernel?

Apart from microkernel, Monolithic Kernel is another classification of Kernel. Like


microkernel, this one also manages system resources between application and hardware,
but user services and kernel services are implemented under the same address space. It
increases the size of the kernel, thus increasing the size of an operating system as well.
This kernel provides CPU scheduling, memory management, file management, and other
operating system functions through system calls. As both services are implemented under
the same address space, this makes operating system execution faster.

48. What is Context Switching?

Switching of CPU to another process means saving the state of the old process and
loading the saved state for the new process. In Context Switching the process is stored
in the Process Control Block to serve the new process so that the old process can be
resumed from the same part it was left.

49. What is the difference between the Operating system and kernel?

Operating System Kernel


24
The kernel is system software
Operating System is
that is part of the
system software.
Microkerneloperating system.
Operating System
The kernel provides an provides an
interface b/w interface b/w the application the user and the
and hardware. hardware.
Its main purpose is memory management, disk
It also provides
management, process protection and security.
management and task management.
All systems need a real- All operating system needs a time operating system kernel to
run.
Operating System Kernel

and a microkernel system to run.


Type of operating system includes single
and
Type of kernel includes
multiuser OS,
Monolithic and Microkernel.
multiprocessor OS, realtime OS, Distributed OS.
It is the first program to It is the first program to load load when the computer when the
operating system
boots up. loads

50. What is the difference between process and thread?

S.NO Process Thread

Process means any Thread means a segment


1.
program is in execution. of a process.
The process is less
Thread is more efficient in
2. efficient in terms of terms of communication. communication.
3. The process is isolated. Threads share memory.
The process is called
Thread is called
4. heavyweight the
25
lightweight process. process.
Process switching uses, Thread switching does not another process require to
call an operating
5.
interface in operating system and cause an system. interrupt to the kernel.
If one process is The second, thread in the
6. blocked then it will not same task could not run, affect the execution of while one
server thread is
S.NO Process Thread

other process blocked.


The process has its own Thread has Parents’ PCB,
Process Control Block, its own Thread Control
7.
Stack and Address Block and Stack and Space. common Address space.

51. What is PCB? the process control block (PCB) is a block that is used to track the
process’s execution status. A process control block (PCB) contains information
about the process, i.e. registers, quantum, priority, etc. The process table is an
array of PCBs, that means logically contains a PCB for all of the current
processes in the system.

52. When is a system in a safe state?


The set of dispatchable processes is in a safe state if there exists at least one temporal
order in which all processes can be run to completion without resulting in a deadlock.

53. What is Cycle Stealing?


cycle stealing is a method of accessing computer memory (RAM) or bus without
interfering with the CPU. It is similar to direct memory access (DMA) for allowing I/O
controllers to read or write RAM without CPU intervention.

54. What are a Trap and Trapdoor?


A trap is a software interrupt, usually the result of an error condition, and is also a non-
maskable interrupt and has the highest priority Trapdoor is a secret undocumented entry
point into a program used to grant access without normal methods of access
authentication.

55. Write a difference between process and program?

S.NO Program Process


26
Program contains a set of instructions designed Process is an instance of
1. to complete a specific an executing program. task.
Process is anThe process
Program is a passive active entity as it is
2. entity as it resides in the created during execution secondary memory. and
loaded into the main memory.
The program exists in a Process exists for a single place and limited span of
time as it
3.
continues to exist until it gets terminated after the is deleted. completion of a
task.
A program is a static The process is a dynamic
4. entity. entity.
Program does not have Process has a high any resource resource requirement,
it requirement, it only needs resources like
5.
requires memory space CPU, memory address, for storing the and I/O during
its instructions. lifetime.
The program does not The process has its own
6. have any control block. control block called
Process Control Block.

56. What is a dispatcher?


The dispatcher is the module that gives process control over the CPU after it has been
selected by the short-term scheduler. This function involves the following:

• Switching context

• Switching to user mode Jumping to the proper location in the user program
to restart that program

57. Define the term dispatch latency?


Dispatch latency can be described as the amount of time it takes for a system to respond
to a request for a process to begin operation. With a scheduler written specifically to
honor application priorities, real-time applications can be developed with a bounded
dispatch latency.

58. What are the goals of CPU scheduling?


• Max CPU utilization [Keep CPU as busy as possible]Fair allocation of CPU.
27
• Max throughput [Number of processes that complete their execution per time
unit]

• Min turnaround time [Time taken by a process to finish execution]

• Min waiting time [Time a process waits in ready queue]

• Min response time [Time when a process produces the first response]

59. What is a critical- section?


When more than one processes access the same code segment that segment is known as
the critical section. The critical section contains shared variables or resources which are
needed to be synchronized to maintain the consistency of data variables. In simple terms,
a critical section is a group of instructions/statements or regions of code that need to be
executed atomically such as accessing a resource (file, input or output port, global data,
etc.).

60. Write the name of synchronization techniques?


• Mutexes

• Condition variables

• Semaphores

• File locks

Intermediate OS Interview Questions

61. Write a difference between a user-level thread and a kernel-level thread?


User-level thread Kernel level thread
User threads are kernel threads implemented by users. implemented by OS. are

OS doesn’t recognize user- Kernel threads level threads. recognized by OS. are

Implementation of the
Implementation of User

28
perform kernel thread is threads is easy.
complicated.
Context switch time is less. Context switch time is more. Context switch requires no
Hardware support is needed.
User-level thread Kernel level thread hardware support.
If one user-level thread If one kernel thread perform performs a blocking a the blocking
operation then operation then entire another thread can continue process will be blocked.
execution.
User-level threads are Kernel level threads are designed as dependent designed as
independent threads. threads.

62. Write down the advantages of multithreading?

Some of the most important benefits of MT are: Improved throughput. Many


concurrent compute operations and I/O requests within a single process.

• Simultaneous and fully symmetric use of multiple processors for computation


and I/O.

• Superior application responsiveness. If a request can be launched on its own


thread, applications do not freeze or show the “hourglass”. An entire
application will not block or otherwise wait, pending the completion of
another request.

• Improved server responsiveness. Large or complex requests or slow clients


don’t block other requests for service. The overall throughput of the server is
much greater.

• Minimized system resource usage. Threads impose minimal impact on


system resources.
Threads require less overhead to create, maintain, and manage than a
traditional process.

• Program structure simplification. Threads can be used to simplify the


structure of complex applications, such as server-class and multimedia
applications. Simple routines can be written for each activity, making
complex programs easier to design and code, and more adaptive to a wide
variation in user demands.

• Better communication. Thread synchronization functions can be used to


provide enhanced process-to-process communication. In addition, sharing
large amounts of data through separate threads of execution within the same

29
address space provides extremely high-bandwidth, lowlatency
communication between separate tasks within an application

63. Difference between Multithreading and


Multitasking?

S.No Multi-threading Multi-tasking

Multiple threads are executing at the same


Several programs are
1. time at the same or executed concurrently. different part of the program.
CPU switches between CPU
switches between
2. multiple tasks and multiple threads.
processes.
It is the process of a It is a heavyweight
3. lightweight part. process.
S.No Multi-threading Multi-tasking

It is a feature of the
4. It is a feature of the OS. process.
Multi-threading is
Multitasking is sharing of sharing of
computing computing resources(CPU,
5. resources among memory, devices, etc.) threads of
a single among processes. process.

64. What are the drawbacks of semaphores?


• Priority Inversion is a big limitation of
semaphores.
• Their use is not enforced but is by convention only.
• The programmer has to keep track of all calls to wait and signal the
semaphore.
• With improper use, a process may block indefinitely. Such a situation is called
Deadlock.
65. What is Peterson’s approach?
It is a concurrent programming algorithm. It is used to synchronize two processes that
maintain the mutual exclusion for the shared resource. It uses two variables, a bool array
flag of size 2 and an int variable turn to accomplish it.

30
66. Define the term Bounded waiting?
A system is said to follow bounded waiting conditions if a process wants to enter into a
critical section will enter in some finite time.
67. What are the solutions to the critical section problem?
There are three solutions to the critical section problem:

• Software solutions
• Hardware solutions
• Semaphores
68. What is a Banker’s algorithm?
The banker’s algorithm is a resource allocation and deadlock avoidance algorithm that
tests for safety by simulating the allocation for the predetermined maximum possible
amounts of all resources, then makes an “s-state” check to test for possible activities,
before deciding whether allocation should be allowed to continue.

69. What is concurrency?


A state in which a process exists simultaneously with another process than those it is said
to be concurrent.

70. Write a drawback of concurrency? It is required to protect multiple


applications from one another.

• It is required to coordinate multiple applications through additional


mechanisms.

• Additional performance overheads and complexities in operating systems are


required for switching among applications.
• Sometimes running too many applications concurrently leads to severely
degraded performance.

71. What are the necessary conditions which can lead to a deadlock in a system?
Mutual Exclusion: There is a resource that cannot be shared.
Hold and Wait: A process is holding at least one resource and waiting for another
resource, which is with some other process.
No Preemption: The operating system is not allowed to take a resource back from a
process until the process gives it back. Circular Wait: A set of processes
waiting for each other in circular form.
31
72. What are the issues related to concurrency?
• Non-atomic: Operations that are non-atomic but interruptible by multiple
processes can cause problems.

• Race conditions: A race condition occurs of the outcome depends on which


of several processes gets to a point first.

• Blocking: Processes can block waiting for resources. A process could be


blocked for a long period of time waiting for input from a terminal. If the
process is required to periodically update some data, this would be very
undesirable.

• Starvation: It occurs when a process does not obtain service to progress.


• Deadlock: It occurs when two processes are blocked and hence neither can
proceed to execute

73. Why do we use precedence graphs?


A precedence graph is a directed acyclic graph that is used to show the execution level
of several processes in the operating system. It has the following properties also:

• Nodes of graphs correspond to individual statements of program code.

• An edge between two nodes represents the execution order.

• A directed edge from node A to node B shows that statement A executes first
and then Statement B executes

74. Explain the resource allocation graph?


The resource allocation graph is explained to us what is the state of the system in terms
of processes and resources. One of the advantages of having a diagram is, sometimes it
is possible to see a deadlock directly by using RAG.

75. What is a deadlock?


Deadlock is a situation when two or more processes wait for each other to finish and
none of them ever finish. Consider an example when two trains are coming toward each
other on the same track and there is only one track, none of the trains can move once
they are in front of each other. A similar situation occurs in operating systems when

32
there are two or more processes that hold some resources and wait for resources held by
other(s).
76. What is the goal and functionality of memory management?
The goal and functionality of memory management are as follows;
• Relocation
• Protection
• Sharing
• Logical organization
• Physical organization
77. Write a difference between physical address and logical address?
S.NO. Parameters Logical address Physical Address
It is the virtual
The physical address address
1. Basic generated by is a location in a
memory unit. CPU.
Set of all logical
addresses Set of all physical generated by the addresses
mapped to
CPU in reference the corresponding
2. Address
to a program is logical addresses is referred to as referred to
as a Logical Address Physical Address. Space.
The user can The user can never view the logical view the
physical
3. Visibility
address of a address of the program.
program
S.NO. Parameters Logical address Physical Address
The user uses the
The user can not logical address
to
4. Access directly access the access the
physical address physical
address.
The Logical
Address is Physical Address is
5. Generation
generated by the Computed by MMU CPU

33
78. Explain address binding?
The Association of program instruction and data to the actual physical memory locations
is called Address Binding.

79. Write different types of address binding?


Address Binding is divided into three types as follows.

• Compile-time Address Binding

• Load time Address Binding

• Execution time Address Binding

80. Write an advantage of dynamic allocation algorithms?

• When we do not know how much amount of memory would be needed for
the program beforehand.

• When we want data structures without any upper limit of memory space.

• When you want to use your memory space more efficiently.


• Dynamically created lists insertions and deletions can be done very easily just
by the manipulation of addresses whereas in the case of statically allocated
memory insertions and deletions lead to more movements and wastage of
memory. When you want to use the concept of structures and linked lists
in programming, dynamic memory allocation is a must

81. Write a difference between internal fragmentation and external


fragmentation?

Internal
S.NO External fragmentation
fragmentation
In internal
In external fragmentation, fragmentation
fixedvariable-sized memory
1. sized memory, blocks blocks square measure square
measure
appointed to the method. appointed to
process.
Internal fragmentation
34
happens when the External fragmentation
2. method or process is happens when the method larger than the or process is
removed. memory.
Solution for external
The solution to internal fragmentation is
3. fragmentation is the
compaction, paging and best-
fit block.
segmentation.
External fragmentation
Internal fragmentation occurs when memory is occurs when memory
4. is divided into fixed- divided into variable-size partitions based on the size
sized partitions.
of processes.
5. The difference between The unused spaces formed Internal
S.NO External fragmentation
fragmentation memory allocated and between non-contiguous
required space or memory fragments are too memory is
called small to serve a new Internal fragmentation. process, which is called
External fragmentation.

82. Define the Compaction?


The process of collecting fragments of available memory space into contiguous blocks
by moving programs and data in a computer’s memory or disk.

83. Write about the advantages and disadvantages of a hashed-page table?


Advantages

• The main advantage is synchronization.

• In many situations, hash tables turn out to be more efficient than search trees
or any other table lookup structure. For this reason, they are widely used in
many kinds of computer software, particularly for associative arrays,
database indexing, caches, and sets. Disadvantages

• Hash collisions are practically unavoidable. when hashing a random subset


of a large set of possible keys.

• Hash tables become quite inefficient when there are many collisions.
• Hash table does not allow null values, like a hash map.
35
• Define Compaction.

84. Write a difference between paging and


segmentation?

S.NO Paging Segmentation

In paging, program is In segmentation, the


1. divided into fixed or program is divided into mounted-size pages. variable-
size sections.
For the paging
For segmentation compiler
2. operating system is
is accountable. accountable.
Page size is determined Here, the section size is
3. by hardware. given by the user. It is faster in

4. comparison of Segmentation is slow. segmentation.


Paging could result in Segmentation could result
5.
internal fragmentation. in external fragmentation.
In paging, logical
Here, logical address is address is split
into that
6. split into section number
page number and page
and section offset. offset.
While segmentation also
Paging comprises a comprises the segment page table which
7. table which encloses the encloses
the base
segment number and address of every
page.
segment offset.
A page table is Section Table maintains
8. employed to keep up the section data.
S.NO Paging Segmentation the page data.
In segmentation, the
In paging, operating operating system maintains

36
9. system must maintain a a list of holes in
the main free frame list.
memory.
Paging is invisible to Segmentation is visible to
10. the user. the user.
In paging, processor In segmentation, the needs page number, processor uses
segment
11.
offset to calculate the number, and offset to absolute address. calculate the full
address.

85. Write a definition of Associative Memory and Cache Memory?

S.No. Associative Memory Cache Memory

A memory unit accessed


Fast and small memory 1 by content is
called is called cache memory.
associative memory.
It reduces the time
It reduces the average 2 required to
find the item memory access time. stored in memory.
Here data is accessed by Here, data are accessed
3
its content. by their address.
It is used when a
It is used where search
4 particular group of data time is very short. is accessed repeatedly.
Its basic characteristic is
Its basic characteristic is
5 its logic circuit for
its fast access matching its content.

86. What is “Locality of reference”?


The locality of reference refers to a phenomenon in which a computer program tends to
access the same set of memory locations for a particular time period. In other words,
Locality of Reference refers to the tendency of the computer program to access
instructions whose addresses are near one another.

87. Write down the advantages of virtual memory?

37
• A higher degree of multiprogramming.
• Allocating memory is easy and cheap
• Eliminates external fragmentation
• Data (page frames) can be scattered all over the
PM
• Pages are mapped appropriately anyway
• Large programs can be written, as the virtual space available is huge
compared to physical memory.
• Less I/O required leads to faster and easy swapping of processes.
• More physical memory is available, as programs are stored on virtual
memory, so they occupy very less space on actual physical memory.
• More efficient swapping
88. How to calculate performance in virtual memory?

The performance of virtual memory of a virtual memory management system depends


on the total number of page faults, which depend on “paging policies” and “frame
allocation”.

Effective access time = (1-p) x Memory access time + p x page fault time

89. Write down the basic concept of the file system?

A file is a collection of related information that is recorded on secondary storage. Or file


is a collection of logically related entities. From the user’s perspective, a file is the
smallest allotment of logical secondary storage.

90. Write the names of different operations on file?


Operation on file:

• Create
• Open
• Read
• Write
• Rename
• Delete
• Append
• Truncate
• Close
91. Define the term Bit-Vector?

38
A Bitmap or Bit Vector is a series or collection of bits where each bit corresponds to a
disk block. The bit can take two values: 0 and 1: 0 indicates that the block is allocated
and 1 indicates a free block.

92. What is a File allocation table?


FAT stands for File Allocation Table and this is called so because it allocates different
files and folders using tables. This was originally designed to handle small

file systems and disks. A file allocation table (FAT) is a table that an operating system
maintains on a hard disk that provides a map of the cluster (the basic units of logical
storage on a hard disk) that a file has been stored in.

93. What is rotational latency?


Rotational Latency: Rotational Latency is the time taken by the desired sector of the
disgeek to rotate into a position so that it can access the read/write heads. So the disk
scheduling algorithm that gives minimum rotational latency is better.

94. What is seek time?


Seek Time: Seek time is the time taken to locate the disk arm to a specified track where
the data is to be read or written. So the disk scheduling algorithm that gives a minimum
average seek time is better.

Advanced OS Interview Questions

95. What is Belady’s Anomaly?


Bélády’s anomaly is an anomaly with some page replacement policies increasing the
number of page frames resulting in an increase in the number of page faults. It occurs
when the First In First Out page replacement is used.

96. What happens if a non-recursive mutex is locked more than once?


Deadlock. If a thread that had already locked a mutex, tries to lock the mutex again, it
will enter into the waiting list of that mutex, which results in a deadlock. It is because no
other thread can unlock the mutex. An operating system implementer can exercise care
in identifying the owner of the mutex and return it if it is already locked by the same
thread to prevent deadlocks.

97. What are the advantages of a multiprocessor system?

39
There are some main advantages of a multiprocessor system:

• Enhanced performance.

• Multiple applications.

• Multi-tasking inside an application.

• High throughput and responsiveness.

• Hardware sharing among CPUs.

98. What are real-time systems?


A real-time system means that the system is subjected to real-time, i.e., the response
should be guaranteed within a specified timing constraint or the system should meet the
specified deadline.

99. How to recover from a deadlock?


We can recover from a deadlock by following methods:

• Process termination o Abort all the

deadlock processes

o Abort one process at a time until the deadlock is eliminated

• Resource preemption o Rollback o

Selecting a victim

100. What factors determine whether a detection algorithm must be utilized in a


deadlock avoidance system?

One is that it depends on how often a deadlock is likely to occur under the
implementation of this algorithm. The other has to do with how many processes will be
affected by deadlock when this algorithm is applied.

101. Explain the resource allocation graph?


The resource allocation graph is explained to us what is the state of the system in terms
of processes and resources. One of the advantages of having a diagram is, sometimes it
is possible to see a deadlock directly by using RAG.

40
DBMS

1. Define database management system and its applications.


Database management system (DBMS) is a collection of interrelated data and a set
of programs to access those data. Applications:
• Banking
• Airlines
• Universities
• Credit card transactions
• Tele communication
• Finance
• Sales
• Manufacturing
• Human resources

2. What are the advantages of using a DBMS? The advantages of using a


DBMS are a) Controlling redundancy
b) Restricting unauthorized access
c) Providing multiple user interfaces
d) Enforcing integrity constraints.
e) Providing backup and recovery

3. What are the disadvantages of file processing system?


a. Data redundancy and inconsistency
b. Difficulty in accessing data
c. Atomicity of updates
d. Concurrent access by multiple users
e. Security problems

4. List the features of a database.


• It is a persistent (stored) collection of related data.
• The data is input (stored) only once.
• The data is organized (in some fashion).
41
• The data is accessible and can be queried (effectively and efficiently).

5. Define a database
Specifying the data types, structures, and constraints of the data to be stored using a Data
Definition Language

6. Define a data model.


A data model is a collection of concepts that can be used to describe the
structure of a database. The model provides the necessary means to achieve the
abstraction. 7. What are the categories of data models. High
level/conceptual data models –provide concepts close to the way users perceive the
data.Physical data models –provide concepts that describe the details of how data is
stored in the computer. These concepts are generally meant for the specialist, and not
the end user. Representational data models –provide concepts that may be understood
by the end user but not far removed from the way data is organized.

8. Define high level/conceptual data model.


Entity –represents a real world object or concept Attribute - represents property of
interest that describes an entity, such as name or salary.Relationships –among two or
more
entities, represents an association among two or more entities.

9. Define representational data models.


Representational data models are used most frequently in commercial DBMSs. They
include relational data models,
and legacy models such as network and hierarchical models.

10. Define physical data model.


Physical data models describe how data is stored in files by representing record
formats, record orderings and access paths.

11. What is object data model.


Object data models –a group of higher level implementation data models closer to
conceptual data models

12. What is internal level schema. Object data models –a group of higher level
implementation data models closer to conceptual data models

13. What is the conceptual level schema.

42
The conceptual level –has a conceptual schema which describes the structure of the
database for users. It hides the details of the physical storage structures, and concentrates
on describing entities, data types, relationships, user operations and constraints. Usually
a representational data model is used to describe the conceptual schema.

14. What is an external or view level schema.


The External or View level –includes external schemas or user vies. Each external
schema describes the part of the database that a particular user group is interested in
and hides the rest of the database from that user group. Represented using the
representational data model.

15. List the components of DBMS.


The major components of database management system are:
• Software
• Hardware
• Data
• Procedures
• Database Access Language
• Users
16. What is relational model.
The relational model represents the database as a collection of relations. A relation is
nothing but a table of values. Every row in the table represents a collection of related
data values. These rows in the table denote a realworld entity or relationship.

17. List some of the relational model concepts.


1. Attribute: Each column in a Table. Attributes are the properties which
define a relation. e.g., Student_Rollno, NAME,etc.
2. Tables – In the Relational model the, relations are saved in the table format.
It is stored along with its entities. A table has two properties rows and columns. Rows
represent records and columns represent attributes. 3. Tuple – It is nothing but a
single row of a table, which contains a single record.
4. Relation Schema: A relation schema represents the name of the relation with
its attributes.
5. Degree: The total number of attributes which in the relation is called the
degree of the relation.
6. Cardinality: Total number of rows present in the Table.

43
7. Column: The column represents the set of values for a specific attribute.
8. Relation instance – Relation instance is a finite set of tuples in the RDBMS
system. Relation instances never have duplicate tuples.
9. Relation key - Every row has one, two or multiple attributes, which is called
relation key.
10. Attribute domain – Every attribute has some predefined value and scope
which is known as attribute domain

18. List some relational integrity constraints.


1. Domain Constraints
2. Key constraints
3. Referential integrity constraints

19. Define domain constraints.


Domain constraints can be violated if an attribute value is not appearing in the
corresponding domain or it is not of the appropriate data type.Domain constraints
specify that within each tuple, and the must be unique. This is specified as data types
which inclu integers, real numbers, characters, Booleans, variable length
Example:Create
DOMAIN CustomerName CHECK (value not
NULL).The example shown demonstrates creating a domain CustomerName is not
NULL value of each attribute de standard data types strings, etc.

20. Define key constraints.


An attribute that can uniquely identify a tuple in a relation is called the key of the table.
The value of the attribute for different tuples in the relation has to be unique. Example:
In the given table, CustomerID is a key attribute of Customer Table. It is most likely to
have a single key for one customer, CustomerID =1 is only for the CustomerName ="
Google".
CustomerID CustomerName

1 Google
2 Amazon
3 Apple

21. Define referential integrity constraints.


Referential integrity constraints is base on the concept of Foreign Keys. A foreign key
is an important attribute of a relation which should be referred to in other relationships.
Referential integrity constraint state happens where relation refers to a key attribute of
a different or same relation. However, that key element must exist in the table.
44
Example:
In the above example, we have 2 relations, Customer and Billing.
Tuple for CustomerID =1 is referenced twice in the relation Billing. So we know
CustomerName=Google has billing amount $300

22. List the operations which can be done on the relational model.
The operations are, Insert, update, delete and select. • Insert is used to insert
data into the relation • Delete is used to delete tuples from the
table.
• Modify allows you to change the values of some attributes in existing tuples.
• Select allows you to choose a specific range of data.

23. What are the advantages of relational model.


• Simplicity: A relational data model is simpler than the hierarchical and
network model.
• Structural Independence: The relational database is only concerned with
data and not with a structure. This can improve the performance of the model.
• Easy to use: The relational model is easy as tables consisting of rows and
columns is quite natural and simple to understand
• Query capability: It makes possible for a highlevel query language like SQL
to avoid complex database navigation.
• Data independence: The structure of a database can be changed without
having to change any application. • Scalable: Regarding a number of records,
or rows, and the number of fields, a database should be enlarged to enhance its
usability.

24. What are the disadvantages of relational model. • Few relational


databases have limits on field lengths which can't be exceeded.
• Relational databases can sometimes become complex as the amount of data
grows, and the relations between pieces of data become more complicated.
• Complex relational database systems may lead to isolated databases where
the information cannot be shared from one system to another.

25. Define relational algebra.


• Intermediate language used within DBMS.
• Procedural langauge

45
ARTIFICIAL INTELLIGENCE BASICS

1. Define Artificial Intelligence (AI).


The study of how to make computers do things at which at the moment, people are
better. • Systems that think like humans
• Systems that act like humans
• Systems that think rationally
• Systems that act rationally

2. Define Artificial Intelligence formulated by Haugeland.


The exciting new effort to make computers think machines with minds in the full and
literal sense.

3. Define Artificial Intelligence in terms of human performance.


The art of creating machines that performs functions that require intelligence when
performed by people.

4. Define Artificial Intelligence in terms of rational acting.


A field of study that seeks to explain and emulate intelligent behaviors in terms of
computational processesSchalkoff. The branch of computer science that is concerned
with the automation of intelligent behaviorLuger & Stubblefield.

5. Define Artificial in terms of rational thinking. The study of mental faculties through the
use of computational models-Charniak & McDermott. The study of the computations that
make it possible to perceive, reason and act-Winston.

6. Define Rational Agent.


It is one that acts, so as to achieve the best outcome (or) when there is uncertainty, the
best expected outcome.

7. Define Agent.
An Agent is anything that can be viewed as perceiving (i.e.) understanding its
environment through sensors and acting upon that environment through actuators.

8. Define an Omniscient agent.


An omniscient agent knows the actual outcome of its action and can act
accordingly; but omniscience is impossible in reality.

9. What are the factors that a rational agent should depend on at any given time?

46
1. The performance measure that defines degree of success.
2. Ever thing that the agent has perceived so far. We will call this complete
perceptual history the percept sequence.
3. When the agent knows about the environment.
4. The action that the agent can perform.

10. Define Architecture.


The action program will run on some sort of computing device which is called as
Architecture

11. List the various type of agent program.


• Simple reflex agent program.
• Agent that keep track of the world.
• Goal based agent program.
• Utility based agent program

12. Give the structure of agent in an environment? Agent interacts with environment
through sensors and actuators.
An Agent is anything that can be viewed as perceiving (i.e.) understanding its
environment through sensors and acting upon that environment through actuators.

13. Define Sequence.Percept


An agent’s choice of action at any given instant can depend on the entire percept sequence
observed to elate.

14. Define Agent Function.


It is a mathematical description which deals with the agent’s behavior that maps
the given percept sequence into an action.

15. Define Agent Program.


Agent function for an agent will be implemented by agent program.

16. How agent should act?


Agent should act as a rational agent. Rational agent is one that does the right thing,
(i.e.) right actions will cause the agent to be most successful in the environment.

17. How to measure the performance of an agent?


Performance measure of an agent is got by analyzing two tasks.
47
They are How and When actions.

18. Define performance measures.


Performance measure embodies the criterion for success of an agent’s behavior.

19. Define Ideal Rational Agent.


For each possible percept sequence, a rational agent should select an action that is
expected to maximize its performance measure, given the evidence provided by the
percept sequence and whatever built in knowledge the agent has.

20. Define Omniscience.


An Omniscience agent knows the actual outcome of its actions and can act accordingly.

21. Define Information Gathering.


Doing actions in order to modify future percepts sometimes called information gathering.

22. What is autonomy?


A rational agent should be autonomous. It should learn what it can do to compensate
for partial (or) in correct prior knowledge.

23. What is important for task environment?


PEAS →P- Performance measure E - Environment A- Actuators S – Sensors
Example
Interactive English tutor performance measure maximize student’s score on test.
Environment
Set of students testing Agency
Actuators
Display exercises suggestions, corrections.
Sensors
Keyboard entry

24. What is environment program?


It defines the relationship between agents and environments.

25. List the properties of environments.


o Fully Observable Vs Partially Observable o Deterministic Vs
Stochastic o Episodic Vs Sequential o Static Vs
Dynamic o Discrete Vs Continuous
o Single Agent Vs Multi agent
a. Competitive Multi agent b.Co – operative Multi agent

48
26. What is Environment Class (EC) and Environment Generator (EG)?
EC – It is defined as a group of environment.
EG – It selects the environment from environment class in which the agent has to Run.

27. What is the structure of intelligent Agent?


Intelligent Agent = Architecture + Agent Program

28. Define problem solving agent.


Problem solving agent is one kind of goal based agent, where the agent Should select
one action from sequence of actions which lead to desirable states.

29. List the steps involved in simple problem solving technique.


i. Goal formulation ii. Problem formulation
iii. Search iv. Solution
v. Execution phase

30. What are the different types of problem? Single state


problem, multiple state problems,
Contingency problem, Exploration problem.

31. What are the components of a problem? There are four


components. They are i. initial state ii. Successor function
iii. Goal test iv. Path cost v. Operator vi.
state space
vii. path

32. Define State Space.


The set of all possible states reachable from the initial state by any sequence of action is
called state space.

33. Define Path.


A path in the state space is a sequence of state connected by sequence of actions.

34. Define Path Cost.


A function that assigns a numeric cost to each path, which is the sum of the cost of the
each action along the path.

35. Give example problems for Artificial Intelligence.


i. Toy problems

49
ii. Real world problems

36. Give example for real world end toy problems.


Real world problem examples: i. Airline travel
problem.
ii. Touring problem. iii. Traveling salesman
problem.
iv. VLSI Layout problem v.
Robot navigation vi. Automatic Assembly
vii. Internet searching Toy problem
Examples:
Vacuum world problem. 8 – Queen problem
8 – Puzzle problem

37. Define search tree.


The tree which is constructed for the search process over the state space is called search
tree.

38. Define search node.


The root of the search tree that is the initial state of the problem is called search node.

39. Define fringe.


The collection of nodes that have been generated but not yet expanded, this collection is
called fringe or frontier.

40. List the performance measures of search strategies.


i. Completeness ii. Optimality iii.
Time complexity
iv. Space complexity

41. Define branching factor (b).


The number of nodes which is connected to each of the node in search tree is called
Branching factor.

42. What is important for agent?


Time (i.e.) intervals is important for agent to take an action.
There are 2 kinds;
i. Moments
ii. Extended Intervals

50
43. Define Backtracking search.
The variant of depth first search called backtracking search. Only one successor is
generated at a time rather than all successor, partially expanded node remembers which
successor generate next is called Backtracking search.

44. Define Uniform cost search.


Uniform cost search expands the node ‘n’ with the lowest path cost instead of expanding
the shallowest node.

45. Define Depth first search.


It expands the deepest node in the current fringe of the search tree.

46. Define depth limited search.


The problem of unbounded tress can be avoided by supplying depth limit 1(i.e.) nodes
at depth 1 are treated as if they have no successors. This is called Depth Limited
search.

47. What is informed search?


One that uses problem – specific knowledge beyond the definition of the problem
itself and it can find solutions more efficiently than an uninformed strategy.

48. What is the use of QUEUING_FN?


QUEUING_FN inserts asset of elements into the queue. Different varieties of queuing
fn produce different varieties of the search algorithm.

49. Mention the criteria for the evaluation of search strategy.


There are 4 criteria: Completeness, time complexity, space complexity, optimality

4. List the various search strategies.


a. BFS
b. Uniform cost search
c. DFS
d. Depth limited search
e. Iterative deepening search
f. Bidirectional search

50. List the various informed search strategy.


Best first search –greedy search ,A* search

51
Memory bounded search-Iterative deepening A*search simplified memory bounded
A*search -Iterative improvement search –hill climbing -simulated annealing

51. What is Best First Search?


Best First Search is an instance of the general TREE SEARCH or GRAPH SEARCH
algorithm in which a node is selected for expansion based on an evaluation function,
f(n).

52. Define Evaluation function, f(n).


A node with the lowest evaluation is selected for expansion, because evaluation
measures distance to the goal.

53. Define Heuristic function, h (n). h (n) is defined as the estimated cost of the cheapest
path from node n to a goal node.

54. Define Greedy Best First Search.


It expands the node that is closest to the goal (i.e.) to reach solution in a quicker
way. It is done by using the heuristic function: f(n) = h(n).

55. Define A* search.


A* search evaluates nodes by combining g(n), the cost to reach the node and h(n), the
cost to get from the node to the goal. f(n) = g(n) + h(n)

56. Define Admissible heuristic h (n).


In A* search, if it is optimal then, h(n) is an admissible heuristic which means h(n)
never overestimates the cost to reach the goal.

57. What is triangle inequality?


It states that each side of a triangle cannot be longer than the sum of the other two slides
of the triangle.

58. What are the 2 types of memory bounded heuristic algorithms?


i. Recursive Best First Search(RBFS) ii. Memory bounded A*(MA*)

59. Differentiate BFS & DFS.


BFS means breath wise search. Space complexity is more. Do not give optimal solution
Queuing fn is same as that of queue operator
DFS means depth wise search. Space complexity is less Gives optimal solution
Queuing fn is somewhat different from queue operator.

52
60. What is RBFS?
It keeps track of the f-value of the best alternative path available from any ancestor of
the current node. RBFS remembers the f-value of the best leaf in the forgotten sub tree
and therefore decide whether its worth re expanding the sub tree sometimes later.
61. Define iterative deepening search.
Iterative deepening is a strategy that sidesteps the issue of choosing the best depth limit
by trying all possible depth limits: first depth 0, then depth 1,then depth 2& so on.

62. What are the 2 ways to use all available memory?


i. Memory bounded A*(MA*)
ii. Simplified Memory bounded A*(SMA*)

63. What is SMA* search?


SMA* expands the best leaf until memory is full and it drops the oldest worst leaf
node and expands the newest best leaf node.

64. What is called as bidirectional search?


The idea behind bidirectional search is to simultaneously search both forward from the
initial state & backward from the goal & stop when the two searches meet in the
middle.

65. What is metalevel state space?


Each state in a metalevel state space captures the internal state of a program that is
searching in an object level state space.

66. What is Manhattan distance, h2?


The sum of the horizontal and vertical distances of the tiles from their goal positions
in a 15 puzzle problem is called Manhattan distance (or) city block distance.

67. Give the drawback of DFS.


The drawback of DFS is that it can get stuck going down the wrong path. Many
problems have very deep or even infinite search tree. So dfs will never be able to
recover from an unlucky choice at one of the nodes near the top of the tree. So DFS
should be avoided for search trees with large or infinite maximum depths

68. Define Branching factor b*.


Uniform tree of depth d would have to be in order to contain N+1 nodes is called
branching factor.

69. Write the time & space complexity associated with depth limited search.
53
Time complexity =O (bd) , b-branching factor, d-depth of tree
Space complexity=o (bl)

70. What is Released problems?


A problem with fewer restrictions on the actions is called a relaxed problem.

71. What is a pattern database?


This database is the storage of exact solution costs for every possible sub problem
instance.

72. What is a disjoint pattern database?


The sum of the two costs is still a lower bound on the cost of solving the entire problem
is called a disjoint pattern database.

73. What is local search?


It operates using a single current state rather than multiple paths and generally moves
only to neighbours of that state.

74. Define Optimization Problems.


The aim of this problem is to find the best state according to an objective function.
75. What are the 2 parts of Landscape?
i. Location defined by the state. ii. Elevation defined by the value of the
heuristic cost function (or) objective function.

76. Define Global minimum.


If elevation corresponds to cost, then the aim is to find the lowest valley is called global
minimum.
77. Define Global Maximum.
If elevation corresponds to an objective function, then the aim is to find the highest peak
is called global maximum.

78. Define Hill Climbing search.


It is a loop that continually moves in a increasing value direction (i.e.) up hill and
terminates when it reaches a
“peak” where no neighbor has a higher value.

79. List some drawbacks of hill climbing process. Local maxima: A local maxima as
opposed to a goal maximum is a peak that is lower that the highest peak in the state space.
Once a local maxima is reached the algorithm will halt even though the solution may be far
from satisfactory.

54
Plateaux: A plateaux is an area of the state space where the evaluation fn is
essentially flat. The search will conduct a random walk.

80. What is the meaning for greedy local search? It goals (picks) a good neighbor state
without thinking ahead about where to go next.

81. Define Local maxima.


A local maximum is a peak that is higher than each of its neighboring states, but lower
than the global maximum.

82. What are the variants of hill climbing?


i. Stochastic hill climbing ii. First choice hill
climbing iii. Simulated annealing search iv. Local
beam search
v. Stochastic beam search

83. Define annealing.


Annealing is the process used to harden metals (or) glass by heating them to a high
temperature and then gradually cooling them, thus allowing the material to coalesce
into a low energy crystalline state.

84. Define simulated annealing.


This algorithm, instead of picking the best move, it picks a random move. If the move
improves the situation, it is always accepted.

85. What is the advantage of memory bounded search techniques?


We can reduce space requirements of A* with memory bounded algorithm such as
IDA* & SMA*.

86. Give the procedure of IDA* search.


Minimize f(n)=g(n)+h(n) combines the advantage of uniform cost search + greedy
search A* is complete, optimal. Its space complexity is still prohibitive. Iterative
improvement algorithms keep only a single state in memory, but can get stuck on
local maxima. In this algorithm each iteration is a dfs just as in regular iterative
deepening. The depth first search is modified to use an fcost limit rather than a depth
limit. Thus each iteration expands all nodes inside the contour for the current f-cost.

87. List some properties of SMA* search.


* It will utilize whatever memory is made available to it.

55
* It avoids repeated states as for as its memory allow.
* It is complete if the available memory is sufficient to store the shallowest
path.
* It is optimal if enough memory is available to store the shallowest optimal
solution path.
Otherwise it returns the best solution that can be reached with the available memory.
*When enough memory is available for entire search tree, the search is optimally
efficient.
*Hill climbing.
*Simulated annealing.

88. What is Genetic Algorithms?


Genetic Algorithm is a variant of stochastic beam search in which successor states are
generated by combining two parent states, rather than by modifying a single state.

89. Define Online Search agent.


Agent operates by interleaving computation and action (i.e.) first it takes an action, and
then it observes the environment and computes the next action.

90. What are the things that agent knows in online search problems? a. Actions(s)
b. Step cost function C(s, a, s’)
c. Goal TEST(s)

91. Define CSP.


Constraint Satisfaction problem (CSP) is defined by a set of variables X1,X2,…Xn and
set of constraints C1,C2,…Cm.

92. Define Successor function.


A value can be assigned to any unassigned variable, provided that does not conflict
with previously assigned variables.

93. What are the types of constraints?


There are 5 types,
a. Unary constraints relates one variable.
b. A binary constraint relates two variables.
c. Higher order constraints relate more than two variables.
d. Absolute constraints.
e. Preference constraints.
56
94. Define MRV.
Minimum remaining values heuristic chooses the variable with the fewest “legal” values.

95. Define LCV.


Least constraining value heuristic prefers the value that rules out the fewest choices for
the neighboring variables in the constraint graph.

96. Define Conflict directed back jumping.


A back jumping algorithm that uses conflict sets defined in this way is called Conflict
directed back jumping.

97. Define constraint propagation.


It is the general term for propagating (i.e.) spreading the implications of constraints on
the variable on to other variable.

98. Define Cycle cut set.


The process of choosing a subset S from variables [CSP] such that the constraint graph
becomes a tree after removal of S. S is called a cycle cut set.

99. Define Tree decomposition.


The constraint graph is divided into a set of connected sub problems. Each sub problem
is solved independently and the resulting solutions are then combined. This process is
called tree decomposition.

100. Define Alpha beta pruning.


Alpha beta pruning eliminates away branches that cannot possibly influence the final
decision

101. Define FOL.


FOL is a first order logic. It is a representational language of knowledge which is
powerful than propositional logic (i.e.) Boolean Logic. It is an expressive, declarative,
compositional language.

102. Define a knowledge Base:


Knowledge base is the central component of knowledge base agent and it is described
as a set of representations of facts about the world.

103. With an example, show objects, properties functions and relations.


57
Example
“EVIL KING JOHN BROTHER OF RICHARD RULED
ENGLAND IN 1200”
Objects : John, Richard, England, 1200
Relation : Ruled
Properties : Evil, King
Functions : BROTHER OF

104. Define a Sentence?


Each individual representation of facts is called a sentence. The sentences are expressed
in a language called as knowledge representation language.

105. Define an inference procedure


An inference procedure reports whether or not a sentence is entiled by knowledge base
provided a knowledge base and a sentence .An inference procedure ‘i’ can be described
by the sentences that it can derive. If i can derive from knowledge base, we can write.
KB --Alpha is derived from KB or i derives alpha from KB.

106. Define Ontological commitment.


The difference between propositional and first order logic is in the ontological
commitment. It assumes about the nature of reality.

107. Define Epistemological commitment.


The logic that allows the possible states of knowledge with respect to each fact.

108. Define domain and domain elements.


The set of objects is called domain, sometimes these objects are referred as domain
elements.

109. What are the three levels in describing knowledge based agent?
• Logical level
• Implementation level
• Knowledge level or epistemological level

110. Define Syntax?


Syntax is the arrangement of words. Syntax of a knowledge describes the possible
configurations that can constitute sentences. Syntax of the language describes how to
make sentences.

58
111. Define Semantics
The semantics of the language defines the truth of each sentence with respect to each
possible world. With this semantics, when a particular configuration exists with in an
agent, the agent believes the corresponding sentence.

112. Define Logic


Logic is one which consist of
i. A formal system for describing states of affairs, consisting of a) Syntax
b)Semantics. ii. Proof Theory – a set of rules for deducing the entailment of
a set sentences.

113. What are the 3 types of symbol which is used to indicate objects, relations
and functions? i) Constant symbols for objects ii) Predicate symbols for
relations iii) Function symbols for functions

114. Define terms.


A term is a logical expression that refers to an object. We use 3 symbols to build a term.
115. Define Atomic sentence.
Atomic sentence is formed by both objects and relations.
Example
Brother (William, Richard) William is the brother of Richard.

116. Define Quantifier and it’s types.


Quantifiers are used to express properties of entire collection of objects rather than
representing the objects by name.
Types:
i. Universal Quantifier ii. Existential
Quantifier iii. Nested Quantifier.

117. What are the two we use to query and answer in knowledge base?
ASK and TELL.

118. Define kinship domain.


The domain of family relationship is called kinship domain which consists of objects
unary predicate, binary predicate, function, relation.

119. Define syntactic sugar.


Extension to the standard syntax (i.e.) procedure that does not change the semantics
(i.e.) meaning is called syntactic sugar.

59
120. Define synchronic and diachronic sentence. Sentences dealing with same time are
called synchronic sentences. Sentences that allow reasoning “a cross time” are called
diachronic sentence.

121. What are the 2 types of synchronic rules? i. Diagnostic rules ii. Casual rules.

122. Define skolem constant.


The existential sentence says there is some object satisfying a condition, and the
instantiation process is just giving a name to that object. That name must not belong to
another object. The new name is called skolem constant.

123. What is truth Preserving


An inference algorithm that derives only entailed sentences is called sound or truth
preserving

124 .Define a Proof


A sequence of application of inference rules is called a proof. Finding proof is exactly
finding solution to search problems. If the successor function is defined to generate all
possible applications of inference rules then the search algorithms can be applied to
find proofs.

125. Define a Complete inference procedure


An inference procedure is complete if it can derive all true conditions from a set of
premises.

126. Define Interpretation


Interpretation specifies exactly which objects, relations and functions are referred
to by the constant predicate, and function symbols.

127. Define Validity of a sentence


A sentence is valid or necessarily true if and only if it is true under all possible
interpretation in all posssible world.
128. Define Satistiability of a sentence
A sentence is satisfiable if and only if there is some interpretation in some
world for which it is true

129. Define true sentence


A sentence is true under a particular interpretation if the state of affairs it represents is
the case.

130. What are the basic Components of propositional logic?


Logical Constants (True, False)

60
131. Define Modus Ponen’s rule in Propositional logic?
The standard patterns of inference that can be applied to derive chains of conclusions
that lead to the desired goal is said to be Modus Ponen’s rule.

132. Define AND –Elimination rule in propositional logic


AND elimination rule states that from a given conjunction it is possible to inference any
of the conjuncts.
OR-Introduction rule states that from, a sentence, we can infer its disjunction with
anything.

133. Define Unification.


Lifted Inference rule require finding substitutions that make different logical
expressions look identical (same). This is called Unification.

134. Define Occur check.


When matching a variable in 2 expressions against a complex term, one must check
whether the variable itself occurs inside the term, If it does the match fails. This is called
occur check.

135. Define pattern matching.


The inner loop of an algorithm involves finding all the possible unifiers with facts
in the KB. This is called pattern matching.

136. Explain the function of Rete Algorithm? This algorithm preprocess the set of rules in
KB to constant a sort of data flow network in which each node is a literals from rule a
premise.

137. Define magic set.


To rewrite the rule set, using information from the goal, so that only relevant variable
bindings called magic set.

138. Define backward chaining.


This algorithm works backward from the goal, chaining through rules to find known facts
that support the proof.

139. Define Prolog program.


It is a set of definite clauses written in a notation somewhat different from standard FOL.

140. What are the divisions of knowledge in OTTER theorem?

61
i. Set of Support (SOS) ii. Usable axioms
iii. Rewrites (or) Demodulators iv. A set of
parameters and sentences

141. What are the 2 types of frame problem? i. Representational Frame


Problem ii. Inferential Frame Problem
142. What are the 2 types of processes?
i. Discrete events – it have definite structure ii. Liquid events -
Categories of events with process.

143. Define fluent calculus.


Discard Situation Calculus and invent a new formalism for writing axioms is
Called Fluent Calculus.

144. What is important for agent?


Time (i.e.) intervals is important for agent to take an action.
There are 2 kinds;
i. Moments
ii. Extended Intervals

145. Define runtime variables.


Plans to gather and use information are represented using short hand Notation called
runtime variables (n). Example
[Look up (Agent, “Phone number (Divya)”.N), Dial (n)] 146. Define Occur check.
When matching a variable in 2 expressions against a complex term, one must check
whether the variable itself occurs inside the term, If it does the match fails. This is
called occur check.

147. Define pattern matching.


The inner loop of an algorithm involves finding all the possible unifiers with facts
in the KB. This is called pattern matching.

148. Explain the function of Rete Algorithm?


This algorithm preprocess the set of rules in KB to constant a sort of data flow network
in which each node is a literals from rule a premise.

149. Define magic set.


To rewrite the rule set, using information from the goal, so that only relevant variable
bindings called magic set.

150. Define backward chaining.


62
This algorithm works backward from the goal, chaining through rules to find known facts
that support the proof.

151. Define Prolog program.


It is a set of definite clauses written in a notation somewhat different from standard FOL.

152. What are the divisions of knowledge in OTTER theorem?


i. Set of Support (SOS) ii. Usable
axioms iii. Rewrites (or) Demodulators
iv. A set of parameters and sentences

153. What are the 2 types of frame problem? i.


Representational Frame Problem ii. Inferential Frame
Problem 154. What are the 2 types of processes?
i. Discrete events – it have definite structure ii. Liquid events -
Categories of events with process.

155. Define fluent calculus.


Discard Situation Calculus and invent a new formalism for writing axioms is
Called Fluent Calculus.

156. Define runtime variables.


Plans to gather and use information are represented using short hand Notation called
runtime variables (n). Example
[Look up (Agent, “Phone number (Divya)”.N), Dial (n)]

63
MACHINE LEARNING FUNDAMENTALS

1. What is Machine learning?


Machine learning is a branch of computer science which deals with system programming
in order to automatically learn and improve with experience. For example: Robots are
programed so that they can perform the task based on data they gather from sensors. It
automatically learns programs from data

2. What Are the Different Types of Machine Learning?


There are three types of machine learning:

Supervised Learning

In supervised machine learning, a model makes predictions or decisions based on


past or labeled data. Labeled data refers to sets of data that are given tags or labels,
and thus made more meaningful.

Unsupervised Learning

In unsupervised learning, we don't have labeled data. A model can identify patterns,
anomalies, and relationships in the input data.

64
Reinforcement Learning

Using reinforcement learning, the model can learn based on the rewards it received for
its previous action.

Consider an environment where an agent is working. The agent is given a target to


achieve. Every time the agent takes some action toward the target, it is given positive
feedback. And, if the action taken is going away from the goal, the agent is given
negative feedback.

3. What are the five popular algorithms of Machine Learning?


• Decision Trees

65
• Neural Networks (back propagation)
• Probabilistic networks
• Nearest Neighbor
• Support vector machines
4. Name three different categories while creating a model?
• Training dataset
• Validation dataset
• Test dataset
5. What is Overfitting, and How Can You Avoid It?

The Overfitting is a situation that occurs when a model learns the training set too well,
taking up random fluctuations in the training data as concepts. These impact the
model’s ability to generalize and don’t apply to new data.

When a model is given the training data, it shows 100 percent accuracy—technically a
slight loss. But, when we use the test data, there may be an error and low efficiency.
This condition is known as overfitting.
There are multiple ways of avoiding overfitting, such as:
• Regularization. It involves a cost term for the features involved with the
objective function
• Making a simple model. With lesser variables and parameters, the
variance can be reduced
• Cross-validation methods like k-folds can also be used
• If some model parameters are likely to cause overfitting, techniques for
regularization like LASSO can be used that penalize these parameters

66
Training Set Test Set

The training se
t is examples The test set is used to test the
given to the model to analyze accuracy of the hypothesis
and learn generated by the model

70% of the total data is Remaining30% is taken as


typically taken as the training testing dataset
dataset
We test without labeled data
This is labeled data used to and then verify results with
train the model labels

6. What is ‘training Set’ and ‘test Set’ in a Machine Learning Model? How Much Data
Will You Allocate for Your Training, Validation, and Test Sets?

There is a three-step process followed to create a model:

1. Train the model

2. Test the model

3. Deploy the model

Consider a case where you have labeled data for 1,000 records. One way to train the
model is to expose all 1,000 records during the training process. Then you take a small
set of the same data to test the model, which would give good results in this case.

But, this is not an accurate way of testing. So, we set aside a portion of that data called
the ‘test set’ before starting the training process. The remaining data is called the
‘training set’ that we use for training the model. The training set passes through the
model multiple times until the accuracy is high, and errors are minimized.

67
Now, we pass the test data to check if the model can accurately predict the values and
determine if training is effective. If you get errors, you either need to change your
model or retrain it with more data.

Regarding the question of how to split the data into a training set and test set, there is
no fixed rule, and the ratio can vary based on individual preferences.
7. How Do You Handle Missing or Corrupted Data in a Dataset?
One of the easiest ways to handle missing or corrupted data is to drop those rows or
columns or replace them entirely with some other value.
There are two useful methods in Pandas:
• IsNull() and dropna() will help to find the columns/rows with missing
data and drop them
• Fillna() will replace the wrong values with a placeholder value

68
8. How Can You Choose a Classifier Based on a Training Set Data Size?
When the training set is small, a model that has a right bias and low variance seems to
work better because they are less likely to overfit.

For example, Naive Bayes works best when the training set is large. Models with low
bias and high variance tend to perform better as they work fine with complex
relationships.

9. Explain the Confusion Matrix with Respect to Machine Learning


Algorithms.
A confusion matrix (or error matrix) is a specific table that is used to measure the
performance of an algorithm. It is mostly used in supervised learning; in unsupervised
learning, it’s called the matching matrix. The confusion matrix has
two parameters:
• Actual
• Predicted
It also has identical sets of features in both of these dimensions. Consider a confusion
matrix (binary matrix) shown below:

69
Here,
For actual values:
Total Yes = 12+1 = 13
Total No = 3+9 = 12
Similarly, for predicted values:
Total Yes = 12+3 = 15
Total No = 1+9 = 10
For a model to be accurate, the values across the diagonals should be high. The total
sum of all the values in the matrix equals the total observations in the test data set.
For the above matrix, total observations = 12+3+1+9 = 25
Now, accuracy = sum of the values across the diagonal/total dataset
= (12+9) / 25
= 21 / 25
= 84%
10. What Is a False Positive and False Negative and How Are They Significant?
False positives are those cases that wrongly get classified as True but are False.
False negatives are those cases that wrongly get classified as False but are True.
In the term ‘False Positive,’ the word ‘Positive’ refers to the ‘Yes’ row of the predicted
value in the confusion matrix. The complete term indicates that the system has
predicted it as a positive, but the actual value is negative.

So, looking at the confusion matrix, we get:


70
False-positive = 3

True positive = 12

Similarly, in the term ‘False Negative,’ the word


‘Negative’ refers to the ‘No’ row of the predicted value in the confusion matrix. And the
complete term indicates that the system has predicted it as negative, but the actual value
is positive.

So, looking at the confusion matrix, we get:


False Negative = 1
True Negative = 9

11. What Are the Three Stages of Building a Model in Machine Learning?
The three stages of building a machine learning model are:

• Model Building
Choose a suitable algorithm for the model and train it according to the
requirement
• Model Testing
Check the accuracy of the model through the test data
• Applying the Model
Make the required changes after testing and use the final model for real-
time projects
Here, it’s important to remember that once in a while, the model needs to be checked to
make sure it’s working correctly. It should be modified to make sure that it is upto-date.
12. What is Deep Learning?

The Deep learning is a subset of machine learning that involves systems that think and
learn like humans using artificial neural networks. The term ‘deep’ comes from the fact
that you can have several layers of neural networks.
One of the primary differences between machine learning and deep learning is that
feature engineering is done manually in machine learning. In the case of deep learning,
the model consisting of neural networks will automatically determine which features to
use (and which not to use).
Machine Learning Deep Learning

71
Enables machines to take decisions on Enables machines to take decisions with
their own, based on past data the help of artificial neural networks

It needs only a small amount of data for


It needs a large amount of training data
training

Works well on the low-end system, so Needs high-end machines because it


you don't need large machines requires a lot of computing power

Most feature need to be identified in The machine learns the features from
advance and manually coded the data it is provided

The problem is divided into two parts and The problem is solved in an end-to-end
solved individually and then combined manner

13. Give a popular application of machine learning that you see on day to day basis?
The recommendation engine implemented by major ecommerce websites uses Machine
Learning.

14. What Are the Applications of Supervised Machine Learning in Modern Businesses?

Applications of supervised machine learning include:

72
• Email Spam Detection
Here we train the model using historical data that consists of emails
categorized as spam or not spam. This labeled information is fed as
input to the model.
• Healthcare Diagnosis
By providing images regarding a disease, a model can be trained to
detect if a person is suffering from the disease or not.
• Sentiment Analysis
This refers to the process of using algorithms to mine documents and
determine whether they’re positive, neutral, or negative in sentiment.
• Fraud Detection
By training the model to identify suspicious patterns, we can detect
instances of possible fraud.

15. What is Semi-supervised Machine Learning?


Supervised learning uses data that is completely labeled, whereas unsupervised
learning uses no training data.In the case of semi-supervised learning, the training data
contains a small amount of labeled data and a large amount of unlabeled data.

16. What Are Unsupervised Machine Learning Techniques?

There are two techniques used in unsupervised learning: clustering and association.

Clustering

73
Clustering problems involve data to be divided into subsets. These subsets, also called
clusters, contain data that are similar to each other. Different clusters reveal different
details about the objects, unlike classification or regression.

Association

In an association problem, we identify patterns of associations between different


variables or items.

For example, an e-commerce website can suggest other items for you to buy, based on
the prior purchases that you have made, spending habits, items in your wishlist, other
customers’ purchase habits, and so on.

74
17. What is the Difference Between Supervised and Unsupervised Machine Learning?

• Supervised learning - This model learns from the labeled data and makes
a future prediction as output
• Unsupervised learning - This model uses unlabeled input data and allows
the algorithm to act on that information without guidance.

18. What is the Difference Between Inductive Machine Learning and Deductive Machine
Learning?

Inductive Learning Deductive Learning

It observes instances based on defined


It concludes experiences
principles to draw a conclusion

Example: Allow the child to play with


Example: Explaining to a child to keep
fire. If he or she gets burned, they will
away from the fire by showing a video
learn that it is dangerous and will refrain
where fire causes damage
from making the same mistake again

19. Compare K-means and KNN Algorithms.


K-means KNN

75
K-Means is unsupervised

KNN is supervised in nature

K-Means is a clustering algorithm KNN is a classification algorithm

The points in each cluster are similar to


each other, and each cluster is different
from its neighboring clusters It classifies an unlabeled observation
based on its K (can be any number)
surrounding neighbors

20. What Is ‘naive’ in the Naive Bayes Classifier?

The classifier is called ‘naive’ because it makes assumptions that may or may not turn
out to be correct.

The algorithm assumes that the presence of one feature of a class is not related to the
presence of any other feature (absolute independence of features), given the class
variable.

For instance, a fruit may be considered to be a cherry if it is red in color and round in
shape, regardless of other features. This assumption may or may not be right (as an
apple also matches the description).

21. Explain How a System Can Play a Game of Chess Using Reinforcement Learning.

76
Reinforcement learning has an environment and an agent. The agent performs some
actions to achieve a specific goal. Every time the agent performs a task that is taking it
towards the goal, it is rewarded. And, every time it takes a step that goes against that goal
or in the reverse direction, it is penalized.

Earlier, chess programs had to determine the best moves after much research on
numerous factors. Building a machine designed to play such games would require
many rules to be specified.

With reinforced learning, we don’t have to deal with this problem as the learning agent
learns by playing the game. It will make a move (decision), check if it’s the right move
(feedback), and keep the outcomes in memory for the next step it takes (learning).
There is a reward for every correct decision the system takes and punishment for the
wrong one.

22. How Will You Know Which Machine Learning


Algorithm to Choose for Your Classification Problem?
While there is no fixed rule to choose an algorithm for a classification problem, you can
follow these guidelines:
• If accuracy is a concern, test different algorithms and cross-validate them
• If the training dataset is small, use models that have low variance and
high bias
• If the training dataset is large, use models that have high variance and
little bias
23. How is Amazon Able to Recommend Other Things to Buy? How Does the
Recommendation Engine Work?

Once a user buys something from Amazon, Amazon stores that purchase data for future
reference and finds products that are most likely also to be bought, it is possible because
of the Association algorithm, which can identify patterns in a given dataset.

77
24. When Will You Use Classification over Regression?

Classification is used when your target is categorical, while regression is used when
your target variable is continuous. Both classification and regression belong to the
category of supervised machine learning algorithms.

Examples of classification problems include:

• Predicting yes or no
• Estimating gender

• Breed of an animal

• Type of color
Examples of regression problems include:
• Estimating sales and price of a product
• Predicting the score of a team
• Predicting the amount of rainfall

25. How Do You Design an Email Spam Filter?


Building a spam filter involves the following process:

• The email spam filter will be fed with thousands of emails


• Each of these emails already has a label:
‘spam’ or ‘not spam.’

78
• The supervised machine learning algorithm will then determine which
type of emails are being marked as spam based on spam words
like the lottery, free offer, no money, full refund, etc.
• The next time an email is about to hit your inbox, the spam filter will use
statistical analysis and algorithms like Decision Trees and SVM to
determine how likely the email is spam
• If the likelihood is high, it will label it as spam, and the email won’t hit
your inbox
• Based on the accuracy of each model, we will use the algorithm with the
highest accuracy after testing all the models

26. What is a Random Forest?

A ‘random forest’ is a supervised machine learning algorithm that is generally used for
classification problems. It operates by constructing multiple decision trees during the
training phase. The random forest chooses the decision of the majority of the trees as
the final decision.

79
27. Considering a Long List of Machine Learning Algorithms, given a Data Set,
How Do You Decide Which One to Use?

There is no master algorithm for all situations. Choosing an algorithm depends on the
following questions:
• How much data do you have, and is it continuous or categorical?
• Is the problem related to classification, association, clustering, or
regression? Predefined variables (labeled), unlabeled, or mix?
• What is the goal?
Based on the above questions, the following algorithms can be used:

80
28. What is Bias and Variance in a Machine Learning Model?
Bias
Bias in a machine learning model occurs when the predicted values are further from the
actual values. Low bias indicates a model where the prediction values are very close to
the actual ones.
Underfitting: High bias can cause an algorithm to miss the relevant relations between
features and target outputs.
Variance
Variance refers to the amount the target model will change when trained with
different training data. For a good model, the variance should be minimized.
Overfitting: High variance can cause an algorithm to model the random noise in the
training data rather than the intended outputs.
29. What is the Trade-off Between Bias and Variance?
The bias-variance decomposition essentially decomposes the learning error from any
algorithm by adding the bias, variance, and a bit of irreducible error due to noise in the
underlying dataset.

Necessarily, if you make the model more complex and add more variables, you’ll lose
bias but gain variance. To get the optimally-reduced amount of error, you’ll have to
trade off bias and variance. Neither high bias nor high variance is desired.

High bias and low variance algorithms train models that are consistent, but inaccurate on
average.
High variance and low bias algorithms train models that are accurate but inconsistent.

30. Define Precision and Recall.

Precision

Precision is the ratio of several events you can correctly recall to the total number of
events you recall (mix of correct and wrong recalls).

Precision = (True Positive) / (True Positive + False Positive)

Recall

A recall is the ratio of the number of events you can recall the number of total events.
81
Recall = (True Positive) / (True Positive + False
Negative)
31. What is a Decision Tree Classification?
A decision tree builds classification (or regression) models as a tree structure, with
datasets broken up into ever-smaller subsets while developing the decision tree,
literally in a tree-like way with branches and nodes.
Decision trees can handle both categorical and numerical data.
32. What is Pruning in Decision Trees, and How Is It Done?
Pruning is a technique in machine learning that reduces the size of decision trees. It
reduces the complexity of the final classifier, and hence improves predictive accuracy
by the reduction of overfitting.

Pruning can occur in:


• Top-down fashion. It will traverse nodes and trim subtrees starting at the
root

• Bottom-up fashion. It will begin at the leaf nodes


There is a popular pruning algorithm called reduced error pruning, in which:

• Starting at the leaves, each node is replaced with its most popular class

• If the prediction accuracy is not affected, the change is kept

• There is an advantage of simplicity and speed

33. Briefly Explain Logistic Regression.

Logistic regression is a classification algorithm used to predict a binary outcome for a


given set of independent variables.
The output of logistic regression is either a 0 or 1 with a threshold value of generally
0.5. Any value above 0.5 is considered as 1, and any point below 0.5 is considered as 0.

82
34. Explain the K Nearest Neighbor Algorithm.

K nearest neighbor algorithm is a classification algorithm that works in a way that a


new data point is assigned to a neighboring group to which it is most similar.

In K nearest neighbors, K can be an integer greater than 1. So, for every new data
point, we want to classify, we compute to which neighboring group it is closest.
Let us classify an object using the following example. Consider there are three clusters:
• Football
• Basketball
• Tennis ball

83
Let the new data point to be classified is a black ball. We use KNN to classify it. Assume
K = 5 (initially).

Next, we find the K (five) nearest data points, as shown.

Observe that all five selected points do not belong to the same cluster. There are three
tennis balls and one each of basketball and football.
When multiple classes are involved, we prefer the majority. Here the majority is with
the tennis ball, so the new data point is assigned to this cluster.

84
35. What is a Recommendation System?
Anyone who has used Spotify or shopped at Amazon will recognize a recommendation
system: It’s an information filtering system that predicts what a user might want to hear
or see based on choice patterns provided by the user.

36. What is Kernel SVM?


Kernel SVM is the abbreviated version of the kernel support vector machine. Kernel
methods are a class of algorithms for pattern analysis, and the most common one is the
kernel SVM.

37. What Are Some Methods of Reducing Dimensionality?


You can reduce dimensionality by combining features with feature engineering,
removing collinear features, or using algorithmic dimensionality reduction. Now that
you have gone through these machine learning interview questions, you must have got
an idea of your strengths and weaknesses in this domain.

38. What is Principal Component Analysis? Principal Component Analysis or PCA is a


multivariate statistical technique that is used for analyzing quantitative data. The objective
of PCA is to reduce higher dimensional data to lower dimensions, remove noise, and
extract crucial information such as features and attributes from large amounts of data.
39. What do you understand by the F1 score?
The F1 score is a metric that combines both Precision and Recall. It is also the
weighted average of precision and recall. The F1 score can be calculated using the
below formula:

F1 = 2 * (P * R) / (P + R)
The F1 score is one when both Precision and Recall scores are one.

40. What do you understand by Type I vs Type II error?


Type I Error: Type I error occurs when the null hypothesis is true and we reject it.
Type II Error: Type II error occurs when the null hypothesis is false and we accept it.

85
41. Explain Correlation and Covariance?

Correlation: Correlation tells us how strongly two random variables are related to each
other. It takes values between -1 to +1.

Formula to calculate Correlation:

Covariance: Covariance tells us the direction of the linear relationship between two
random variables. It can take any value between - ∞ and + ∞.
Formula to calculate Covariance:

86
42. What are Support Vectors in SVM?

Support Vectors are data points that are nearest to the hyperplane. It influences the
position and orientation of the hyperplane. Removing the support vectors will alter the
position of the hyperplane. The support vectors help us build our support vector
machine model.

43. What is Ensemble learning?


Ensemble learning is a combination of the results obtained from multiple
machine learning models to increase the accuracy for improved decision-
making.

87
Example: A Random Forest with 100 trees can provide much better results than using
just one decision tree.

44. What is Cross-Validation?

Cross-Validation in Machine Learning is a statistical resampling technique that uses


different parts of the dataset to train and test a machine learning algorithm on different
iterations. The aim of cross-validation is to test the model’s ability to predict a new set
of data that was not used to train the model. Cross-validation avoids the overfitting of
data.
K-Fold Cross Validation is the most popular resampling technique that divides the
whole dataset into K sets of equal sizes.
45. What are the different methods to split a tree in a decision tree algorithm?
Variance: Splitting the nodes of a decision tree using the variance is done when the target
variable is continuous.

Information Gain: Splitting the nodes of a decision


tree using Information Gain is preferred when the
target variable is categorical.

88
Gini Impurity: Splitting the nodes of a decision tree
using Gini Impurity is followed when the target
variable is categorical.

46. How does the Support Vector Machine algorithm handle self-learning?
The SVM algorithm has a learning rate and expansion rate which takes care of self-
learning. The learning rate compensates or penalizes the hyperplanes for making all the
incorrect moves while the expansion rate handles finding the maximum separation area
between different classes.
47. What are the assumptions you need to take before starting with linear regression?
There are primarily 5 assumptions for a Linear Regression model:

• Multivariate normality
• No auto-correlation
• Homoscedasticity
• Linear relationship
• No or little multicollinearity
48. What is the difference between Lasso and Ridge regression?

Lasso(also known as L1) and Ridge(also known as L2) regression are two popular
regularization techniques that are used to avoid overfitting of data. These methods are
used to penalize the coefficients to find the optimum solution and reduce complexity.
The Lasso regression works by penalizing the sum of the absolute values of the
coefficients. In Ridge or L2 regression, the penalty function is determined by the sum of
the squares of the coefficients.

89
Neural Networks

1. Explain multilayer perceptron.


The Multilayer Perceptron (MLP) model features multiple layers that are interconnected
in such a way that they form a feed-forward neural network. Each neuron in one layer
has directed connections to the neurons of a separate layer. It consists of three types of
layers: the input layer, output layer and hidden layer.

2. What is vanishing gradient problem?


When back-propagation is used, the earlier layers will receive very small updates
compared to the later layers. This problem is referred to as the vanishing gradient
problem. The vanishing gradient problem is essentially a situation in which a deep
multilayer feed-forward network or a recurrent neural network (RNN) does not have the
ability to propagate useful gradient information from the output end of the model back
to the layers near the input end of the model.

3.Explain advantages deep learning.

Advantages of deep learning:

• No need for feature engineering


• DL solves the problem on the end-to-end basis.
• Deep learning gives more accuracy
4. Explain back propagation.

Backpropagation is a training method used for a multilayer neural network. It is also


called the generalized delta rule. It is a gradient descent method which minimizes the
total squared error of the output computed by the net.

5.What is hyperparameters?

Hyperparameters are parameters whose values control the learning process and
determine the values of model parameters that a learning algorithm ends up learning.

6. Define ReLU.

90
Rectified Linear Unit (ReLU) solve the vanishing gradient problem. ReLU is a nonlinear
function or piecewise linear function that will output the input directly if it is positive,
otherwise, it will output zero.

7.What is vanishing gradient problem

The vanishing gradient problem is a problem that user face, when we are training
Neural Networks by using gradient-based methods like backpropagation. This problem
makes it difficult to learn and tune the parameters of the earlier layers in the network

8. Define normalization.

Normalization is a data pre-processing tool used to bring the numerical data to a common
scale without distorting its shape.

9. What is batch normalization

It is a method of adaptive reparameterization, motivated by the difficulty of training very


deep models. In Deep networks, the weights are updated for each layer. So the output
will no longer be on the same scale as the input.

10. Explain advantages of ReLU function


Advantages of ReLU function:

a) ReLU is simple to compute and has a predictable gradient for the backpropagation of
the error. b) Easy to implement and very fast.

c) It can be used for deep network training

11. Explain Ridge regression.


Ridge regression, also known as L2 regularization, is a technique of regularization to
avoid the overfitting in training data set, which introduces a small bias in the training
model, through which one can get long term predictions for that input.

12. Explain dropout.


Dropout was introduced by "Hinton et al" and this method is now very popular. It consists
of setting to zero the output of each hidden neuron in chosen layer with some probability
and is proven to be very effective in reducing overfitting.

13. Explain disadvantages of deep learning

91
Disadvantages of deep learning

• DL needs high-performance hardware.


• DL needs much more time to train
• it is very difficult to assess its performance in real world applications
• it is very hard to understand
14. Explain need of hidden layers.
1. A network with only two layers (input and output) can only represent the
input with whatever representation already exists in the input data.
2. If the data is discontinuous or non-linearly separable, the innate
representation is inconsistent, and the mapping cannot be learned using two layers (Input
and Output).
3. Therefore, hidden layer(s) are used between input and output layers.
15. Explain activation functions.
Activation functions also known as transfer function is used to map input nodes to output
nodes in certain fashion. It helps in normalizing the output between 0 to 1 or - V1 to 1.
The activation function is the most important factor in a neural network which decided
whether or not a neuron will be activated or not and transferred to the next layer.
16. What is Deep Learning?
If you are going for a deep learning interview, you definitely know what exactly deep
learning is. However, with this question the interviewee expects you to give an in-detail
answer, with an example. Deep Learning involves taking large volumes of structured or
unstructured data and using complex algorithms to train neural networks. It performs
complex operations to extract hidden patterns and features (for instance, distinguishing
the image of a cat from that of a dog).
17.What is a Neural Network?
Neural Networks replicate the way humans learn, inspired by how the neurons in our
brains fire, only much simpler.

The most common Neural Networks consist of three network layers:


1. An input layer
2. A hidden layer (this is the most important layer where feature extraction takes
place, and adjustments are made to train faster and function better)
3. An output layer
Each sheet contains neurons called “nodes,” performing various operations. Neural
Networks are used in deep learning algorithms like CNN, RNN, GAN, etc. 18. What Is
a Multi-layer Perceptron(MLP)?
92
As in Neural Networks, MLPs have an input layer, a hidden layer, and an output layer. It
has the same structure as a single layer perceptron with one or more hidden layers. A
single layer perceptron can classify only linear separable classes with binary output (0,1),
but MLP can classify nonlinear classes.

Except for the input layer, each node in the other layers uses a nonlinear activation
function. This means the input layers, the data coming in, and the activation function is
based upon all nodes and weights being added together, producing the output. MLP uses
a supervised learning method called “backpropagation.” In backpropagation, the neural
network calculates the error with the help of cost function. It propagates this error
backward from where it came (adjusts the weights to train the model more accurately).

19. What Is Data Normalization, and Why Do We Need It?


The process of standardizing and reforming data is called “Data Normalization.” It’s a
pre-processing step to eliminate data redundancy. Often, data comes in, and you get the
same information in different formats. In these cases, you should rescale values to fit into
a particular range, achieving better convergence.

20. What is the Boltzmann Machine?


One of the most basic Deep Learning models is a Boltzmann Machine, resembling a
simplified version of the Multi-Layer Perceptron. This model features a visible input
layer and a hidden layer -- just a two-layer neural net that makes stochastic decisions as
to whether a neuron should be on or off. Nodes are connected across layers, but no two
nodes of the same layer are connected.

21. What Is the Role of Activation Functions in a Neural Network?


At the most basic level, an activation function decides whether a neuron should be fired
or not. It accepts the weighted sum of the inputs and bias as input to any activation
function. Step function, Sigmoid, ReLU, Tanh, and Softmax are examples of activation
functions.

22. What Is the Cost Function?


Also referred to as “loss” or “error,” cost function is a measure to evaluate how good
your model’s performance is. It’s used to compute the error of the output layer during
backpropagation. We push that error backward through the neural network and use that
during the different training functions.

93
23. What Is Gradient Descent?
Gradient Descent is an optimal algorithm to minimize the cost function or to minimize
an error. The aim is to find the local-global minima of a function. This determines the
direction the model should take to reduce the error.

22 What Do You Understand by Backpropagation?


This is one of the most frequently asked deep learning interview questions.
Backpropagation is a technique to improve the performance of the network. It
backpropagates the error and updates the weights to reduce the error.
24. What Is the Difference Between a Feedforward Neural Network and Recurrent Neural
Network?
In this deep learning interview question, the interviewee expects you to give a detailed
answer.

A Feedforward Neural Network signals travel in one direction from input to output. There
are no feedback loops; the network considers only the current input. It cannot memorize
previous inputs (e.g., CNN).

A Recurrent Neural Network’s signals travel in both directions, creating a looped


network. It considers the current input with the previously received inputs for generating
the output of a layer and can memorize past data due to its internal memory.

94
25. What Are the Applications of a Recurrent Neural Network (RNN)?
The RNN can be used for sentiment analysis, text mining, and image captioning.
Recurrent Neural Networks can also address time series problems such as predicting the
prices of stocks in a month or quarter.

26. What Are the Softmax and ReLU Functions?


Softmax is an activation function that generates the output between zero and one. It
divides each output, such that the total sum of the outputs is equal to one. Softmax is
often used for output layers.

ReLU (or Rectified Linear Unit) is the most widely used activation function. It gives an
output of X if X is positive and zeros otherwise. ReLU is often used for hidden layers.

27. What Are Hyperparameters?


This is another frequently asked deep learning interview question. With neural networks,
you’re usually working with hyperparameters once the data is formatted correctly. A
hyperparameter is a parameter whose value is set before the learning process begins. It
determines how a network is trained and the structure of the network (such as the number
of hidden units, the learning rate, epochs, etc.).

28. What Will Happen If the Learning Rate Is Set Too Low or Too High?
When your learning rate is too low, training of the model will progress very slowly as we
are making minimal updates to the weights. It will take many updates before reaching
the minimum point.

If the learning rate is set too high, this causes undesirable divergent behavior to the loss
function due to drastic updates in weights. It may fail to converge (model can give a good
output) or even diverge (data is too chaotic for the network to train).

29. What Is Dropout and Batch Normalization?


Dropout is a technique of dropping out hidden and visible units of a network randomly
to prevent overfitting of data (typically dropping 20 percent of the nodes). It doubles the
number of iterations needed to converge the network.

Batch normalization is the technique to improve the performance and stability of neural
networks by normalizing the inputs in every layer so that they have mean output
activation of zero and standard deviation of one.

30. What Is the Difference Between Batch Gradient Descent and Stochastic Gradient
Descent?
Batch Gradient Descent Stochastic Gradient Descent

95
The batch gradient The stochastic gradient computes the gradient computes the gradient
using using the entire dataset. a single sample.
It takes time to converge It converges much faster than because the volume of data the
batch gradient because it is huge, and weights update updates weight more slowly.
frequently.

31. What is Overfitting and Underfitting, and How to Combat Them?


Overfitting occurs when the model learns the details and noise in the training data to the
degree that it adversely impacts the execution of the model on new information. It is
more likely to occur with nonlinear models that have more flexibility when learning a
target function. An example would be if a model is looking at cars and trucks, but only
recognizes trucks that have a specific box shape. It might not be able to notice a flatbed
truck because there's only a particular kind of truck it saw in training. The model
performs well on training data, but not in the real world.

Underfitting alludes to a model that is neither well-trained on data nor can generalize to
new information. This usually happens when there is less and incorrect data to train a
model. Underfitting has both poor performance and accuracy.

32. How Are Weights Initialized in a Network? There are two methods here: we can either
initialize the weights to zero or assign them randomly.

Initializing all weights to 0: This makes your model similar to a linear model. All the
neurons and every layer perform the same operation, giving the same output and making
the deep net useless.

Initializing all weights randomly: Here, the weights are assigned randomly by initializing
them very close to 0. It gives better accuracy to the model since every neuron performs
different computations. This is the most commonly used method.

33. What Are the Different Layers on CNN? There are four layers in CNN:
1. Convolutional Layer - the layer that performs a convolutional operation,
creating several smaller picture windows to go over the data.
2. ReLU Layer - it brings non-linearity to the network and converts all the
negative pixels to zero. The output is a rectified feature map.
3. Pooling Layer - pooling is a down-sampling operation that reduces the
dimensionality of the feature map.
34. What is Pooling on CNN, and How Does It Work?
Pooling is used to reduce the spatial dimensions of a CNN. It performs down-sampling
operations to reduce the dimensionality and creates a pooled feature map by sliding a
filter

35. What Are Vanishing and Exploding Gradients?


96
While training an RNN, your slope can become either too small or too large; this makes
the training difficult. When the slope is too small, the problem is known as a
“Vanishing Gradient.” When the slope tends to grow exponentially instead of decaying,
it’s referred to as an “Exploding Gradient.” Gradient problems lead to long training
times, poor performance, and low accuracy
36. What Is the Difference Between Epoch, Batch, and Iteration in Deep Learning?
• Epoch - Represents one iteration over the entire dataset (everything put into
the training model).
• Batch - Refers to when we cannot pass the entire dataset into the neural
network at once, so we divide the dataset into several batches.
• Iteration - if we have 10,000 images as data and a batch size of 200. then an
epoch should run 50 iterations (10,000 divided by 50).
37. Why is Tensorflow the Most Preferred Library in Deep Learning?
Tensorflow provides both C++ and Python APIs, making it easier to work on and has a
faster compilation time compared to other Deep Learning libraries like Keras and Torch.
Tensorflow supports both CPU and GPU computing devices.

38. What Do You Mean by Tensor in Tensorflow?


This is another most frequently asked deep learning interview question. A tensor is a
mathematical object represented as arrays of higher dimensions. These arrays of data
with different dimensions and ranks fed as input to the neural network are called
“Tensors.” 39. Explain a Computational Graph.
Everything in a tensorflow is based on creating a computational graph. It has a network
of nodes where each node operates, Nodes represent mathematical operations, and edges
represent tensors. Since data flows in the form of a graph, it is also called a “DataFlow
Graph.”
40. What Is an Auto-encoder?
This Neural Network has three layers in which the input neurons are equal to the output
neurons. The network's target outside is the same as the input. It uses dimensionality
reduction to restructure the input. It works by compressing the image input to a latent
space representation then reconstructing the output from this representation.

41. What Is Bagging and Boosting?


Bagging and Boosting are ensemble techniques to train multiple models using the same
learning algorithm and then taking a call.With Bagging, we take a dataset and split it into
training data and test data. Then we randomly select data to place into the bags and train
the model separately.With Boosting, the emphasis is on selecting data points which give
wrong output to improve the accuracy.

42. What is the significance of using the Fourier transform in Deep Learning tasks?

97
The Fourier transform function efficiently analyzes, maintains, and manages large
datasets. You can use it to generate real-time array data that is helpful for processing
multiple signals.

43. What do you understand by transfer learning? Name a few commonly used transfer
learning models.
Transfer learning is the process of transferring the learning from a model to another
model without having to train it from scratch. It takes critical parts of a pre-trained model
and applies them to solve new but similar machine learning problems.

Some of the popular transfer learning models are:


• VGG-16
• BERT
• GTP-3
• Inception V3
• XCeption
44. What is the difference between SAME and VALID padding in Tensorflow?
Using the Tensorflow library, tf.nn.max_pool performs the max-pooling operation.
Tf.nn.max_pool has a padding argument that takes 2 values - SAME or VALID. With
padding == “SAME” ensures that the filter is applied to all the elements of the input.

The input image gets fully covered by the filter and specified stride. The padding type is
named SAME as the output size is the same as the input size (when stride=1).

With padding == “VALID” implies there is no padding in the input image. The filter
window always stays inside the input image. It assumes that all the dimensions are valid
so that the input image gets fully covered by a filter and the stride defined by you.

45. What are some of the uses of Autoencoders in Deep Learning?


• Autoencoders are used to convert black and white images into colored
images.
• Autoencoder helps to extract features and hidden patterns in the data.
• It is also used to reduce the dimensionality of data.
• It can also be used to remove noises from images.
46. What is the Swish Function?
Swish is an activation function proposed by Google which is an alternative to the ReLU
activation function.

It is represented as: f(x) = x * sigmoid(x).

The Swish function works better than ReLU for a variety of deeper models.

98
The derivative of Swist can be written as: y’ = y + sigmoid(x) * (1 - y)

47. What are the reasons for mini-batch gradient being so useful?
• Mini-batch gradient is highly efficient compared to stochastic gradient
descent.
• It lets you attain generalization by finding the flat minima.
• Mini-batch gradient helps avoid local minima to allow gradient
approximation for the whole dataset.
48. What do you understand by Leaky ReLU activation function?
Leaky ReLU is an advanced version of the ReLU activation function. In general, the
ReLU function defines the gradient to be 0 when all the values of inputs are less than
zero. This deactivates the neurons. To overcome this problem, Leaky ReLU activation
functions are used. It has a very small slope for negative values instead of a flat slope.
49. What is Data Augmentation in Deep Learning?
Data Augmentation is the process of creating new data by enhancing the size and quality
of training datasets to ensure better models can be built using them. There are different
techniques to augment data such as numerical data augmentation, image augmentation,
GAN-based augmentation, and text augmentation.

50. Explain the Adam optimization algorithm.


Adaptive Moment Estimation or Adam optimization is an extension to the stochastic
gradient descent. This algorithm is useful when working with complex problems
involving vast amounts of data or parameters. It needs less memory and is efficient.

Adam optimization algorithm is a combination of two gradient descent methodologies -

Momentum and Root Mean Square Propagation.


51.Why is a convolutional neural network preferred over a dense neural network
for an image classification task?
• The number of parameters in a convolutional neural network is much more
diminutive than that of a Dense Neural Network. Hence, a CNN is less likely
to overfit.

• CNN allows you to look at the weights of a filter and visualize what the
network learned. So, this gives a better understanding of the model. CNN
trains models in a hierarchical way, i.e., it learns the patterns by explaining
complex patterns using simpler ones.
52. Which strategy does not prevent a model from over-fitting to the training data?
1. Dropout

99
2. Pooling
3. Data augmentation
4. Early stopping
Answer: b) Pooling - It’s a layer in CNN that performs a downsampling operation.

53. Explain two ways to deal with the vanishing gradient problem in a deep neural
network.
• Use the ReLU activation function instead of the sigmoid function
• Initialize neural networks using Xavier initialization that works with tanh
activation.
54. Why is a deep neural network better than a shallow neural network?
Both deep and shallow neural networks can approximate the values of a function. But
the deep neural network is more efficient as it learns something new in every layer. A
shallow neural network has only one hidden layer. But a deep neural network has several
hidden layers that create a deeper representation and computation capability.

55. What is the need to add randomness in the weight initialization process?
If you set the weights to zero, then every neuron at each layer will produce the same
result and the same gradient value during backpropagation. So, the neural network won’t
be able to learn the function as there is no asymmetry between the neurons. Hence,
randomness to the weight initialization process is crucial.

56. How can you train hyperparameters in a neural network?


Hyperparameters in a neural network can be trained using four components:
• Batch size: Indicates the size of the input data.
• Epochs: Denotes the number of times the training data is visible to the neural
network to train.
• Momentum: Used to get an idea of the next steps that occur with the data
being executed.
• Learning rate: Represents the time required for the network to update the
parameters and learn.
57. How Does an LSTM Network Work?
Long-Short-Term Memory (LSTM) is a special kind of recurrent neural network capable
of learning long-term dependencies, remembering information for long periods as its
default behavior.

100

You might also like