0% found this document useful (0 votes)

36 views39 pages

CS Preliminaries: ECS289A

The document provides an overview of key concepts in computer science and algorithms, including: - Problems are solved through computational solutions called algorithms, which are implemented as programs. Data is stored and accessed through databases and analyzed for hypothesis testing. - Algorithms are analyzed for correctness and efficiency based on time and space complexity, with common metrics including big-O notation. Many important problems are NP-complete, meaning they likely cannot be solved efficiently. - Popular algorithmic techniques include sorting, graph algorithms, dynamic programming, and heuristics like simulated annealing. Data structures like arrays, linked lists, trees and hash tables are important tools for solving problems efficiently.

Uploaded by

Harish Hari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views39 pages

CS Preliminaries: ECS289A

Uploaded by

Harish Hari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 39

CS Preliminaries

ECS289A
Computer Science
• Computational solutions to problems: algorithms
• Programming the solutions: programs
• Data storage and access: databases
• Data Analysis: for hypothesis generation and
testing
• Human-computer Interfaces: interaction with data
• Building systems: hardware and software
• Education

ECS289A
What is a solution to a problem:
an algorithm
• A procedure designed to perform a certain
task, or solve a particular problem
• Algorithms are recipes: ordered lists of
steps to follow in order to complete a task
• Abstract idea behind particular
implementation in a computer program

ECS289A
1. Algorithms in Bioinformatics
Theoretical Computer Scientists are
contributors to the genomic revolution

• Sequence comparison
• Genome Assembly
• Phylogenetic Trees
• Microarray design (SBH)
• Data Integration
• Gene network inference
ECS289A
Algorithm Design
• Recognize the structure of a given problem:
– Where does it come from?
– What does it remind of?
– How does it relate to established problems?
• Build on existing, efficient data structures
and algorithms to solve the problem
• If the problem is difficult to solve efficiently,
use approximative algorithms

ECS289A
Problems and Solutions
In algorithmic lingo:
• Problems are very specific, general
mathematical tasks, that take variables as
input and yield variables as output.
• Particularizations (assigning values to the
variables) are called instances.
• Problem: Multiply(a,b): Given integers a
and b, compute their product a*b.
• Instance: Multiply (13, 243).

ECS289A
Algorithms produce solutions for any given
instance of a general problem
Multiply(a,b):
0) Let Product = 0
1) Take the k-th rightmost digit of b
and multiply a by it. Attach k-1 zeros
to the right, and add to Product.
2) Repeat Step 1. for all digits of b.
3) Product = a*b

Multiply (13, 243) = 3159

ECS289A
Algorithm Analysis
• Correctness
– Exact solutions require a proof of correctness
– Heuristics: approximate solutions
• Resource Efficiency (complexity)
– Time: number of steps to follow to obtain a
solution as a function of the input size
– Space: amount of memory required for the
algorithm execution
• Best, Average, and Worst Case Analysis

ECS289A
Time / Space Complexity
• Input size: how many units of
constant size does it take to represent
the input? This is dependent on the
computational model, but can be
thought of as the storage size of the
input. The input size is usually n.
• Running time: f(n) = const., n, log n,
Poly(n), en

ECS289A
Big Oh Notation
• Asymptotic upper bound on the number of
steps an algorithm takes (in the worst case)

• f(n) = O(g(n)) iff there is a constant c such

that for all large n, 0 <= f(n) <= c*g(n)

• More intuitively: f(n) is almost always less

than or equal to g(n), i.e. algorithm with t.c.
f(n) will almost never take more time than
one with t.c. of g(n)
ECS289A
Big Oh, examples
• Const. = O(1)
• 3n = O(n)
• 3n = O(n2)
• log n = O(n)
• Poly(n) = O(en)

• O(n) time algorithm is called linear

• O(Poly(n)) is polynomial
• O(en) is polynomial
ECS289A
Basic Complexity Theory
• Classification of Problems based on the
time/space complexity of their solutions

• Class P: Problems with polynomial time

algorithms t.c. = O(Poly(n))

• Class NP: (non-deterministic polynomial)

Problems whose solution instances can be
verified in Poly(n) time.

ECS289A
Complexity, contd.
• NP-complete problems: a polynomial algorithm
for one of them would mean all problems in NP
are polynomial time
• But, NO polynomial time algorithms for NP
problems are known
• P ≠ NP? Still unsolved, although strongly
suspected true.
• NP complete problems: 3-SAT, Hamiltonian
Cycle, Vertex Cover, Maximal Clique, etc.
Thousands of NP-complete problems known
• Compendium:
http://www.nada.kth.se/~viggo/problemlist/compendium.html

ECS289A
Why All That?
• Many important problems in the real world
tend to be NP-complete
• That means exact solutions are
intractable, but for very small instances
• Proving a problem to be NP-complete is
just a first step: a good algorist would use
good and efficient heuristics

ECS289A
Popular Algorithms
• Sorting
• String Matching
• Graph Algorithms
– Graph representation: linked lists, incidence matrix
– Graph Traversal (Depth First and Breadth First)
– Minimum Spanning Trees
– Shortest Paths
• Linear Programming

ECS289A
Algorithmic Techniques
• Combinatorial Optimization Problems
– Find min (max) of a given function under given
constraints
• Greedy – best solution locally
• Dynamic Programming – best global
solution, if the problem has a nice structure
• Simulated Annealing: if not much is known
about the problem. Good general technique

ECS289A
Data Structures
• Once a given problem is digested,
algorithm design becomes an engineering
discipline: having a big toolbox and
matching the tools to the task at hand
• A major part of the toolbox are data
structures:
Data representations allowing efficient
performance of basic operations

ECS289A
Basic Opperations
• Store/Search:
– Search(x)
– Delete(x)
– Insert(x)
• Priority:
– FindMIN
– FindMAX
• Set:
– UnionSet
– FindElement

ECS289A
Basic Data Structures
• Static: arrays and matrices
– Array of n elements: a[i], 0 <= i <= n-1
1 2 3 4 5
a[1] a[2] a[3] a[4] a[5]

1 2 3 4
– Matrix of n*n elements:
m[i][j], 0 <= i, j <= n-1 1 m[1][1] m[1][2] m[1][3] m[1][4]

2 m[2][1] m[2][2] m[2][3] m[2][3]

• Basic operations are O(1)
3 m[3][1] m[3][2] m[3][3] m[3][4]

ECS289A
Dynamic Data Structures: linked lists,
trees and balanced trees, hash tables
• No static memory allocation: items are added/deleted on
the go

• Linked Lists (basic operations are O(n)):

NIL
a b c

• Trees

Balanced tree: Height is O(logn).

Basic operations are O(log n)

ECS289A
Hash Tables
a

c a b c NIL
f(key)
d
Keys
e d e f NIL

f g h i NIL

A good hash function f(key) yields constant search time O(1).

ECS289A
Set Data Structures
• Given sets A={1,2,3,4} and B={1,3}
• Operations: Find, Union
• Example:
– Find(A,3) = yes
– Find(A,5) = no
– Find(B,3) = yes
– Union(A,B) = {1,2,3,4}
• Very efficient: almost linear in the number
of union+find operations
ECS289A
Graphs
• Graph G(V, E). V is a set of vertices, E a
set of edges
V4

V = {v1, v2, v3, v4, v5, v6}

V3
V5
E = { (v1, v2), (v1, v5), (v1, v6),
(v2, v3), (v2, v5), (v2, v6),
(v3, v4), (v3, v5), (v3, v6) }

ECS289A
• Linked list representation: V2
V4

v1: v2, v5, v6

V3
v2: v1, v3, v5, v6 V5

v3: v2, v4, v5, v6

v4: v3
v5: v1, v2, v3 V1

v6: v1, v2, v3

• Adjacency Matrix Representation

V2 V3 V4 V5 V6
V1
V1 1 0 0 1 1
V2 1 1 0 1 1
V3 0 1 1 1 1
V4 0 0 1 0 0
V5 1 1 1 0 0
V6 1 1 1 0 0

ECS289A
A Greedy Clustering Example

ECS289A
• Clustering is a very important tool in
analysis of large quantities of data

• Clustering: Given a number of objects we

want to group them based on similarity

• Here we will work out a very simple

example: clustering points in a plane by
single-link hierarchical clustering

ECS289A
Clustering Points in the Plane
Problem 1: Given n points p1 ( x1 , y1 ), p2 ( x2 , y2 ), , pn ( xn , yn )
in a plane, cluster them so that if the distance
between two points is less than D they are in the
same cluster
Input: D, p1 ( x1 , y1 ), p2 ( x2 , y2 ), , pn ( xn , yn )
Output: Sets (clusters) of points C1, C2, …, Ck.
D

C1 C2

ECS289A
Algorithm Draft
• Calculate distances between point pairs

ECS289A
• Sort the distances in ascending order

p2 p1 d2,1 p7 p5 d7,5

p3 p2 d3,2 Sort p3 p1 d3,1

p3 p1 d3,1 p4 p3 d4,3

… … … … … …

ECS289A
Move through the sorted list of distances and add a new
point to a cluster if the distance is < D.

ECS289A
Algorithm in Detail
• Data Structure for the graph: adjacency matrix
p2 p1 d2,1

p3 p2 d3,2

p3 p1 d3,1

… … …

• Data Structure for the clusters: Set (Union /

Find)

ECS289A
Algorithm in detail
• Calculate distances O(n2)
– For all pairs i,j calculate d(i,j)
• Sort adjacency table O(n2 log n)
• Start with n sets, p1,p2,…,pn. Build a linked-
list representation of a graph:
– Get the next smallest distance, d(i,j)
– If d(i,j) >= D done
– Else Union(Find(pi),Find(pj))
• Traverse the graph to find the connected
components (DFS)
ECS289A
Algorithm Analysis
• Correctness:
– All distances less than D are added
– Clusters contain all points with distance < D to
some other point in the cluster
• Time complexity:
– Bounded above by the sorting step
– O(n2 log n)

ECS289A
Discussion
• This algorithm is known as Single-Link
Hierarchical Clustering
• It is a version of Kruskal’s Minimum
Spanning Tree Algorithm
• It is fast

ECS289A
Performance on Real Data
• Lousy: Chaining effects

ECS289A
Better Approaches:
Complete-Link Clustering
Problem 2: Given n points p1 ( x1 , y1 ), p2 ( x2 , y2 ), , pn ( xn , yn )
in a plane, cluster them so that the distance
between any two points in a cluster is less than D

Input: D, p1 ( x1 , y1 ), p2 ( x2 , y2 ), , pn ( xn , yn )
Output: Sets (clusters) of points C1, C2, …, Ck.
D

C1 C2

ECS289A
2. Bio-databases
• A biological database is a large, organized
body of persistent data, usually associated
with computerized software designed to
update, query, and retrieve components of
the data stored within the system.
– easy access to the information
– a method for extracting only that information
needed to answer a specific biological question
• Many databases are linked through a
unique search and retrieval system, eg
NCBI's Entrez.
ECS289A
Database Interfacing
• APIs: scripts in Perl, Python, R
• Direct online:
– NCBI entrez
– KEGG
– Reactome
– etc.

ECS289A
3. Workflows

ECS289A

Cheat Sheet
No ratings yet
Cheat Sheet
7 pages
BCS401 Ada PPT 24 25
No ratings yet
BCS401 Ada PPT 24 25
317 pages
The Hitchhiker's Guide To The Programming Contests
100% (2)
The Hitchhiker's Guide To The Programming Contests
78 pages
Mc4101 Ads Notes Advance Data Structure Nodes
0% (1)
Mc4101 Ads Notes Advance Data Structure Nodes
144 pages
02 Complexity
No ratings yet
02 Complexity
153 pages
Chazelle
No ratings yet
Chazelle
61 pages
Isagoge - A Classical Primer On Logic - Feryal Salem (Author), Athir Al-Din Al-Abhari (Author) - 20
No ratings yet
Isagoge - A Classical Primer On Logic - Feryal Salem (Author), Athir Al-Din Al-Abhari (Author) - 20
163 pages
Cs550 Manuscript
No ratings yet
Cs550 Manuscript
406 pages
Competitive Programming Notebook: Joao Carreira 2010
No ratings yet
Competitive Programming Notebook: Joao Carreira 2010
21 pages
Module5 Algorithms
No ratings yet
Module5 Algorithms
25 pages
Algorithms
No ratings yet
Algorithms
14 pages
Algorithms
No ratings yet
Algorithms
501 pages
Cooks Theorem
100% (1)
Cooks Theorem
33 pages
Interview Preparation Plan For Fresher
No ratings yet
Interview Preparation Plan For Fresher
13 pages
Lesson Plan
No ratings yet
Lesson Plan
6 pages
161 Main
No ratings yet
161 Main
51 pages
Form 4 PDF
No ratings yet
Form 4 PDF
1 page
CS 332: Algorithms: Final Exam
No ratings yet
CS 332: Algorithms: Final Exam
26 pages
Lect 1
No ratings yet
Lect 1
35 pages
Subject Name: Design and Analysis of Algorithms Subject Code: 10CS43 Prepared By: Sindhuja K Department: CSE Date
No ratings yet
Subject Name: Design and Analysis of Algorithms Subject Code: 10CS43 Prepared By: Sindhuja K Department: CSE Date
59 pages
Algorithms: 2006 S. Dasgupta, C. H. Papadimitriou, and U. V. Vazirani July 18, 2006
100% (2)
Algorithms: 2006 S. Dasgupta, C. H. Papadimitriou, and U. V. Vazirani July 18, 2006
336 pages
Lec01 Motivation
No ratings yet
Lec01 Motivation
30 pages
Week 12
No ratings yet
Week 12
7 pages
TOA Cheatsheet
No ratings yet
TOA Cheatsheet
43 pages
Theory of Computation
No ratings yet
Theory of Computation
6 pages
DMGT Question Bank
No ratings yet
DMGT Question Bank
4 pages
23CS312 Syllabus
No ratings yet
23CS312 Syllabus
3 pages
ALG - Couse Data Sheet - New
No ratings yet
ALG - Couse Data Sheet - New
7 pages
Lecture 01 IntroductionToAlgorithm
No ratings yet
Lecture 01 IntroductionToAlgorithm
25 pages
CS3401 Lab Manual Final
No ratings yet
CS3401 Lab Manual Final
37 pages
Algorithm Introduction
No ratings yet
Algorithm Introduction
28 pages
Data Structures and Algorithms: CS210/CS210A
No ratings yet
Data Structures and Algorithms: CS210/CS210A
31 pages
CHAPTER 3 Boolean Logic
No ratings yet
CHAPTER 3 Boolean Logic
5 pages
PDF Bolzano's Logical System 1st Edition Ettore Casari Download
100% (1)
PDF Bolzano's Logical System 1st Edition Ettore Casari Download
65 pages
Latex
No ratings yet
Latex
40 pages
CSE 241 Class Notes
No ratings yet
CSE 241 Class Notes
7 pages
Daa 1
No ratings yet
Daa 1
6 pages
Intuitionistic Proof Versus Classical Truth The Role of Brouwer S Creative Subject in Intuitionistic Mathematics 1st Edition Enrico Martino (Auth.)
No ratings yet
Intuitionistic Proof Versus Classical Truth The Role of Brouwer S Creative Subject in Intuitionistic Mathematics 1st Edition Enrico Martino (Auth.)
57 pages
Workbook: Puhlicatisns
No ratings yet
Workbook: Puhlicatisns
43 pages
AI and ML Lab Manual 2022
No ratings yet
AI and ML Lab Manual 2022
37 pages
Formula Sheet
No ratings yet
Formula Sheet
2 pages
Graph Tree Notes
No ratings yet
Graph Tree Notes
76 pages
Apple T Notes
No ratings yet
Apple T Notes
29 pages
Algorithms NOTES
No ratings yet
Algorithms NOTES
162 pages
16becs404 Design and Analysis of Algorithms
No ratings yet
16becs404 Design and Analysis of Algorithms
62 pages
Aiml Lab
No ratings yet
Aiml Lab
44 pages
Hopcroft Sol
No ratings yet
Hopcroft Sol
7 pages
Exercises: Part I: Author: Mala Mitra
No ratings yet
Exercises: Part I: Author: Mala Mitra
10 pages
Cse Daa LN Ug20
No ratings yet
Cse Daa LN Ug20
115 pages
Algorithms and Data Structures: Dynamic Programming Matrix-Chain Multiplication
No ratings yet
Algorithms and Data Structures: Dynamic Programming Matrix-Chain Multiplication
17 pages
DAA Handouts Apr 29
No ratings yet
DAA Handouts Apr 29
114 pages
0282 Algorithms
No ratings yet
0282 Algorithms
90 pages
Arsdigita University Month 8: Theory of Computation Professor Shai Simonson Problem Set 5
No ratings yet
Arsdigita University Month 8: Theory of Computation Professor Shai Simonson Problem Set 5
4 pages
Digital Control Sys Syllabus
0% (1)
Digital Control Sys Syllabus
3 pages
Algorithms
No ratings yet
Algorithms
90 pages
1 - Mathematical Induction
No ratings yet
1 - Mathematical Induction
3 pages
Daa Unit Wise Ques
No ratings yet
Daa Unit Wise Ques
18 pages
LAB 3 Handout
No ratings yet
LAB 3 Handout
2 pages
CS3491 Ai Lab Manula R2021 Final
100% (4)
CS3491 Ai Lab Manula R2021 Final
43 pages
Design and Analysis of Algorithms
No ratings yet
Design and Analysis of Algorithms
6 pages
Fast Fourier Transform Algorithms and Applications PDF
No ratings yet
Fast Fourier Transform Algorithms and Applications PDF
2 pages
NNFL LP
No ratings yet
NNFL LP
4 pages
Unit 3
No ratings yet
Unit 3
71 pages
Mathematical and Computational Methods For Compressible Flow
No ratings yet
Mathematical and Computational Methods For Compressible Flow
8 pages
Artificial Neural Networks: Slides Are By: Tan, Steinbach, Karpatne, Kumar
No ratings yet
Artificial Neural Networks: Slides Are By: Tan, Steinbach, Karpatne, Kumar
26 pages
Convolutional Neural Network With An Optimized Backpropagation Technique
No ratings yet
Convolutional Neural Network With An Optimized Backpropagation Technique
5 pages
Fuzzy Rules and Reasoning
No ratings yet
Fuzzy Rules and Reasoning
18 pages
CV Kerja Ahmad Bukhari
No ratings yet
CV Kerja Ahmad Bukhari
1 page
03 Propositional Logic
No ratings yet
03 Propositional Logic
22 pages
Introduction To Algorithms
No ratings yet
Introduction To Algorithms
19 pages
Course Information Sheet (Theory) Algo
No ratings yet
Course Information Sheet (Theory) Algo
8 pages
BC 240422235 Math202 Assignment No.1
No ratings yet
BC 240422235 Math202 Assignment No.1
4 pages
Mathematical System Postulates Theorems 1
No ratings yet
Mathematical System Postulates Theorems 1
8 pages
Fall 22 Mid Cse 2213 Uiu
No ratings yet
Fall 22 Mid Cse 2213 Uiu
1 page
Unit 2
No ratings yet
Unit 2
21 pages
CS170: Efficient Algorithms and Intractable Problems Fall 2001
No ratings yet
CS170: Efficient Algorithms and Intractable Problems Fall 2001
113 pages
Algorithms: 2006 S. Dasgupta, C. H. Papadimitriou, and U. V. Vazirani July 18, 2006
No ratings yet
Algorithms: 2006 S. Dasgupta, C. H. Papadimitriou, and U. V. Vazirani July 18, 2006
8 pages

CS Preliminaries: ECS289A

Uploaded by

CS Preliminaries: ECS289A

Uploaded by

CS Preliminaries

Multiply (13, 243) = 3159

• f(n) = O(g(n)) iff there is a constant c such

• More intuitively: f(n) is almost always less

• O(n) time algorithm is called linear

• Class P: Problems with polynomial time

• Class NP: (non-deterministic polynomial)

2 m[2][1] m[2][2] m[2][3] m[2][3]

• Linked Lists (basic operations are O(n)):

Balanced tree: Height is O(logn).

A good hash function f(key) yields constant search time O(1).

V = {v1, v2, v3, v4, v5, v6}

v1: v2, v5, v6

v3: v2, v4, v5, v6

v6: v1, v2, v3

• Adjacency Matrix Representation

• Clustering: Given a number of objects we

• Here we will work out a very simple

p3 p2 d3,2 Sort p3 p1 d3,1

• Data Structure for the clusters: Set (Union /

You might also like