CS201 Midterm2 Helper Sheet
CS201 Midterm2 Helper Sheet
contains(“hello”)
ArrayList List.get(i)
ArrayList<String> students = new ArrayList<String>();
ArrayList can’t hold primitive like int,
char
studentList.add("John"); BUT use Integer, Character
studentList.remove(0);
studentList.remove("Lily"); Traverse array:
int bat = Collections.frequency(list,”bat”) for (int i = 0; i <
arrayName.length; i++)
Array
for (dataType element :
dataType[] arrayName = new dataType[size];
Changing Elements: arrayName[index] = value arrayName)
arrayName.length
P3 CHAT MARKOV
1. AbstractMarkovModel Class
Purpose:
Abstract base class for Markov models, defining core methods for text-based training and
generation.
1. setTraining(String text): Abstract, splits text into words.
Runtime: O(n), where n is the number of words.
2. getFollows(WordGram wgram): Abstract, retrieves words following a WordGram.
Runtime: O(n) (BaseMarkov) / O(1) (HashMarkov).
3. getRandomNextWord(WordGram wgram): Picks a random word following a
WordGram.
Runtime: O(1).
4. getRandomGram(): Generates a random WordGram.
Runtime: O(k), where k is the WordGram length.
2. WordGram Class
Purpose: Represents an immutable sequence of words.
Key Methods and Runtime:
1. equals(Object o): Checks if two WordGram objects are equal.
Runtime: O(k).
2. shiftAdd(String last): Returns a new WordGram with a word shifted in.
Runtime: O(k).
3. BaseMarkovModel Class
Purpose:
Uses sequential search to find words following a WordGram.
Key Methods and Runtime:
1. setTraining(String text): Splits text into words.
Runtime: O(n).
2. getFollows(WordGram wgram): Searches through the entire text.
Runtime: O(n) per call.
4. HashMarkovModel Class
Purpose:
Uses a HashMap to pre-compute and store follow-word mappings for faster lookups.
Key Methods and Runtime:
1. setTraining(String text): Populates a HashMap with WordGram mappings.
Runtime: O(n).
2. getFollows(WordGram wgram): Looks up words from the HashMap.
Runtime: O(1) average, O(n) worst-case.
P4 DNA
The IDnaStrand interface specifies the operations required to manipulate DNA strands.
Key Methods:
1. cutAndSplice: Replaces occurrences of a given enzyme sequence with another splicee
sequence.
Search for Enzyme:
iterate through the strand, searching for the enzyme using indexOf().
Every time it finds an occurrence of the enzyme:
o It extracts the segment before the enzyme.
o Replaces the enzyme with the splicee string.
Build the New Strand:
If this is the first match, a new strand is created using getInstance().
For subsequent matches, the new segments are appended to the existing strand.
After each replacement, the start index is moved to the end of the current enzyme
match.
2. size: Returns the length of the strand.
3. append: Adds DNA sequences to the end of the strand.
4. reverse: Reverses the DNA strand.
5. charAt: Accesses a character at a given index.
6. getInstance: Creates a new instance of the DNA strand type.
Shift Add: Shifting the existing sequence to the left (removing the first word). Adding a new word
to the end of the sequence.
StringStrand
Description: Stores the DNA data as a simple String.
Key Operations:
o Appending is performed by creating a new string each time, which leads to high overhead.
o reverse() creates a new reversed string using StringBuilder.
Runtime Analysis:
1. append(): O(n) (new string created each time).
2. reverse(): O(n) (creating and reversing with StringBuilder).
3. cutAndSplice(): O(m*n) (repeated string operations during search/replace).
Pros:
o Simple to implement and straightforward.
Cons:
o Slow for large inputs due to string immutability and frequent copying.
StringBuilderStrand
Description: Uses a StringBuilder for better performance with dynamic strings.
Key Operations:
o Appending is performed directly on the StringBuilder, avoiding repeated copying.
o reverse() reverses the content in-place with StringBuilder.
Runtime Analysis:
1. append(): O(1) amortized (no new string creation).
2. reverse(): O(n) (in-place reversal).
3. cutAndSplice: O(m*n) (still linear search but better append performance).
Pros:
o Faster appends compared to StringStrand.
Cons:
o Still not optimal for very large data due to the sequential nature of string operations.
LinkStrand
Description: Uses a linked list (Node) to store DNA segments. Each append adds a new node to
the list.
Key Operations:
o Appending is constant time by adding new nodes.
o reverse() reverses the linked list by creating new nodes in reverse order.
Runtime Analysis:
1. append(): O(1) (constant time with linked list).
2. reverse(): O(n) (traversing and creating new nodes).
3. cutAndSplice(): O(m + n) (efficient due to constant-time appends).
Pros:
o Best for large data due to constant-time appends.
Cons:
o More complex implementation compared to string-based approaches.