tansimia
diff --git a/Diff for: ‎Assignments/EN/Assignment_5.ipynb
+266-28 b/Diff for: ‎Assignments/EN/Assignment_5.ipynb
+266-28
diff --git a/Diff for: ‎Data/figs/Sample-concordance-lines-of-actually.png
111 KB b/Diff for: ‎Data/figs/Sample-concordance-lines-of-actually.png
111 KB
@@ -1,7 +1,6 @@
 {
  "cells": [
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
@@ -17,15 +16,13 @@
    ]
   },
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "***"
    ]
   },
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
@@ -34,7 +31,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 1,
    "metadata": {},
    "outputs": [
     {
@@ -67,30 +64,34 @@
    ]
   },
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "***"
    ]
   },
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "You will solve the following exercises using Pure Python.\n",
-    "1. Count words in a text\n",
+    "#### You will solve the following exercises using Pure Python.\n",
+    "1. Count words in a text  \n",
     "2. Sort a list of words in various ways  \n",
-    "   • ascii order  \n",
-    "   • \"rhyming\" order  \n",
-    "3. Extract useful info from a dictionary\n",
-    "4. Compute ngram statistics\n",
-    "5. Make a Concordance"
+    "   • ascii order   \n",
+    "   • \"rhyming\" order   \n",
+    "3. Extract useful info for a dictionary  \n",
+    "4. Compute ngram statistics  \n",
+    "5. Make a Concordance  "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "***"
    ]
   },
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
@@ -105,7 +106,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 2,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -114,7 +115,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 3,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -123,7 +124,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 4,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -132,30 +133,268 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 5,
    "metadata": {},
    "outputs": [],
    "source": [
     "# d)"
    ]
   },
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "##### 2. Sorting and reversing lines of text\n",
     "\n",
-    "a. Sort each line ignoring case\n",
-    "• sort –n Numeric order\n",
-    "• sort –r Reverse sort\n",
-    "• sort –nr Reverse numeric sort"
+    "a. Sort each line alphabetically ignoring case  \n",
+    "b. sort in numeric ([ascii](https://python-reference.readthedocs.io/en/latest/docs/str/ASCII.html)) order  \n",
+    "c. Alphabetically reverse sort (ignoring case)  \n",
+    "d. Reverse numeric ([ascii](https://python-reference.readthedocs.io/en/latest/docs/str/ASCII.html)) sort  "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# a)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# b)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# c)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# d)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "##### 3. Sorting and reversing lines of text\n",
+    "\n",
+    "a. Find the 50 most common words  \n",
+    "b. Find the words in the NYT that end in \"zz\" "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# a)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# b)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "##### 4. Compute ngrams and other statistics\n",
+    "\n",
+    "a. Find the 10 most common bigrams  \n",
+    "b. Find the 10 most common trigrams  \n",
+    "c. Count the lines, the words, and the characters\n",
+    "d. How many all uppercase words are there in this NYT file?\n",
+    "e, How many 4-letter words?\n",
+    "f. How many different words are there with no vowels\n",
+    "g. What subtypes do they belong to?\n",
+    "h. How many “1 syllable” words are there"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# a)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# b)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# c)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# d)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# e)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# f)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 18,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# g)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 19,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# h)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "##### 5. Make a Concordance\n",
+    "\n",
+    "a. Create a concordance display for an arbitrary word. See the example below  \n",
+    "\n",
+    "![](../../Data/figs/Sample-concordance-lines-of-actually.png)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# a)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "***"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "##### Extra Credit – Secret Message\n",
+    "+ The answers to the extra credit exercises will reveal a secret message.  \n",
+    "+ We will be working with the following text file for these exercises:  \n",
+    "[Link to Text](https://web.stanford.edu/class/cs124/lec/secret_ec.txt)  "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "##### Extra Credit Exercise 1\n",
+    "• Find the 2 most common words in secret_ec.txt containing the letter e.  \n",
+    "• Your answer will correspond to the first two words of the secret message.  "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "##### Extra Credit Exercise 2\n",
+    "• Find the 2 most common bigrams in secret_ec.txt where the second word in the bigram ends with a consonant.  \n",
+    "• Your answer will correspond to the next four words of the secret message.  "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "##### Extra Credit Exercise 3\n",
+    "• Find all 5-letter-long words that only appear once in secret_ec.txt.   \n",
+    "• Concatenate your result. This will be the final word of the secret message.  \n",
+    "\n",
+    "What is the secret message?  "
    ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
   }
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
    "language": "python",
    "name": "python3"
   },
@@ -169,10 +408,9 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.10.10"
-  },
-  "orig_nbformat": 4
+   "version": "3.10.6"
+  }
  },
  "nbformat": 4,
- "nbformat_minor": 2
+ "nbformat_minor": 4
 }