0% found this document useful (0 votes)
22 views3 pages

The University of Western Australia School of Computer Science & Software Engineering

The document describes a lab assignment to create an index for a text document. Students are provided starter code and asked to write functions to: 1. Number the lines of a document and pair each line with its text 2. Pair each word in the document with its line number 3. Return a list of unique first elements from a list of pairs 4. Return the line numbers associated with a given word 5. Combine words with their line numbers into an index 6. Use the other functions to create an index for a given document 7. Test the main function on a sample file

Uploaded by

mheba11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views3 pages

The University of Western Australia School of Computer Science & Software Engineering

The document describes a lab assignment to create an index for a text document. Students are provided starter code and asked to write functions to: 1. Number the lines of a document and pair each line with its text 2. Pair each word in the document with its line number 3. Return a list of unique first elements from a list of pairs 4. Return the line numbers associated with a given word 5. Combine words with their line numbers into an index 6. Use the other functions to create an index for a given document 7. Test the main function on a sample file

Uploaded by

mheba11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

THE UNIVERSITY OF WESTERN AUSTRALIA

SCHOOL OF COMPUTER SCIENCE & SOFTWARE ENGINEERING

CITS3242: Programming Paradigms

Lab sheet 2: Indexing a text document.

In this lab, you will write a program that creates an index for a text document. The index will list
every word that occurs in the document, together with the number of every line on which the word
appears. The words in the index will be sorted alphabetically, the line numbers listed for each
word will be in ascending order, and there will be no duplicated entries.
This lab uses some features that we have not seen in lectures yet – polymorphism and type
abbreviations. This code is in the starting point, however, so you not need to know how to write
such code, just to make use of the provided definitions, which should be easy.

Make sure that you have the following files, which are available via the unit web page:
Lab2index.fs contains type declarations and some trivial functions
inp contains an example text file
inp.index contains the index produced for inp

Compare inp and inp.index to make sure that you understand the problem specification – put inp in
an appropriate folder somewhere to use for testing once your program is complete.
Create a new project, and add the code Lab2index.fs, then add the functions below. The types for
each function include type abbreviations – these are in the starting point, and you should use these
types to guide you in writing your program.

1. Define a function
numberLines : fileContents -> (line * lineNumber) list

that takes the contents of a file and returns a list containing the lines from the file, each paired
with its line number. E.g.,
numberLines "a zz\r\nb yy\r\ncx d\r\n\r\nd zz d"
// yields: [ ("a zz",1); ("b yy",2); ("cx d",3); ("",4); ("d zz d",5) ]
(Hint: use lines and zip.)

[Note: “\r\n” is an end of line under Windows. However, beyond test cases like this it is
generally better to use Environment.NewLine, so that code will work with other operating
systems.]

CITS3242: Programming Paradigms 1 Lab sheet 2


2. Use numberLines to define a function
numberWords : fileContents -> (wordstr * lineNumber) list

that takes the contents of a file and returns a list containing the words of the file, each paired
with its line number. E.g.,
numberWords "a zz\r\nb yy\r\ncx d\r\n\r\nd zz d\r\n"

// yields: [("a",1); ("zz",1); ("b",2); ("yy",2);


// ("cx",3); ("d",3); ("d",5); ("zz",5); ("d",5)]

(Hint: use words and a list comprehension.)

3. Use loseAdjacentDuplicates (see below) to define a function


distinctfsts : (wordStr * lineNumber) list -> wordStr list

that takes a list of pairs xys and returns the list of distinct first fields in xys. xys is assumed
to be sorted. E.g.,

distinctfsts [("a",1); ("b",2); ("cx",3); ("d",3); ("d",5);


("d",5); ("yy",2); ("zz",1); ("zz",5)]

// yields: ["a"; "b"; "cx"; "d"; "yy"; "zz"]

(Hint: use a list comprehension.)


Note – you should use the following function from the starting point code:
loseAdjacentDuplicates : 'a list -> 'a list

Here the 'a is a type variable, and means that the function can be used for every type obtained
by replacing 'a by some type. This function will remove duplicates from a list, as long as the
duplicates are grouped together – which they will be if the list is sorted.

4. Use loseAdjacentDuplicates again to define a function


snds : wordStr -> (wordStr*lineNumber) list -> lineNumber list

that takes a value x and a list of pairs xys and returns the list of distinct second fields
associated with x in xys. x is assumed to occur on xys. E.g.,
snds "d" [("a",1); ("b",2); ("cx",3); ("d",3); ("d",5);
("d",5); ("yy",2); ("zz",1); ("zz",5)]

// yields: [3; 5]

(Hint: use a list comprehension.)

CITS3242: Programming Paradigms 2 Lab sheet 2


5. Use distinctfsts and snds to define a function
combineWords : (wordstr*lineNumber) list -> index

that takes a list of words paired with individual line numbers wis and returns an index, i.e. a
list of words each paired with a list of line numbers. wis is assumed to be sorted. E.g.,
combineWords [("a",1); ("b",2); ("cx",3); ("d",3); ("d",5); ("yy",2); ("zz",1); ("zz",5)]

// yields: [("a",[1]); ("b",[2]); ("cx",[3]); ("d",[3,5]); ("yy",[2]); ("zz",[1,5])]

(Hint: use a list comprehension.)

6. Use numberWords, combineWords and sort to define a function


makeIndex : fileContents -> index

that takes the contents of a file and returns an index for the file. See inp and inp.index for
examples.

7. Test main by giving it the full path to a copy of inp, and check the index that it writes to
inp.index.

CITS3242: Programming Paradigms 3 Lab sheet 2

You might also like