Lab1: Shell Scripting

Props to Greg Kesden for this assignment

Due Date: Sun, 25 Jan 2009 by 11:59pm


Background

This assignment asks you to implement a short shell script. Given a list of keywords and a list of files, the script should produce an organized index that shows the line number of each occurrence of each keyword within the specified files. Basically, this script makes it easy to find a reference within a file by keyword.


Specification

The script accepts two arguments, each of which is a quoted list. The first argument is the list of keywords. The second argument is the list of files to search. The files are to be searched for occurrences of each word that is provided. Substrings of the keywords should not be matched, e.g., if one of the keywods was "ark", occurrences of "mark" should not be flagged as a match.

Should the user invoke the script with other than two (2) arguments, it should display its usage help and terminate, as below:

    Usage: index "keywords" "file list"

The output should be neat and readable. In the section below you can see an example of sample functionality. Please try to match this format as closely as possible, using tabs to force the leading whitespace on a line.


Sample Interaction

> ./index "Parker Waterman" "pens/*"
Parker
        pens/fountain-pens.txt:
                12 15 39 58 62 67 
        pens/inks.txt:
                8 21 42 
        pens/other-pens.txt:
                7 22 
        pens/pens.txt:
                20 26 
Waterman
        pens/fountain-pens.txt:
                4 10 23 27 28 29 37 43 44 47 54 59 69 71 
        pens/inks.txt:
                11 19 22 24 28 30 32 47 
        pens/other-pens.txt:
                10 27 
        pens/pens.txt:
                5 18 35 39 40 42 


Implementation Hints and Strategy

grep can be used to find lines that contain matching patterns, such as words. RTFM to find grep switches that print out line numbers and word-based searches.

cut can be used to select the necessary information from the grep output. For example, cut -d: -f3 selects the third field from the input.

tr, short for translate can be used to substitute one character for another, or, with the -d option, to delete one character, outright.

Try these commands with some sample searches at the command prompt before using them in your shell script!

My solution first checked the argument count. If it wasn't right, it printed the usage and exited. Then, it entered a nested loop. The outer loop was "for each keyword in the list." The nested loop was "for each file in the list".

Basically, I printed out the keyword. Then, indented, I printed out each hit for each file. I got the hits and line numbers using grep, then doing some massaging. Lastly I used printf to generate the output.


Good things to research (RTFM)


Handing in your Solution

The handin directory will be set up next week.