Lab5: Perl Scripting

Due Date: Sun 21 Mar 2010 by 23:59


Objective:

Once you are done with this lab, you should be familiar with:


Assignment:

Using the web without a search engine has become almost unthinkable. We all use Google, Yahoo, Bing, or some other similar service to locate information online. In this assignment, you are to write a Perl script that takes a keyword as an argument and searches for the keyword on Yahoo, Google and Bing.

The script should then combine the top 5 URL results from each search engine and rank the combined results. Rankings should be based on the number of searches in which the URL was found: a URL which is found by more search engines should be ranked more highly.

Finally, the script should display the URLs in rank order, with the highest ranked URLs listed first. The display should also show which search engine(s) provided the result. Your format should match the following example:

"http://www.tessa.org/" - Google & Yahoo & Bing
"http://www.tessacs.org/" - Google & Yahoo & Bing
"http://en.wikipedia.org/wiki/Tessa_No%C3%ABl" - Yahoo & Bing
"http://www.tessafrica.net/" - Google & Bing
"http://twitter.com/tessa" - Google
"http://tessasbraces.blogspot.com/" - Google
"http://www.tessahoran.com" - Yahoo
"http://www.tessaradley.com" - Yahoo
"http://www.tessa.com" - Bing

To download web pages, use the UNIX command wget. Here is some useful syntax:

wget -U "Mozilla/5.0" -O "outputfile" URL   > /dev/null 

For example:

wget -U "Mozilla/5.0" -O "temp.html" "http://www.google.com/search?q=tessa"   > /dev/null 

See the wget man page for more information

Handing in your Solution

Using the same process as previous labs, copy your Perl (.pl) file to the handin directory. Early is Saturday midnight, (+10% bonus), Sunday midnight is on time, Monday midnight is the end of the late period (-10% penalty).