CS 477/677 Analysis f Algrithms Fall 2007 Dr. Gerge Bebis Curse Prject Due Date: 11/29/2007 Part1: Cmparisn f Srting Algrithms (70% f the prject grade) The bjective f the first part f the assignment is t study hw the theretical analysis f a variety f srting algrithms cmpares with their actual perfrmance. The srting algrithms yu will study are: Insertin Srt Merge Srt Quick Srt Heap Srt Radix Srt The majr emphasis f this assignment is n analyzing the perfrmance f the algrithms, NOT n cding the algrithms. Therefre, yu will be given a template prgram which implements the abve algrithms. Yu will need t make nly a few mdificatins t test ut different strategies. Mst f yur time shuld be spent n designing careful test cases and analyzing yur results in rder t draw useful cnclusins regarding the perfrmance f the varius algrithms. Template Prgram The cde is dcumented well enugh fr yu t take it and mdify it. The template prgram can read the input frm a file, r t create three types f input lists: (i) randm generated elements, (ii) elements srted in increasing rder, (iii) elements srted in decreasing rder. If needed, yu can add additinal functinality t it. The frmat f the input files is as fllws: [n] /*number f psitive integers t srt*/ [element 1] [element 2] : [element n] The times returned are micrsecnds and secnds *n my machine*. Hwever, the value returned is system dependent. The time reprted as micrsecnds is 10 6 times the time reprted as secnds. If yu are unsure f the units yu will have t check ut the man page fr yur machine. Hwever, primarily what it wuld be f interest t yu is
the *relative* times i.e., the units shuld nt matter except fr determining the cnstants. I dn't think yu will find a randm number generatr that generates a number in a range yu specify. Yu will need t d a man n "rand," "drand", and "randm" and its cunterparts t see hw they wrk, but mst f them return a number in a specified range, say (0,1). T get a number in the range yu want, say [F,L], yu wuld need t find a way t translate the randm number r in (0,1) t smething in [F,L]. There are a number f ways t d this. Here is just ne: r1 = rand() /* randm number between 0 and 1 */ t1 = r1*(l-f) /* randm number between 0 and L-F */ r2 = F + t1 /* randm number between F and L */ T cmpile the prgram, cpy srt.c and Makefile int yur directry and type make (it is written in C and uses the gcc cmpiler). Cding The versin f QuickSrt in the template prgram uses the first element in the array as the pivt element. Yu will study different strategies fr selecting the pivt: Pivt Chice 1: The first element in the list (used in the template prgram) Pivt Chice 2: A randm element in the array. Pivt Chice 3: The median f the first, middle, and last elements in the array. Yu shuld implement the Pivt Chices 2 and 3 listed abve. Yu will then have three different versins f QuickSrt (yu wuld need t add them t the template prgram as additinal menu ptins). Graduate Students Only: Implement and cmpare tw mre srting algrithms: Bubble Srt and Cunting Srt Analysis 1. [10 pints] Theretical questin Assume the n input elements are integers in the range [0, n-1]. Fr each algrithm, determine what are best, average, and wrst-case inputs. Yur writeup shuld list these fr each algrithm. Include a sentence r tw f justificatin fr each ne. Yu shuld answer what yu expect t be true based n a theretical analysis (and yu shuld nt refer t experimental results). In the subsequent questins we will cmpare the experimental results t these theretical predictins
. 2. [5 pints] Data generatin and experimental setup. The chice f test data is up t yu (i.e., fr each srting algrithm, which input sizes shuld be tested, hw many different inputs f the same size, which particular inputs f a given size.) Be smart abut which experiments t run, i.e., dn't run larger r mre tests than yu need t answer the abve questins reasnably well. Als, nte that yu will need t run yur experiments several times in rder t get stable measurements (i.e., times will vary depending upn system lad, input, etc.). Yur experimental setup must be described in terms f the fllwing: What kind f machine did yu use? What timing mechanism? Hw many times did yu repeat each experiment? What times are reprted? Hw did yu select the inputs? Did yu use the same inputs fr all srting algrithms? 3. [10 pints] Which f the three versins f Quick srt seems t perfrm the best? Graph the best case running time as a functin f input size n fr all three versins (use the best case input yu determined in each case in part 1). Graph the wrst case running time as a functin f input size n fr all three versins (use the wrst case input yu determined in each case in part 1). Graph the average case running time as a functin f input size n fr all three versins. 4. [15 pints] Which f the five srts seems t perfrm the best (cnsider the best versin f Quicksrt)? Graph the best case running time as a functin f input size n fr the five srts (use the best case input yu determined in each case in part 1). Graph the wrst case running time as a functin f input size n fr the five srts (use the best case input yu determined in each case in part 1). Graph the average case running time as a functin f input size n fr the five srts. 5. [15 pints] T what extent des the best, average and wrst case analyses (frm class/textbk) f each srt agree with the experimental results? T answer this questin yu wuld need t find a way t cmpare the experimental results fr a srt with its predicted theretical times. One way t cmpare a time btained experimentally t a predicted time f O(f(n)) (e.g., f(n)= n 2 ) wuld be t divide the time fr a number f runs with different input sizes by f(n) and see if yu get a hrizntal line (after sme input size n 0 ). That n 0 wuld
represent the n 0 value fr the asympttic analysis. The value n the y-axis (assuming yu put input size n the x-axis) will give yu the cnstant value f the big-o. Fr each srt, and fr each case (best, average, and wrst), determine whether the bserved experimental running time is f the same rder as predicted by the asympttic analysis. Yur determinatin shuld be backed up by yur experiments and analysis and yu must explain yur reasning. If yu fund the srt didn't cnfrm t the asympttic analysis, yu shuld try t understand why and prvide an explanatin. 6. [15 pints] Fr the cmparisn srts, is the number f cmparisns really a gd predictr f the executin time? In ther wrds, is a cmparisn a gd chice f basic peratin fr analyzing these algrithms? T answer this questin yu wuld need t analyze yur data t see if the number f cmparisns is crrelated with executin time. Plt (time / #cmp) vs. n and refer t these plts in yur answer. Deliverables A HARDCOPY REPORT.It shuld address the pints mentined abve. Yur write-up must include a cherent discussin f which experiments yu ran, hw many times yu ran them, etc. Grading n this assignment will put the greatest weight n the chice f test data and the quality and insightfulness f yur discussin f yur results. Dn't be put ff t much if there are sme discrepancies between the theretical results and the experiments. If that happens, try t explain why it ccurred. Reprts must be typed and carry a high percentage f yur grade. Write yur rept carefully; explain things as clearly as pssible, check fr spelling errrs. Answer the questins in the rder presented. Use meaningful titles fr each subsectin and figure captins t explain the graphs. Als, graphs shuld be numbered and must be in the same sectin where they are discussed. AN ELECTRONIC COPY OF YOUR CODE. If yu have several files t turn in, then please submit them in a rar file by email. Prvide instructins f hw t run yur cdes. Please nte that I d nt want a hardcpy f yur cde, yur raw utput, r a lg f yur prgram's executin. Part2: Prblem Slving and Analysis (30% f the prject grade)
Design and implement an efficient algrithm that, given a set S f n integers and anther integer x, determines whether r nt there exist tw elements in S whse sum is exactly x. (a) First, slve this prblem in a brute-frce manner by checking all pssible pairs f elements. Shw the pseud-cde and prvide an analysis fr the running time f this apprach. (b) Secnd, find a mre efficient algrithm that des nr require checking all pssible pairs f elements. Shw the pseud-cde and prvide an analysis fr the running time f this apprach. Deliverables A HARDCOPY REPORT. It shuld address the pints mentined abve. Yur write-up must include a cherent discussin f the running time analysis f each apprach, AN ELECTRONIC COPY OF YOUR CODE. If yu have several files t turn in, then please submit them in a rar file by email. Prvide instructins f hw t run yur cdes. Please nte that I d nt want a hardcpy f yur cde, yur raw utput, r a lg f yur prgram's executin. Cntributin Statement Prvide a detailed statement abut the cntributin f each grup member in cmpleting this assignment. In particular, cmment n psitive and negative aspects f wrking tgether with yur partner n this assignment. Please, nte that in the event that yu cannt wrk tgether with yur partner n this assignment, yu shuld infrm me as sn as pssible. Individual slutins will nt be accepted unless there is a gd reasn.