hashtable - Java spell checker using hash tables -
i don't want codes. want learn logic myself need pointing right direction. pseudocode fine. need create spell checker using hash tables primary data structure. know may not best data structure job tasked do. words correct spellings come text file. please guide me on how approach problem.
the way i'm thinking of doing it:
i'm guessing need create adt class takes string words.
i need main class reads dictionary text file , takes sentence inputted user. class scans string of words places each word arraylist noting spaces in between words. boolean method pass each word in arraylist class handle misspellings , return if word valid or false.
i believe need create class generates misspellings word list , stores them hash table? there boolean method takes string parameter checks in table if word valid , return true or false.
in generating misspellings, key concepts have out be: (take example word: "hello")
- missing characters. e.g. "ello", "helo"
- jumbled version of word. e.g. "ehllo", "helol"
- phonetic misspelling. e.g. "fello" ('f' 'h')
how can improve on thinking?
edit! came using hashset
/** * main program loads correct dictionary spellings * , takes input analyzed user. * @author catherine austria */ public class spellchecker { private static string stringinput; // input check; private static string[] checkthis; // stringinput turned array of words check. public static hashset dictionary; // dictionary used /** * main method. * @param args argh! */ public static void main(string[] args) { setup(); }//end of main /** * method loads dictionary , initiates checks errors in scanned input. */ public static void setup(){ int tablesize=59000; dictionary = new hashset(tablesize); try { //system.out.print(system.getproperty("user.dir"));//just find user's working directory; // combined filereader bufferreader statement //the file located in edu.frostburg.cosc310 bufferedreader bufferedreader = new bufferedreader(new filereader("./dictionary.txt")); string line = null; // notes 1 line @ time while((line = bufferedreader.readline()) != null) { dictionary.add(line);//add dictinary word in } prompt(); bufferedreader.close(); //close file } catch(filenotfoundexception ex) { ex.printstacktrace();//print error } catch(ioexception ex) { ex.printstacktrace();//print error } }//end of setup /** * prompt auto generated tests or manual input test. */ public static void prompt(){ system.out.println("type number below: "); system.out.println("1. auto generate test\t2.manual input\t3.exit"); scanner theline = new scanner(system.in); int choice = theline.nextint(); // manual input if(choice==1) autotest(); else if(choice==2) startwinput(); else if (choice==3) system.exit(0); else system.out.println("invalid input. exiting."); } /** * manual input of sentence or words. */ public static void startwinput(){ //printdictionary(bufferedreader); // print dictionary system.out.println("spell checker c. austria\nplease enter text check: "); scanner theline = new scanner(system.in); stringinput = theline.nextline(); // manual input system.out.print("\nyou have entered text: "+stringinput+"\ninitiating check..."); /*------------------------------------------------------------------------------------------------------------*/ //final long starttime = system.currenttimemillis(); //speed test wordfinder grammarnazi = new wordfinder(); //instance of misspell splitstring(removepunctuation(stringinput));//turn string line string[] grammarnazi.initialcheck(checkthis); //final long endtime = system.currenttimemillis(); //system.out.println("total execution time: " + (endtime - starttime) ); }//end of startwinput /** * generates testing case. */ public static void autotest(){ system.out.println("spell checker c. austria\nthis sentence being tested:\nthe dog foud hom. , m ct hisse xdgfchv!@# "); wordfinder grammarnazi = new wordfinder(); //instance of misspell splitstring(removepunctuation("the dog foud hom. , m ct hisse xdgfchv!@# "));//turn string line string[] grammarnazi.initialcheck(checkthis); }//end of autotest /** * method prints entire dictionary. * used in testing. * @param bufferedreader dictionary file */ public static void printdictionary(bufferedreader bufferedreader){ string line = null; // notes 1 line @ time try{ while((line = bufferedreader.readline()) != null) { system.out.println(line); } }catch(filenotfoundexception ex) { ex.printstacktrace();//print error } catch(ioexception ex) { ex.printstacktrace();//print error } }//end of printdictionary /** * methods splits passed string , puts them string[] * @param sentence sentence needs editing. */ public static void splitstring(string sentence){ // split sentence in between " " aka spaces checkthis = sentence.split(" "); }//end of splitstring /** * method removes punctuation , capitalization string. * @param sentence sentence needs editing. * @return edited sentence. */ public static string removepunctuation(string sentence){ string newsentence; // new sentence //remove evil punctuation , convert whole line lowercase newsentence = sentence.tolowercase().replaceall("[^a-za-z\\s]", "").replaceall("\\s+", " "); return newsentence; }//end of removepunctuation } class checks misspellings public class wordfinder extends spellchecker{ private int wordslength;//length of string[] check private list<string> wrongwords = new arraylist<string>();//stores incorrect words /** * methods checks string[] spelling errors. * hashes each index in string[] see if in dictionary hashset * @param words string list of misspelled words check */ public void initialcheck(string[] words){ wordslength=words.length; system.out.println(); for(int i=0;i<wordslength;i++){ //system.out.println("what i'm checking: "+words[i]); //test if(!dictionary.contains(words[i])) wrongwords.add(words[i]); } //end //manualwordlookup(); //for testing dictionary if (!wrongwords.isempty()) { system.out.println("mistakes have been made!"); printincorrect(); } //end if if (wrongwords.isempty()) { system.out.println("\n\nmove along. end of program."); } //end if }//end of initialcheck /** * method prints incorrect words in string[] being checked , generates suggestions. */ public void printincorrect(){//delete guy system.out.print("these words [ "); (string wrongword : wrongwords) { system.out.print(wrongword + " "); }//end of system.out.println("]seems incorrect.\n"); suggest(); }//end of printincorrect /** * method gives suggestions user based on wrong words she/he misspelled. */ public void suggest(){ misspell test = new misspell(); while(!wrongwords.isempty()&&test.possibilities.size()<=5){ string wordcheck=wrongwords.remove(0); test.generatemispellings(wordcheck); //if possibilities size greater 0 print suggestions if(test.possibilities.size()>=0) test.print(test.possibilities); }//end of while }//end of suggest /*entering test zone*/ /** * allows tester thorough dictionary words if valid; , testing only. */ public void manualwordlookup(){ system.out.print("enter 'ext' exit.\n\n"); scanner line = new scanner(system.in); string look=line.nextline(); do{ if(dictionary.contains(look)) system.out.print(look+" valid\n"); else system.out.print(look+" invalid\n"); look=line.nextline(); }while (!look.equals("ext")); }//end of manualwordlookup } /** * main class responsible generating misspellings. * @author catherine austria */ public class misspell extends spellchecker{ public list<string> possibilities = new arraylist<string>();//stores possible suggestions private list<string> tempholder = new arraylist<string>(); //telps transposition method private int ldistance=0; // distance related 2 words private string wrongword;// original wrong word. /** * execute methods make misspellings. * @param wordcheck word being checked. */ public void generatemispellings(string wordcheck){ wrongword=wordcheck; try{ concatfl(wordcheck); concatll(wordcheck); replacefl(wordcheck); replacell(wordcheck); deletefl(wordcheck); deletell(wordcheck); pluralize(wordcheck); transposition(wordcheck); }catch(stringindexoutofboundsexception e){ system.out.println(); }catch(arrayindexoutofboundsexception e){ system.out.println(); } } /** * method concats word behind each of alphabet letters , checks if in dictionary. * fl first letter * @param word word being manipulated. */ public void concatfl(string word){ char cur; // current character string tempword=""; // stores temp made word for(int i=97;i<123;i++){ cur=(char)i;//assign ascii index value tempword+=cur; //if word in dictionary add possibilities list tempword=tempword.concat(word); //add passed string end of tempword checkdict(tempword); //check see if in dictionary tempword="";//reset temp word contain nothing }//end of }//end of concatfl /** * concatenates alphabet letters behind each of word , checks if in dictionary. ll last letter. * @param word word being manipulated. */ public void concatll(string word){ char cur; // current character string tempword=""; // stores temp made word for(int i=123;i>97;i--){ cur=(char)i;//assign ascii index value tempword=tempword.concat(word); //add passed string end of tempword tempword+=cur; //if word in dictionary add possibilities list checkdict(tempword); tempword="";//reset temp word contain nothing }//end of }//end of concatll /** * method replaces first letter (fl) of word alphabet letters. * @param word word being manipulated. */ public void replacefl(string word){ char cur; // current character string tempword=""; // stores temp made word for(int i=97;i<123;i++){ cur=(char)i;//assign ascii index value tempword=cur+word.substring(1,word.length()); //add ascii of ad substring of word index 1 till word's last index checkdict(tempword); tempword="";//reset temp word contain nothing }//end of }//end of replacefl /** * method replaces last letter (ll) of word alphabet letters * @param word word being manipulated. */ public void replacell(string word){ char cur; // current character string tempword=""; // stores temp made word for(int i=97;i<123;i++){ cur=(char)i;//assign ascii index value tempword=word.substring(0,word.length()-1)+cur; //add ascii of ad substring of word index 1 till word's last index checkdict(tempword); tempword="";//reset temp word contain nothing }//end of }//end of replacell /** * deletes first letter , sees if in dictionary * @param word word being manipulated. */ public void deletefl(string word){ string tempword=word.substring(1,word.length()-1); // stores temp made word checkdict(tempword); //print(possibilities); }//end of deletefl /** * deletes last letter , sees if in dictionary * @param word word being manipulated. */ public void deletell(string word){ string tempword=word.substring(0,word.length()-1); // stores temp made word checkdict(tempword); //print(possibilities); }//end of deletell /** * method pluralizes word input * @param word word being manipulated. */ public void pluralize(string word){ string tempword=word+"s"; checkdict(tempword); }//end of pluralize /** * it's purpose check word if in dictionary. * if is, add possibilities list. * @param word word being checked. */ public void checkdict(string word){ if(dictionary.contains(word)){//check see if tempword in dictionary //if tempword in dictionary, check if in possibilities list //then if tempword not in list, add tempword list if(!possibilities.contains(word)) possibilities.add(word); } }//end of checkdict /** * method transposes letters of word different places. * not best implementation. guy last minute addition. * @param word word being manipulated. */ public void transposition(string word){ wrongword=word; int wordlen=word.length(); string[] mixer = new string[wordlen]; //string[] length of passed word //make word string[] for(int i=0;i<wordlen;i++){ mixer [i]=word.substring(i,i+1); } shift(mixer); }//end of transposition /** * method takes string[] list shifts value in between * elements in list , checks if in dictionary, adds if so. * agree brute force implementation. * @param mixer string array being shifted around. */ public void shift(string[] mixer){ system.out.println(); string wordvalue=""; for(int i=0;i<=tempholder.size();i++){ resethelper(tempholder);//reset helper transposehelper(mixer);//fill tempholder string wordfirstvalue=tempholder.remove(i);//remove value @ index in tempholder for(int j=0;j<tempholder.size();j++){ int inttemp=0; string temp; while(inttemp<j){ temp=tempholder.remove(inttemp); tempholder.add(temp); wordvalue+=wordfirstvalue+printword(tempholder); inttemp++; if(dictionary.contains(wordvalue)) if(!possibilities.contains(wordvalue)) possibilities.add(wordvalue); wordvalue=""; }//end of while }//end of }//end }//end of shift /** * method fills list tempholder contents string[] * @param wordmix string array being shifted around. */ public void transposehelper(string[] wordmix){ for(int i=0;i<wordmix.length;i++){ tempholder.add(wordmix[i]); } }//end of transposehelper /** * resets list * @param thislist removes content of list */ public void resethelper(list<string> thislist){ while(!thislist.isempty()) thislist.remove(0); //while list not empty, remove first value }//end of resethelper /** * method prints out list * @param listprint list print out. */ public void print(list<string> listprint){ if (possibilities.isempty()) { system.out.print("can't seem find related words "+wrongword); return; } system.out.println("maybe meant these "+wrongword+": "); system.out.printf("%s", listprint); resethelper(possibilities); }//end of print /** * returns string word version of list * @param listprint list make word. * @return generated word version of list. */ public string printword(list<string> listprint){ object[] suggests = listprint.toarray(); string theword=""; for(object word: suggests){//form listprint elements word theword+=word; } return theword; }//end of printword }
it sounds want quick way verify word spelled correctly, or find correct spelling. if trying can use hashmap<string,string>
(i.e. hash table string keys , string values). whenever find word in dictionary enter key null value indicating word not changed (i.e. correct spelling). can compute , add keys possible misspellings , give correct word value.
you'd have devise way carefully, because if dictionary has 2 similar words "clot" , "colt" computed misspellings of 1 may replace correct spelling (or misspellings) of other. once done can word see if in dictionary, if misspelling of dictionary word (and word), or if not found @ all.
i believe bad design though, because table has exponentially larger (i assume, quite large) dictionary. , because spent lot of time calculating many misspelling every word in dictionary (very big overhead if check few lines may contain few of these words). given little liberty opt hashset<string>
(which hash table without values) filled dictionary words. allows check if word in dictionary or not.
you can dynamically compute other ways spell words when encounter ones not in dictionary. if doing line or 2 should not slow @ (certainly faster computing alternatives in dictionary). if wanted every whole file may want keep smaller hashmap<string,string>
separate dictionary store corrections find since author may misspell word same way in future. checking before computing alternatives keeps duplicating efforts several times over.
Comments
Post a Comment