c++ - Reducing time complexity of string comparison -


i have dictionary .txt file on thousand words , definitions. i've written program take first word of each line file , check against string input user:

void checkword(string input) {      std::ifstream infile;     infile.open("oxford.txt");     if (infile.is_open())     {         string line; //there "using std::string" in file         while (getline(infile, line))         {             //read first word each line             std::istringstream iss(line);             string word;             iss >> word;             //make sure strings being compared same case             std::transform(word.begin(), word.end(), word.begin(), ::tolower);             std::transform(input.begin(), input.end(), input.begin(), ::tolower);             if (word == input)             {                 //do thing word             }         }         infile.close();         return "end of file";     }     else     {         return "unable open file";     }  } 

but if i'm checking more sentence, time takes process becomes noticeable. i've thought about few ways of making time shorter:

  • making .txt file each letter of alphabet (pretty easy do, not fix in long-term)
  • using unordered_set compare strings (like in this question) problem might initial creation of these maps text file
  • using other data structure compare strings? (like std::map)

given data "sorted", kind of data structure or method should employ in order (if possible) reduce time complexity? also, there issues function using compare strings? (for example, string::compare() quicker "=="?)

a tree (std::map)? or hashmap (std::unsorted_map)? linear search brute force solution! both of above substantially superior multiple searches.

of course, helps if going use data multiple times per program run, didn't specify in question. if not, there's not benefit in loading , parsing , storing all data perform single lookup quit. put break in on success, @ least.

you imply input file sorted. hack binary search solution file seeking (which cheap) , snapping nearest newline on each iteration determine words same leading (say) 3 characters in file. thousand entries, though, overkill.


Comments

Popular posts from this blog

python - pip install -U PySide error -

arrays - C++ error: a brace-enclosed initializer is not allowed here before ‘{’ token -

cytoscape.js - How to add nodes to Dagre layout with Cytoscape -