content_indexer v0.1.0 ContentIndexer.Services.PreProcess
content and query pre-process functions that are passed to the SearchUtils.compile and SearchUtils.compile_query functions - here we are just some some extra stuf with a markdown file - i.e. removing the header.
The import thing to note is that these two functions take in the content as a string and spit out a list of tokenized strings.
The steps we are taking:
(1) Remove all the stop words - they are noise and we should never search by them (2) remove non-char data & white space
Using streams means most of the work will happen in a single step
Link to this section Summary
Link to this section Functions
Link to this function
pre_process_content(content, file_name)
Link to this function
pre_process_query(query)