content_indexer v0.2.0 ContentIndexer.Services.SearchUtils
utility functions to crawl a folder with files and extract content - the actual processing of the content is handled by the file_pre_process_func function that we are using from the ContentIndexer.Services.PreProcess module - however this can easily be swapped out by passing your own pre-process
Link to this section Summary
Functions
Compiles the query using the passed pre-process function
See the ContentIndexer.Services.PreProcess
module for what sort of pre-processing is done to the query tokens list
crawls a folder and process the content into tokens using the passed in function
See the ContentIndexer.Services.PreProcess
module for what sort of pre-processing is done when the content is crawled
see crawl function - this version does the same thing - just in the background using a Task
Link to this section Functions
Compiles the query using the passed pre-process function
See the ContentIndexer.Services.PreProcess
module for what sort of pre-processing is done to the query tokens list
## Parameters
- query: List of String query tokens
- file_pre_process_func: Function with 1 parameter that is used to pre-process the token data
## Example
iex> ContentIndexer.Services.SearchUtils.compile_query(["bread", "and", "butter"], &ContentIndexer.Services.PreProcess.pre_process_query/)
[{"bread", -0.34657359027997264}, {"butter", -0.34657359027997264}]
crawls a folder and process the content into tokens using the passed in function
See the ContentIndexer.Services.PreProcess
module for what sort of pre-processing is done when the content is crawled
## Parameters
- data_folder: String - folder name
- file_pre_process_func: Function with 2 parameters that is used to pre-process the token data
## Example
iex> ContentIndexer.Services.SearchUtils.crawl("test/fixtures", &ContentIndexer.Services.PreProcess.pre_process_content/2)
[
{"test1.md",
["test1", "this", "test", "file", "one", "two", "simpl", "line", "text"...]},
{"test2.md",
["test2", "cook", "great", "hobbi", "nor", "again", "anyon", "who", "love"...]},
{"test3.md",
["test3", "how", "about", "learn", "new", "music", "instrument", "year"...]}
]
see crawl function - this version does the same thing - just in the background using a Task.