content_indexer v0.2.0 ContentIndexer.Services.SearchUtils

utility functions to crawl a folder with files and extract content - the actual processing of the content is handled by the file_pre_process_func function that we are using from the ContentIndexer.Services.PreProcess module - however this can easily be swapped out by passing your own pre-process

Link to this section Summary

Functions

Compiles the query using the passed pre-process function See the ContentIndexer.Services.PreProcess module for what sort of pre-processing is done to the query tokens list

crawls a folder and process the content into tokens using the passed in function See the ContentIndexer.Services.PreProcess module for what sort of pre-processing is done when the content is crawled

see crawl function - this version does the same thing - just in the background using a Task

Link to this section Functions

Link to this function accum_list(list)
Link to this function accum_list(list, acc)
Link to this function build_index(data_folder, file_pre_process_func)
Link to this function compile_query(query, query_pre_process_func)

Compiles the query using the passed pre-process function See the ContentIndexer.Services.PreProcess module for what sort of pre-processing is done to the query tokens list

## Parameters

- query: List of String query tokens
- file_pre_process_func: Function with 1 parameter that is used to pre-process the token data

## Example

iex> ContentIndexer.Services.SearchUtils.compile_query(["bread", "and", "butter"], &ContentIndexer.Services.PreProcess.pre_process_query/)
      [{"bread", -0.34657359027997264}, {"butter", -0.34657359027997264}]
Link to this function crawl(data_folder, file_pre_process_func)

crawls a folder and process the content into tokens using the passed in function See the ContentIndexer.Services.PreProcess module for what sort of pre-processing is done when the content is crawled

## Parameters

- data_folder: String - folder name
- file_pre_process_func: Function with 2 parameters that is used to pre-process the token data

## Example

iex> ContentIndexer.Services.SearchUtils.crawl("test/fixtures", &ContentIndexer.Services.PreProcess.pre_process_content/2)
      [
        {"test1.md",
          ["test1", "this", "test", "file", "one", "two", "simpl", "line", "text"...]},
        {"test2.md",
          ["test2", "cook", "great", "hobbi", "nor", "again", "anyon", "who", "love"...]},
        {"test3.md",
          ["test3", "how", "about", "learn", "new", "music", "instrument", "year"...]}
      ]
Link to this function crawl_async(data_folder, file_pre_process_func)

see crawl function - this version does the same thing - just in the background using a Task.