js search下載 - js search原始碼下載

js search

其他源碼

1.0.0

下載

安裝|概述 |代幣化 |詞幹 |停用詞 |搜尋索引 |指數策略

Js Search：客戶端搜尋庫

Js Search 支援對 JavaScript 和 JSON 物件進行高效率的客戶端搜尋。它與 ES5 相容，不需要 jQuery 或任何其他第三方函式庫。

Js Search 最初是 Lunr JS 的輕量級實現，提供運行時效能改進和更小的檔案大小。此後，它已擴展到包含豐富的功能集 - 支援詞幹提取、停用詞和 TF-IDF 排名。

以下是一些比較兩個搜尋庫的 JS Perf 基準測試。（感謝 olivernn 調整 Lunr 一側以獲得更好的比較！）

搜尋索引的初始構建
運行搜尋

如果您正在尋找更簡單的、針對 web-worker 優化的 JS 搜尋實用程序，請查看 js-worker-search。

如果你喜歡這個項目，？成為贊助商或 ☕ 請我喝杯咖啡

安裝

您可以使用 Bower 或 NPM 進行安裝，如下所示：

npm install js-search
bower install js-search

概述

在較高層級上，您可以透過告訴 Js Search 應該索引哪些欄位來進行搜尋來配置 Js Search，然後新增要搜尋的物件。

例如，JS Search 的簡單使用如下：

 import * as JsSearch from 'js-search' ;

var theGreatGatsby = {
  isbn : '9781597226769' ,
  title : 'The Great Gatsby' ,
  author : {
    name : 'F. Scott Fitzgerald'
  } ,
  tags : [ 'book' , 'inspirational' ]
} ;
var theDaVinciCode = {
  isbn : '0307474275' ,
  title : 'The DaVinci Code' ,
  author : {
    name : 'Dan Brown'
  } ,
  tags : [ 'book' , 'mystery' ]
} ;
var angelsAndDemons = {
  isbn : '074349346X' ,
  title : 'Angels & Demons' ,
  author : {
    name : 'Dan Brown' ,
  } ,
  tags : [ 'book' , 'mystery' ]
} ;

var search = new JsSearch . Search ( 'isbn' ) ;
search . addIndex ( 'title' ) ;
search . addIndex ( [ 'author' , 'name' ] ) ;
search . addIndex ( 'tags' )

search . addDocuments ( [ theGreatGatsby , theDaVinciCode , angelsAndDemons ] ) ;

search . search ( 'The' ) ;    // [theGreatGatsby, theDaVinciCode]
search . search ( 'scott' ) ;  // [theGreatGatsby]
search . search ( 'dan' ) ;    // [angelsAndDemons, theDaVinciCode]
search . search ( 'mystery' ) // [angelsAndDemons, theDaVinciCode]

代幣化

標記化是將文字（例如句子）分解為較小的、可搜尋的標記（例如單字或單字的一部分）的過程。 Js Search 提供了一個基本的分詞器，應該適用於英語，但您可以提供自己的分詞器，如下所示：

 search . tokenizer = {
  tokenize ( text /* string */ ) {
    // Convert text to an Array of strings and return the Array
  }
} ;

詞幹擷取

詞幹提取是將搜尋標記減少到其詞根（或「詞幹」）的過程，以便搜尋單字的不同形式仍然會產生結果。例如「search」、「searching」和「searched」都可以簡化為字幹「search」。

Js Search 沒有實作自己的詞幹函式庫，但它確實支援透過使用第三方函式庫進行詞幹擷取。

若要啟用詞幹擷取，請使用StemmingTokenizer如下所示：

 var stemmer = require ( 'porter-stemmer' ) . stemmer ;

search . tokenizer =
	new JsSearch . StemmingTokenizer (
        stemmer , // Function should accept a string param and return a string
	    new JsSearch . SimpleTokenizer ( ) ) ;

停用詞

停用詞非常常見（例如，a、an、and、the、of），並且通常沒有語義意義。預設情況下，Js Search 不會過濾這些單詞，但可以透過使用StopWordsTokenizer來啟用過濾，如下所示：

 search . tokenizer =
	new JsSearch . StopWordsTokenizer (
    	new JsSearch . SimpleTokenizer ( ) ) ;

預設情況下，Js Search 使用 www.ranks.nl/stopwords 上列出的 Google 歷史記錄停用詞的稍微修改版本。您可以透過在JsSearch.StopWordsMap物件中新增或刪除值來修改此停用詞列表，如下所示：

 JsSearch . StopWordsMap . the = false ; // Do not treat "the" as a stop word
JsSearch . StopWordsMap . bob = true ;  // Treat "bob" as a stop word

請注意，停用詞是小寫的，因此使用區分大小寫的清理程序可能會阻止某些停用詞被刪除。

配置搜尋索引

js-search打包了兩個搜尋索引。

詞頻-逆文檔頻率（或 TF-IDF）是一種數位統計量，旨在反映一個或多個單字對於語料庫中的文件的重要性。 TF-IDF 值與文件中單字出現的次數成比例增加，但會隨著單字在語料庫中出現的頻率而偏移。這有助於調整某些單字（例如 and、or、the）比其他單字出現頻率更高的事實。

預設情況下，Js Search 支援 TF-IDF 排名，但如果不需要，可以出於效能原因停用此功能。您可以指定備用ISearchIndex實作以停用 TF-IDF，如下所示：

 // default
search . searchIndex = new JsSearch . TfIdfSearchIndex ( ) ;

// Search index capable of returning results matching a set of tokens
// but without any meaningful rank or order.
search . searchIndex = new JsSearch . UnorderedSearchIndex ( ) ;

配置索引策略

js-search封裝了三種索引策略。

PrefixIndexStrategy用於前綴搜尋的索引。（例如，術語“cat”被索引為“c”、“ca”和“cat”，允許前綴搜尋查找）。

AllSubstringsIndexStrategy所有子字串的索引。換句話說，「c」、「ca」、「cat」、「a」、「at」和「t」都與「cat」相符。

ExactWordIndexStrategy精確單字匹配的索引。例如，“bob”將匹配“bob jones”（但“bo”不會）。

預設情況下，Js Search 支援前綴索引，但這是可配置的。您可以指定備用IIndexStrategy實作以停用前綴索引，如下所示：

 // default
search . indexStrategy = new JsSearch . PrefixIndexStrategy ( ) ;

// this index strategy is built for all substrings matches.
search . indexStrategy = new JsSearch . AllSubstringsIndexStrategy ( ) ;

// this index strategy is built for exact word matches.
search . indexStrategy = new JsSearch . ExactWordIndexStrategy ( ) ;

展開

附加信息

版本 1.0.0
類型其他源碼
更新時間 2024-12-26
大小 131.83KB
來自於 Github

相關應用

FNF JS Engine

2024-11-10
詞搜尋 800

2024-11-08
azure search python samples

2024-11-05
js range download

2024-11-04
kiwix js pwa

2024-11-02
Liehuo! Search 英文搜索

2011-01-07

爲您推薦

chat.petals.dev

其他源碼

1.0.0
GPT Prompt Templates

其他源碼

1.0.0
GPTyped

其他源碼

GPTyped 1.0.5
waymo open dataset

其他源碼

December 2023 Update
SmartTube

其他源碼

24.71 Stable
Sunamu

其他源碼

Release 2.2.0
waymo open dataset

其他源碼

December 2023 Update
termwind

其他類別

v2.3.0
wp functions

其他類別

1.0.0

相關資訊全部