ndx 다운로드 ndx 소스 코드 다운로드

ndx

기타 소스코드

1.0.0

다운로드

NDX ·

가벼운 전체 텍스트 인덱싱 및 검색 라이브러리.

이 라이브러리는 모든 문서가 디스크 (indexedDB)에 저장되어 인덱스에 동적으로 추가하거나 제거 할 수있는 특정 사용 사례를 위해 설계되었습니다.

쿼리 기능은 분리 연산자 만 지원합니다. one two 같은 쿼리는 "one" or "two" 로 작동합니다.

역 색인은 용어 위치를 저장하지 않으며 쿼리 기능은 "Super Mario" 와 같은 문구를 검색 할 수 없습니다.

특정 사용 사례에 더 적합 할 수있는 다른 트레이드 오프를 가진 많은 대체 솔루션이 있습니다. 정적 데이터 세트가있는 간단한 문서 검색을 위해서는 FST와 같은 것을 사용하여 Edge 함수 (WASM)로 배포하는 것이 좋습니다.

특징

여러 필드 전체 텍스트 인덱싱 및 검색.
필드 당 점수 향상.
BM25 순위는 일치하는 문서 순위를 매길 수 있습니다.
트리 기반 동적 거꾸로 된 색인.
구성 가능한 토큰 화기 및 용어 필터.
쿼리 확장이있는 무료 텍스트 쿼리.

예

 import { createIndex , indexAdd } from "ndx" ;
import { indexQuery } from "ndx/query" ;

const termFilter = ( term ) => term . toLowerCase ( ) ;

function createDocumentIndex ( fields ) {
  // `createIndex()` creates an index data structure.
  // First argument specifies how many different fields we want to index.
  const index = createIndex (
    fields . length ,
    // Tokenizer is a function that breaks text into words, phrases, symbols,
    // or other meaningful elements called tokens.
    ( s ) => s . split ( " " ) ,
    // Filter is a function that processes tokens and returns terms, terms are
    // used in Inverted Index to index documents.
    termFilter ,
  ) ;
  // `fieldGetters` is an array with functions that will be used to retrieve
  // data from different fields.
  const fieldGetters = fields . map ( ( f ) => ( doc ) => doc [ f . name ] ) ;
  // `fieldBoostFactors` is an array of boost factors for each field, in this
  // example all fields will have identical weight.
  const fieldBoostFactors = fields . map ( ( ) => 1 ) ;

  return {
    index ,
    // `add()` will add documents to the index.
    add ( doc ) {
      indexAdd (
        index ,
        fieldGetters ,
        // Docum  ent key, it can be an unique document id or a refernce to a
        // document if you want to store all documents in memory.
        doc . id ,
        // Document.
        doc ,
      ) ;
    } ,
    // `remove()` will remove documents from the index.
    remove ( id ) {
      // When document is removed we are just marking document id as being
      // removed. Index data structure still contains references to the removed
      // document.
      indexRemove ( index , removed , id ) ;
      if ( removed . size > 10 ) {
        // `indexVacuum()` removes all references to removed documents from the
        // index.
        indexVacuum ( index , removed ) ;
      }
    } ,

    // `search()` will be used to perform queries.
    search ( q ) {
      return indexQuery (
        index ,
        fieldBoostFactors ,
        // BM25 ranking function constants:
        // BM25 k1 constant, controls non-linear term frequency normalization
        // (saturation).
        1.2 ,
        // BM25 b constant, controls to what degree document length normalizes
        // tf values.
        0.75 ,
        q ,
      ) ;
    }
  } ;
}

// Create a document index that will index `content` field.
const index = createDocumentIndex ( [ { name : "content" } ] ) ;

const docs = [
  {
    "id" : "1" ,
    "content" : "Lorem ipsum dolor" ,
  } ,
  {
    "id" : "2" ,
    "content" : "Lorem ipsum" ,
  }
] ;

// Add documents to the index.
docs . forEach ( ( d ) => { index . add ( d ) ; } ) ;

// Perform a search query.
index . search ( "Lorem" ) ;
// => [{ key: "2" , score: ... }, { key: "1", score: ... } ]
//
// document with an id `"2"` is ranked higher because it has a `"content"`
// field with a less number of terms than document with an id `"1"`.

index . search ( "dolor" ) ;
// => [{ key: "1", score: ... }]