npm
This is a failure resilient npm registry to Algolia index replication process. It will replicate all npm packages to an Algolia index and keep it up to date. The state of the replication is saved in Algolia index settings.
The replication should always be running. Only one instance per Algolia index must run at the same time. If the process fails, restart it and the replication process will continue at the last point it remembers.
The Algolia index is currently used, for free, by a few selected projects (e.g: yarnpkg.com, codesandbox.io, jsdelivr.com, etc...).
If you want to include this index to your project please create a support request here: Algolia Support.
This product is an open source product for the community and not supported by Algolia.
To be eligible your project must meet these requirements:
You can also use the code or the public docker image to run your own (as of September 2021 it will create ~3M records x4).
For every single NPM package, we create a record in the Algolia index. The resulting records have the following schema:
{
name: 'babel-core',
downloadsLast30Days: 10978749,
downloadsRatio: 0.08310651682685861,
humanDownloadsLast30Days: '11m',
jsDelivrHits: 11684192,
popular: true,
version: '6.26.0',
versions: {
// [...]
'7.0.0-beta.3': '2017-10-15T13:12:35.166Z',
},
tags: {
latest: '6.26.0',
old: '5.8.38',
next: '7.0.0-beta.3',
},
description: 'Babel compiler core.',
dependencies: {
'babel-code-frame': '^6.26.0',
// [...]
},
devDependencies: {
'babel-helper-fixtures': '^6.26.0',
// [...]
},
repository: {
url: 'https://github.com/babel/babel/tree/master/packages/babel-core',
host: 'github.com',
user: 'babel',
project: 'babel',
path: '/tree/master/packages/babel-core',
branch: 'master',
},
readme: '# babel-corenn> Babel compiler core.nnn [... truncated at 200kb]',
owner: {
// either GitHub owner or npm owner
name: 'babel',
avatar: 'https://github.com/babel.png',
link: 'https://github.com/babel',
},
deprecated: 'Deprecated', // This field will be removed, please use `isDeprecated` instead
isDeprecated: true,
deprecatedReason: 'Deprecated',
isSecurityHeld: false, // See https://github.com/npm/security-holder
badPackage: false,
homepage: 'https://babeljs.io/',
license: 'MIT',
keywords: [
'6to5',
'babel',
'classes',
'const',
'es6',
'harmony',
'let',
'modules',
'transpile',
'transpiler',
'var',
'babel-core',
'compiler',
],
created: 1424009748555,
modified: 1508833762239,
lastPublisher: {
name: 'hzoo',
email: '[email protected]',
avatar: 'https://gravatar.com/avatar/851fb4fa7ca479bce1ae0cdf80d6e042',
link: 'https://www.npmjs.com/~hzoo',
},
owners: [
{
email: '[email protected]',
name: 'thejameskyle',
avatar: 'https://gravatar.com/avatar/8a00efb48d632ae449794c094f7d5c38',
link: 'https://www.npmjs.com/~thejameskyle',
},
// [...]
],
lastCrawl: '2017-10-24T08:29:24.672Z',
dependents: 3321,
types: {
ts: 'definitely-typed', // definitely-typed | included | false
definitelyTyped: '@types/babel__core',
},
moduleTypes: ['unknown'], // esm | cjs | none | unknown
styleTypes: ['none'], // file extensions like css, less, scss or none if no style files present
humanDependents: '3.3k',
changelogFilename: null, // if babel-core had a changelog, it would be the raw GitHub url here
objectID: 'babel-core',
// the following fields are considered internal and may change at any time
_downloadsMagnitude: 8,
_jsDelivrPopularity: 5,
_popularName: 'babel-core',
_searchInternal: {
alternativeNames: [
// alternative versions of this name, to show up on confused searches
],
},
}
If you want to learn more about how Algolia's ranking algorithm is working, you can read this blog post.
We're restricting the search to use a subset of the attributes only:
_popularName
name
description
keywords
owner.name
owners.name
Algolia provides default prefix search capabilities (matching words with only the beginning). This is disabled for the owner.name
and owners.name
attributes.
Algolia provides default typo-tolerance.
Using the optionalFacetFilters
feature of Algolia, we're boosting exact matches on the name of a package to always be on top of the results.
For each package, we use the number of downloads in the last 30 days as Algolia's customRanking
setting. This will be used to sort the results having the same textual-relevance against each others.
For instance, search for babel
with match both babel-core
and babel-messages
. From a textual-relevance point of view, those 2 packages are exactly matching in the same way. In such case, Algolia will rely on the customRanking
setting and therefore put the package with the highest number of downloads in the past 30 days first.
Some packages will be considered as popular if they have been downloaded "more" than others. We currently consider a package popular if it either:
0.005%
of the total number of npm downloads,This popular
flag is also used to boost some records over non-popular ones.
yarn
apiKey=... yarn start
To restart from a particular point (or from the beginning):
seq=0 apiKey=... yarn start
This is useful when you want to completely resync the npm registry because:
seq
represents a change sequence
in CouchDB lingo.
Our goal with this project is to:
When the process starts with seq=0
:
Replicate and watch are separated because:
See CONTRIBUTING.md