Intro

Project re-isearch: a novel multimodal search and retrieval engine using mathematical models and algorithms different from the all-too-common inverted index. The design allows it to have effectively no limits on the frequency of words, term length, number of fields or complexity of structured data and support even overlap—where fields or structures cross other's boundaries (common examples are quotes, line/sentences, biblical verse, annotations). Its model enables a completely flexible unit of retrieval and modes of search.

  • Low-code ETL / "Any-to-Any" NoSQL datastore architecture

  • Handles a wide range of native (and via filters) document formats including “live” data.

  • Phrase, case, proximity, wildcard, parametric, range, phonetic, fuzzy, thesauri, polymorphism datatypes (including numeric, dates, geospatial, ranges etc.) and object capabilities.

  • Set based with an exhaustive collection of (binary and unary) set operations.

  • A number of different query languages including “smart” plain language expressions.

  • Fully Customizable and extendable.

  • Plugin architecture that allows for binary distributed (proprietary) 3rd party extensions

  • Useful for Analytics, Recommendation / Autosuggestion and a host of other applications

  • Support for Peer-to-Peer and Federated architectures.

  • Flexible scripting language interfaces (including Python)

  • Tiny, efficient. Can run on low powered systems (even embedded) with a minimum of RAM

  • Freely available under a permissive software license.

Core engine development language: reduced subset of highly portable C++

Plugin extensions development language: C++

Application development language: Among others C++, C, Java, PHP, Python, R, Tcl/Tk

Licence: Apache 2.0

A few possible paradigm changing uses:

  • Multi-media/video: using a combination of speech to text and image captioning pre-procesing.

  • distributed internet search on IPFS

  • Centroid federated search

Source: https://github.com/re-Isearch/re-Isearch

Comparisons to other engines (Especially Lucene):

https://github.com/re-Isearch/re-Isearch/blob/master/docs/re-Isearch-vs-Others.pdf

Handbook (PDF):

https://github.com/re-Isearch/re-Isearch/blob/master/docs/re-Isearch-Handbook.pdf

Design Whitepaper:

https://github.com/re-Isearch/re-Isearch/blob/master/docs/re-Isearch-Design.pdf

FOSDEM’22 Talk:

https://fosdem.org/2022/schedule/event/lt_re_lsearch/ (~14 min)

References:

European Union N(ext) G(eneration) I(nternet): https://www.ngi.eu/funded_solution/re-isearch/

Nlnet Foundation: https://nlnet.nl/project/Re-iSearch/

Visual Search: https://isea-archives.siggraph.org/art-events/metahaven-exodus-cross-search/

Comparisons

Market Overview

There are many of the companies in this space and market caps are relatively high. Intrafind and Elastic are based on Apache Lucene.

Commercial market: ecommerce product search.

The engine has a number of unique features that set its possibilites apart from the standard solutions.

Elastic Path’s Java ecommerce platform is based on open source technologies such as Spring Framework, Apache OpenJPA, Eclipse RCP, Apache Solr, Apache Velocity, Groovy, Direct Web Remoting, jQuery and more. So basically … Lucene….

Last updated