Intro

Project re-isearch: a novel multimodal search and retrieval engine using mathematical models and algorithms different from the all-too-common inverted index. The design allows it to have effectively no limits on the frequency of words, term length, number of fields or complexity of structured data and support even overlap—where fields or structures cross other's boundaries (common examples are quotes, line/sentences, biblical verse, annotations). Its model enables a completely flexible unit of retrieval and modes of search.

  • Low-code ETL / "Any-to-Any" NoSQL datastore architecture

  • Handles a wide range of native (and via filters) document formats including “live” data.

  • Phrase, case, proximity, wildcard, parametric, range, phonetic, fuzzy, thesauri, polymorphism datatypes (including numeric, dates, geospatial, ranges etc.) and object capabilities.

  • Set based with an exhaustive collection of (binary and unary) set operations.

  • A number of different query languages including “smart” plain language expressions.

  • Fully Customizable and extendable.

  • Plugin architecture that allows for binary distributed (proprietary) 3rd party extensions

  • Useful for Analytics, Recommendation / Autosuggestion and a host of other applications

  • Support for Peer-to-Peer and Federated architectures.

  • Flexible scripting language interfaces (including Python)

  • Tiny, efficient. Can run on low powered systems (even embedded) with a minimum of RAM

  • Freely available under a permissive software license.

Core engine development language: reduced subset of highly portable C++

Plugin extensions development language: C++

Application development language: Among others C++, C, Java, PHP, Python, R, Tcl/Tk

Licence: Apache 2.0

A few possible paradigm changing uses:

  • Multi-media/video: using a combination of speech to text and image captioning pre-procesing.

  • distributed internet search on IPFS

  • Centroid federated search

Source: https://github.com/re-Isearch/re-Isearch

Comparisons to other engines (Especially Lucene):

https://github.com/re-Isearch/re-Isearch/blob/master/docs/re-Isearch-vs-Others.pdf

Handbook (PDF):

https://github.com/re-Isearch/re-Isearch/blob/master/docs/re-Isearch-Handbook.pdf

Design Whitepaper:

https://github.com/re-Isearch/re-Isearch/blob/master/docs/re-Isearch-Design.pdf

FOSDEM’22 Talk:

https://fosdem.org/2022/schedule/event/lt_re_lsearch/ (~14 min)

References:

European Union N(ext) G(eneration) I(nternet): https://www.ngi.eu/funded_solution/re-isearch/

Nlnet Foundation: https://nlnet.nl/project/Re-iSearch/

Visual Search: https://isea-archives.siggraph.org/art-events/metahaven-exodus-cross-search/

Comparisons

re-Isearch

Typesense

Algolia

ElasticSearch

Meilisearch

intraFind

Open Source?

Yes

Yes

No

Source-available

Yes

No

License

Apache 2.0

GPL 2

Commerical

SSPL

MIT

Commercial

First Commit

1992,2020

2015

2012

2010

2018

2010

Built Using

C++

C++

C++

Java

Rust

Java

Core Search Algorithm

Own

Own

Own

Lucene

Own

Lucene

Primary Index Location

Disk, exploits virtual memory system

RAM

RAM

Disk, with RAM cache

Disk with Memory Mapped files

Disk, with RAM cache

re-Isearch

MarkLogic

Elasticsearch

Apache Solr

NoSQL search engine

Operational and transactional Enterprise NoSQL database

A distributed, RESTful modern search and analytics engine

A widely used distributed, scalable search engine

NativeXML DBMS, RDF Store, search engine

NativeXML DBMS, RDF Store, search engine

Search engine

Search engine

Object DBMS including Spatial

Document store

Spatial DBMS

Spatial DBMS

1994-2011, reborn 2021

Since 2001

Since 2010

Since 2006

C++

C++

Java

Java

Free

Commerical

Partially Free

Free

Open Source

Proprietary

Open Source

Open Source

XML support

XML support

JSON Only

XML support

Foreign keys, Join

No foreign keys

No foreign keys

No foreign keys

Schema-Free

Schema-Free

Schema-Free

Schema

Multi-language API, Z39.50, SRU/W. CQL, IB Query Language,...

Multi-language API, Xquery, SPARQL,

Java API

RESTful HTTP/JSON API

Java API

RESTful HTTP/JSON API

Search during index

Search during index

No

No

Own algorithms

Own algorithms

Based on Lucene

Based on Lucene

Market Overview

There are many of the companies in this space and market caps are relatively high. Intrafind and Elastic are based on Apache Lucene.

Commercial market: ecommerce product search.

The engine has a number of unique features that set its possibilites apart from the standard solutions.

Elastic Path’s Java ecommerce platform is based on open source technologies such as Spring Framework, Apache OpenJPA, Eclipse RCP, Apache Solr, Apache Velocity, Groovy, Direct Web Remoting, jQuery and more. So basically … Lucene….

Last updated