Use re-Isearch in Python

A Short Introduction

re-Isearch is implemented mainly in C / C++. We can create a Python Binding using SWIG (Simplified Wrapper and Interface Generator).

Install SWIG

You may need to install the PCRE library:

 sudo apt-get install libpcre3 libpcre3-dev

Now just run the following commands:

tar -xzvf swig-2.0.12.tar.gz
cd swig-2.0.12
sudo make install

Simply go to the swig folder in your local copy.

cd /re-Isearch/swig

There, you can edit to specify the Python version you are using in PYVERSION.

Make sure you have the appropiate python-dev package installed, which provides the header files needed to run C / C++ code. As of now, only Python 2.xx are supported.

Now simply run

make Python

This creates the shared object file and that can be loaded into your Python script.


Run this in a folder with shakespeare.xml (can be found in /re-Isearch/test/data) to create an Index

import sys
import string
from IB import *

pdb = IDB(junk);
print "This is PyIB version %s/%s" % (string.split(sys.version)[0], pdb.GetVersionID());
if not pdb.IsDbCompatible():
  raise ValueError, "The specified database '%s' is not compatible with this version. Re-index!" % `junk`

pdb.AddRecord ("shakespeare.xml");



if not pdb.Index() :
  print "Indexing error encountered";


To do an example search, now run the following script

import sys
import string
from IB import *

pdb = IDB(junk);
print "This is PyIB version %s/%s" % (string.split(sys.version)[0], pdb.GetVersionID());
if not pdb.IsDbCompatible():
  raise ValueError, "The specified database '%s' is not compatible with this version. Re-index!" % `junk`

sentence =  "to be or not to be";
#sentence = "Hate Christian OR";
#sentence = "Hate Jew OR";

squery = SQUERY(sentence);
print squery;
query = QUERY();

elements = pdb.GetTotalRecords();

print "Database ", junk, " has ", elements, " elements";

total = 10;
if elements > 0:
    rset = pdb.VSearchSmart(query);
    print type(rset);
    print rset;
    total = rset.GetTotalEntries();
    print "Searching for: ", query;
    print "Got = ", total, " Records";
    # Print the results....
    for i in range(1,total+1):
        result = rset.GetEntry(i);
        area = pdb.Context(result, "____", "____") ;
        datum = result.GetDate();

        score  = result.GetScore();
        hits   = result.GetHitTable();
        print "[", i , "] ", rset.GetScaledScore(score, 100), " ", score, " ", pdb.Present(result, ELEMENT_Brief);
        print "\tFormat: ", result.GetDoctype();
        print "\tFile:", result.GetFullFileName(), "  [", result.GetRecordStart(), "-", result.GetRecordEnd(), "]";
        print "\tDate: ", datum.RFCdate();
        print "\tMatch: ", area;
    print 'Empty Index!';

pdb = None; # Delete

Last updated