Writings mostly about Lotus Notes/Domino...by me :
Jesper Kiaer,Espergærde, Denmark

Looking for a Notes/Domino developer? I'm available


RSS 2.0 Feed
Bookmark and Share
Quick guide on how to index data from IBM Domino databases in the Apache Solr search engine
Full Text indexing has always been a great feature of IBM Notes and Domino. In the old days it was rare to see other systems have Full Text Indexing and it was really a unique and useful feature of IBM Notes and Domino.

Unfortunately for IBM Notes and Domino two things changed that advantage.
- IBM really did not keep on improving the Full Text engine, a new engine arrived in release 5, but since then only minor upgrades have been done to the engine.
- New searchengine nappeared with Doug Cutting created the Java based Full Text engine Lucene. By 2001 it was part of Apache as Open Source and has grown ever since then.
It has since been the foundation or inspiration of most Full Text engines.

Apache Solr is an Enterprise Search Engine based on Lucene. It is free and Open Source, so why not have a look at it?.

At IBM Connect 2014 IBM announced that Solr will be used to index mail database in a later major release...what ever that means...

I will give you a quick tutorial to set it up and have it running on some of your Domino data fast.

What is Apache Solr?


From their website:

"SolrTM is the popular, blazing fast open source enterprise search platform from the Apache LuceneTM project. Its major features include powerful full-text search, hit highlighting, faceted search, near real-time indexing, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly reliable, scalable and fault tolerant, providing distributed indexing, replication and load-balanced querying, automated failover and recovery, centralized configuration and more. .."

The goal of this quick tutorial.

- setting up a Solr server for testing
- have Solr index a Domino database
- query the data

1. The IBM Domino database to be indexed
The database to be index will be a simple web enabled database.

One form which has 3 fields

Subject - a Text field
Body - a Rich Text field
Attachments - a Rich Text field for attachments

2. Installation of Apache Solr
For this tutorial I will take the easy route.
This means downloading Apache Solr 4.10 and just use a modified version of the included example.
Start by downloading Solr here at http://lucene.apache.org/solr/.
Unpack the files files anywhere you want.
In this tutorial I will just run it on my desktop pc.

3. Preparing Solr for Domino data
The most important files in Solr are the files schema.xml and solrconfig.xml.

solr-conf.jpg

Solr can handle data as dynamic or static fields. We will define are few static fields

In the schema.xml file we will add

<field name="title" type="text_general" indexed="true" stored="true" multiValued="false"/>
<field name="subject" type="text_general" indexed="true" stored="true"/>
<field name="body" type="text_general" indexed="true" stored="true"/>
<field name="docurl" type="string" indexed="true" stored="true"/>
<field name="domino_doc_type" type="string" indexed="true" stored="true"/>

the field ID is very important, it is the key of the Solr document. We will use the UNID of the Domino as the ID in Solr documents

<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />

4. Getting data from IBM Domino into Solr
There are different ways of getting data into Solr.
You can use the Data Import Handler (DIH) which is part of Solr and import data from RDMS, XML, data from file systems or websites, You can use Apache Nutch, Apache Manifold and more.

What I will use is the Java API called SolrJ.

For Domino documents the steps are:

- Connect to the Solr server.
- for each Domino document create a Solr document and fill in the fields with data form the Domino document
- if the Domino document has attachments upload them to the server
- for every 1000 document commit the Solr documents to the server

I will be using DIIOP to get the data from the Domino database. This is not exactly the fastest way to get data but for the this purpose it is fine.

- Connect to server

public class SolrImporter {
Collection<SolrInputDocument> solrDocs;
HttpSolrServer solrServer;
String solrUrl = "http://127.0.0.1:8983/solr/";

public void init() {
solrServer = new HttpSolrServer(solrUrl);
solrServer.setSoTimeout(10000);
solrServer.setConnectionTimeout(10000);
solrServer.setDefaultMaxConnectionsPerHost(100);
solrServer.setMaxTotalConnections(100);
solrServer.setFollowRedirects(false);
solrServer.setAllowCompression(true);
solrServer.setMaxRetries(1);

try {
solrServer.deleteByQuery("*:*");// delete everything!
} catch (SolrServerException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}

In this demo every time I import data, ... all data in Solr is deleted first.
In real life you would of course update Solr documents by its ID.

- Creating the Solr documents:

In Solr data is saved in documents, which are much like Domino documents.

SolrInputDocument solrDoc = new SolrInputDocument();
solrDoc.addField("id", doc.getUniversalID(), 1.0f);
solrDoc.addField("subject", doc.getItemValueString("subject"), 1.0f);
solrDoc.addField("body", ((RichTextItem)doc.getFirstItem("Body")).getFormattedText(false, 0, 0), 1.0f);
solrDoc.addField("docurl", "http://jezzper.com/"+ db.getFilePath()+"/0/"+doc.getUniversalID());
solrDoc.addField("domino_doc_type","document");

//add to collection of docs to be submitted
boolean result = solrDocs.add(solrDoc);

Notice the last parameter, it gives you the possibility to "boost" the value (greater weight) in the search index or the opposite. A number larger than 1 boosts

For performance sake don't commit after every Solr document.
Here I commit the document for every 1000 documents to the Solr server

if ((i % 1000) == 0) {
solrServer.add(solrDocs);
solrServer.commit();
solrDocs.clear();
}

- For attachments I will use another appraoch:
For each attachment, I will extract the attachment to the file system and then upload the file to the Solr server

while (e.hasMoreElements()) {
EmbeddedObject eo = (EmbeddedObject) e.nextElement();
if (eo.getType() == EmbeddedObject.EMBED_ATTACHMENT) {
try {
eo.extractFile("c:\\extracts\\" + eo.getSource());
} catch (Exception e2) {
e2.printStackTrace();
}
try {
AttachmentImporter.Upload(solrServer,"c:\\extracts\\" + eo.getSource(), doc.getUniversalID() + "."+eo.getSource());
} catch (Exception e1) {
e1.printStackTrace();
}
}
}

Solr uses Apache Tika which can detect and extract metadata and text content from various document types.

We will send the attachments to the Solr server which will detect the Content Type and automatically extract Meta data and content from the files.

The benefit of this is that it is very easy to do, but on the other hand you do not want to send a 2 GB AVI file to the server just to get the Meta data extracted.
In that case you might want to consider a solution where you extract the meta data yourself and only save these in a Solr document.

For ID we will use the Domino UNID +"."+attachment Internal name

public static void Upload(HttpSolrServer solrServer,String fileName,String id) throws IOException, SolrServerException {
ContentStreamUpdateRequest req = new ContentStreamUpdateRequest("/update/extract");
req.setParam("literal.id", id);
req.setParam("fmap.content", "text");
req.setParam("commit","true");
req.setParam("literal.attachment_name",fileName.substring(fileName.lastIndexOf('\\') + 1));
req.setParam("literal.domino_doc_type","attachment");

File attFile =new File(fileName);
// empty content type to let Tika itself find the content type
req.addFile(attFile,"");

//new test parm
req.setParam("uprefix", "attr_");
req.setParam("fmap.content", "attr_content");
req.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);

// workaround, this must be done to be able to delete tmp attachment file in Windows
req=null;
System.gc();
System.out.println( "Is attachment deleted from tmp directory? " + attFile.delete());
}

5. The Solr administrator console
You can mange, analyze and query the Solr server from a browser at http://localhost:8983/solr/



Select the core called "Collection1" and you will get the menu to analyze and query the data



6. Query the data
The basic form of query in Solr is

Field:Seach Text

so Body:pdf means search for text pdf in the body field

The response result can be served in many formats JSON, XML, CSV etc. when querying the Solr server.
Another way is to use SolrJ again query and get the result and handle it in for example in an Xpage, or use the JSON output in Dojo.

You can get can the data from SolrJ into the Solr server as Java Beans, and also get the result back as Java Beans

An example of a document returned as JSON in a result :

{ "id": "4A4DF8CF83AFFC8FC1257D6000416FCB",
"subject": "Elvis Costello",
"body": "Elvis Costello was in Denmark and visited the Elvis Presly Graceland museum in \nRanders\n\nA picture of Elvis Costello ",
"docurl": "http://jezzper.com/jezzper/Solr.nsf/0/4A4DF8CF83AFFC8FC1257D6000416FCB",
"domino_doc_type": "document",
"_version_": 1480501411752968200
},

A example of all 7 documents returned (3 Domino documents and 4 attachments). :

Search *:* gives:

As XML: Search result.xmlSearch result.xml
As JSON Search result.JSONSearch result.JSON


This was just quick example of getting up and running to play with Domino data in Solr.
There is much more to Solr so I will probably do some more blogging on Solr over the next months, if I can find the time.

Downloads
Domino data export to Solr Search Engine Source Code
Libraries needed for SolJ 4.10
Demo Domino database for Solr indexing

Published by: Jesper B. Kiær at 29-09-2014 00:45:00 Full Post

XPages: Creating PDF files from an XPage or HTML with no coding or installation of software
It is often a wish from users to be able to get a PDF file of what is on an XPage.

Normally the user will install software so the user can print to a PDF file.
There are free ones which often contains malware or commercial ones which could be quit expensive.
Other options are hand coding the PDF file with iText or PDFBox etc. which can be a large task.

So lately I have been testing the open source solution wkhtmltopdf, which I have been looking at for some time.
I have been testing it by making PDFs from different web sites and there are some issues with fonts kerning, iframes not loading correctly on etc.
But if you are in control of the HTML you can work around these minor issues.

There is an external C API for creating the PDF file, which I am calling from Java.
So basically all you need is a URL and a name of the PDF file to be created.

An example:

Jesse Gallagher's great article today "Setting up nginx in Front of a Domino Server"
would give this PDF file:
Setting up nginx in Front of a Domino Server.pdfSetting up nginx in Front of a Domino Server.pdf

You might need to resize you browsers width to get the same result

You can create images from HTML as well

Published by: Jesper B. Kiær at 19-09-2014 13:43:00 Full Post

Goodbye NSF ...welcome XSF ...XPages Storage Facility
IBM Notes as development platform has been getting a good overhaul the recent years with the arrival of XPages.

This has really boosted, and blown new air into IBM Notes/Domino and opened up for a lot of new possibilities.

On the other hand the document database Notes Storage Facility ( NSF) has grown old (but maybe not ugly yet)
and there has really been no major evolution the last many years.

It is ridiculous the we have to deal with ancient old restrictions in 2014 like:

• It is slow or at least "not fast"
• Database has a maximum size of 64 GB
• Field size limits of (32/64KB)
• 64KB result limits on @DbLookup, @Dbcolumn etc.
• Can only create/edit 100 docs pr second otherwise it leads to "Time Creep" on the server




The most promising Document and Graph Database today is OrientDB.
In many ways it resembles IBM Notes, but it modern and has a lot of cool stuff like

• It is a Document and Graph Database
• Written in Java
• Master <-> Master replication like IBM Notes
• Virtually no limits
• You query with SQL
• Supports transactions
• and much more...

and it is FAST!

In 2013 I did a simple test in IBM Notes and OrientDB.
Create 10.000 documents, 2 text fields, one attachment of size 7KB.

In IBM Notes it took 191seconds, in OrientDB it took 19 seconds!!

So IBM ..at Connect 2014 I really hope you have seen the light and the need for a new document and graph database for IBM Notes/Domino, well for XPages only.
You can keep the NSF for the old stuff.

Please announce that 2014 will the year IBM Notes will release a new and modern Document and Graph Database for XPages.

Feel free to name it XPages Storage Facility or just XSF

Published by: Jesper B. Kiær at 23-01-2014 23:30:00 Full Post

Beware @MailDbName now returns a different result in version 9.0.1 of IBM Notes
Just a slight warning about @MailDbName in version 9.0.1 of IBM Notes.

I have a customer where they have a lot of old code which rely on @MailDbName and it has worked for years.
Last year they started to use Managed Replicas for mail databases, and @MailDbName would still return values as if the mail was server based.
In version 9.0.1 it has changed and @MailDbName now returns an empty string as if it is a local replica.

Published by: Jesper B. Kiær at 26-11-2013 11:15:47 Full Post

The extremely slow categorized views has been fixed in IBM Domino 9.0.1. Here is how to.
I have in former post on my blog, I have written about viewpanels in XPages in IBM Domino being very very slow.

Have a look at http://nevermind.dk/nevermind/blog.nsf/subject/viewpanel-performance-issue-confirmed-by-ibm.

Philippe Riand some time ago wrote to me that a fix would be available in 8.5.4 aka 9.0, but a apparently the fix did not make in time.
But in 9.0.1 there is a fix for the issue :-)

You can in the "What’s new for Developers in IBM Domino & Domino Designer 9.0.1" Webinar (must see) see how too use the new navigator way .
No wait ...I'll show you.

Go to xsp.properties and add the setting:

xsp.domino.view.navigator=ByNoteId

Done!



(the old way is xsp.domino.view.navigator=ByPosition)

I have just tested the old troublesome database.
The old way still very very slow, but changing to the new way of getting data by NoteId ....it now works and it is really fast ! :-)

Published by: Jesper B. Kiær at 11-11-2013 12:40:00 Full Post

Is Sametime just bloated and too complicated?
Stumbled upon this 970 page long Slideshow "IBM Sametime 9 Complete - Basic Features Installation - From Zero To Hero" from IBM on the Sametime 9.0,


http://www.slideshare.net/FrankAltenburg/ibm-sametime-9-complete-basic-features-installation-from-zero-to-hero-for-domino-ldap-version-10

Really ?? 970 pages? .. and this is only basic features.

OK, if the product was a rocket going to the Moon, this might be a appropriate sized slideshow.

Sametime is living in the past and has just become way too bloated and complicated

Anyway most organizations will most likely be happier with a simpler and cheaper (free) solution.

A secure chat server like Openfire. It installs and configures in less than 10 min. can be used with LDAP (Domino). Has a Java API, a chat client, a web and mobile client.
Has web video conferencing (webRTC) via a plugin, connects to a Open Source PBX like Asteriks and more...

just sayin'....

Published by: Jesper B. Kiær at 10-11-2013 22:46:00 Full Post