Replacing the Relevanssi attachment server

There may be situations where it’s necessary to replace the Relevanssi attachment indexing server with your own server. If your documents are sensitive, you may not be allowed to have them processed on a third-party server.

It’s possible to make Relevanssi use a custom server. This requires three steps: setting up the attachment reading server, setting up an intermediary server and changing the server URL in Relevanssi.

Setting up your own indexing server

First of all, you need to set up your own server. I would recommend using Apache Tika: it’s the best way to read attachment files I know. The easiest way to set up Tika is to use a Docker container.

Setting up a Docker container somewhere is an easy process, and I’m not going to go into that here; there are plenty of good guides for that. Docker documentation is a good place to start.

Setting up the intermediary server

Setting up the Tika server is not enough, because while it does great work reading the attachment contents, it does not work smoothly with Relevanssi. There needs to be an intermediary layer that will receive the files from Relevanssi and send them to Tika, and then receive the response and send it back to Relevanssi.

This doesn’t require a separate server. This step can just be a PHP script on the same server as your WordPress setup. I’ve shared a fairly basic version of the script on Gist, that should get you started.

This file needs to be uploaded to your server as index.php in a directory. Let’s assume it’s in https://www.example.com/intermediary/index.php. You also need to edit the file to put in the URL to your own Tika server.

Changing the server URL in Relevanssi

The attachment server URL is changed with a filter function on relevanssi_attachment_server_url. Add this to your site, with the correct URL:

add_filter( 'relevanssi_attachment_server_url', 'rlv_personal_url' );
function rlv_personal_url( $url ) {
    return 'https://www.example.com/intermediary/';
}

Notice the URL needs to be the directory URL, without the index.php, it will be added later automatically if necessary.

With all these steps done, you should now be able to process attachments with your own server.