Process sensitive PDF documents on your own private server
Custom Indexing Server
There may be situations where it’s necessary to replace the default Relevanssi attachment indexing server with your own. If your documents are highly sensitive, you may not be allowed to have them processed on a third-party server.
It’s possible to make Relevanssi use a custom server. This requires three steps: setting up the attachment reading server, setting up an intermediary script, and changing the server URL in Relevanssi.
1. Setting up your own indexing server
First of all, you need to set up your own server. I strongly recommend using Apache Tika: it’s the best way to read attachment files I know. The easiest way to set up Tika is to use a Docker container.
Setting up a Docker container somewhere is an easy process, and I’m not going to go into that here; there are plenty of excellent guides for that. The Docker documentation is a great place to start.
2. Setting up the intermediary server
Setting up the Tika server is not enough, because while it does great work reading the attachment contents, it does not communicate smoothly with Relevanssi directly. There needs to be an intermediary layer that will receive the files from Relevanssi, send them to Tika, and then receive the response to send back to Relevanssi.
This doesn’t require a whole separate server. This step can just be a PHP script hosted on the same server as your WordPress setup. I’ve shared a fairly basic version of the script on Gist that should get you started.
This file needs to be uploaded to your server as index.php in a dedicated directory. Let’s assume it’s located at https://www.example.com/intermediary/index.php. You also need to edit the file to put in the exact URL to your own Tika server.
3. Changing the server URL in Relevanssi
The attachment server URL is changed with a filter function on relevanssi_attachment_server_url. Add this to your site, swapping in your correct URL:
add_filter( 'relevanssi_attachment_server_url', 'rlv_personal_url' );
function rlv_personal_url( $url ) {
return 'https://www.example.com/intermediary/';
}
index.php at the end. The filename will be added automatically later if necessary.
With all these steps done, you should now be able to process attachments safely on your own private server!