Words with ampersands can’t be found

A lot of people on my site are looking for ‘H&M’. I have an H&M page but it does not show up when searching with Relevanssi. How can I change the plugin so that words with the &-sign get found?

By default Relevanssi cleans out ampersands (and other punctuation). In order to keep them, you’ll have to modify the way the punctuation is handled. A simple solution to fix the problem is this:

add_filter('relevanssi_remove_punctuation', 'saveampersands_1', 9);
function saveampersands_1($a) {
    $a = str_replace('&', 'AMPERSAND', $a);
    $a = str_replace('&', 'AMPERSAND', $a);
    return $a;
}
 
add_filter('relevanssi_remove_punctuation', 'saveampersands_2', 11);
function saveampersands_2($a) {
    $a = str_replace('AMPERSAND', '&', $a);
    return $a;
}

Stick this code to your functions.php file and rebuild the index. If you’re not protecting ampersands, just change the &s to something else. For more complicated modifications, it’s best to rewrite the whole relevanssi_remove_punct() function (unhook the default function, copy it, make modifications as you see necessary and then hook in the new function).

Update 12.2.2014: Adding

$a = str_replace('&', 'AMPERSAND', $a);

to the first function covers also ampersands that are proper HTML entities.

In some cases there’s no need to keep the punctuation, but it makes sense to remove them completely instead of replacing them with spaces. This simplifies the code a bit. For example, to make hyphens inside words not a problem, add this code:

add_filter('relevanssi_remove_punctuation', 'remove_hyphens', 9);
function remove_hyphens($a) {
    $a = str_replace('-', '', $a);
    return $a;
}

Add this code to the functions.php and rebuild the index.

This was originally asked at the WP support forum.

  • John

    Adding this code changes nothing at all & is still treated as a space

    • John

      code in the wp forum works perfect!

      • The sample code had amp entities where just plain & symbols should’ve been. Thanks for the heads up, the code is now correct.

  • Davide Prevosto

    Hi. Is this hook valid for a Multisite Installation?

    I did try (in our functions.php):

    add_filter(‘relevanssi_remove_punctuation’, ‘saveampersands_1’, 9);
    function saveampersands_1($a) {
    $a = str_replace(‘&’, ‘AMPERSAND’, $a);
    return $a;
    }
    add_filter(‘relevanssi_remove_punctuation’, ‘saveampersands_2’, 11);
    function saveampersands_2($a) {
    $a = str_replace(‘AMPERSAND’, ‘&’, $a);
    return $a;
    }

    OR

    remove_filter(‘relevanssi_remove_punctuation’, ‘relevanssi_remove_punct’);
    add_filter(‘relevanssi_remove_punctuation’, ‘enc_relevanssi_remove_punct’);

    function enc_relevanssi_remove_punct($a) {
    $a = strip_tags($a);
    $a = stripslashes($a);
    $a = str_replace(“·”, ”, $a);
    $a = str_replace(“…”, ”, $a);
    $a = str_replace(“€”, ”, $a);
    $a = str_replace(“­”, ”, $a);
    $a = str_replace(chr(194) . chr(160), ‘ ‘, $a);
    $a = str_replace(” “, ‘ ‘, $a);
    $a = str_replace(‘’’, ‘ ‘, $a);
    $a = str_replace(“‘”, ‘ ‘, $a);
    $a = str_replace(“’”, ‘ ‘, $a);
    $a = str_replace(“‘”, ‘ ‘, $a);
    $a = str_replace(“””, ‘ ‘, $a);
    $a = str_replace(““”, ‘ ‘, $a);
    $a = str_replace(“„”, ‘ ‘, $a);
    $a = str_replace(“´”, ‘ ‘, $a);
    $a = str_replace(“—”, ‘ ‘, $a);
    $a = str_replace(“–”, ‘ ‘, $a);
    $a = str_replace(“×”, ‘ ‘, $a);
    $a = str_replace(‘&’, ‘AMPERSAND’, $a);
    $a = preg_replace(‘/[[:punct:]]+/u’, ‘ ‘, $a);
    $a = str_replace(‘AMPERSAND’, ‘&’, $a);
    $a = preg_replace(‘/[[:space:]]+/’, ‘ ‘, $a);
    $a = trim($a);
    return $a;
    }

    We did rebuild the index of all blogs… but a search like this: “D&G” or “Build&Beader” didn’t works.

    We are Premium Users, would you like to help us?

    Thank you.

    • Multisite uses the same code, so yes, this is valid for Multisite installations as well. However, I’m not sure where the code should be added… You should check if the code is being executed in the first place: add an echo and an exit to the function and see if it’s even run.

      If it’s being executed, then it should work, so I’m guessing it’s just not being noticed. I’m not sure if the code should be on network level or in individual blog level, so try the different options.

      • Davide Prevosto

        Thank you! Do you mean to try the first solution? The one guggested in your post? I will try.
        I am quite sure code was executed, cause I wrote “die($a);” and I was able to see myAMPERSANDquery and my&query (I did try with “D&G”).
        I will let you know. Thank you.

        • I recommend the first solution, but both should work.

          Do note that this should work in two places: both in indexing and in searching. if the code executes when searching, but not when indexing, or vice versa, searching won’t work.

          • Davide Prevosto

            Is it possible to PVT with you, later maybe?

          • Sure, just use the customer support form to contact me.

  • armandl

    Could you provide the code for protecting “-” punctuation?

    • add_filter(‘relevanssi_remove_punctuation’, ‘saveampersands_1’, 9);
      function saveampersands_1($a) {
      $a = str_replace(‘-‘, ‘HYPHEN’, $a);
      return $a;
      }

      add_filter(‘relevanssi_remove_punctuation’, ‘saveampersands_2’, 11);
      function saveampersands_2($a) {
      $a = str_replace(‘HYPHEN’, ‘-‘, $a);
      return $a;
      }

  • Karin Bronwasser

    Don’t know what I am doing wrong. Added this code to my functions.php

    add_filter(‘relevanssi_remove_punctuation’, ‘saveampersands_1’, 9);
    function saveampersands_1($a) {
    $a = str_replace(‘_’, ‘UNDERSCORE’, $a);
    $a = str_replace(‘_’, ‘UNDERSCORE’, $a);
    return $a;
    }

    add_filter(‘relevanssi_remove_punctuation’, ‘saveampersands_2’, 11);
    function saveampersands_2($a) {
    $a = str_replace(‘UNDERSCORE’, ‘_’, $a);
    return $a;
    }

    And when I search for example GE_0001 I get 0 results, but I should get about 40
    Tried all kind of variations. Like the code with the hyphen and then only replace the – for _ all with the same result – nothing. I need to keep the _ in the results since the site is a shop with SKU with _.

    • Karin, the code is correct, so I’d next check that Relevanssi is indexing the SKUs in the first place. If you create a product with a SKU without an underscore, for example “TESTSKU”, can you find it? If not, start by fixing that.

      • Karin Bronwasser

        Relevanssi was indexing the SKU. I had it working till the latest update of Relevanssi. I didn’t have the code in the functions.php but just removed the lines in de common.php of the plugin. That worked, but that is not working anymore either. Is there a way to complety empty the index and start indexing all over again. Maybe that will help

        • Just click “Build the index”, that’ll empty the index and rebuild it.

          Do note that the code requires reindexing the whole database before it works.

          • Karin Bronwasser

            I did build the index over and over again, but still when I search on ge_0001 I get all the results for GE and 0001

          • Hmm, hard to say. The code is correct. You might want to try and see that it runs, just to be sure. Add this in the first function before the “return $a;” line and then try to save a post:

            echo “it runs”;
            exit();

            Now you should see a white page with the text “it runs”. If you don’t see a white page and the post is saved as usual, then your code does not run and that’s where the problem is.

          • Karin Bronwasser

            That works. I had to remove the functions because it gave “no results” where there should be results. I outcommented the line for the underscore replacement in the common.php and after that I emtied the relevanssi database table and build it again. No result.. the index again is full with lose components of the SKU

          • Then I can’t help you further – if the code runs, but doesn’t do what it’s supposed to do, I don’t know what’s going on.

          • Karin Bronwasser

            it is working!!!!. Just completly uninstalled Relevanssi from my website and installed it fresh again (after I activated the code again in the functions.php. REbuild the index and when I search for LB_0001 I only get leatherbands… and nothing else 🙂 Thank you for thinking with me. Code was fine.. installation had a bug somewhere.

  • Atef

    Hi Mikko,
    I have Arabic text that have punctuation. the old versions of Relevanssi used to correctly ignore the punctuation and get the text.
    I had Relevanssi off for around a year, then installed it again to resume using it.
    The newest version did not detect the text, so when i seach using the “unpunctuated” text, i don’t get any search results.

    I tried to apply the same concept above using the following function:

    add_filter(‘relevanssi_remove_punctuation’, ‘arabic_filter’, 9);
    function arabic_filter($a) {
    $tashkeeel = array(“ّ”, “َ”, “ً”, “ُ”, “ٌ”, “ِ”, “ٍ”, “ْ”);
    $a = str_replace($tashkeeel, “”, $a);
    }

    (the punctution characters are in between “”, but might not appear properly)

    However, when i do so and rebuild the index, the index does not detect any posts and says:

    Documents in the index: 0
    Terms in the index: 0
    Highest post ID indexed: 0

    I have to remove the function to be able to regain the indexing again.

    Am I doing something wrong in my functions?

    thank you very much

    • You’re eliminating all text, because you haven’t remembered to actually return any value. Your filter is a black hole that swallows everything… so just add a “return $a;” in the end and you’ll be fine =)