Posted on

German umlauts

By default, the search ignores German umlauts. Searching for “uber” will also find “über” and so on. This ignorance is not a Relevanssi feature but instead governed by your database. The default database collation WordPress uses (utf8mb4_unicode_ci) ignores all accents, including umlauts.

wp_relevanssi database table, with collation set to utf8mb4_unicode_ci

If you want the search to care about umlauts, you can change the collation. For German, use utf8mb4_german2_ci (actually, use utf8mb4_swedish_ci, see comments below):

wp_relevanssi database table, with collation set to utf8mb4_german2_ci

Now, the database will not ignore umlauts. Relevanssi also removes accents to match what the database does, so you must also undo that. Add this to your site:

remove_filter( 'relevanssi_remove_punctuation', 'remove_accents', 9 );

Once you’ve done these steps, rebuild the index. Now, the search should not ignore the umlauts.

One comment German umlauts

  1. Sorry, but the info “If you want the search to care about umlauts, you can change the collation. For German, use utf8mb4_german2_ci” is WRONG.

    You need to use utf8mb4_swedish_ci, like always.
    It might have worked in some specific use case, but with any other current MySQL or MariaDB server, swedish_ci is the better option, as you will ACTUALLY get umlauts – as one expects – in the term column.

    I repeat: NO, DO NOT use utf8mb4_german2_ci, because that doesnt store umlauts. Yes, it allows for querying anything with “ue” as if it was “ü”, but it DOES NOT STORE words with umlauts, sz-ligature.

Leave a Reply

Are you a Relevanssi Premium customer looking for support? Please use the Premium support form.

Your email address will not be published. Required fields are marked *