Skip to main contentSkip to footer

By default, Relevanssi tends to prefer longer posts. The default TF × IDF weights Relevanssi uses simply count the term frequency, ie. how many times a word appears in the post. That prefers longer posts as they usually have the search term appear more often. However, a 500-word post with 15 search term appearances might well be a better match for the search than a 2000-word post with 20 search term appearances, as the density is much higher in the shorter post.

One way to make Relevanssi give a boost to shorter posts is the add a consideration for the document length in the calculations. Adding this function hooked to the relevanssi_match hook to your site will include inverse document length in the weights:

add_filter( 'relevanssi_match', 'rlv_inverse_document_length', 10, 2 );
function rlv_inverse_document_length( $match, $idf ) {
    global $relevanssi_post_idl;
    if ( isset( $relevanssi_post_idl[ $match->doc ] ) ) {
        $idl = $relevanssi_post_idl[ $match->doc ];
    } else {
        $current_post_object = relevanssi_get_post( $match->doc );
        $minimum_doc_length  = 5000; // in characters
        if ( ! $current_post_object ) {
            $idl = 1;
        } else {
            $post_length = max( $minimum_doc_length, strlen( $current_post_object->post_content ) );
            $idl         = $minimum_doc_length / $post_length;
        }
        $relevanssi_post_idl[ $match->doc ] = $idl;
    }
    $match_multiplier = $match->weight / ( $match->tf * $idf );
    $match->weight    = $match_multiplier * $match->tf * $idf * $idl;
    return $match;
}

What this does is to determine the post length (in characters, not in words, because counting words is slower) and then come up with a ratio between the current post length and the minimum post length chosen in the function. Here it’s set to 5000 characters, which means that all posts are considered at least 5000 characters long and posts longer than that will get a multiplier that goes down from 1 towards 0 as the post gets longer.

This will give a boost to shorter posts that have a higher weight and will punish very long posts that rank high just for being long.

The version above only considers post content. It does not count the attachment content. This version includes that:

add_filter( 'relevanssi_match', 'rlv_inverse_document_length', 10, 2 );
function rlv_inverse_document_length( $match, $idf ) {
    global $relevanssi_post_idl;
    if ( isset( $relevanssi_post_idl[ $match->doc ] ) ) {
        $idl = $relevanssi_post_idl[ $match->doc ];
    } else {
        $current_post_object = relevanssi_get_post( $match->doc );
        $minimum_doc_length  = 5000; // in characters
        if ( ! $current_post_object ) {
            $idl = 1;
        } else {
            $post_content = $current_post_object->post_content;
            $pdf_content  = get_post_meta( $current_post_object->ID, '_relevanssi_pdf_content', true );
            $content      = $post_content . ' ' . $pdf_content;
            $post_length  = max( $minimum_doc_length, strlen( $content ) );
            $idl          = $minimum_doc_length / $post_length;
        }
        $relevanssi_post_idl[ $match->doc ] = $idl;
    }
    $match_multiplier = $match->weight / ( $match->tf * $idf );
    $match->weight    = $match_multiplier * $match->tf * $idf * $idl;
    return $match;
}

Your account

Not logged in. Log in to see your license details.

Search

Popular Resources

ThemeCo
ThemeCo themes use custom codes for dynamic content. Those are not usual shortcodes, and Relevanssi won’t expand them automatically. In…
Using multiple custom taxonomies

Usually, WordPress supports one category, one tag and one taxonomy in the search, if you use the query variables to set the taxonomies. You can enter multiple taxonomies as parameters to the query, but only the first one is used. If you want to have multiple taxonomies involved, you need……For taxonomies, having a dropdown select is usually the best option. You can create dropdowns with the wp_dropdown_categories() function. It works for all taxonomies, despite the name. The important thing is to change the ‘name’ parameter for each dropdown. So, in our case, we would have two dropdowns, like this,……in the search form code: wp_dropdown_categories( array( ‘name’ => ‘movie_director’, ‘taxonomy‘ => ‘director’ ) ); wp_dropdown_categories( array( ‘name’ => ‘movie_actor’, ‘taxonomy‘ => ‘actor’ ) ); Introducing the query variables WordPress cleans out unknown query variables for data hygiene reasons. That’s a good thing, but a bit of a complication for…

Related Posts:

Comment Section:

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.
You need to agree with the terms to proceed