Posted on

Searching for all descendants of a page

If you want to restrict a search to a page and all it’s children, you can add a post_parent parameter to the search, and that’s it – but that only includes the direct children of the page. What if you want to include page, it’s children, all the grandchildren and their children? In that case, just using post_parent isn’t enough.

Let’s add a new search parameter, parent, that will include all descendants. WordPress has a helpful function here, get_post_ancestors(), which we can use to easily fetch the lineage of each post to only include posts that are descendants of our parent post.

This parameter is easiest to implement as a relevanssi_hits_filter filter function, so instead of limiting the search results to certain posts, we fetch everything and then weed out the unwanted posts.

/**
 * Introduces our `parent` query variable.
 */
add_filter( 'query_vars', function( $qv ) { return array_merge( $qv, array( 'parent' ) ); } );

/**
 * Filters the search results to only include the descendants.
 */
add_filter( 'relevanssi_hits_filter', 'rlv_parent_filter' );
function rlv_parent_filter( $hits ) {
    global $wp_query;
    if ( ! isset( $wp_query->query_vars['parent'] ) ) {
        // No parent parameter set, do nothing.
        return $hits;
    }
    $clan = array();
    foreach ( $hits[0] as $hit ) {
        // Loop through all the posts found.
        if ( $hit->ID === $wp_query->query_vars['parent'] ) {
            // The page itself.
            $clan[] = $hit;
        } elseif ( $hit->post_parent === $wp_query->query_vars['parent'] ) {
            // A direct descendant.
            $clan[] = $hit;
        } elseif ( $hit->post_parent > 0 ) {
            $ancestors = get_post_ancestors( $hit );
            if ( in_array( intval( $wp_query->query_vars['parent'] ), $ancestors, true ) ) {
                // One of the lower level descendants.
                $clan[] = $hit;
            }
        }
    }
    // Only include the filtered posts.
    $hits[0] = $clan;
    return $hits;
}

This solution works well in smaller databases (“small” meaning sites where an average search result set size is under 500 posts). In larger cases, this can get problematic: using a throttle may cause relevant posts go missing. If throttle leaves out some of the pages in the family tree, those cannot be found. On the other hand, without the throttle running get_post_ancestors() to all posts found can be a problem, especially if there are plenty of page hierarchies on your site.

In a larger database, it may be helpful to think this in some other way. I’m not sure what that would be, as building a MySQL query that would restrict the search using post_parent will also get very complicated.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.