Posted on

Indexing image captions for the posts

The use case is I run several newspapers, and the “caption” field when uploading media is where the journalists put the photographer credit. We need to be able to index the photographer bylines, but ideally would want to return the story/commentary where their image was used, not the image itself.

If you want to index the image captions for the posts where the images are used, that’s easy if the captions appear on the page: Relevanssi will index those captions as well as the other text from the post. However, if the captions don’t appear on the page, you need extra steps in order to read in the captions.

The trick is to add an indexing filter on the relevanssi_content_to_index filter hook that will find all the attachments for a post, fetch the desired data out of the attachments and add that data to the parent post content so Relevanssi can index it.

The image captions for the images in the Media library is stored in the post excerpt, so what needs to be done is to fetch the post excerpts and include them.

The actual filter is straightforward:

add_filter( 'relevanssi_content_to_index', 'rlv_add_attachment_excerpts', 10, 2 );
/**
 * Indexes attachment excerpts for the parent post.
 *
 * This function reads in the attachment excerpts from the database and
 * adds it to the parent post content.
 *
 * @global $wpdb The WordPress database interface.
 *
 * @param string $content The added content.
 * @param object $post    The indexed post object.
 *
 * @return string The added content.
 */
function rlv_add_attachment_excerpts( $content, $post ) {
    global $wpdb;
    $results = $wpdb->get_col(
        $wpdb->prepare(
            "SELECT post_excerpt FROM $wpdb->posts WHERE post_parent = %d",
            $post->ID
        )
    );
    foreach ( $results as $excerpt ) {
        $content .= " $excerpt";
    }
    return $content;
}

Now just set Relevanssi so that the attachment post type is not indexed and rebuild the index. Searching for a caption should find the post that contains the image.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.