Posted on

Controlling attachment types in index

Relevanssi lets you index attachments. But perhaps you only want to index a particular type of attachment? Relevanssi settings don’t have any control over that, it’s either all attachments or nothing.

It is possible to choose which kinds of attachments are indexed. It is done with the relevanssi_indexing_restriction filter hook, which lets you control which posts are indexed. You can use the attachment MIME type to see which kind of attachment it is and use that information to weed out unwanted attachments.

No images

To remove all image attachments from the index, add this code to your site and rebuild the index. It will weed out all attachments that have a MIME type that begins with image.

add_filter( 'relevanssi_indexing_restriction', 'rlv_no_image_attachments' );
function rlv_no_image_attachments( $restriction ) {
    global $wpdb;
    $restriction['mysql']  .= " AND post.ID NOT IN (SELECT ID FROM $wpdb->posts WHERE post_type = 'attachment' AND post_mime_type LIKE 'image%' ) ";
    $restriction['reason'] .= ' No images';
    return $restriction;
}

Only PDFs

This function will only index PDF attachments, and nothing else.

add_filter( 'relevanssi_indexing_restriction', 'rlv_only_pdfs' );
function rlv_only_pdfs( $restriction ) {
    global $wpdb;
    $restriction['mysql']  .= " AND post.ID NOT IN (SELECT ID FROM $wpdb->posts WHERE post_type = 'attachment' AND post_mime_type != 'application/pdf' ) ";
    $restriction['reason'] .= 'Not a PDF';
    return $restriction;
}

Only attached attachments

This function will only index attachments that have a parent post.

add_filter( 'relevanssi_indexing_restriction', 'rlv_only_attached' );
function rlv_only_attached( $restriction ) {
    global $wpdb;
    $restriction['mysql']  .= " AND post.ID NOT IN (SELECT ID FROM $wpdb->posts WHERE post_type = 'attachment' AND post_parent = 0 ) ";
    $restriction['reason'] .= 'Not attached';
    return $restriction;
}

16 comments Controlling attachment types in index

  1. I tried this but it doesn’t seem to be working. Is there another way to remove them? They’re coming in first place in my search which is the last place I want them.

      1. Hi Mikko, just posts pages and downloads (from wpdownloadmanager). Also, is there a way to sort the results types, at the moment media is always on top and I want posts to be first, then downloads, then pages. (and no images).

  2. Actually… I think I might be talking about a different thing… I’m seeing images and such in the search box dropdown before I go to the whole page results… it’s in the search preview that I want to remove images.

    1. That isn’t probably coming from Relevanssi at all. As far as I can tell, the only Relevanssi-compatible search dropdown is SearchWP Live Ajax Search. If you’re using something else, it’s using the default WP search to get the results.

      1. oh… maybe that is something that came with my template. sorry to have bothered you. I’ll have to keep digging to see what’s going on.

    1. Eric, did you rebuild the index after adding the codes? If you didn’t, do that and that should solve the problem. These are indexing filters, and only take action when you are indexing posts.

      If that doesn’t help, then I would recommend debugging this, take a look at the values get_post_mime_type() is returning.

  3. I’ve tried both types of the above code, then “reset all attachments…” and then “read all unread attachments” and then Relevanssi hangs, telling me “time elapsed 11:44:20 | time remaining about 20 minutes” (numbers changing; longest I let it run was nearly 20 hours). At the bottom of the log it displays “Failed to index attachment id 7760: cURL error 28: Operation timed out after 45007 milliseconds with 0 bytes received\n” which is obviously where it is hanging. I have no idea where to find out what attachment id 7760 is (or any attachment id, for that matter). So I suspect this code no longer works with your current version. True? Also, how to find attachment id? [Premium Relevanssi]

    1. Judy, if the error says “timed out”, then the problem is not in your code or in your attachments, it’s the server: it doesn’t respond in time. The indexing should respond quickly: if nothing happens in few minutes, something’s wrong and there’s no reason to wait. The US server is slightly unstable at the moment, I’m investigating it. Meanwhile you can switch to the more reliable EU server, or simply try again later when the server has rebooted and probably responds better.

      To see which attachment the ID 7760 refers to, you can go to /wp-admin/post.php?post=7760&action=edit on your site.

    1. Ge, Relevanssi should not index the image URL, but if the image is embedded on the page in some nonstandard way, it’s possible the image URL gets in the excerpt. Nothing on this page has anything to do with excerpts, so these are not the solution. The solution depends on how the post looks like in the WP editor.

    1. Zachary, no, it isn’t. It used to be a string, but it’s still an array. If you’re getting a string, you either use an old version of Relevanssi or have an old filter function that returns a string.

Leave a Reply

Are you a Relevanssi Premium customer looking for support? Please use the Premium support form.

Your email address will not be published. Required fields are marked *