relevanssi_post_content_before_tokenize

apply_filters( 'relevanssi_post_content_before_tokenize', string $content, object $post_object )

Filters the post content in indexing before it is tokenized.

Parameters

$content
(string) The post content.

$post_object
(object) The post object, usually a WP_Post object, but sometimes a stdClass object that looks like a WP_Post object.

More information

As Relevanssi indexes the post content, the post content is passed through many filter hooks. First the post content passes through relevanssi_post_content, then more content may be added with relevanssi_content_to_index, then the shortcodes are either expanded or removed, depending on the settings, and then the content passes through relevanssi_post_content_after_shortcodes filter hook.

After that, the invisible elements (<script>, <style>, <embed>, <object>, <applet>, <noscript>, <iframe>, <noembed> and <del> tags) are removed, internal links are processed, and all HTML tags are stripped. After that the content passes through this filter hook, then the content is tokenized and passed through relevanssi_indexing_tokens.

Generally if you want to modify the post content somehow, using relevanssi_post_content is good (unless you need to modify the content after the shortcodes are expanded, in which case use relevanssi_post_content_after_shortcodes). If you want to add something, using relevanssi_content_to_index is good.

If you for some reason need to access the post content after the invisible elements and the HTML tags are stripped, this filter hook is the best fit, but most of the time you’ll probably want to use one of the other filters.