Does wp_new_comment expect a comment in HTML?

Should the comment_content in this example contain HTML? Or should it contain plain text?

$comment_id = wp_new_comment([
    'comment_post_ID' => 1,
    'comment_content' => 'Tom & Jerry', // or should this be 'Tom & Jerry'?
    'comment_type' => '',
    'user_id' => 1,
    // etc
]);

For completeness, does the same apply to wp_insert_comment too?

1 Answer
1

or should this be ‘Tom & Jerry’?

It doesn’t really matter. It’s more important to consider if you don’t know where the comment content is coming from. If you’re inserting user input as the comment content then this should be escaped.

wp_new_comment() escapes and sanititizes the comment for you. It’s designed to take the user input from the comment form directly and sanitise it before passing it to wp_insert_comment().

So this includes encoding HTML entitites (converting & to &) and filtering out disallowed HTML. It also does things like add the commenter’s IP address to the comment.

So when using wp_new_comment() you shouldn’t need to do any escaping and should enter the comment the same way the user would, with one important exception.

Both wp_new_comment() and wp_insert_comment() expect data to be slashed. WordPress automatically runs addslashes() on everything in $_POST, $_GET and $_REQUEST and these functions are expecting the data to be slashed because they are made to use data from those variables.

So if any of the data for the comment you’re inserting includes quotes or slashes then you need to slash it with wp_slash() before inserting:

$args = [
    'comment_post_ID' => 1,
    'comment_content' => 'Tom\'s a cat',
    'comment_type' => '',
    'user_id' => 1,
];

$comment_id = wp_new_comment( wp_slash( $args ) );

Note that the slash I already added, Tom\'s was escaping it for PHP purposes, and still needs to be slashed.

The silly thing is that the first thing that happens in wp_insert_comment() is that it runs wp_unslash(). So we’re adding slashes just for them to be removed immediately. But it needs to be done for consistency with the $_POST variable and so that slashes that might be part of the content aren’t incorrectly removed.

Leave a Comment