Default WordPress settings API data sanitization

It looks to me that when saving data to the database via the settings API WordPress sanitizes data by default. By that I mean that if I look at the raw settings options in the database they have (at the very least) been through the wordpress equivalent of htmlentities().
Is there any documentation of the exact sanitization process? I don’t want to repeat any of it in my own validation function, and want to make sure I’m using the data correctly when I call it back…..

UPDATE:
In response to Christopher Davis’s great answer, here is a bit more detail.
I am using register_setting to register a group of settings. This group is set using the add_settings_field. The array of all settings is passed (using the register_settings callback) to a single validation method, which just checks that everything looks right (i.e. reg exp checking that an email is an email, an integer is an integer etc). I am doing no sanitisation, or referencing any of the WordPress sanitisation methods. However one option value contains a tag, which when viewed in the database has been converted to HTML entities. I assumed WordPress was doing (at least) this by default for any options stored in the database. Perhaps just by the way it converts an array to a string to store it in the database?

1 Answer
1

WordPress will not do any data sanitization for you. It does do sanitization/validation of the default options.

You have to pass in the third argument of register_setting and either role your own validation callback or or use one of the builtins.

If your options is only going to contain a string, you could do something like this, for instance.

<?php
register_setting('your_group', 'your_setting', 'esc_attr');

You can trace how WP saves an option, if you look at the source for register_setting (in wp-admin/includes/plugin.php):

<?php
/**
 * Register a setting and its sanitization callback
 *
 * @since 2.7.0
 *
 * @param string $option_group A settings group name. Should correspond to a whitelisted option key name.
 *  Default whitelisted option key names include "general," "discussion," and "reading," among others.
 * @param string $option_name The name of an option to sanitize and save.
 * @param unknown_type $sanitize_callback A callback function that sanitizes the option's value.
 * @return unknown
 */
function register_setting( $option_group, $option_name, $sanitize_callback = '' ) {
    global $new_whitelist_options;

    if ( 'misc' == $option_group ) {
        _deprecated_argument( __FUNCTION__, '3.0', __( 'The miscellaneous options group has been removed. Use another settings group.' ) );
        $option_group = 'general';
    }

    $new_whitelist_options[ $option_group ][] = $option_name;
    if ( $sanitize_callback != '' )
        add_filter( "sanitize_option_{$option_name}", $sanitize_callback );
}

The key bit is the last few lines of the function. If there’s a sanitization callback, WP will add it to the filter sanitize_option_{$name}.

That filter gets applied in sanitize_option (in wp-includes/formatting.php):

<?php
/**
 * Sanitises various option values based on the nature of the option.
 *
 * This is basically a switch statement which will pass $value through a number
 * of functions depending on the $option.
 *
 * @since 2.0.5
 *
 * @param string $option The name of the option.
 * @param string $value The unsanitised value.
 * @return string Sanitized value.
 */
function sanitize_option($option, $value) {

    switch ( $option ) {
        case 'admin_email' :
        case 'new_admin_email' :
            $value = sanitize_email( $value );
            if ( ! is_email( $value ) ) {
                $value = get_option( $option ); // Resets option to stored value in the case of failed sanitization
                if ( function_exists( 'add_settings_error' ) )
                    add_settings_error( $option, 'invalid_admin_email', __( 'The email address entered did not appear to be a valid email address. Please enter a valid email address.' ) );
            }
            break;

        case 'thumbnail_size_w':
        case 'thumbnail_size_h':
        case 'medium_size_w':
        case 'medium_size_h':
        case 'large_size_w':
        case 'large_size_h':
        case 'embed_size_h':
        case 'default_post_edit_rows':
        case 'mailserver_port':
        case 'comment_max_links':
        case 'page_on_front':
        case 'page_for_posts':
        case 'rss_excerpt_length':
        case 'default_category':
        case 'default_email_category':
        case 'default_link_category':
        case 'close_comments_days_old':
        case 'comments_per_page':
        case 'thread_comments_depth':
        case 'users_can_register':
        case 'start_of_week':
            $value = absint( $value );
            break;

        case 'embed_size_w':
            if ( '' !== $value )
                $value = absint( $value );
            break;

        case 'posts_per_page':
        case 'posts_per_rss':
            $value = (int) $value;
            if ( empty($value) )
                $value = 1;
            if ( $value < -1 )
                $value = abs($value);
            break;

        case 'default_ping_status':
        case 'default_comment_status':
            // Options that if not there have 0 value but need to be something like "closed"
            if ( $value == '0' || $value == '')
                $value="closed";
            break;

        case 'blogdescription':
        case 'blogname':
            $value = addslashes($value);
            $value = wp_filter_post_kses( $value ); // calls stripslashes then addslashes
            $value = stripslashes($value);
            $value = esc_html( $value );
            break;

        case 'blog_charset':
            $value = preg_replace('/[^a-zA-Z0-9_-]/', '', $value); // strips slashes
            break;

        case 'date_format':
        case 'time_format':
        case 'mailserver_url':
        case 'mailserver_login':
        case 'mailserver_pass':
        case 'ping_sites':
        case 'upload_path':
            $value = strip_tags($value);
            $value = addslashes($value);
            $value = wp_filter_kses($value); // calls stripslashes then addslashes
            $value = stripslashes($value);
            break;

        case 'gmt_offset':
            $value = preg_replace('/[^0-9:.-]/', '', $value); // strips slashes
            break;

        case 'siteurl':
            if ( (bool)preg_match( '#http(s?)://(.+)#i', $value) ) {
                $value = esc_url_raw($value);
            } else {
                $value = get_option( $option ); // Resets option to stored value in the case of failed sanitization
                if ( function_exists('add_settings_error') )
                    add_settings_error('siteurl', 'invalid_siteurl', __('The WordPress address you entered did not appear to be a valid URL. Please enter a valid URL.'));
            }
            break;

        case 'home':
            if ( (bool)preg_match( '#http(s?)://(.+)#i', $value) ) {
                $value = esc_url_raw($value);
            } else {
                $value = get_option( $option ); // Resets option to stored value in the case of failed sanitization
                if ( function_exists('add_settings_error') )
                    add_settings_error('home', 'invalid_home', __('The Site address you entered did not appear to be a valid URL. Please enter a valid URL.'));
            }
            break;

        case 'WPLANG':
            $allowed = get_available_languages();
            if ( ! in_array( $value, $allowed ) && ! empty( $value ) )
                $value = get_option( $option );
            break;

        case 'timezone_string':
            $allowed_zones = timezone_identifiers_list();
            if ( ! in_array( $value, $allowed_zones ) && ! empty( $value ) ) {
                $value = get_option( $option ); // Resets option to stored value in the case of failed sanitization
                if ( function_exists('add_settings_error') )
                    add_settings_error('timezone_string', 'invalid_timezone_string', __('The timezone you have entered is not valid. Please select a valid timezone.') );
            }
            break;

        case 'permalink_structure':
        case 'category_base':
        case 'tag_base':
            $value = esc_url_raw( $value );
            $value = str_replace( 'http://', '', $value );
            break;
    }

    $value = apply_filters("sanitize_option_{$option}", $value, $option);

    return $value;
}

As you can see, there are a lot of cases there to handle all the built ins, but no default sanitization.

Sanitize option is called in update_option to clean/validate things before they go into the database.

<?php
/**
 * Update the value of an option that was already added.
 *
 * You do not need to serialize values. If the value needs to be serialized, then
 * it will be serialized before it is inserted into the database. Remember,
 * resources can not be serialized or added as an option.
 *
 * If the option does not exist, then the option will be added with the option
 * value, but you will not be able to set whether it is autoloaded. If you want
 * to set whether an option is autoloaded, then you need to use the add_option().
 *
 * @since 1.0.0
 * @package WordPress
 * @subpackage Option
 *
 * @uses apply_filters() Calls 'pre_update_option_$option' hook to allow overwriting the
 *  option value to be stored.
 * @uses do_action() Calls 'update_option' hook before updating the option.
 * @uses do_action() Calls 'update_option_$option' and 'updated_option' hooks on success.
 *
 * @param string $option Option name. Expected to not be SQL-escaped.
 * @param mixed $newvalue Option value. Expected to not be SQL-escaped.
 * @return bool False if value was not updated and true if value was updated.
 */
function update_option( $option, $newvalue ) {
    global $wpdb;

    $option = trim($option);
    if ( empty($option) )
        return false;

    wp_protect_special_option( $option );

    if ( is_object($newvalue) )
        $newvalue = clone $newvalue;

    $newvalue = sanitize_option( $option, $newvalue );
    $oldvalue = get_option( $option );
    $newvalue = apply_filters( 'pre_update_option_' . $option, $newvalue, $oldvalue );

    // If the new and old values are the same, no need to update.
    if ( $newvalue === $oldvalue )
        return false;

    if ( false === $oldvalue )
        return add_option( $option, $newvalue );

    $notoptions = wp_cache_get( 'notoptions', 'options' );
    if ( is_array( $notoptions ) && isset( $notoptions[$option] ) ) {
        unset( $notoptions[$option] );
        wp_cache_set( 'notoptions', $notoptions, 'options' );
    }

    $_newvalue = $newvalue;
    $newvalue = maybe_serialize( $newvalue );

    do_action( 'update_option', $option, $oldvalue, $_newvalue );
    if ( ! defined( 'WP_INSTALLING' ) ) {
        $alloptions = wp_load_alloptions();
        if ( isset( $alloptions[$option] ) ) {
            $alloptions[$option] = $_newvalue;
            wp_cache_set( 'alloptions', $alloptions, 'options' );
        } else {
            wp_cache_set( $option, $_newvalue, 'options' );
        }
    }

    $result = $wpdb->update( $wpdb->options, array( 'option_value' => $newvalue ), array( 'option_name' => $option ) );

    if ( $result ) {
        do_action( "update_option_{$option}", $oldvalue, $_newvalue );
        do_action( 'updated_option', $option, $oldvalue, $_newvalue );
        return true;
    }
    return false;
}

That’s kind of long winded. Just wanted to show the process by which you could go through and figure out this sort of thing.

Edit

It should be noted that arrays of options will be serialized (via maybe_serialize).

Leave a Comment