Correct regex for wp_embed_register_handler

THE GOAL

I’m trying to parse a url and convert it to an embedded video player in my post’s content but I think my regEx is off or I’m not creating a provider correctly.

THE CODE

Here is what I have setup now.

wp_embed_register_handler('brightcove', '/(players.brightcove.net/)([^/]+)/([^/]+)/index\.html\?videoId=([\d]+)/g', 'wp_embed_handler_brightcove');

function wp_embed_handler_brightcove($matches, $attr, $url, $rawattr) {

    // var_dump($matches, $attr, $url, $rawattr);

    $account  = esc_attr($matches[ 1 ]);
    $player   = esc_attr($matches[ 2 ]);
    $video_id = esc_attr($matches[ 3 ]);

    $embed = '<div style="display: block; position: relative; max-width: 142.86%;"><div style="padding-top: 39.3742%;"><iframe src="'
    . sprintf('http://players.brightcove.net/%1$s/%2$s/index.html?videoId=%3$s',
              $account,
              $player,
              $video_id
    ) . '" allowfullscreen="" webkitallowfullscreen="" mozallowfullscreen="" style="width: 100%; height: 100%; position: absolute; top: 0px; bottom: 0px; right: 0px; left: 0px;"></iframe></div></div>';

    return apply_filters('embed_brightcove', $embed, $matches, $attr, $url, $rawattr);
}

TESTING

$embed_code = wp_oembed_get('http://players.brightcove.net/1234567890123/default_default/index.html?videoId=1234567789012');

echo $embed_code; // nothing :(

I do have a regEx working on regexr which uses:

(players.brightcove.net/)([^/]+)/([^/]+)/index\.html\?videoId=([\d]+)

and looks for:

http://players.brightcove.net/{account}/{player_id}/index.html?videoId={video_Id}

RESOURCES

  • https://codex.wordpress.org/Embeds
  • https://codex.wordpress.org/Function_Reference/wp_embed_register_handler
  • https://codex.wordpress.org/Function_Reference/wp_oembed_get
  • http://php.net/manual/en/function.preg-match-all.php
  • http://regexr.com

THE QUESTION

How can I get this custom embed handler to function correctly? Can I fix my regEx or am I not calling the methods correctly?


UPDATE: Experiment

I ran into VerbalExpressions a while ago and I wanted to see how this might work. The PHP version is here, but this is using the JS version that I can test on RegExr or RegEx101:

var tester = VerEx()
        .startOfLine()
        .maybe("http")
        .maybe("s")
        .maybe(":")
        .then("//players.brightcove.net/")
        .word()
        .then("https://wordpress.stackexchange.com/")
        .word()
        .then('/index.html?videoId=')
        .word()
        .maybe('&')
        .anythingBut(" ")
        .endOfLine();

Although it compiles down to something that works, it still needs some cleanup. Essential only making groups of the things I’m interested in and converting some orignal groups to non-matching groups.

// Before

/^(http)(s)?(\:\/\/players\.brightcove\.net\/)\w+(\/)\w+(\/index\.html\?videoId\=)\w+(\&)?([^\ ]*)$/gm

// After

/^(?:http)?(?:s)?(?:\:)?\/\/players\.brightcove\.net\/(\d+)\/(\w+)\/index\.html\?videoId\=(\d+)\&?[^\ ]*$/gm

UPDATE: SOLUTION

Modified slightly from @birgire’s answer because account will always be numeric.

wp_embed_register_handler(
    'brightcove',
    '#https?://players\.brightcove\.net/([\d]+)/([^/]+)/index\.html\?videoId=([\d]+)#', 
    'wp_embed_handler_brightcove');

1
1

Just few notes here:

  • We have to be careful using % within sprintf() to avoid confusion with the placeholders. Try to remove the CSS styles.

  • It’s sometimes easier to use the # or ~ delimiters in regular expressions, instead of the / delimiter.

  • Since you have the (players.brightcove.net/) as the first match, it might not match this assumption:

    $account  = esc_attr( $matches[1] );
    

    Try instead:

    #https?://players\.brightcove\.net/([^/]+)/([^/]+)/index\.html\?videoId=([\d]+)#
    

This seems to work in the content editor:

add_action( 'init', function()
{
    wp_embed_register_handler( 
       'brightcove', 
       '#https?://players\.brightcove\.net)/([^/]+)/([^/]+)/index.html\?videoId=([\d]+)#', 
       'wp_embed_handler_brightcove' 
    );
} );

where wp_embed_handler_brightcove() is the callback function defined by @jgraup above.

Here’s a related answer I worked on recently.

Playing with PHPVerbalExpressions

@jgraup mentioned the library PHPVerbalExpressions – “PHP Regular expressions made easy”.

Here’s an attempt to use it:

$regex = new \VerbalExpressions\PHPVerbalExpressions\VerbalExpressions();

$regex->removeModifier( 'm' )
//      ->startOfLine()
      ->then( 'http' )
      ->maybe( 's' )
      ->then( '://players.brightcove.net/' )
      ->anythingBut( "https://wordpress.stackexchange.com/" )
      ->then( "https://wordpress.stackexchange.com/" )
      ->anythingBut( "https://wordpress.stackexchange.com/" )
      ->then( '/index.html?videoId=' )
      ->add( '([\d]+)' );
//      ->endOfLine();

This generates the following pattern:

/(?:http)(?:s)?(?:\:\/\/players\.brightcove\.net\/)(?:[^\/]*)(?:\/)(?:[^\/]*)(?:\/index\.html\?videoId\=)([\d]+)/

or if we expand it:

/
   (?:http)
   (?:s)?
   (?:\:\/\/players\.brightcove\.net\/)
   (?:[^\/]*)
   (?:\/)
   (?:[^\/]*)
   (?:\/index\.html\?videoId\=)
   ([\d]+)
/

We would have to adjust the matched keys accordingly with $matches.

Leave a Comment