wp_embed_register_handler to embed html files

I am creating a new WordPress website that will contain thousands of online books and articles to read, in addition to other media, like videos.

I don’t want to bloat my database by pasting tens of thousands of pages worth of text, so I want to create the books chapters as html, then embed these html files in the WordPress posts. I believe that this will reduce the database size significantly.

I uploaded a test html file here. The file can be read by browsers, so there is no errors in the file.

I installed the latest version of Gutenberg, and tried to embed this url in the Embed URL block, but when I click Embed, I get this message: Sorry, we could not embed that content.

I followed the answer to this StackExchange question (question #238330), and the the answer to this other StackExchange question (question #277163), but without luck.

When I used the code in question #238330 answer, i was able to embed videos from forbes.com successfully, using Gutenberg and the default WordPress post editor.

However, when I changed the code to read from my website, nothing happens (i.e. the url shows as a line of text in the post), so obviously I did something wrong in the code.

Here is the code I modified:

    /**
 * Embed support for Coptic Treasures Texts
 *
 * Usage Example:
 *
 *     https://coptic-treasures.com/html-test-filed/02.html
 */
add_action( 'init', function()
{
    wp_embed_register_handler( 
        'coptic-treasures', 
        '#http://www\.coptic-treasures\.com/?#i', 
        'wp_embed_handler_coptic_treasures' 
    );

} );

function wp_embed_handler_coptic_treasures( $matches, $attr, $url, $rawattr )
{
    $embed = sprintf(
        '<iframe class="coptic-treasures-texts" src="https://coptic-treasures.com/html-test-filed/%1$s.html" width="600" height="400" frameborder="0" scrolling="no"></iframe>',
        esc_attr( $matches[1] ) 
     );

    return apply_filters( 'embed_coptic_treasures', $embed, $matches, $attr, $url, $rawattr );
}

My questions are:

1- Can someone please guide me to what are the errors in the modified code?

2- can I embed without an iFrame? My website is responsive, and I want the text to be seamless in the page, and discoverable by Google.

I tried a plugin that achieves the required results before, but ended up not using it as it had an error in its code that leads to consuming all the server resources, if one url is wrong. It was also an unmaintained plugin.

Thanks in advance.

EDIT

I modified the code a bit, and it is now embeding he text, however, I am still facing a problem in the code.

The problem is that for me to embed this url: https://coptic-treasures.com/html-test-filed/02.html, I have to use this code https://coptic-treasures.com/02.html

This is because this part html-test-filed/ is added in the function I am using. I tried to replace it by a variable, but could not do this successfully.

I want to be able to change the url of the embeded files.

Thanks.

/**
 * Embed support for Coptic Treasures Texts
 *
 * Usage Example:
 *
 *     https://coptic-treasures.com/html-test-filed/02.html
 */
add_action( 'init', function()
{
    wp_embed_register_handler( 
        'coptic-treasures', 
        '#https://coptic-treasures.com/([a-z0-9_-]+)\.html$#i', 
        'wp_embed_handler_coptic_treasures' 
    );

} );

function wp_embed_handler_coptic_treasures( $matches, $attr, $url, $rawattr )
{
    $embed = sprintf(
        '<iframe class="coptic-treasures-texts" src="https://coptic-treasures.com/html-test-filed/%1$s.html" width="1200" height="1200" frameborder="0" scrolling="no"></iframe>',
        esc_attr( $matches[1] ) 
     );

    return apply_filters( 'embed_coptic_treasures', $embed, $matches, $attr, $url, $rawattr );
}

1 Answer
1

Fixing the regex pattern

To match an url of the type:

https://coptic-treasures.com/
{Some string with a mix of a-z letters and hyphen}/{Some number}.html

like this example:

https://coptic-treasures.com/html-test-filed/02.html

you can try this kind of pattern:

'#https://coptic-treasures.com/([a-z-]+)/([0-9]+)\.html$#i'

and then you have to update the iframe output accordingly:

... src="https://coptic-treasures.com/%1$s/%2$s.html" ...

with the corresponding matches.

Demo

Here’s a test run:

text

that generates:

embed

Some notes

As far as I remember the embeds write some post meta data, so your approach does not avoid database actions.

But many WordPress installations have tens of thousand of non-hierarchical custom posts without major problems, so I would not rule that option out. Make sure you have a good hosting provider, that provides you with the latest PHP version etc. If the search becomes too slow, there are some third party solutions and services available out there.

Note that it could be a problem with the WordPress admin interface having so many hierarchical posts, because the parent dropdowns fetch all pages.

It’s also probably much easier to edit the data in the WordPress backend (utilizing things like revisions and taxonomies), instead of finding the corresponding bare file each time, open it and modify it that way. Unless it’s exported from another system you have in place.

Another approach might to use e.g. Vue/React/… JS and fetch the pure data as json(p) resources and format it as needed, but I’m not sure about the search engine visibility in that case.

Leave a Comment