Importing JSON feed should the content be sanitized?

I’ve been doing a lot of research and had several discussions in chat about learning to take a JSON feed and add the content to a post in WordPress.

In the past I would create a page template and through the setting’s admin panel create a form field for a particular API’s key and when the template was used it would call the JSON into a folder and loop through the results on a single page with links to the site it originated from. Now I would like to learn how take the JSON and add it to the database to be used in a post.

Research and discussions in chat have resulted several tutorials and Q&As on the subject such as:

  • Using the WP-API to Fetch Posts
  • how do i convert this JSON array to html then create a WordPress post with the collected data?
  • simple wp_insert_post example
  • wp_insert_post with a form
  • Programmatically Create a Post in WordPress

Importing the data I was able to find:

  • Making Remote Requests with wp_remote_get
  • Import JSON feed to WordPress
  • How to import external JSON and display in php
  • JSON decoding from a URL

However, I’ve always been taught when inputting data it should always be considered as bad input and should be tested and cleaned. When researching on sanitizing input I did run across:

  • Validating Sanitizing and Escaping User Data
  • Data Validation

Though I didn’t really see anything discussing if JSON should be sanitized and when I was looking for a hook I didn’t see anything. So this leads me to my question. When calling a JSON feed from a third party source should the JSON be sanitized before adding it to the database and if so does WordPress have a built in way to do it? When I researched sanitize JSON I didn’t find anything built in WordPress that could sanitize the JSON.

1 Answer
1

There are two aspects here

  1. obviously all input should be sanitized
  2. JSON is just a wrapper no different then any other type of container which is used to aggregate data for transmission. You almost never sanitize the container as usually in case of an error you will just not be able to extract the data from it, but each piece of data should be sanitized. Since sanitization depends on context it is just impossible to have a generic rules that can be applied to all data in all the possible contexts.

So you need to do sanitization, but can you just trust core API to do it for you? It again depends on context. If you just need the data to be stored in the DB without breaking anything then the DB access APIs will do everything for you, but if you have to have the content of a post be XHTML complient then you will have to write your own validation.

In one line: Always sanitize input as much as possible. Once everything is sanitized on the “data model” level, you can trust wordpress API to not generate additional gotca moments.

Leave a Comment