WordPressFeedToBook Google Apps Script to Create WordPress Blog Book (posts and comments) from Feed: Description and Stable Version Info

Last updated on 10 Sep. 2023

10 Sep. 2023 Update: I rarely use the software covered in this post now. To see the current software that I use for creating WordPress blogbooks, please visit: Generated blogbook of my Misc. Tech. WordPress blog using my VBA program WPBlogExportFileToBook, https://ravisiyermisc.blogspot.com/2023/08/generated-blogbook-of-my-misc-tech.html .

end-Update 10 Sep. 2023

WordPressFeedToBook project has code (functions) that can be run from Google Apps Script Editor to create blog book (or blog books) having posts and comments but not pages, from WordPress blog feed (XML feed). The number of posts returned in the feed request by WordPress is controlled by WordPress Admin Settings of the associated blog with a default value of 10 but which can be increased (to 150 at least). One can also retrieve feed for a particular  year or even a particular month in a year. The functions in WordPressFeedToBook project make a Google Docs document or book of the posts returned in the feed by WordPress. By making appropriate Admin settings and feed requests, one should be able to put contents of all posts of a WordPress blog into books (or single book if the number of posts are around 100, say).

These generated blog book(s) will not have blog pages. However, blog pages content can be manually added to the generated blog books. Further, table of contents and page numbers can also manually be added to them. 

The current stable version of the WordPressFeedToBook project is 20230710-WordPressFeedToBook version. It consists of three code files:

1) Code.gs.txt: https://drive.google.com/file/d/1kpz1LHZ4DEQxLEvzITWVtYNo5yXnAZz4/view?usp=drive_link .

2) XMLToJSON.gs.txt: https://drive.google.com/file/d/1xiUEPZOtAOFWpm2M5vbQSF_OYU18bXRI/view?usp=drive_link

3) Run-Driver.gs.txt: https://drive.google.com/file/d/1MedXqF-qZ66GEP0YQY476AD1rqWgmLC9/view?usp=drive_link .

Note that a copy of the file 1 and file 3 mentioned above - Code.gs.txt and Run-Driver.gs.txt which have lot of my code - is provided towards the bottom of this post. Further note that the code in these two files is free for others to use and modify. File 2 - XMLToJSON.gs.txt - has code taken from https://gist.github.com/erickoledadevrel/6b1e9e2796e3c21f669f . Many thanks to the author for sharing that code. 

Drive API service has got to be added/enabled for the WordPressFeedToBook project to run successfully. When running the program for the first time, a lot of permissions are requested which have to be granted. Also the code is not verified by Google and hence it is deemed unsafe. To get around this issue, in the warning dialog about "Google hasn't verified this app", the Advanced button has to be clicked and then a button to run the unsafe code (button label: Go to app-name (unsafe)) has to be clicked. Note that the message in this dialog about "Continue only if you understand the risks and trust the developer (dev-google-ac-id)" will have Google account id of the person running this code from Google Apps Script Editor as dev-google-ac-id.

Short background 

This work builds up on my previous project(s) covered in the following posts: 

The projects described in the two abovementioned posts are based on "bloggerToEbook" project shared under MIT license here: https://github.com/hn-88/bloggerToEbook . I would like to thank the author of that project. 

Readers who are not familiar with Google Apps Script should go through relevant parts of the  two abovementioned posts. I am skipping that introductory information in this post.

8 Aug. 2023 Update

A minor change was done to add a function makeRavisiyerWPCommentsBlogBook() to Run-Driver.gs which creates a blog book having only comments (no posts and no pages) of ravisiyer.wordpress.com blog. No other changes were made to the code.

Ideally speaking, we should have a single blog book havings posts and pages of the blog and comments of posts and pages should be included after the associated posts and pages. But there are some challenges to do that in this GAS program. 

https://wordpress.com/support/feeds/ tells us that posts and comments can be requested separately via RSS feed requests for a WordPress blog. But there is no mention about retrieving pages of the blog via RSS feed request. In my quick reading of the above article, I did not come across any way to retrieve posts along with their comments. So I think one has to retrieve posts and comments separately and then write code to link the retrieved comments to the retrieved posts and then create the blogbook. While I could write code to link retrieved comments to retrieved posts, as anyway pages are not returned, I felt I should not invest the time to write that code.

That meant that I can use this program to have two blogbooks for a WordPress blog - a posts blogbook and a comments blogbook. For pages, I would have to explore another way.

makeRavisiyerWPCommentsBlogBook() is the function I added to Run-Driver.gs that simply provides appropriate feed request argument to the main function makeWPBlogFeedBook to create a blog comments book.

Public share folder: 20230807-WordPressFeedToBook,  https://drive.google.com/drive/folders/1P1nZ1r2BAlkKJCeE6c9HbMhkD92T5Bzr?usp=drive_link .

Comments (only) blog book for ravisiyer.wordpress.com:  https://drive.google.com/file/d/1XL5aBfZhkYfdCUo-swLkRPT53brPk2EK/view?usp=drive_link .

RunInfo-ExecLog.txt for the associated run:  https://drive.google.com/file/d/1OtRWnyr14eX7Fs_PKYVBg6N3NvcdQe5Q/view?usp=drive_link .

Run-Driver.gs.txt having the modified code: https://drive.google.com/file/d/1u-JJYw_mDIb81uYjw072CDvVZthFNBWu/view?usp=drive_link .

8 Aug. 2023 Update end

About current stable version of WordPressFeedToBook (20230710-WordPressFeedToBook version)

Introductory information to WordPress feeds and related matters, along with detailed info. on test versions and stable version is covered in my post: Google Apps Script to Create WordPress Blog Book (or Book parts): Test Versions and Stable Version, https://ravisiyermisc.blogspot.com/2023/07/google-apps-script-to-create-wordpress.html

The current stable version is 20230710-WordPressFeedToBook version. The following part of this section gives concise info. about it based on the abovementioned post.


Key code files:

1) Code.gs.txt: https://drive.google.com/file/d/1kpz1LHZ4DEQxLEvzITWVtYNo5yXnAZz4/view?usp=drive_link . 

It has the function makeWPBlogFeedBook(blogFeedURL, bookTitle) which is the main function doing the blog feed book work.

2) XMLToJSON.gs.txt: https://drive.google.com/file/d/1xiUEPZOtAOFWpm2M5vbQSF_OYU18bXRI/view?usp=drive_link

It has the function XML_to_JSON() which converts XML feed given by WordPress to JSON feed (as JSON object).


It has the following functions invoking the main function makeWPBlogFeedBook(blogFeedURL, bookTitle) of Code.gs file. In function names below, BFB is the acronym for Blog Feed Book:
  • function makeBFBWithDefaultValues(): Invokes makeWPBlogFeedBook with default values which will make a book of the default blog feed. This function can be run from Script Editor to test makeWPBlogFeedBook function.
  • function makeRavisiyerWPBlogBook(): Invokes makeWPBlogFeedBook for ravisiyer.wordpress.com full blog. Requires WordPress Admin->Settings->Reading->Syndication feeds to be set to higher than number of published posts in ravisiyer.wordpress.com.
  • function makeBFBForOneYear(): Invokes makeWPBlogFeedBook for a particular blog feed (for a year).
  • function makeBFBYearWise(): Invokes makeWPBlogFeedBook for a particular blog feed with year parameter in a loop.
The test runs of this version are described in the post mentioned at the beginning of the section - Google Apps Script to Create WordPress Blog Book (or Book parts): Test Versions and Stable Version,  https://ravisiyermisc.blogspot.com/2023/07/google-apps-script-to-create-wordpress.html. I have provided below, only the links to files/folder associated with the runs. For details, readers are requested to visit the aforementioned post.

Run1
Run-Driver.gs function makeBFBWithDefaultValues() was run from Script Editor.
6 Aug. Update: Later the above output file's images exceeding page size issue was fixed, page numbers were inserted and Table of Contents was generated. The Word document with these changes is: https://drive.google.com/file/d/1A36oevckYp5IhqiQpWaLrtjVtvfXR_wO/view?usp=drive_link (385 pages, 10 MB). And the PDF file created from it is: https://drive.google.com/file/d/1fmRi4HH5yi-mMKs5nqj3mG0XV9aVGQAC/view?usp=drive_link (385 pages, 8.5 MB). 6 Aug Update-end

Run2
Run-Driver.gs function makeRavisiyerWPBlogBook() was run from Script Editor.
Run3
Run-Driver.gs function makeBFBForOneYear() was run from Script Editor.
Run4
Run-Driver.gs function makeBFBYearWise() was run from Script Editor.
Some details about the image size issue in some of the output books of the test runs are given below:
The image issue is that some images are bigger than page size. The images are NOT cropped as their size can be reduced manually to show the whole image on the page. A workaround for it is in Google Docs page settings. In the default page settings of Portrait, Letter and 1 inch margins (left, right, top and bottom), we have the image size issue for some images. Changing the page settings of Google Docs (File->Page Setup) to pageless (for some documents, it takes some time and so one has to wait after clicking OK in the dialog windows), seems to reduce sizes of all images in document to fit the view size! So pageless is a solution for viewing all images in the Google Docs document on computer!

Another point is that in my case, as these docs, as of now, mainly serve as a human-readable backup of the blog (as against XML export), the image issue is manageable. I can leave the images as is, and when I need to retrieve the data for a post, I can manually resize only images of that post that do not fit the page. If somebody does use these docs as a book to read then that person will have to put in the effort to resize those images which do not fit the page and which he/she wants to see in full (or he/she can simply visit the blog post link provided along with the content for that post in the book).

As the final set of runs have been successful with some issues like image size being too big for the page in some cases but which I am willing to live with as manual resize is possible and we have the pageless setting workaround in Google Docs, this 20230710-WordPressFeedToBook version can be viewed as a stable version.

Current Stable Version - 20230710-WordPressFeedToBook version - My Code Copy

My code is two files: Code.gs and Run-Driver.gs (third being XMLToJSON.gs whose original link has been shared earlier in this post). I felt it appropriate to provide a copy of these two files with my code, in this post itself and so have given them below.

--- Start Code.gs (copy-pasted on 11 Jul 2023) ---

// This code is based on the code available here: https://github.com/hn-88/bloggerToEbook which is
// licensed under "MIT License" .."Copyright (c) 2022 hn-88".
// The details of that license can be viewed here: https://github.com/hn-88/bloggerToEbook/blob/main/LICENSE
// As the license allows others "the rights to use, copy, modify, merge, publish" the software, I have used
// it in my code given below and associated web app.
// The license for this modified version (code below) (modifications done by me, Ravi S. Iyer) is provided
// in my blog post:
// All my blog data and books publicly accessible on Google Drive; Permission for free reuse,
// https://ravisiyer.blogspot.com/p/all-my-blogbooks-publicly-accessible-on.html .
// To know more about how I modified this code and tried it out, please visit my blog post:
// Google Apps Script to Create WordPress Blog Books: In-Progress Test Versions, 
// https://ravisiyermisc.blogspot.com/2023/07/google-apps-script-to-create-wordpress.html ,
// created on 9 Jul 2023.
//
/* I (Ravi S. Iyer) am now an obsolete software developer as I have stayed away from coding for around, if
  not over, a decade now, barring very tiny JavaScript tweaking of others' code to suit my needs. Prior to
  the recent and earlier exploration covered here: Google Apps Script to Create Blogger Blog Books: Test Versions 
  and Stable Version, https://ravisiyermisc.blogspot.com/2023/06/google-apps-script-to-create-blogger.html , I had
  never used Google Apps Script - so the development platform as well as the  development environment is new
  to me. I want to limit the time I spend on this work. So I am looking at quick fix modifications to the code
  which may be bad coding style. I also may not be calling the right API functions or calling them the right way
  - I just don't have the time to read up on the development platform / API barring just quick viewing of
  reference pages and help I get from Google Search result links, to figure out what I should try. I am not
  drastically changing the original code and am trying to stick to its style.  

  I am focusing on modifications for my specific needs and not for any general purpose needs. So my code may 
  have bugs and problems which do not come into play when I am using it for my specific needs but come into 
  play when others use it for different needs. I am not in a position now to help fixing such issues. Other 
  developers are absolutely welcome to do such fixes or other changes and re-publish their version of this 
  software. */


const runType = {
  NORMAL: 1,
  TEST: 2,
};

const thisRun = runType.NORMAL;  // Change to runType.TEST for test run

const testRunPostsLoopIterations = 10; // Used only if thisRun is runType.TEST, otherwise ignored

const DEF_BLOG_FEED_URL = 'https://ravisiyer.wordpress.com/feed/';
const DEF_BOOK_TITLE = 'Blog Feed Book'; 

// In function name below, WP stands for WordPress
function makeWPBlogFeedBook(blogFeedURL, bookTitle) {
  
  Logger.log("makeWPBlogFeedBook arguments: blogFeedURL = " + blogFeedURL + ", bookTitle = " + bookTitle);

  var options = {
    method: 'GET',
    muteHttpExceptions: true,
  };
  
  if (blogFeedURL == null)
    blogFeedURL = DEF_BLOG_FEED_URL;

  if (bookTitle == null)
    bookTitle = DEF_BOOK_TITLE;

  Logger.log("After arguments check for null values, blogFeedURL set to: " + blogFeedURL + ", bookTitle set to: " + bookTitle);

  var xmlFeed = UrlFetchApp.fetch(blogFeedURL, options);    
//  Logger.log("xmlFeed:");
//  Logger.log(xmlFeed);

  var json = XML_to_JSON(xmlFeed); 
//  Logger.log("json:");
//  Logger.log(json);

  var contenthtml = '';
  const now = new Date();
  var timeZone = Session.getScriptTimeZone(); 
  var contentDate = Utilities.formatDate(now, timeZone, 'd-MMM-yyyy\' at \'K:mm a zzzz \'(GMT \'XXX\)');

  contenthtml = '<h1>' + bookTitle + '</h1><br/>';
  contenthtml+= 'Book creation process started on ' + contentDate;

  Logger.log('Book Title and date data (in HTML): ' + contenthtml);

  if (thisRun == runType.NORMAL) {
    Logger.log("Normal run (not test run)");
  } else if (this.Run == runType.TEST) {
    var msg = '<br/><br/><br/><b>Test run with Posts Loop Iterations = ' + 
    testRunPostsLoopIterations + '.</b><br/><br/>';
    contenthtml+= msg;
    Logger.log("(In HTML): " + msg);
  } else {
    Logger.log("Unexpected value for thisRun variable. Aborting function");
    return -1;
  }

  contenthtml += '<br/><br/>Post contents follow:<br/>' +
    '=================================================================';

  for (i in json.rss.channel.item) {
    var pubDateOnly = json.rss.channel.item[i].pubDate.Text.substring(5,16); //0 based start to end, end exclusive
    var ListUrl = json.rss.channel.item[i].link.Text;

    contenthtml+='<h1>'+json.rss.channel.item[i].title.Text
          + '; Published: '+pubDateOnly
          +'</h1>'
          +'Post link (URL) on blog: <a href="' +ListUrl+'">' + ListUrl +'</a>'+'<br/><br/>'
          +json.rss.channel.item[i].encoded.Text
          +'<br/>===========================End of Post============================<br/>';

    Logger.log("Post Title (80 chars), Pub. date: '" +
      json.rss.channel.item[i].title.Text.substring(0,80) + "', " + pubDateOnly) ;

    if (this.Run == runType.TEST) {
      // Test run break
      if (i >= testRunPostsLoopIterations) {
        var msg = "Test run break out of loop of copy of posts data to contenthtml var after " +
          testRunPostsLoopIterations + " loop iterations."
        Logger.log(msg);
        contenthtml+= '<br/><br/><h1>' + msg + '</h1><br/>';
        break;
      }
    }
  }
  
  contenthtml+= '<br/>===========================<b>End of Book</b>============================<br/>';

//  Logger.log("contenthtml:");
//  Logger.log(contenthtml);

  try {
    var ablob = Utilities.newBlob(contenthtml, MimeType.HTML, "asset.html");
    var AssetGDocId = Drive.Files.insert(
      { title: bookTitle, 
      mimeType: MimeType.GOOGLE_DOCS },
      ablob
    ).id;
    Logger.log('Wrote "%s" to GDoc.', bookTitle );
  }
  catch(err){
    Logger.log('Error is %s', err);
    return -1;
  }
  return 0;
}

--- End Code.gs (copy-pasted on 11 Jul 2023) ---
....

--- Start Run-Driver.gs (copy-pasted on 11 Jul 2023) ---

// This file has various invocations of makeWPBlogFeedBook() function which can be run from Script Editor
// The parameters of makeWPBlogFeedBook() can be made out from the parameter names in function prototype below
  // function makeWPBlogFeedBook(blogFeedURL, bookTitle)

// The license for the code in this file (code below) (authored by me, Ravi S. Iyer) is provided
// in my blog post:
// All my blog data and books publicly accessible on Google Drive; Permission for free reuse,
// https://ravisiyer.blogspot.com/p/all-my-blogbooks-publicly-accessible-on.html .


// In function names below BFB is acronym for Blog Feed Book.
// Note that number of posts in feed for a WordPress blog is controlled by
// WordPress Admin->Settings->Reading->Syndication feeds for that blog and has default value of 10.
// As far as I know, number of posts cannot be specified as a parameter while making the blog feed request.
// I have been able to change WordPress Admin->Settings->Reading->Syndication feeds for one of my
// WordPress blogs to 150 after which the feed request seems to have returned all 113 published posts
// of that blog.


// Invokes makeWPBlogFeedBook with default values which will make a book of the default blog feed 
// This function can be run from Script Editor to test makeWPBlogFeedBook function.
function makeBFBWithDefaultValues() {
  makeWPBlogFeedBook(null, null);
}

// Invokes makeWPBlogFeedBook for ravisiyer.wordpress.com full blog
// Requires WordPress Admin->Settings->Reading->Syndication feeds to be set to higher than number
// of published posts in ravisiyer.wordpress.com. As of 10 Jul 2023, number of published posts
// in that blog is 113 and so 150 is a good value for Syndication feeds. I checked that 150 value is
// accepted by WordPress and that the WordPress feed seems to return all 113 posts.
function makeRavisiyerWPBlogBook() {
  makeWPBlogFeedBook('https://ravisiyer.wordpress.com/feed/', "ravisiyer.wordpress.com Blog Feed Book");
}

// Invokes makeWPBlogFeedBook for a particular blog feed (for a year)
function makeBFBForOneYear() {
  makeWPBlogFeedBook('https://ravisiyer.wordpress.com/2022/feed/', "ravisiyer.wordpress.com - Year 2022 Blog Feed Book");
}

// Invokes makeWPBlogFeedBook for a particular blog feed with year parameter in a loop.
function makeBFBYearWise() {
  var year;
  const blogURL = 'https://ravisiyer.wordpress.com/';
  const blogFeedBookTitleBase = 'ravisiyer.wordpress.com';

  var blogFeedURL = '';
  var blogFeedBookTitle = '';
  var rtnCode;
  for(year = 2023; year > 2010; year--){
    blogFeedURL = blogURL + year + '/feed/' ;
    blogFeedBookTitle = blogFeedBookTitleBase + ' - Year ' + year + ' Blog Feed Book';
    rtnCode=makeWPBlogFeedBook (blogFeedURL, blogFeedBookTitle); 
    if (rtnCode == 0) {
      Logger.log("Successfully returned from makeWPBlogFeedBook() for parameters: " + blogFeedURL +", " +       
        blogFeedBookTitle);
    } else {
      Logger.log ("Failure return from makeWPBlogFeedBook() for parameters: " + blogFeedURL +", " +           
        blogFeedBookTitle);
      Logger.log("Aborting!");
      return;
    }
  }
}

--- End Run-Driver.gs (copy-pasted on 11 Jul 2023) ---

Comments

Archive

Show more