Google Apps Script to Create Blogger Blog Book (or Book parts): Test Versions and Stable Version
Last updated on 9 Sep. 2023
9 Sep. 2023 Update: I rarely use the software covered in this post now. To see the current software for creating Blogger blogbooks that I use, please visit: Short User Guide to creating Blogger Blogbooks from Backup/Export File using ExportFileFilterAndGenBook and another VBA projects' macros/code (free and open source), https://ravisiyermisc.blogspot.com/2023/09/short-user-guide-to-creating-blogger.html .
end-Update 9 Sep. 2023
8 Jul 2023 Update Note
I decided to put up and publish this post on 28 June 2023 while the Google Apps Script project work was in progress, as I wanted to ensure that whatever work I had done till then was available for interested readers. I updated this post in a big way over the next few days as I progressed towards a stable solution. Now, as of 8 Jul. 2023, the Google Apps Script project is stable.
I have also put up a post about the stable version:
BlogBooksMaker Google Apps Script to Create Blogger Blog Books: Description and Stable Version Info, https://ravisiyermisc.blogspot.com/2023/07/blogbooksmaker-google-apps-script-to.html .
End 8 Jul 2023 Update Note
I want to limit the time I spend on this app. I may try out a few more things and then implement the stuff that I want and works from test version in main version. With that I intend to stop my work on this stuff. Hopefully then I will be having a version that can be conveniently used as a web app run from a browser for any blog either to print the whole blog or print a particular year's posts, without having to use Google Apps Script editor and make changes to the code.
But if others want to take it further and make it a general purpose freeware app to create blog books for any Blogger blog, they are welcome to do so.
Firstly, I would like to thank the author of "bloggerToEbook" project shared under MIT license here: https://github.com/hn-88/bloggerToEbook . A related blog post of the author: https://hnsws.blogspot.com/2022/03/google-apps-script-to-turn-blog-posts_6.html . The author had kindly informed me of this software over a year ago, as he came to know of my needs for such software. But I had not got around to reading up on Google Apps Script and his code to figure out how to use it for my needs. Now I felt it was appropriate to invest the time to understand how to use his code.
After figuring out how to use his code and being delighted to see how it satisfied my basic needs to have a blog book (or booklets/book parts) of my Blogger blogs, I explored how I could alter it to suit my needs better. In the course of that, I created a test project to try out my alterations. In this post I am sharing this test project, and later test projects, code and data. Note that all this code is free for others to use and modify.
Some background about Google Apps Script is necessary - here's the Overview page: https://developers.google.com/apps-script/overview . The above mentioned author also has a blog post providing links for getting started on Google Apps Script: https://hnsws.blogspot.com/2021/05/getting-started-with-google-apps-script.html .
Now for my test project based on above author's code. The test version uses query string parameters to make the code usable for multiple blogs and for printing a particular year's posts. As it is a test version it prints out only few posts (two by default). Once the test version stablizes, I will convert it back to the full functionality version which prints a whole blog as a set of documents each having 100 posts with the exception of the last document having lesser posts typically.
I am now an obsolete software developer as I have stayed away from coding for around, if not over, a decade now, barring very tiny JavaScript tweaking of others' code to suit my needs. Prior to this exploration I had never used Google Apps Script - so the development platform as well as the development environment is new to me. As written earlier, I want to limit the time I spend on this work. So I am looking at quick fix modifications to the code which may be bad coding style. I also may not be calling the right API functions or calling them the right way - I just don't have the time to read up on the development platform / API barring just quick viewing of reference pages and help I get from Google Search result links, to figure out what I should try. I am not drastically changing the original code and am trying to stick to its style.
I am focusing on modifications for my specific needs and not for any general purpose needs. So my code may have bugs and problems which do not come into play when I am using it for my specific needs but come into play when others use it for different needs. I am not in a position now to help fixing such issues. Other developers are absolutely welcome to do such fixes or other changes and re-publish their version of this software.
========================================================
[4th July 2023: The sections below cover two projects - BlogBooksMaker and BlogBookMakerTest, and a brief mention of an earlier project at the end. The sections as well as its sub-sections are usually in reverse chronological order. So BlogBooksMaker project was later and progressed from the earlier BlogBookMakerTest project.]
BlogBooksMaker project
This section is in reverse chronological order.
The folder corresponding to this project is shared here: https://drive.google.com/drive/folders/1jipFEeigwZ4W_LBtEF_jiWZXzFY41Xcs?usp=drive_link .
11th July 2023 Update start:
v13 version of BlogBooksMaker Project
- In log entry, 'pubdate in this fetch' changed to 'Pub. date' (of post is understood from earlier part of log entry).
- Added Blog book title, creation date in book (part)
- Added end of book-part line in the last part (at the end) of book (part)
- RunInfo-ExecLog.txt: https://drive.google.com/file/d/1VLix9CvTw3aAXxuJbYSUwfy8MPq_ZYvx/view?usp=drive_link ,
- tnarayanasasthri.blogspot.com part 1: https://docs.google.com/document/d/1_E4Yb4RqVZGqiw4_WqKCtiDMUHUjZ_Hklfzaeftb8Io/edit?usp=drive_link
=====================================================
7th July 2023 Update start:
v12 version of BlogBooksMaker Project
Added functions to Run-Driver.gs. Did not add anything to Code.gs as far as I can recall.
7th July 2023 Update end
=====================================================
5th July 2023 Update start:
v11 version of BlogBooksMaker Project
Added functions to Run-Driver.gs. Did not add anything to Code.gs as far as I can recall.
v10 version of BlogBooksMaker Project
v8 version of BlogBooksMaker Project
v8 code: https://drive.google.com/file/d/1EZBnhv2BJjTK57yfih4T6ZrPPY75A_z4/view?usp=drive_link
v7 to v8 changes: Addition of mainbooktitle parameter; addition of function makeMySpiritualBlogBooksYearWise(); removal of some commented code about default value settings; adding code to add year string as part of book title if year has been specified; doGet() function being commented out as web app was behaving strangely.
The run info. and log file (Run3-RunInfo-ExecLog.txt) is here: https://drive.google.com/file/d/12iSxLmSAQ_nCMEzT7XDFphujLNrgU4D7/view?usp=drive_link
Given below are extracts from initial part of above file (slightly edited to improve formatting):
4 Jul 2023
Two runs:
1st run, 4 Jul 2023: Produced blog books from 2023 to 2016 - need to check them out. Execution timed out after 2015 part 1 before completion of part 2 though part 2 file also was created. The relevant execution log entry is: "Jul 4, 2023, 1:25:46 AM Error Exceeded maximum execution time". To prevent confusion I deleted 2015 parts 1 & 2 files of this run.
2nd run, 4 Jul 2023: Produced blog books from 2015 to 2013 - need to check them out. Program completed normally as blog starts from year 2013.
------------
These two runs used v8Code.gs.txt and its slightly earlier version (as calling of myFunction related code in function makeMySpiritualBlogBooksYearWise() had to change to handle timeout error in first run - that's all, no other changes).
===================================
Top-level info. in Execution Log for the 2 runs:
Deployment Function Type Start Time Duration Status
Head makeMySpiritualBlogBooksYearWise Editor Jul 4, 2023, 1:32:18 AM 83.669 s Completed
Head makeMySpiritualBlogBooksYearWise Editor Jul 4, 2023, 1:19:46 AM 360.164 s Timed Out
==== end top-level info ====
v7 code: https://drive.google.com/file/d/1e_wwcGiFIHYn7peK-3uomNXATpzS95VW/view?usp=drive_link
There are no v7 code changes from v6. There are some comments that are added in v7.
But v7 seems to have been used to run web app (instead of from script editor).
The run info. and log file (Run2-RunInfo-ExecLog.txt) is here: https://drive.google.com/file/d/1PASC3usCIwj-XPxaU9Ao-JMrKXg5fsiY/view?usp=drive_link
Given below are extracts from initial part of above file:
This run is through web deployment
url: https://script.google.com/macros/s/AKfycbwjfNbl6rtxsdhtU1luanxHgWXtg7SMq4HfDJZA4T-W/dev?blogurl=https://ravisiyer.blogspot.com/&maxtotalposts=1500
=========
Trips on same point as earlier run through scripteditor. Strangely the webapp tab does not abort but seems to be stuck. The execution log shows multiple webapp runs though I issued only one webapp run! Even after stopping the webapp tab, the execution log shows some webapp running entries which seem to start randomly in the posts sequence and write blog part files (within 1000 posts from current post of blog and so essentially recreating some files done earlier in same set of runs).
Finally after 5 to 10 minutes, the webapp runs are shown as "Completed" or "Timed Out"! The top-level execution info. for these runs is copy-pasted below and slightly edited for better formatting:
Deployment Function Type Start Time Duration Status
Head doGet Web App Jul 4, 2023, 12:05:28 AM 306.172 s Completed
Head doGet Web App Jul 4, 2023, 12:03:28 AM 314.474 s Completed
Head doGet Web App Jul 4, 2023, 12:01:28 AM 322.765 s Completed
Head doGet Web App Jul 3, 2023, 11:59:27 PM 360.193 s Timed Out
Head doGet Web App Jul 3, 2023, 11:57:27 PM 360.222 s Timed Out
Head doGet Web App Jul 3, 2023, 11:55:27 PM 319.482 s Completed
--- end copy-paste of top level info ---
Google Drive shows many versions of Blog Book part (num) files! It is quite a mess! In some runs Blog Book part 24 to 28 have been created!
I think webapp stuff seems to be unstable. I should not use it but run from Script editor even for per year runs even if call code changes have to be made.
--- end extracts from file ---
The above run raised many questions about stability of web app as against running from script editor. I decided to stop further work on web app, and consider only using script editor runs for the near future for this project.
===================
v6 version of BlogBooksMaker Project
v6 code: https://drive.google.com/file/d/1UZSiaKG_KYpQLrsS1co9yM7rVC4G2Hi6/view?usp=drive_link
v6 code changes seems to be for making full Spiritual blog books using Script Editor run.
Key code addition in function runFromScriptEditor():
//var Finmsg = myFunction(null, null, null, 1200, null); // Print whole Worldly blog with def. 50 posts per part
//3rd July 2023: above code seems to have run successfully created 22 Google doc files.
// Output has still to be checked thoroughly
var Finmsg = myFunction("https://ravisiyer.blogspot.com/", null, null, 1500, null); // Print whole Spiritual
// blog with def. 50 posts per part
--- end code addition ---
Above code run on 3rd July 2023 seems to have run well for around 1100 posts but tripped up after that with the run getting aborted. Output of 23 Google docs was not checked in detail. There was a possibility of specifying a partial run from a particular date backwards using fetchurlmainpart parameter but which I did not try out.
The run info. and log file (Run1-RunInfo-ExecLog.txt) is here: https://drive.google.com/file/d/1-miuYK06-nR3wwWENlsF1pZvasSsztpy/view?usp=drive_link
Failure entry and nearby entries in the log file:
11:21:32 PM Info 50.0 in this fetch, 1150.0 overall
11:21:32 PM Info contenthtml.length = 478688
11:23:01 PM Info Error is GoogleJsonResponseException: API call to drive.files.insert failed with error: Internal Error
11:23:01 PM Info Failure to write output Google Docs file. Error Message: GoogleJsonResponseException: API call to drive.files.insert failed with error: Internal Error
11:23:02 PM Notice Execution completed
--- end log file entries extract ---
====================
v5 version of BlogBooksMaker Project
v5 code: https://drive.google.com/file/d/11zd8iybaVaRyaEoR7Rb8X-Y9GGaDzOK6/view?usp=drive_link
v5 code changes seems to be for making full Worldly blog books.
Key code addition in function runFromScriptEditor():
var Finmsg = myFunction(null, null, null, 1200, null); // Print whole Worldly blog with def. 50 posts per part
--- end code addition ---
v5CodeExecInfo-WholeWorldlyBlog.txt has the details of the associated run: https://drive.google.com/file/d/1U6ZsFunkYo78D-gAqb3qbbskW-zzyQ9O/view?usp=drive_link .
Given below are extracts from initial part of above file:
The run function call code line:
var Finmsg = myFunction(null, null, null, 1200, null); // Print whole Worldly blog with def. 50 posts per part
=== end code extract ====
22 output files created with name: 'BlogBook part (num)' with (num) running from 1 to 22. The folder "drive-download-20230703T165538Z-001" in same folder as this file, has the documents. Google Drive seems to auto convert it to .docx (Microsoft Word) format before download. I had a quick look at two or three of these docs. They seem to be OK. A detailed and proper check is required before using this set of docs as my main archive or backup copy. Further TOC also would have to be generated in Word, and perhaps PDF files created from each Word document.
--- end extracts from file ---
====================
5th July 2023 Update end
=====================================================
3rd July 2023, around 7 PM Update start:
version 4 of BlogBooksMaker Project - Drive API error
Restarted the work today, hoping that I could finish it quickly but ran into an error that happens for some data, which error I could not figure out. The link to this version (version 4 Drive API error) of my main code - BlogBooksMaker project code (filename: v4Code-DriveAPI-error.gs): https://drive.google.com/file/d/1fKQclxKQN0JFGbyugovCfeIAMBGbO0CE/view?usp=drive_link
An extract from the code explaining the issue:
//const DEF_MAX_POSTS_PER_PART = 100; // Release version default value
//const DEF_MAX_TOTAL_POSTS = 200; // Release version default value
// Fails for above values when run without arguments, with error:
/* "Jul 3, 2023, 1:59:49 PM Info Error is GoogleJsonResponseException: API call to drive.files.insert failed with error: Bad Request"
This failure is at the time of writing the first part of 100 posts.*/
--- end extract ---
I tried digging up info. on the error from the Internet including Google reference and guide pages but they were not giving me the solution.
Related reference & guide pages:
- Google Drive Reference: Method: files.insert: https://developers.google.com/drive/api/reference/rest/v2/files/insert . (This seems to be page for version 2 of the API but the code I am using (and original shared code) seems to be using an older version API. I could not get the reference page for the older version API.)
- Google Drive Guide: Resolve errors: https://developers.google.com/drive/api/guides/handle-errors .
I tried to get more details of the error by running the program in debug mode (using a new very simple function I wrote - runfromscripteditor() - which simply calls the main function - myFunction() - with hard-coded arguments. Given below are pics of the key info. from the debugger about the error joined into a single composite pic:
Very unfortunately, we have no additional info. about the error beyond it being a "Bad Request" with "global" domain. So I could not figure out what exactly it had tripped up on.29 Jun. 2023 around 10 PM Update start:
v2 and v1 versions of BlogBooksMaker Project
I am now close to finishing the work. I need to do some more testing before going to release version. I thought of sharing the current features and status of the app.
I have to also mention that I am facing some personal issues due to which I am freezing this work and hope to restart it in the near future.
The web app takes the following arguments:
*) blogurl: name is self-explanatory
*) year: limits posts to the specified year with one day before and after as extra for cushion about Timezones
*) maxpostsperpart: Max posts printed in one Google Docs part. It has a default value of 10 in test version and will be 100 in release version.
*) maxtotalposts: Max total number of posts printed across all Google Docs generated by program. To print the whole blog this number has to be set to a number larger than total number of posts in blog. It has a default value of 20 in test version and will be 200 in release version.
*) fetchurlmainpart: This allows user to directly specify the main part of the url used in UrlFetchApp.fetch() function call. If this is used, blogurl, year and maxpostsperpart parameters are ignored. But maxtotalposts parameter is used if user has specified it. User can use fetchurlmainpart to choose a sub-section of the blog to be put in the blogbooks, based on various conditions like publication date or updation date etc. When using this parameter, some special characters like '&' have to be escaped - that can be seen from the examples given below which use fetchurlmainpart parameter.
The code as a text file (version 2 - v2Code.gs of BlogBooksMaker project) is uploaded here: https://drive.google.com/file/d/1Znc9dtbGK4_XQYAplKc68JJcaBZgHKJQ/view?usp=drive_link .
An earlier version code (v1Code.gs) is shared here: https://drive.google.com/file/d/1dJkkNhh4ncCUOG7hxfPLZ3qS_sSyyIFi/view?usp=drive_link .
Links to files having web app parameters used in successful test runs yesterday (28 Jun. 2023), along with, at times, execution log data, are given below. The test version functionality improved step by step and so the execution log data is for both the above code versions.
v1Code-ExecLogs.txt: https://drive.google.com/file/d/1gPXg76xTrdtnwMM0Y_q4NwVfFzceHZeT/view?usp=drive_link .
Given below are fetchurlmainpart parameters used in successful test runs on evening/night of 28-Jun-2023. Other data is not included. This data is based on the shared file (v2Runs-fetchurl-testvalues.txt): https://drive.google.com/file/d/1P9BCw7nMcYZFGCWKaerqnUJpdkpZY07i/view?usp=drive_link .
*1) About escaping & in fetchurlmainpart:
[Useful link: https://www.w3schools.com/tags/ref_urlencode.asp ]
If we use the following query string for the web app:
?fetchurlmainpart=https://ravisiyermisc.blogspot.com/feeds/posts/default?max-results=10&alt=json
it trips up on the & character in last part of url, viewing it as a parameter separator
[Possible run start time for above: Jun 28, 2023, 11:39:43 PM]
Encoding of https://ravisiyermisc.blogspot.com/feeds/posts/default?max-results=10&alt=json gives us:
https%3A%2F%2Fravisiyermisc.blogspot.com%2Ffeeds%2Fposts%2Fdefault%3Fmax-results%3D10%26alt%3Djson
So we can use:
?fetchurlmainpart=https%3A%2F%2Fravisiyermisc.blogspot.com%2Ffeeds%2Fposts%2Fdefault%3Fmax-results%3D10%26alt%3Djson
Note that ?fetchurlmainpart= should NOT be encoded as = character seems to be required in that part for query string parameter fetchurlmainpart being set up
The above query string worked (successful run of web app).
[Possible run start time: Jun 28, 2023, 11:54:49 PM]
*2) The below query string parameter works! So only & character needs to be escaped.
?fetchurlmainpart=https://ravisiyermisc.blogspot.com/feeds/posts/default?max-results=8%26alt=json
[Possible run start time: Jun 28, 2023, 11:57:53 PM]
*3) Further to add maxtotalposts parameter & without escaping it can be used, as follows:
?fetchurlmainpart=https://ravisiyermisc.blogspot.com/feeds/posts/default?max-results=8%26alt=json&maxtotalposts=16
Tested the above: It works!
[Possible run start time: Jun 29, 2023, 12:02:48 AM]
*4) fetchurlmainpart parameter before escaping &:
https://ravisiyer.blogspot.com/feeds/posts/default?max-results=10&published-min=2021-12-31T00:00:00-08:00&published-max=2022-07-01T23:59:59-08:00&alt=json
After escaping & along with ?fetchurlmainpart= initial part, specified as parameter to web app:
?fetchurlmainpart=https://ravisiyer.blogspot.com/feeds/posts/default?max-results=7%26published-min=2021-12-31T00:00:00-08:00%26published-max=2022-07-01T23:59:59-08:00%26alt=json&maxtotalposts=14
Tested the above: It works!
[Possible run start time: Jun 29, 2023, 12:07:11 AM]
29 Jun. 2023 around 10 PM Update end
===================================================
BlogBookMakerTest project
This section is in reverse chronlogical order.
The files associated with this project are shared in this folder: https://drive.google.com/drive/folders/1kL1i3WW8IVGtJa_Ah-F_Ij6fhmZo70FK?usp=drive_link
28 Jun. 2023 around 6 PM Update start:
v4 version of BlogBookMakerTest Project
The code as a text file (v4Code.gs) is uploaded here: https://drive.google.com/file/d/1hd5gMhhHosmhMfpR7lt06hhN1-LmCAXK/view?usp=drive_link
28 Jun. 2023 around 6 PM Update end
v2 and v1 versions of BlogBookMakerTest Project
Google Drive API Service has to be enabled in (or added to) Google Apps Script editor for this project, for the code to run successfully (specifically, be able to create a Google Docs document on drive of user running the app). https://developers.google.com/apps-script/guides/services/advanced provides details of how to enable/add such services.
To run the program, one has to deploy it as a web app, at which time the URL for the deployed web app. is provided. Entering that URL in a browser (I used Chrome browser), with or without Query String parameters, results in execution of the web app. I changed the default permissions to allow any user to run the app (using the same URL which is given later on in this post, in a browser). But I have not tested it using another Google account, so far. Note that the app. creates a Google Docs document in default folder of the user's Google Drive (the user has to have a Google account, I think).
But a lot of permissions issues crop up all of which have to be granted. Also, there is a warning about the app being not verified by Google and so unsafe! The Advanced button has to be clicked in this warning dialog and then a button to run the unsafe code has to be clicked! As I had read the code and understood quite well (though not fully, earlier on) I did not feel I was taking any big risks by granting the permissions and running the "unsafe" code/app. But I do think that users who are not familiar with Google Apps Script and have not read the code for this test app., may be scared to grant all the permissions it asks for and then run the "unsafe" web app.
Run and execution log data for above v2Code-YearnblogurlnmaxpostsperpartAsargs.gs seems to be v2code-ExecLogs.txt shared here: https://drive.google.com/file/d/12Q6uMimoL6qSLb9M5AacOYEO90UV5v47/view?usp=drive_link .
=================================
An earlier version of the code is v1Code-YearnblogurlnmaxpostsAsargs.gs: https://drive.google.com/file/d/1creokoyAudAbeEQFUWF5bJ8XNi7ayTIR/view?usp=drive_link .
Run and execution log data for above v1Code-YearnblogurlnmaxpostsAsargs.gs seems to be v1code-ExecLogs.txt shared here: https://drive.google.com/file/d/10hHoMhhAs2BMHAbJ8GdV0rrmYOEiUnSV/view?usp=drive_link .
===================================================
RSIMiscBlogBook project
Miscellaneous
- DocumentApp.getUi().prompt() function to see if it can be used to introduce a dialog window in which users can provide parameters like blogurl, year, maxtotalposts etc. It did not work as it needs a Document context. Perhaps one can use a dummy document and in its context use this function. But I did not try that out.
- Trying out this code with one of my WordPress blogs. It failed as my WordPress blog does not have json plugins (which needs a business plan). [None of my WordPress blogs have a business plan.]
Comments
Post a Comment