Tennessee Leeuwenburg: Wrangling downloading my own blog articles

99.9% of everything at the moment appears to be wrangling basic data ingest.

In a fit of semi-directionless curiosity, I decided to try expanding on my previous post by wiring up some kind of automatic blogging tool just to see if I could. This problem has many aspects. Let the yak shaving begin.

The first thing I did was start thinking about requirements a bit. I didn't, you know, write any down. But I did think about it.

I decided I needed an AI bot which could automatically write blog posts for me. I assume many people have tried and failed, or possibly even tried and succeeded, but I didn't want to let knowledge get in the way of the outcome and just ran at it.

Here is my design:
-- A module for downloading input source data to feed into the AI
-- A module for running learning jobs based on source data
-- A module to write blog articles
-- A module to publish / print out the blog articles

I thought I'd wrangle some kind of thing which would create text of the required length using some kind of minimal implementation like a markov model trained on my old blog posts. I should have just trained it on gutenberg text or something, because it turns out that downloading my old blog posts is Harder Than It First Appears. Not really difficult, but longer than the hour or so I thought it would take.

There are basically two options: web scraping or the blogger API. The tool du jour for scraping in Python is called scrapy. I decided to use the blogger API, not because I thought it was inherently better than scraping, but rather because I expect I'll get a reasonably structured object in memory at the other end and won't have to do a lot of page interpretation. Also, if I want to publish directly to blogger later down the track, it will probably use the same mechanism.

Possibly because I'm bad at searching and terrible at web programming, this took me ages to get going. In fact, I haven't gotten going yet. I have merely jumped the first couple of an unknown number of hurdles. Programming is basically like playing an infinite scrolling computer game of randomly varying difficulty until you get to the end of the current level.

The plan was to create a minimal implementation which could then be bootstrapped into a smarter implementation. I thought people might like seeing it come together (I'd open source it, and then write about the process of building it here).

Here's where I've gotten:
-- I've installed the requisite API
-- I've discovered that I don't just need a 'project key', I actually a full Oauth2 secret and key
-- I read the source code to figure out that the filename that is supplied into the sample tool method in the library is actually only used to derive a directory to look for the "client_secrets.json" file, so I can actually supply a fake filename in order to tell it which directory to look for the file in. It should have just asked for a directory.

So, instead of a really cool Markov model, instead I have gotten three lines into reproducing the online walkthrough. Only took a couple hours...

More later! Wish me luck...

Tennessee Leeuwenburg: Wrangling downloading my own blog articles

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112