Trove bots for all! (Updated 2020)

Trove bots for all! (Updated 2020)

I like Twitter bots. Not the evil, spammy, election-rigging type of Twitter bots. I like the experimental, artistic, political bots. The bots that surprise you, that make you laugh, make you think, or start you on a new journey. Most of all, I like the fact that in this ever more monetised world of massive online platforms we can carve out new spaces for expression through the creative use of code.

And anyone can do it.

In some of my undergraduate classes and workshops I ask participants to make simple Twitter bots using the site Cheap Bots Done Quick. Have a look at these four bots that were brought to life in my Random Acts of Meaning workshop at NLS8 last year. Cheap Bots Done Quick is very easy to use, but there’s also plenty of scope for creativity. For an excellent introduction to the possibilities, you should read Shawn Graham’s Programming Historian tutorial, ‘An Introduction to Twitterbots with Tracery’.

But bots can also be used to mobilise cultural heritage collections. Instead of languishing in catalogues and databases, collections can be set loose into spaces where people congregate. They can be given a life of their own, outside of the confines of the corporate website.

I created @TroveNewsBot back in 2013 to share Trove’s digitised newspapers. As well as tweeting random articles, @TroveNewsBot responds to queries — tweet some keywords at it and the bot will reply with the most relevant search result. @TroveBot followed soon after to liberate the contents of other Trove zones.

I also shared a simplified version of the @TroveBot code in the Trove Build-a-Bot repository (my kids were a bit obsessed with Build-a-Bear Workshop at the time). I hoped that Trove contributors might use it to create their own collection bots, and some did – @CurtinLibBot and @Kasparbot (from the NMA) have been busily tweeting since July 2013.

But the problem was that while the code itself was easy to configure, you still needed a server somewhere to run it on. That’s a significant hurdle to anyone who just wants to experiment.

Enter Glitch.

Screen capture of Glitch home page

Glitch is sort of like a combination of Heroku and GitHub with a bit of JSFiddle thrown in. It lets you collaborate on code like GitHub. But it also runs your applications in the cloud like Heroku. Most importantly, it encourages education and experimentation, making it easy to share your projects in a readily remixable form. (Oh yeah, and it’s free!)

You can create bots (and all sorts of other things!) on Glitch without having to worry about setting up a server. So over the last few weeks I’ve been creating a collection of Trove bot starter kits for everyone to play with.


Trove bot starter kits

UPDATE! In November 2019, Trove discontinued version 1 of its API and many existing Twitter bots stopped working. The new version of the API makes it difficult to select an item at random, so I’ve had to make some major changes to the Trove Twitter bot code. The links below now go to the new Twitter bots. I’ve also adjusted the customisation examples below. If you’re wanting to bring an old bot back to life, the upgrade process is fairly painless – just click on the links below for detailed instructions.

Here’s the current line-up:

Each bot comes with detailed instructions, so I won’t repeat them here. Once you’ve got your authorisation keys from Twitter, the rest is easy-peasy. Just click on one of the links to start.

Jump in and have a go!

New Trove bots using the starter kits have been popping up just about every day. There’s @AustWWBot, created by @bonniewildie, tweeting out articles from the Australian Women’s Weekly. Perhaps you’d enjoy @CatsofTrove, by @lib_idol, keeping up the internet’s quota of cats by sharing the contents of a Trove list. Then there’s @DoSonTrove and @suthlib sharing items from the Dictionary of Sydney and the Sutherland Library.

Here’s a list of all the Trove twitter bots I know about – please tweet me any additions!

The folks from Glitch have even created a Trove page to share these bots and any other projects using the Trove API.


Hack my bots

There are lots of ways these simple bots could be improved and extended. While it’s great to see people making more bots, I’m also hoping that some will want to take things further and hack my code.

Once again Glitch makes this easy. You can remix projects multiple times (just choose ‘Remix this!’ from the dropdown menu on the project’s title). You can invite others to collaborate on a project. And of course you can share the results of your hacking for others to remix.

Screen capture of Glitch editor

Glitch comes with a browser-based editor that includes syntax highlighting and various other nifty features – try highlighting a word, then hit Cmd-D (or Ctrl-D) for multiple selections. Glitch also saves any edits automatically and relaunches your app so you can quickly pick up problems.

My bots are written in Python, which is a pretty friendly programming language for beginners. The inner workings of the bots are contained in a file called server.py. Just click on it in the Glitch sidebar to open it for editing.

Below you’ll find a few hints and suggestions for hacking your own Trove bots.


Looking behind the curtain

As you experiment, it’s a good idea to have Glitch’s Activity Log open. Just click on the ‘Logs’ button in the sidebar. If you happen to break something you should see an error message pop up in the Activity Log. And don’t worry — breaking things is an important part of learning to code!

You might need to scroll back a bit through the log to find the actual source of an error. Look at the example below. It might seem scary, but all it’s really saying is that there’s a problem with the indentation of your code on line 43 of server.py. Python uses indentation to group lines of code together into blocks – like functions of ‘if’ statements – so it will complain if you get it wrong.

Screen capture of Glitch activity log showing indentation error

You can also send your own messages to the activity log. In Python, this is done using the print command. If you look in server.py file of any of the bots you’ll see the line print(message). This writes the tweet to the log before actually tweeting it.

The print command can be really useful when you’re trying to track down problems with your code. You might notice there’s also a line in server.py that reads print url. This writes the url that’s being used to retrieve data from the Trove API to the log. So if your code is failing and you suspect it’s got something to do with the data being delivered by Trove, you can grab the url, paste it into a new browser window, and inspect the result.

Here’s an example from the activity log for @TroveTribuneBot, showing the url and formatted tweet.

Screen capture of Glitch activity log showing message

Both url and message are variables — they’re containers that have been assigned values by the code. You can use print to view the value of any variable to make sure it has the value you expect.


Shh, I’m experimenting…

While you’re playing around with your bot’s code you probably want to stop it from trying to tweet. All you need to do is find the line of code that calls the tweet() function and add a # sign at the beginning of the line. So you’ll end up with something like:

1
# tweet(message)

What does the # do? In Python, a # indicates a comment, so by adding it at the start of the line, we’ve told Python to ignore what follows. Obviously, just delete it (and any spaces you’ve inserted after it) when you’re ready to tweet again. As noted above, the content of the tweet is sent to the app’s activity log before tweeting, so with tweeting disabled, you can experiment as much as you like and still view the results in the log.


Inject some personality

The bots have a rather limited vocabulary. Tweets of random items say something like ‘Another interesting item!’, while new items are announced with ‘Another new item!’. Bor-ing. Why not teach your bot some new phrases?

Open up server.py and look for the prepare_message() function. In Python functions are defined using the def keyword, so just find the line that starts with something like def prepare_message(item).

Now look for where the greeting variable is set. In trove-title-bot-2 you’ll see something like:

1
greeting = 'Another interesting article!'

If you’re experimenting with trove-tag-bot or trove-list-bot you’ll see that greeting is set to one of two values, depending on whether it’s a ‘random’ or ‘new’ tweet:

1
2
3
4
if message_type == 'new':
    greeting = 'New item added!'
elif message_type == 'random':
    greeting = 'Another interesting item!'

Remember variables are just containers, we can change them whenever we want – try editing the value of greeting. How about:

1
greeting = 'Wacko! More goodness from Trove!'

or

1
greeting = 'OMG I didn\'t expect to find this in Trove!'

It’s just a matter of clicking on the text in Glitch and changing it, but there are a few things to be wary of:

  • Keep the basic structure
  • Be careful with apostrophes
  • Beware the Twitter character limit
  • Curly brackets for tags

Keep the basic structure

Don’t change the greeting = part of the line, and make sure the quotation marks are still there!

1
greeting = 'Message goes here in quotes!'

Be careful with apostrophes

We’re using single quotes around our message, so if you include a single quote inside the message it will break the code. You can either ‘escape’ the apostrophe using a backslash, or use double quotes around the whole message:

1
greeting = 'OMG I didn\'t expect to find this in Trove!'

or

1
greeting = "OMG I didn't expect to find this in Trove!"

Beware the Twitter character limit

Keep your new phrase under about 50 characters, that should mean you’re always within the 280 character limit.

Although the code truncates item titles to 200 characters, it doesn’t do any other checking for length. One way of improving the code might be to create a new function that checks the length of a message before tweeting.

Curly brackets for tags

The trove-tag-bot-2 greeting is slightly different. It includes a set of curly brackets where the value of the tag itself is inserted. The f makes this a special type of string – it’s like a template, where the value of a variable is inserted in place of the curly brackets.

1
2
3
4
if message_type == 'new':
    greeting = f"New item tagged '{tag}'!"
elif message_type == 'random':
    greeting = f"Another item tagged '{tag}'!"

You can change the text around the tag, just make sure you keep the curly brackets and the f as they are, for example:

1
greeting = f"More '{tag}' from Trove to you!"

You should also count the tag in your 50 character limit.


Mixing things up

So we’ve seen how easy it is to change your bot’s default message, but it’s still going to be tweeting the same message every time. Let’s mix things up a bit.

This time, instead of manually adding a new phrase into the message variable, we’re going to make a random selection from a list of phrases and automatically add it to our tweet.

Let’s break down the tasks:

  • Think of some suitable phrases
  • Create a list containing the phrases
  • Select one of the phrases at random
  • Insert the selected phrase into our tweet

Think of some phrases

First of all you need to think of a few suitable phrases – remember to keep them under about 50 characters. For this example, let’s use:

  • ‘Hey check this out!’
  • ‘Trove never ceases to surprise!’
  • ‘Look what I found!’
  • ‘More Trove goodness!’

Create a list

The message variable we met above was a ‘string’ – it just contained text. Other variable types include integers, floats, lists, and dictionaries. In Python a ‘list’ is a variable that contains umm… a list of things. In other programming languages, lists can be called ‘arrays’.

Python lists can contain just about anything – strings, numbers, even other lists. In this case we’re going to create a list of strings. Here’s how to create a list called phrases that contains our examples.

1
phrases = ['Hey check this out!', 'Trove never ceases to surprise!', 'Look what I found!', 'More Trove goodness!']

As you can see, lists are created by using square brackets, with individual members separated by commas. Here’s some other lists:

1
2
[1, 5, 73, 42]
['apple', 'pecan', 36, 'rhubarb', 3.14]

Select a phrase at random

Python includes lots of different modules that extend its core functionality. When you need to use one of these modules, you import it into your code. Near the top of server.py you’ll see a number of import statements, including import random. This makes the random module available in our code to help us do useful randomish stuff.

The random module includes a function called choice() that selects a random item from a list. So choosing one of our phrases at random is as simple as:

1
selected_phrase = random.choice(phrases)

As you can see, the way we call a function within a module is to use a dot – random.choice() calls the choice() function inside the random module. The choice() function expects a list (or something similar) and so we give it our phrases list. The function gives back one of the phrases which we then store in the selected_phrase variable.

Insert the phrase in our tweet

Let’s go back to the default message for random tweets:

1
greeting = 'Another interesting article!'

What we want to do is replace ‘Another interesting article!’ with our selected phrase.

1
greeting = selected_phrase

But what about tags?

As we saw above, the trove-tag-bot-2 messages include the tag itself. Of course you could just hard code the tag name into your phrases, but for something more re-usable just include a placeholder in your phrases. For example:

1
phrases = [f'Look what I found tagged {tag}!', f'More gold from the {tag} tag!']

Change the default message as above:

1
greeting = selected_phrase

The new function

Here’s the complete prepare_message() function from trove-title-bot incorporating all the steps above:

1
2
3
4
5
6
7
8
9
10
11
12
13
def prepare_message(item):
    # Our list of phrases
    phrases = ['Hey check this out!', 'Trove never ceases to surprise!', 'Look what I found!', 'More Trove goodness!']
    # Select one
    selected_phrase = random.choice(phrases)
    # Placeholder for our phrase
    greeting = selected_phrase
    details = None
    date = arrow.get(item['date'], 'YYYY-MM-DD').format('D MMM YYYY')
    title = truncate(item['heading'], 200)
    url = f'http://nla.gov.au/nla.news-article{item["id"]}'
    message = f"{greeting} {date}, '{title}': {url}"
    return message

This will be a little different if your bot tweets new as well as random items, but the steps are basically the same.


Would you like keywords with that?

All four starter kits work with a particular slice of Trove – a list, a tag, a collection, or a newspaper. But you want want to have more fine-grained control over your bot’s selections. One way of doing that is to add a few keywords to the mix.

For example, @CaddieBrain used the trove-title-bot kit to create @NTTimesGazette, sharing articles from the Northern Territory Times and Gazette. The problem was, while the articles were published in the NT, they weren’t about the NT – @CaddieBrain asked if there was a way of making the results more local.

One way of doing this is to think of some local place names and add them into the query the bot uses to get data from the Trove API. So the steps would be:

  • Construct a search using the Trove web interface that returns the results you want
  • Insert our query into the bot’s API url

In this case I’d start with a search limited to the Northern Territory Times and Gazette. Then I’d start adding some place names to the search box – let’s try “Port Darwin” or “Pine Creek”.

Screen capture of Trove search box

Note the double quotes around the names to make Trove treat them as phrases, and the ‘OR’ to show we’d be happy with one or the other or both. You could add as many ‘OR’ clauses as you want.

That seems to work pretty well. So all we need to do is copy the contents of the search box – ie the whole "port darwin" OR "pine creek" bit.

Insert our query

Open up server.py and look for the tweet_random() function. You should see a line where the article is set by calling the get_random_article() function:

1
article = get_random_article(title=title, category='Article')

In the trove-title-bot-2 the get_random_article() function expects you to specify a value for the title you’re interested in – that’s what the title=title parameter is doing. But the function will also accept a query parameter. To filter the pool of random articles by our keywords, it’s just a matter of passing them to the function using query:

1
article = get_random_article(title=title, query='"port darwin" OR "pine creek"', category='Article')

That’s it!

Bonus points for extra reusability

The approach above works perfectly well, but rather than directly editing the url when we want to change our query, it might be better to store it with the other configuration settings in .env.

Just open .env and add a new line:

1
QUERY="\"port darwin\"+OR+\"pine creek\""

Note that I’m using backslashes to escape the double quotes inside the query.

Settings in the .env file are added to the application’s ‘environment’. To grab the query from the environment we have to add this line to server.py:

1
QUERY = os.environ.get('QUERY')

This line saves the query to a variable named QUERY. Then we can feed it to the get_random_article() function as before.

1
article = get_random_article(title=title, query=QUERY, category='Article')

The rise of the hybrid bot

If you compare the code of the different bot starter kits you’ll see that they have a lot in common. With a bit of experimenting, you should be able to mix and match various approaches to create hybrid bots.

For example, @follysantidote used the trove-tag-bot kit to make a bot that shared items with the tag ‘queensland’. But @follysantidote wanted to do something a bit different – to only tweet tagged items that came from the Canberra Times. How? The answer was to create a hybrid tag/title bot. @BotCBR_QLD was born!

Assuming that the tag bot is up and running, only a couple of minor tweaks are required.

First, open up the .env file, and change the line that says:

1
ZONES="article,work"

to just:

1
ZONES="article"

This means that the bot will only look for newspaper articles.

Next find the title identifier of the newspaper you’re interested in. See the trove-title-bot-2 documentation for more information on this.

Open up server.py and find the tweet_new() function. Look for the following lines:

1
2
if query_type == 'article':
    item = get_random_article(query=query, publictag=tag)

We just need to give the get_random_article() function the id of your newspaper. The id of the Canberra Times is 11, so in this case we’d change the code to:

1
2
if query_type == 'article':
    item = get_random_article(query=query, title='11', publictag=tag)

The tweet_random() function needs to be changed in the same way. From this:

1
2
if query_type == 'article':
    item = get_random_article(publictag=tag)

to this:

1
2
if query_type == 'article':
    item = get_random_article(title='11', publictag=tag)

Adding the title parameter is the equivalent of checking one of the title facets in the web interface – it limits results to that particular newspaper.

Bonus points

Using the keywords example above, you should be able to work out how to store the newspaper title id in the .env file.

More possibilities?

There’s plenty of other bot recipes on Glitch to experiment with. If you already know some Javascript and don’t want to play around in Python, have a look at Stefan Bohacek’s node.js bots on Glitch.

If you need help with your Trove bots, tweet at @wragge. Remember, you don’t have to be a coder to make your own Trove bot.

Trove bots for all!

Article updated on 12 April 2020


Tim Sherratt

Tim Sherratt
Historian and hacker

comments powered by Disqus