The Tealhinator (occupation by RNN)

Becoming a Pinter, Garner, Pepytatil Nurse, Chief Infospation Officer or Tealhinator, as a career choice, seems to be completely fine in the eyes of a Recurrent Neural Network

The context? Generating new occupations based on existing ones

Wiki fetch

In order to get existing job titles – Wikipedia was consulted

At start, plan was to write a crawler, but the idea was remised in favor of a more “cavalier” code that utilizes MediaWiki API

To retrieve all known professions, URL walk through initial lists of occupations is required. Titles from second level are subsequently fetched, and the special ones (e.g. Category:, Talk:) or references to nested lists are removed

The ∑

Machine Learning and Neural networks thrive on large input datasets, while we’re dealing with less than 1k records here. Hence, in order to make the most of it, we’re forced to heavily tweak and try various arrangements of hidden layers, dropouts, RNN and batch sizes

On more than a dozen t2.medium instances with Amazon Linux 2 planted, after multiple train setups tested, three configurations yielded the most:

rnn_size

num_layers

batch_size

dropout

256

5

1

0.5

128

4

2

0.5

128

3

2

0.6

Automated job title generation on a new VM, with last row’s parameters for RNN can be triggered via: