What is this ?

This software crawls articles published on the frontpage of various online news
outlets. For every article, it extracts its title, category, content, links and
links to embedded medias. The extracted data is stored in a plaintext database,
as a series of JSON files.

3rd party Dependencies

BeautifulSoup. This project
currently uses a mix of version 3 and 4 of BeautifulSoup. It's not pretty but
porting the old code was not a priority. They use different namespaces so
there are no confusions.