Python Web Crawler/ MP3 Blog Aggregator

The media web crawler that we need created needs to be able to crawl a given set of URLâ€™s. From these pages the crawler must extract the audio file that the song is on. The crawler also needs to index all metadata from that particular song. This would include information such as the artist name, track name, album name, genre, track length, album cover, album artwork, etc. It also needs to index information such as what website the song was pulled from. Once this information is collected, a streamable URL must be created. This means that the song needs to be able to be streamed through an audio player created with SoundManager2. The primary function of our application is to create playable search results. When a user types in a query in the search bar, the playable search results will be displayed.

Example: User searches â€œThe Beatlesâ€

Result: Search Engine returns all playable results from a list of over 6,000 blogs/websites that contain a Beatles song. These results can come from YouTube, SoundCloud, Zippyshare, Hulkshare, or any other site that contains a Beatles song.

Experience/Skills Required:

**Ability to speak English/Communicate effectively and efficiently. Available to speak on Skype, IM, phone, etc to give daily updates**

Python

Web/Server Side app development

Scalable applications (Large scale/Enterprise solutions)

AWS (Amazon Web Services)*

EC2 (Amazon Elastic Cloud Service)*

SQS (Amazon Simple Queue Service)*

PHP

Database Programming Experience

CakePHP (Or Equivalent)

MySQL

* means that this can be substituted by using another hosting service. I ask that you have experience using AWS just so there are no problems moving forward

For working examples of this type of technology please visit the following websites:

12 freelancers are bidding on average $2165 for this job

I run European website design firm with years of design experience personally developing over 300+ websites. I can also provide a list of references from past customers if you'd like. I'm not the cheapest you'll find, Plus