Author Topic: Rotten Tomatoes Scraper in VBA  (Read 1997 times)

Offline mike_carton

  • Experienced Member
  • **
  • Posts: 103
  • Helpful Contribution Status: +4/-1
Rotten Tomatoes Scraper in VBA
« on: February 07, 2016, 04:59:44 pm »
The code below is in Microsoft VBA. It scrapes Rotten Tomatoes web site to retrieve

- All Critics' Score - Number
- All Critics' Opinion - rotten/fresh/certified
- Top Critics' Score - Number
- Top Critics' Opinion - rotten/fresh/certified
- Critical Consensus - descriptive text

It was developed in Excel's VBA development environment but should work anywhere VBA is supported, with minimal changes if any. Conversion to VB or VBScript should be easy, obviously. Anyone improving the code or converting to C#, JavaScript or some other language, please post your code here for others to use.

By its very nature, web page scraping is fragile and highly susceptible to even minor changes to the template of the content made by the owner. Since they don't support searching by IMDB IDs anymore and are not giving out keys for using API liberally, scraping seems to be the only option. Please do not abuse this code (or any other, for that matter) to retrieve hundreds or thousands of movies in a day; they'll just identify the source of the calls and find a way to block/stop/break. Using this code (or a converted version of it) on a web site to show the retrieved information is likely to be discovered easily as well; for such use, request API keys and they'll happily oblige. Please don't do anything that'll jeopardize the use of this code by legitimate, hobbyist Mede8er owners who just want this information for movies they own and want to add this to their meta data for personal use.

Sample Output
Movie Title: Dogma
Movie Year:  1999
All Critics: fresh at 67%
Top Critics: rotten at 48%
Critical Consensus: Provocative and audacious, Dogma is an uneven but thoughtful religious satire that's both respectful and irreverent.

Attachments
1. Sample Output in Debug Window (more movies; pdf)
2. Sample Output Message Box screenshot (a favorite movie)
3. modScraper.pdf (VBA code in a pdf to meet the forum requirements)

Limitations
1. Fragile and susceptible to failure on minor changes to web site
2. Very little error handling
3. If an incorrect year is given, the code will not produce any information (will not crash and burn, though)
4. If there are multiple movies made with same name, the correct year becomes necessary; otherwise, a crash is likely

Potential Improvements
1. Add error handling, making it robust so it fails gracefully when needed.
2. When the year is not available as input, search for the movie and pull out the first (or most recent) movie from the search results. Retrieve the information for that movie.

Offline OldskoolOrion

  • Newbie
  • Posts: 25
  • Helpful Contribution Status: +7/-0
  • I like Pi !
Re: Rotten Tomatoes Scraper in VBA
« Reply #1 on: March 01, 2016, 04:24:03 pm »
I like the output :) Nice and concise.. something I should add to my XML's :-)


And just when I thought I had my template just the way I wanted haha
Alphamale on betablockers !