Commit Graph

  • d98ee113bc Rewrite generic OCW parser, BeautifulSoup fix to allow exclusion of comments for string retrieval, and fix BS4 bug develop Sven Slootweg 2013-01-31 01:36:20 +0100
  • 98340b38a0 Rewrite University of Reddit crawler - now with less hacks! Sven Slootweg 2013-01-30 22:36:42 +0100
  • 8bbffb9429 Add topic_exists and item_exists methods to Scraper class Sven Slootweg 2013-01-30 22:30:13 +0100
  • 0e4df4549f No need to import oursql from within the scrapers Sven Slootweg 2013-01-30 22:03:55 +0100
  • 2c3bcc5418 Rewrite Khan Academy crawler Sven Slootweg 2013-01-30 20:42:46 +0100
  • d9034b6215 Consistently use row_id, and not itemid or rowid Sven Slootweg 2013-01-30 20:42:23 +0100
  • 8c0033074b Support both output logging and error logging in the Environment.log() method Sven Slootweg 2013-01-30 20:41:51 +0100
  • b3edd35ecf Add support for lectures and sandboxes Sven Slootweg 2013-01-30 20:41:11 +0100
  • d6d8eb70b9 Fix typo - it should be Khan Academy, not Khan University. Sven Slootweg 2013-01-30 20:07:50 +0100
  • fb6c43a38f Rewrite scraper to be more modular, and convert the Coursera crawler to the new model Sven Slootweg 2013-01-30 19:43:48 +0100
  • c2a8a66dac Update README to fix dependencies list Sven Slootweg 2013-01-30 14:17:32 +0100
  • a690cb2c8f Add rudimentary first version of the OCW scraper Sven Slootweg 2013-01-30 13:41:27 +0100
  • f188d443d1 Add README Sven Slootweg 2013-01-30 13:39:44 +0100
  • 43c700ac2b Add list of various OCW sources for parser development Sven Slootweg 2013-01-30 13:34:18 +0100
  • 26b68952fa Add table structure updates for new version of updater Sven Slootweg 2013-01-30 13:33:24 +0100
  • a4e744f892 Add list of sources for book data Sven Slootweg 2013-01-30 13:33:07 +0100
  • d3bd59f813 Add modified version of BeautifulSoup4 (nth-of-type pseudoselector and full-featured direct descendant support) Sven Slootweg 2013-01-30 13:30:18 +0100
  • 8e951f6b27 Add simple script for searching from a terminal Sven Slootweg 2013-01-30 13:28:21 +0100
  • d387541822 Support custom provider names Sven Slootweg 2013-01-30 13:27:59 +0100
  • a6e350c0d9 Add dumping script Sven Slootweg 2013-01-28 17:11:44 +0100
  • 0f5cade812 Simple dumper Sven Slootweg 2013-01-28 17:10:13 +0100
  • fa74d394a7 Filter _ search terms Sven Slootweg 2013-01-28 16:43:46 +0100
  • a9d2576eaf Add donation link Sven Slootweg 2013-01-28 16:39:38 +0100
  • f57d45fa53 Add header message Sven Slootweg 2013-01-28 16:34:25 +0100
  • 1503c1f75f Add 404 page Sven Slootweg 2013-01-28 16:32:52 +0100
  • bfbfd821b5 Include a small preview in the search results Sven Slootweg 2013-01-28 16:15:06 +0100
  • efeef5f70e Change search term requirements Sven Slootweg 2013-01-28 16:09:17 +0100
  • 3f02174ba3 Implement some very basic methods to prevent overloading Sven Slootweg 2013-01-28 16:07:48 +0100
  • 1fbb21e6d8 Properly use the password when connecting the crawlers Sven Slootweg 2013-01-28 15:48:37 +0100
  • dd4c62bc4e Very basic error handling Sven Slootweg 2013-01-28 15:43:39 +0100
  • 6ec1a2d90b Add crawlers for coursera and ureddit, get first quick and dirty version of frontend done, and fix buigs and stuff Sven Slootweg 2013-01-28 14:48:35 +0100
  • 703a34bfa2 Reorganize updater code and add first design idea for frontend master Sven Slootweg 2013-01-27 23:06:32 +0100
  • 8152ec8dca First version of update script Sven Slootweg 2013-01-27 22:31:57 +0100