Sven Slootweg
|
d98ee113bc
|
Rewrite generic OCW parser, BeautifulSoup fix to allow exclusion of comments for string retrieval, and fix BS4 bug
|
12 years ago |
Sven Slootweg
|
98340b38a0
|
Rewrite University of Reddit crawler - now with less hacks!
|
12 years ago |
Sven Slootweg
|
8bbffb9429
|
Add topic_exists and item_exists methods to Scraper class
|
12 years ago |
Sven Slootweg
|
0e4df4549f
|
No need to import oursql from within the scrapers
|
12 years ago |
Sven Slootweg
|
2c3bcc5418
|
Rewrite Khan Academy crawler
|
12 years ago |
Sven Slootweg
|
d9034b6215
|
Consistently use row_id, and not itemid or rowid
|
12 years ago |
Sven Slootweg
|
8c0033074b
|
Support both output logging and error logging in the Environment.log() method
|
12 years ago |
Sven Slootweg
|
b3edd35ecf
|
Add support for lectures and sandboxes
|
12 years ago |
Sven Slootweg
|
d6d8eb70b9
|
Fix typo - it should be Khan Academy, not Khan University.
|
12 years ago |
Sven Slootweg
|
fb6c43a38f
|
Rewrite scraper to be more modular, and convert the Coursera crawler to the new model
|
12 years ago |
Sven Slootweg
|
c2a8a66dac
|
Update README to fix dependencies list
|
12 years ago |
Sven Slootweg
|
a690cb2c8f
|
Add rudimentary first version of the OCW scraper
|
12 years ago |
Sven Slootweg
|
f188d443d1
|
Add README
|
12 years ago |
Sven Slootweg
|
43c700ac2b
|
Add list of various OCW sources for parser development
|
12 years ago |
Sven Slootweg
|
26b68952fa
|
Add table structure updates for new version of updater
|
12 years ago |
Sven Slootweg
|
a4e744f892
|
Add list of sources for book data
|
12 years ago |
Sven Slootweg
|
d3bd59f813
|
Add modified version of BeautifulSoup4 (nth-of-type pseudoselector and full-featured direct descendant support)
|
12 years ago |
Sven Slootweg
|
8e951f6b27
|
Add simple script for searching from a terminal
|
12 years ago |
Sven Slootweg
|
d387541822
|
Support custom provider names
|
12 years ago |
Sven Slootweg
|
a6e350c0d9
|
Add dumping script
|
12 years ago |
Sven Slootweg
|
0f5cade812
|
Simple dumper
|
12 years ago |
Sven Slootweg
|
fa74d394a7
|
Filter _ search terms
|
12 years ago |
Sven Slootweg
|
a9d2576eaf
|
Add donation link
|
12 years ago |
Sven Slootweg
|
f57d45fa53
|
Add header message
|
12 years ago |
Sven Slootweg
|
1503c1f75f
|
Add 404 page
|
12 years ago |
Sven Slootweg
|
bfbfd821b5
|
Include a small preview in the search results
|
12 years ago |
Sven Slootweg
|
efeef5f70e
|
Change search term requirements
|
12 years ago |
Sven Slootweg
|
3f02174ba3
|
Implement some very basic methods to prevent overloading
|
12 years ago |
Sven Slootweg
|
1fbb21e6d8
|
Properly use the password when connecting the crawlers
|
12 years ago |
Sven Slootweg
|
dd4c62bc4e
|
Very basic error handling
|
12 years ago |
Sven Slootweg
|
6ec1a2d90b
|
Add crawlers for coursera and ureddit, get first quick and dirty version of frontend done, and fix buigs and stuff
|
12 years ago |
Sven Slootweg
|
703a34bfa2
|
Reorganize updater code and add first design idea for frontend
|
12 years ago |
Sven Slootweg
|
8152ec8dca
|
First version of update script
|
12 years ago |