Sven Slootweg
|
1f32cac481
|
In the process of redoing RevisionedDict
|
11 years ago |
Sven Slootweg
|
4927b5e7a3
|
Bits and pieces of a new scraper and task distribution mechanism, and a (probably over-engineered) revisioned dict, also some ISBN scraping stuff
|
11 years ago |
Sven Slootweg
|
6a0654b7cb
|
Use the proper exception name
|
12 years ago |
Sven Slootweg
|
0a2d4fcb9f
|
Skip a server when it cannot be reached
|
12 years ago |
Sven Slootweg
|
51916b8bbd
|
Let the Calibre crawler crawl more than one Google page
|
12 years ago |
Sven Slootweg
|
090165e62f
|
Sleep for a bit after every read to not waste an entire core
|
12 years ago |
Sven Slootweg
|
82e4826dea
|
Fall back to simplejson when the stdlib json module is not available
|
12 years ago |
Sven Slootweg
|
727f150621
|
Add updated data dump
|
12 years ago |
Sven Slootweg
|
e74b025219
|
Use external configuration file
|
12 years ago |
Sven Slootweg
|
c274685179
|
Make sure that the Google crawler runs as well
|
12 years ago |
Sven Slootweg
|
a11bf165a4
|
Rewrite crawler
|
12 years ago |
Sven Slootweg
|
8c255202c4
|
Initial commit
|
13 years ago |