bootcat:release_notes:toolkit:0.18

This is an old revision of the document!


Version 0.18 (TBA)

  • New tool: BootCaTExtractor.jar performs the same task as retrieve_and_clean_pages_from_url_list.pl but, unlike the Perl script, supports UTF-8 , language filtering and document size filtering;
  • UrlCollector.jar does not require the “market” parameter anymore;
  • bootcat/release_notes/toolkit/0.18.1371484807.txt.gz
  • Last modified: 2013/06/17 16:00
  • by eros