When you download a very long list of URLs, sometimes BootCaT will crash. We're trying to fix the problem, but for now here's a handy workaround.
Even if BootCaT crashed while downloading files, you can find a file called urls_list_final.txt
in the folder created for the failed attempt at building your corpus: that's the list of all the URLs you collected in the first stage of the corpus creation process.
You can simply try again using the Custom URLs corpus creation mode.
Another solution is downloading the files using an external program and then using BootCaT to clean them using the Local files corpus creation mode.
Here's a step-by-step guide to downloading files using the freeware external downloader WinWget and then turning them into a corpus with BootCaT.
wget.exe
file you downloaded earlier
url_list_final.txt
file
“
) at the beginning and the end of the file path, it should look something like “C:\Users\john\Desktop\urls_list_final.txt”
, the important part is that there must be double quotes at the beginning and at the end of the line
Local files
Open