tutorials:b4b

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
tutorials:b4b [2019/10/31 09:30] – [Find yourself at home] albarrontutorials:b4b [2019/11/06 10:36] (current) – [Location and time] albarron
Line 5: Line 5:
 ====== Location and time ====== ====== Location and time ======
  
-The tutorial will be held in Room 4 of the PhD Lab on Wednesday 6 November from 9.15 to 10.45. +The tutorial was held in Room 4 of the PhD Lab on Wednesday 6 November from 9.15 to 10.45. 
  
 ===== Requirements ===== ===== Requirements =====
Line 15: Line 15:
   * **Linux**. Nothing extra. You are ready to go.   * **Linux**. Nothing extra. You are ready to go.
  
 +===== Resources =====
 +
 +We'll use a small subset of the English-Italian part of the Europarl parallel corpus.
 +
 +Download the two files here: {{:tutorials:b4b:en.zip|English}} and {{:tutorials:b4b:it.zip|Italian}}
  
 ===== Why is bash relevant? ===== ===== Why is bash relevant? =====
 +
 +  * Quick and easy text and data processing
 +  * The right way to interact with real computing software
 +  * One gate to Python and deep learning
 +===== Hands on Bash =====
  
  
Line 35: Line 45:
 Files can be simply displayed (without performing any modification) or actually opened for edition purposes. You will learn to do both.  Files can be simply displayed (without performing any modification) or actually opened for edition purposes. You will learn to do both. 
  
-Commands: ''cat'', ''more'', ''less'', ''most'', ''wc'', ''nano'' +Commands: ''cat'', ''more'', ''less'', ''most'', ''wc'', ''nano'', ''head'', ''shuf'' 
 + 
 ==== Grabbing information in a file from the command line ==== ==== Grabbing information in a file from the command line ====
    
Line 46: Line 58:
 All the operations carried out show their result in the terminal, but do not alter the contents nor are stored anywhere. Now we learn how to store them. All the operations carried out show their result in the terminal, but do not alter the contents nor are stored anywhere. Now we learn how to store them.
  
-Commands: ''te'', ''>'', ''>>''+Commands: ''>'', ''%%>>%%''
  
 ==== Understanding the structure of the commands ==== ==== Understanding the structure of the commands ====
Line 62: Line 74:
  
 Commands: ''man'' Commands: ''man''
 +
 +==== Exercises ====
 +
 +**EXERCISE 1**. Let us "measure" a file: bytes, megabytes, lines, words, etc.
 +
 +**EXERCISE 2**. Shuffle a parallel corpus in order to have sentences from different speeches. 
 +
 +**EXERCISE 3**. Find the most frequent tokens in the two parts of a parallel corpus and analyse them.
 +
 +**EXERCISE 4**. Get all words which are cognates wrt Italian from a tsv dictionary. Afterwards, count the number of tokens which belong to each family. 
 +
  • tutorials/b4b.1572514247.txt.gz
  • Last modified: 2019/10/31 09:30
  • by albarron