Table of Contents

Bash for Beginners

This quick-and-dirty tutorial is intended as an introduction to bash. The natural language used during the tutorial will be English.

Location and time

The tutorial was held in Room 4 of the PhD Lab on Wednesday 6 November from 9.15 to 10.45.

Requirements

In order to follow the tutorial, you will require a laptop. Depending on your operative system, you will require one of the following:

Resources

We'll use a small subset of the English-Italian part of the Europarl parallel corpus.

Download the two files here: English and Italian

Why is bash relevant?

Hands on Bash

Find yourself at home

You will first learn how to setup a remote connection to the machine

Afterwards, you will understand what is the meaning of “living” in a multi-user setting. You will learn how to list the files and directories, as well as how to move around.

Commands: ssh, ls, pwd, cd, mkdir

How to display and edit a file

Files can be simply displayed (without performing any modification) or actually opened for edition purposes. You will learn to do both.

Commands: cat, more, less, most, wc, nano, head, shuf

Grabbing information in a file from the command line

Until now, the kinds of operation you have performed are quite basic and not to different from what you can do with standard tools. Now we start to do interesting stuff. In this section you will learn how to sort, filter, and modify, and combine the information in a file

Commands: sort, grep, sed, column

Storing the results

All the operations carried out show their result in the terminal, but do not alter the contents nor are stored anywhere. Now we learn how to store them.

Commands: >, >>

Understanding the structure of the commands

We have played with quite a few commands already. Let us understand how commands are usually structured.

Piping to combine commands

Let's start making things interesting: all these commands can be executed one after the other at no extra cost. These are the so-called one-liners.

Commands: awk, |

How to find some help

Commands: man

Exercises

EXERCISE 1. Let us “measure” a file: bytes, megabytes, lines, words, etc.

EXERCISE 2. Shuffle a parallel corpus in order to have sentences from different speeches.

EXERCISE 3. Find the most frequent tokens in the two parts of a parallel corpus and analyse them.

EXERCISE 4. Get all words which are cognates wrt Italian from a tsv dictionary. Afterwards, count the number of tokens which belong to each family.