This short tutorial is meant to provide you with familiarity using a small number of unix commands to manipulate big text files of data. It is not meant to substitute for a complete understanding of unix, or programming in general, or even an exhaustive listing of useful commands but I hope that if you follow along, you'll learn enough simple file editing skills to save you some time.
The first thing we'll want to do is open a terminal window. Go to the Utilities folder inside your Applications folder and open Terminal. A window appears with some text that should be similar to the following.
Last login: Tue Apr 20 09:43:13 on console [rockfall:~] eliza% |
The first line "Last login: Tue Apr 20 09:43:13 on console" provides the date, and time from the last time that you logged into the system. On the second line, the word rockfall refers to the machine that you are logged into. In this case, rockfall refers to my machine or hard drive. The text after the : refers to which directory I am in. The ~ means that I am currently in my home directory. So, [rockfall:~] means that I am logged into the home directory on my machine. Next, "eliza" refers to my userid and the % means that is the end of the prompt and is waiting for your input.
An image of the window with the text referenced above is below.
Here are three commands to try:
Now let's take a text file and mess around with it using unix commands in the terminal window. Here is a link to a plain text file of ten days of aftershocks [1] following the 4 April 2010 Baja California earthquake. Put it in the new directory you called earth801/data1. Go to earth801/data1 and type ls to verify the file is there. Type more baja_neic.txt (in which baja_neic.txt is the actual name of the text file). The file should look like the screenshot below. If your terminal window is too small to show the whole file at once, you will get a black bar at the bottom that tells you what percentage of the file you are seeing. Hit the spacebar and you'll see another chunk of the file. Continue to hit the spacebar until you've seen the whole file and you are back at the terminal prompt. Alternatively, if you type cat baja_neic.txt the entire file will scroll by and leave you at the prompt when it's done.
The command head baja_neic.txt shows you exactly the first ten lines of the file. Try it. You can also modify the head command like this:
head -5 baja_neic.txt |
The -5 tells head to show the first 5 lines. Showing ten lines is the default when head has no arguments, so the following two commands are equivalent:
head baja_neic.txt |
head -10 baja_neic.txt |
The command tail is similar to head but works on the end of the file instead of the beginning. The commands more, cat, head, and tail return their output to the screen by default but you can also have them create a new file and put their results in it instead. The way to do this is to redirect the output with the > symbol.
For example, do this:
head -5 baja_neic.txt > newfile |
and you will create a new text file called "newfile" which contains exactly the first five lines of the original file baja_neic.txt. It is important to note here that performing this command has not changed the original file in any way. You can type ls to verify that you now have two files in your data1 directory. One of them is the original baja_neic.txt and the other one is called newfile and it is a copy of the first five lines of baja_neic.txt. Use the more command to look at your newfile file. Did you get what you were expecting? When the "head" command counts lines of a file, blank lines are counted just like lines that have text characters in them, so that's why newfile looks the way it does. At this point, if you have been following along, the following three commands should give you identical output:
head -5 baja_neic.txt |
more newfile |
cat newfile |
Okay, on to the next command of interest. The cp command copies one file to another but instead of using > you just specify the other filename. So, these two commands are equivalent ways of copying the entire file baja_neic.txt to a new file called baja_neic_copy.txt:
cp baja_neic.txt baja_neic_copy.txt |
cat baja_neic.txt > baja_neic_copy.txt |
If you want to rename a file without changing its contents, use mv. Like cp, mv requires two filenames, the previous one and the new one.
mv newfile baja_neic_five.txt |
The above command renames the file "newfile" to "baja_neic_five.txt". You can also use "mv" to change the location of a file. Try typing
mv baja_neic_copy.txt .. /data2/baja .txt |
This command takes the file "baja_neic_copy.txt" and moves it from the folder data1 to the folder data2 and renames it baja.txt. You can go to data2 (remember how?) and verify there is now a file in there called baja.txt and that it is a duplicate of baja_neic.txt.
Another cool use of the cat command is to stick two or more files together and make one file. So,
cat baja_neic.txt newfile > baja2.txt |
will make a file called baja2.txt which is a copy of baja_neic.txt plus a copy of "newfile" stuck together.