14. How can I work on a remote server?#

Today we will connect to a remote server and learn new bash commands for working with the content of files.

14.1. What are remote servers and HPC systems?#

diagram illustrating a remote connection to a login node and compute cluster

14.2. Connecting to Seawulf#

We connect with secure shell or ssh from our terminal (GitBash or Putty on windows) to URI’s teaching High Performance Computing (HPC) Cluster Seawulf.

Our login is the part of your uri e-mail address before the @

ssh -l brownsarahm seawulf.uri.edu

When it logs in it looks like this and requires you to change your password. They configure it with a default and with it past expired. Please note the command ssh -l, includes a lowercase “L” not the number 1!

Note

This block is sort of weird, because it is interactive terminal. I have rendered it all as output, but broken it down to separate chunks to add explanation.

The authenticity of host 'seawulf.uri.edu (131.128.217.210)' can't be established.
ECDSA key fingerprint is SHA256:RwhTUyjWLqwohXiRw+tYlTiJEbqX2n/drCpkIwQVCro.
Are you sure you want to continue connecting (yes/no/[fingerprint])? y
Please type 'yes', 'no' or the fingerprint: yes

Follow the instruction to type yes

I will tell you how to find your default password if you missed class (do not want to post it publicly). Comment on your experience report PR to ask for this information.

Warning: Permanently added 'seawulf.uri.edu,131.128.217.210' (ECDSA) to the list of known hosts.
brownsarahm@seawulf.uri.edu's password:

It does not show charachters when you type your password, but it works when you press enter

Then it requires you to change your password

You are required to change your password immediately (root enforced)
WARNING: Your password has expired.
You must change your password now and login again!

To change, it asks for you current (default) password first,

Important

You use the default password when prompted for your username’s password. Then again when it asks for the (current) UNIX password:. Then you must type the same, new password twice.

Choose a new password you will remember, we will come back to this server

Changing password for user brownsarahm.
Changing password for brownsarahm.
(current) UNIX password:

then the new one twice

New password:
Retype new password:
passwd: all authentication tokens updated successfully.
Connection to seawulf.uri.edu closed.

after you give it a new password, then it logs you out and you have to log back in.

ssh -l brownsarahm seawulf.uri.edu

When you log in it shows you information about your recent logins.

brownsarahm@seawulf.uri.edu's password: 
Last failed login: Thu Oct 24 12:54:33 EDT 2024 from 172.20.105.68 on ssh:notty
There were 2 failed login attempts since the last successful login.
Last login: Thu Oct 24 12:48:08 2024 from 172.20.105.68

We will first make a new folder

mkdir example

and go into it

cd example/
ls

Notice that the prompt says uriusername@seawulf to indicate that you are logged into the server, not working locally.

14.3. Downloading files#

wget allows you to get files from the web.

wget http://www.hpc-carpentry.org/hpc-shell/files/bash-lesson.tar.gz

Note that this is a reasonably sized download and it finished very quickly. This is because the download happened on the remote server not your laptop. The server has a high quality hard-wired connection to the internet that is very fast, unlike the wifi in our classroom.

This is an advantage of using a remote system. If your connection is slow, but stable enough to connect, you can do the work on a different computer that has better connection.

Now we see we have the file.

--2024-10-24 12:57:50--  http://www.hpc-carpentry.org/hpc-shell/files/bash-lesson.tar.gz
Resolving www.hpc-carpentry.org (www.hpc-carpentry.org)... 172.67.146.136, 104.21.33.152
Connecting to www.hpc-carpentry.org (www.hpc-carpentry.org)|172.67.146.136|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 12534006 (12M) [application/gzip]
Saving to: ‘bash-lesson.tar.gz’

100%[====================>] 12,534,006  --.-K/s   in 0.1s    

2024-10-24 12:57:51 (95.2 MB/s) - ‘bash-lesson.tar.gz’ saved [12534006/12534006]
ls
bash-lesson.tar.gz

14.4. Unzipping a file on the command line#

This file is compressed.

We can use man tar to see the manual aka man file of the tar program to learn how it works. You can also read man files online from GNU where you can choose your format, this page shows the full version.

tar -xvf bash-lesson.tar.gz 

This command uses the tar program and:

  • v makes it verbose (I have cut this output here)

  • x makes it extract

  • f option accepts the file name to work on

We can see what it did with ls

dmel-all-r6.19.gtf
dmel_unique_protein_isoforms_fb_2016_01.tsv
gene_association.fb
SRR307023_1.fastq
SRR307023_2.fastq
SRR307024_1.fastq
SRR307024_2.fastq
SRR307025_1.fastq
SRR307025_2.fastq
SRR307026_1.fastq
SRR307026_2.fastq
SRR307027_1.fastq
SRR307027_2.fastq
SRR307028_1.fastq
SRR307028_2.fastq
SRR307029_1.fastq
SRR307029_2.fastq
SRR307030_1.fastq
SRR307030_2.fastq

Note: To extract files to a different directory use the option --directory --directory path/to/directory

14.5. Working with large files#

We can use a new option on ls to see the size of the objects, -l shows that information and then --block-size=M changes the unit of the size to MB

ls -l --blocksize=M
ls: unrecognized option '--blocksize=M'
Try 'ls --help' for more information.

first I had a typo, note that it gave me a tip

ls -l --block-size=M
total 136M
-rw-r--r--. 1 brownsarahm spring2022-csc392 12M Apr 18  2021 bash-lesson.tar.gz
-rw-r--r--. 1 brownsarahm spring2022-csc392 74M Jan 16  2018 dmel-all-r6.19.gtf
-rw-r--r--. 1 brownsarahm spring2022-csc392  1M Jan 25  2016 dmel_unique_protein_isoforms_fb_2016_01.tsv
-rw-r--r--. 1 brownsarahm spring2022-csc392 24M Jan 25  2016 gene_association.fb
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307023_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307023_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307024_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307024_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307025_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307025_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307026_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307026_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307027_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307027_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307028_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307028_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307029_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307029_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307030_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307030_2.fastq

We see some of the files are very big,

lets look at one fo the smaller ones:

cat SRR30702
SRR307023_1.fastq  SRR307025_1.fastq  SRR307027_1.fastq  SRR307029_1.fastq
SRR307023_2.fastq  SRR307025_2.fastq  SRR307027_2.fastq  SRR307029_2.fastq
SRR307024_1.fastq  SRR307026_1.fastq  SRR307028_1.fastq  
SRR307024_2.fastq  SRR307026_2.fastq  SRR307028_2.fastq  
cat SRR307023_1.fastq 
@SRR307023.37289418.1 GA-C_0019:1:120:7087:20418 length=101
CGAGCGACTTTTGTATAACTATATTTTTCTCGTTCTTGGCTCCGACATCTATACAAATTCAGAAGGCAGTTTTGCGCGTGGAGGGACAATTACAAATTGAG
GGGGCACCCACATTATACANAACCANNNANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
+SRR307023.37294416.1 GA-C_0019:1:120:17598:20714 length=101
IDIG?IHBGIIDIHIDDDD#@=@##############################################################################
@SRR307023.37294417.1 GA-C_0019:1:120:18444:20714 length=101
GGGGCAGTGCTAAGGTACTNGAAAGNNNCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
+SRR307023.37294417.1 GA-C_0019:1:120:18444:20714 length=101
IIIGIIIHIHHIIIIEFFF#?################################################################################

We see that this actually take a long time to output and is way tooo much information to actually read. In fact, in order to make the website work, I had to cut that content using command line tools, my text editor couldn’t open the file and GitHub was unhappy when I pushed it.

Note

to truncate the output above, in my saved terminal output then I used:

grep -n cat 2024-10-24.md

to get the line number of the cat and then

head -n 156 2024-10-24.md > today.md

to take the part above the cat and then

grep -n head 2024-10-24.md

to find the next command, head and then

tail -n +20150 2024-10-24.md > tmp.md

to keep the lines after line 20150 in a temp file, and repeat to find the rest of the lines to cut the pieces needed, taking the head off and saving.

head dmel-all-r6.19.gtf 

X	FlyBase	gene	19961297	19969323	.	+	.	gene_id "FBgn0031081"; gene_symbol "Nep3";
X	FlyBase	mRNA	19961689	19968479	.	+	.	gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
X	FlyBase	5UTR	19961689	19961845	.	+	.	gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
X	FlyBase	exon	19961689	19961845	.	+	.	gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
X	FlyBase	exon	19963955	19964071	.	+	.	gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
X	FlyBase	exon	19964782	19964944	.	+	.	gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
X	FlyBase	exon	19965006	19965126	.	+	.	gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
X	FlyBase	exon	19965197	19965511	.	+	.	gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
X	FlyBase	exon	19965577	19966071	.	+	.	gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
X	FlyBase	exon	19966183	19967012	.	+	.	gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
head -n 5 dmel-all-r6.19.gtf 
X	FlyBase	gene	19961297	19969323	.	+	.	gene_id "FBgn0031081"; gene_symbol "Nep3";
X	FlyBase	mRNA	19961689	19968479	.	+	.	gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
X	FlyBase	5UTR	19961689	19961845	.	+	.	gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
X	FlyBase	exon	19961689	19961845	.	+	.	gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
X	FlyBase	exon	19963955	19964071	.	+	.	gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
ls -l --block-size=M
total 136M
-rw-r--r--. 1 brownsarahm spring2022-csc392 12M Apr 18  2021 bash-lesson.tar.gz
-rw-r--r--. 1 brownsarahm spring2022-csc392 74M Jan 16  2018 dmel-all-r6.19.gtf
-rw-r--r--. 1 brownsarahm spring2022-csc392  1M Jan 25  2016 dmel_unique_protein_isoforms_fb_2016_01.tsv
-rw-r--r--. 1 brownsarahm spring2022-csc392 24M Jan 25  2016 gene_association.fb
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307023_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307023_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307024_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307024_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307025_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307025_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307026_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307026_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307027_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307027_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307028_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307028_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307029_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307029_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307030_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  2M Jan 25  2016 SRR307030_2.fastq

and we can use tail to look at the end

tail -n 2 dmel-all-r6.19.gtf 
2L	FlyBase	stop_codon	782822	782824	.	+	0	gene_id "FBgn0041250"; gene_symbol "Gr21a"; transcript_id "FBtr0331651"; transcript_symbol "Gr21a-RB";
2L	FlyBase	3UTR	782825	782885	.	+	.	gene_id "FBgn0041250"; gene_symbol "Gr21a"; transcript_id "FBtr0331651"; transcript_symbol "Gr21a-RB";

For a file like this, we don’t really want to read the whole file but we do need to know what it’s strucutred like in order to design programs to work with it.

We can also see how much content is in the file wc give a word count,

wc -l dmel-all-r6.19.gtf 
542048 dmel-all-r6.19.gtf
ls
bash-lesson.tar.gz                           SRR307024_2.fastq  SRR307028_1.fastq
dmel-all-r6.19.gtf                           SRR307025_1.fastq  SRR307028_2.fastq
dmel_unique_protein_isoforms_fb_2016_01.tsv  SRR307025_2.fastq  SRR307029_1.fastq
gene_association.fb                          SRR307026_1.fastq  SRR307029_2.fastq
SRR307023_1.fastq                            SRR307026_2.fastq  SRR307030_1.fastq
SRR307023_2.fastq                            SRR307027_1.fastq  SRR307030_2.fastq
SRR307024_1.fastq                            SRR307027_2.fastq

We can use wc with patterns

wc -l *.fastq
   20000 SRR307023_1.fastq
   20000 SRR307023_2.fastq
   20000 SRR307024_1.fastq
   20000 SRR307024_2.fastq
   20000 SRR307025_1.fastq
   20000 SRR307025_2.fastq
   20000 SRR307026_1.fastq
   20000 SRR307026_2.fastq
   20000 SRR307027_1.fastq
   20000 SRR307027_2.fastq
   20000 SRR307028_1.fastq
   20000 SRR307028_2.fastq
   20000 SRR307029_1.fastq
   20000 SRR307029_2.fastq
   20000 SRR307030_1.fastq
   20000 SRR307030_2.fastq
  320000 total

and it gives us each result, plus the total

We can save these:

wc -l *.fastq > linecounts.txt

14.5.1. Searching file Contents#

14.5.2. How many times does mRNA appear in the file dmel-all-r6.19.gtf ?#

We can answer this two ways:

grep has a count option, -c:

[brownsarahm@seawulf ~]$ grep -c mRNA dmel-all-r6.19.gtf 
34025

or we can use grep and wc together with a pipe:

grep mRNA dmel-all-r6.19.gtf | wc -l 
34025

14.6. Scripts and Permissions#

we will set a very simple script

echo "echo 'script works'" > demo.sh

so the file looks like:

echo 'script works'

If we want to run the file, we can use bash directly,

bash demo.sh 
script works

and this works as expected

this is limited relative to calling our script in other ways.

One thing we could do is to run the script using ./

./demo.sh
-bash: ./demo.sh: Permission denied

By default, files have different types of permissions: read, write, and execute for different users that can access them. To view the permissions, we can use the -l option of ls.

ls -l
total 138452
-rw-r--r--. 1 brownsarahm spring2022-csc392 12534006 Apr 18  2021 bash-lesson.tar.gz
-rw-r--r--. 1 brownsarahm spring2022-csc392       20 Oct 24 13:18 demo.sh
-rw-r--r--. 1 brownsarahm spring2022-csc392 77426528 Jan 16  2018 dmel-all-r6.19.gtf
-rw-r--r--. 1 brownsarahm spring2022-csc392   721242 Jan 25  2016 dmel_unique_protein_isoforms_fb_2016_01.tsv
-rw-r--r--. 1 brownsarahm spring2022-csc392 25056938 Jan 25  2016 gene_association.fb
-rw-r--r--. 1 brownsarahm spring2022-csc392      447 Oct 24 13:15 linecounts.txt
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625262 Jan 25  2016 SRR307023_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625262 Jan 25  2016 SRR307023_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625376 Jan 25  2016 SRR307024_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625376 Jan 25  2016 SRR307024_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625286 Jan 25  2016 SRR307025_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625286 Jan 25  2016 SRR307025_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625302 Jan 25  2016 SRR307026_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625302 Jan 25  2016 SRR307026_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625312 Jan 25  2016 SRR307027_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625312 Jan 25  2016 SRR307027_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625338 Jan 25  2016 SRR307028_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625338 Jan 25  2016 SRR307028_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625390 Jan 25  2016 SRR307029_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625390 Jan 25  2016 SRR307029_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625318 Jan 25  2016 SRR307030_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625318 Jan 25  2016 SRR307030_2.fastq

For each file we get 10 characters in the first column that describe the permissions. The 3rd column is the username of the owner, the fourth is the group, then size date revised and the file name.

We are most interested in the 10 character permissions. The fist column indicates if any are directories with a d or a - for files. We have no directories, but we can create one to see this.

We can see in the bold line, that the first character is a d.

The next nine characters indicate permission to Read, Write, and eXecute a file. With either the letter or a - for permissions not granted, they appear in three groups of three, three characters each for owner, group, anyone with access.

we can use grep to make the output shorter:

ls -l | grep demo.sh 
-rw-r--r--. 1 brownsarahm spring2022-csc392       20 Oct 24 13:18 demo.sh

To add execute permission, we can use chmod

chmod +x demo.sh 

now we look at the permissions again

ls -l | grep demo.sh 
-rwxr-xr-x. 1 brownsarahm spring2022-csc392       20 Oct 24 13:18 demo.sh

and can run the file

./demo.sh
script works

adding more to the script:

nano demo.sh 

so we set the script up to create a file that has th first 5 lines of eachh .fastq file

echo 'script works'

for file in $(ls *.fastq)
do
echo $file >> fastq_head.txt
head -n 5 $file >> fastq_head.txt
done
./demo.sh
script works

it runs!

and then we can look at the file created

cat fastq_head.txt 
SRR307023_1.fastq
@SRR307023.37289418.1 GA-C_0019:1:120:7087:20418 length=101
CGAGCGACTTTTGTATAACTATATTTTTCTCGTTCTTGGCTCCGACATCTATACAAATTCAGAAGGCAGTTTTGCGCGTGGAGGGACAATTACAAATTGAG
+SRR307023.37289418.1 GA-C_0019:1:120:7087:20418 length=101
GGFGGFHHHHFEBH@G@EGEGHHEHGHHHHHHHHHBHDGHHGDGFHH?H:DFEDEGGGDGD:DBDA=@BBE@G?D?FD8FDGGDD<:BAB4;4;>CF@B3B
@SRR307023.37289419.1 GA-C_0019:1:120:7127:20414 length=101
SRR307023_2.fastq
@SRR307023.37289418.2 GA-C_0019:1:120:7087:20418 length=101
GGGTATGCGTAGAAACGAAAACAGAACAAGAATCTCAAAGTTCAGATTGTTCCTGACTTATTTATTTATTTTTGATTTTTCATAGTTCAAACCAACATACA
+SRR307023.37289418.2 GA-C_0019:1:120:7087:20418 length=101
FGG<FGGGDGIHIIDGEHIIHH@DGFIIGIHEFGG?EFFDEGGGG=FBF<G@@@<D@?GG@DDDGIFIE;7@@@-?D8@BG3EE<<AF-E?9>B8=B:@4G
@SRR307023.37289419.2 GA-C_0019:1:120:7127:20414 length=101
SRR307024_1.fastq
@SRR307024.42654158.1 GA-C_0019:2:120:12973:20432 length=101
GAGCAGGTAAACGCCAGAGAAAATTATGATACCAAAAATATGTGTGTGTGTGTTAAAGAAAGACCGTACAGTATATAGGATATAGGAGACGTGAGGTATAG
+SRR307024.42654158.1 GA-C_0019:2:120:12973:20432 length=101
IIIIIIIIIIIFHIIHIIIIIIIIGIIHHIIGIIHHIIIAIIGIHHHIFIEGGEDEEGDDDDGEEB<G@DD?ECBG>GB8?9A>=<?@?############
@SRR307024.42654159.1 GA-C_0019:2:120:13125:20421 length=101
SRR307024_2.fastq
@SRR307024.42654158.2 GA-C_0019:2:120:12973:20432 length=101
ATGTGGGGCAGCTGAGTTATAGCAGCGCCGATTTCATGGTTCGTAACTCTGGNGTGTAAAATAGTTTCATTCGCATATAGTTCTGTTTTGCTTGTATATAT
+SRR307024.42654158.2 GA-C_0019:2:120:12973:20432 length=101
IIIIIIIIIIIIIIIIIIIIIIIIIIIHIIHDIHIGIIIIIHHIIHHIIFEC#EAA=??>HH?FHIIGGDHIBDGG8D8ADDGEGBGECDDBEGEEEEBGI
@SRR307024.42654159.2 GA-C_0019:2:120:13125:20421 length=101
SRR307025_1.fastq
@SRR307025.41760981.1 GA-C_0019:3:120:2150:20421 length=101
ACCCAGAGCGTGACCGCGGTTACCATTTATTTTATGCTTCGAACAGTAAGTGGATATTAATAATACAGAATGACGGAGTCCGGAATAAAGAAACAATAACA
+SRR307025.41760981.1 GA-C_0019:3:120:2150:20421 length=101
EEEEE>EAEE5;@B@9=84<BBEE8??8??DDDA>=<D??:,:?=@</::/:-/,21'.1EE?BEDDD<>?##############################
@SRR307025.41760982.1 GA-C_0019:3:120:2572:20426 length=101
SRR307025_2.fastq
@SRR307025.41760981.2 GA-C_0019:3:120:2150:20421 length=101
AGGTGCCGATGGTGTAACGCAGGTAACAGTTGTTAACCTATATTCTTTTAATTTATTATTTGTAGAATCTCCCGCGAGTGGGGCAATGAGATCAAGTACAC
+SRR307025.41760981.2 GA-C_0019:3:120:2150:20421 length=101
GGGBGD=BB88+51<CBGEEC:A6>;;:1=9<<2?6243815355F;@?@DDG2G##############################################
@SRR307025.41760982.2 GA-C_0019:3:120:2572:20426 length=101
SRR307026_1.fastq
@SRR307026.44108070.1 GA-C_0019:4:120:15885:20406 length=101
ACCACACTTTTATGAACTCTTTAATAATGGTTTTGATAAAACCCGATTACCTTAATCATTATCCTTCGAAACGCCTTATCCTCTTGGTAAAATACACTTAA
+SRR307026.44108070.1 GA-C_0019:4:120:15885:20406 length=101
IIHHIIIIHIIEDFG@GGGGHHHHIGDCBEDGGGDBGGDDDDDBB2<<<>EFBEBGB>DBBBGB#####################################
@SRR307026.44108071.1 GA-C_0019:4:120:16150:20404 length=101
SRR307026_2.fastq
@SRR307026.44108070.2 GA-C_0019:4:120:15885:20406 length=101
ACCACACTTTTAAGATGCTTTAACTCTTTGTTTATTATGTGACGCGATTTTGAATTGGGAACGTTTTATATCACACCCTACCTATATGGATTTTATGTGTA
+SRR307026.44108070.2 GA-C_0019:4:120:15885:20406 length=101
GGG>GEGGGG@EGGADBGGEGGGG>GBGDCD>ADGADGGB8?>?CB+=A??;=3?C@A?@EEBBE1,=;;88<?=?;?8;<??@#################
@SRR307026.44108071.2 GA-C_0019:4:120:16150:20404 length=101
SRR307027_1.fastq
@SRR307027.37696234.1 GA-G_0028:1:120:17872:21167 length=101
GGAACTAGGTAAATCTTATTAAATTTTTTGTTGTTAATTGCAAACAATTCAATTTTTAAACGACCCCGAATATGTACTTTGATTTAATTGTTAGACAAATA
+SRR307027.37696234.1 GA-G_0028:1:120:17872:21167 length=101
DD?D:B?7C)BDDBDDDDDDBBB@DDDDD*==;DD7DBBBDD@DD:<<4;>?>B+:B4?06-9*;DB>5<;>>7,;;7B?<>>BDD8@D2<B*7<DD2D><
@SRR307027.37696235.1 GA-G_0028:1:120:18034:21172 length=101
SRR307027_2.fastq
@SRR307027.37696234.2 GA-G_0028:1:120:17872:21167 length=101
GGAACTAGGTAAATCTTATTAAATTTTTTGTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTAT
+SRR307027.37696234.2 GA-G_0028:1:120:17872:21167 length=101
HIBIIIIGIIFIIIFIGDIIIIIIIHIIIGFGIIIIHIIIIIDIIFIGIIIIIIAIIIAIIGIIIHIIIIIIIIIIIIIFIIIIIIHIDHIID########
@SRR307027.37696235.2 GA-G_0028:1:120:18034:21172 length=101
SRR307028_1.fastq
@SRR307028.32347990.1 GA-G_0028:2:120:15465:21120 length=101
GGTGGCTCGACTTAGATTAAGACACGTTCACTAAATTATTAAAGCTGACACTTATTCATGTACTTAGTCAATCGATTGCATAACAAATACGAGACCCATTT
+SRR307028.32347990.1 GA-G_0028:2:120:15465:21120 length=101
IIIIIIIIIIIIIIHDHIFHIIIIIIIIIIIIHIIIIIIIIIIIIHIDIIIIIIIIIIIIIIIGIIGHIHIIIBIIIIHIIIIGEHHFIIIHIIIEHHIDI
@SRR307028.32347991.1 GA-G_0028:2:120:15558:21125 length=101
SRR307028_2.fastq
@SRR307028.32347990.2 GA-G_0028:2:120:15465:21120 length=101
GAATTAAAAGTTAAAAGAAAGGAGTCCGAAACCAAAGCCAACAGGATTTGTTGGTCCCCAAATGGGTCTCGTATTTGTTATGCAATCGATTGACTAAGTAC
+SRR307028.32347990.2 GA-G_0028:2:120:15465:21120 length=101
IIIIIIIIIIIIIIIIIIIIIGFGIIIIIIIIIIIIIIIIIIIIIFIHIIHFHIHDIIIGHHGGIHEHIEHHEGFIGFFFGII>EIGII@IIF@EDDFCFF
@SRR307028.32347991.2 GA-G_0028:2:120:15558:21125 length=101
SRR307029_1.fastq
@SRR307029.43207317.1 GA-G_0028:3:120:15726:21171 length=101
GTGAGCGAGAGGGGGGCGAGCCACCTAGATAGAGAGGGGAGATACTACACGCGACTTTGACTTGGTTTTCGTTCGGTTCTGGTTCCTCCTCTATCAAAACT
+SRR307029.43207317.1 GA-G_0028:3:120:15726:21171 length=101
=>=2=@?B6CD:C=@<88+B;@@@@>4B4?5/-559:64+6+:=>F=B<:@=:=:FED<G@D@@GBGD@DEEEGE>G@DG3=?CA>GGGGGGG<G@@GGGG
@SRR307029.43207318.1 GA-G_0028:3:120:15858:21166 length=101
SRR307029_2.fastq
@SRR307029.43207317.2 GA-G_0028:3:120:15726:21171 length=101
GTTTGCTGGTTATTGCTATGGCTATATGGCTGTGAAGACAAAAGGCCCGGGACATGGCGTTTTGATAGAGGAGGAACCAGAACCGAACGCAAACCAAGTCA
+SRR307029.43207317.2 GA-G_0028:3:120:15726:21171 length=101
FAEEBBEBB=GGED>8=B7=<CCFBC8C8EE8=?AA=EEA173,5;AA2=B;?2?<<-255144829:<<;@22@2:5;6>>AAA################
@SRR307029.43207318.2 GA-G_0028:3:120:15858:21166 length=101
SRR307030_1.fastq
@SRR307030.42800974.1 GA-G_0028:4:120:4309:21177 length=101
TGAGGCCATACTCATATTAATTAGATTTATTGCGCTTAGAAAATAATAAATCAATGTAGACAGCTAAGTGATAGTTGAAGCTGGAACTTTGTTTTTTAACT
+SRR307030.42800974.1 GA-G_0028:4:120:4309:21177 length=101
IIFIIHIIGIHIIIIIIIIGIIIIIIIIHIIIIIIIIIIHIIIIHIIHIIIIIIIIIEIBHIIDGBDGGIGGDIGIIIGIIFHIGIDBIIIIIIIIIHFF@
@SRR307030.42800975.1 GA-G_0028:4:120:4580:21181 length=101
SRR307030_2.fastq
@SRR307030.42800974.2 GA-G_0028:4:120:4309:21177 length=101
GATAGTGATAGATAAAAATACATGTATACTTTTCGGCTGTCTAGGAAAATGGTATTTTAAATAAAAAAATCTATACAACTTATTTTCCGATACAAATTCAT
+SRR307030.42800974.2 GA-G_0028:4:120:4309:21177 length=101
IHIHIIIIIIIIIIIHGGIIIIHIIIIIIIIIIIIIIDIIIIIIIFHGHIIIEGIIIIIIFIIIIII@IIHIIIFGHGIIIFIIIIIIIIIIHIEHGIFHI
@SRR307030.42800975.2 GA-G_0028:4:120:4580:21181 length=101

editing the file does not impact the permissions:

ls -l | grep demo.sh 
-rwxr-xr-x. 1 brownsarahm spring2022-csc392      118 Oct 24 13:25 demo.sh
interactive
salloc: Granted job allocation 27907
salloc: Waiting for resource configuration
salloc: Nodes n005 are ready for job
ls
bash-lesson.tar.gz                           SRR307023_2.fastq  SRR307027_2.fastq
demo.sh                                      SRR307024_1.fastq  SRR307028_1.fastq
dmel-all-r6.19.gtf                           SRR307024_2.fastq  SRR307028_2.fastq
dmel_unique_protein_isoforms_fb_2016_01.tsv  SRR307025_1.fastq  SRR307029_1.fastq
fastq_head.txt                               SRR307025_2.fastq  SRR307029_2.fastq
gene_association.fb                          SRR307026_1.fastq  SRR307030_1.fastq
linecounts.txt                               SRR307026_2.fastq  SRR307030_2.fastq
SRR307023_1.fastq                            SRR307027_1.fastq
lshw

WARNING: you should run this program as super-user.
n005                        
    description: Computer
    width: 64 bits
    p00:04
          product: PnP device PNP0c02
          physical id: fd
          capabilities: pnp
          configuration: driver=system
WARNING: output may be incomplete or inaccurate, you should run this program as super-user.

we can go back to the login node:

exit 

logout
salloc: Relinquishing job allocation 27907

14.7. Closing a session#

we can use exit or logout to clsoe the connection, it will also close if you lose internet.

exit
logout
Connection to seawulf.uri.edu closed.

14.8. Prepare for Next Class#

  1. install gcc locally

  2. ensure you can log into seawulf

14.9. Badges#

  1. Answer the following in hpc.md of your KWL repo: (to think about how the design of the system we used in class impacts programming and connect it to other ideas taught in CS)

    1. What kinds of things would your code need to do if you were going to run it on an HPC system? 
    1. What sbatch options seem the most helpful?
    1. How might you go about setting the time limits for a script? How could you estimate how long a script will take?
    
  1. Answer the following in hpc.md of your KWL repo: (to think about how the design of the system we used in class impacts programming and connect it to other ideas taught in CS)

    1. What kinds of things would your code need to do if you were going to run it on an HPC system? 
    2. What sbatch options seem the most helpful?
    3. How might you go about setting the time limits for a script? How could you estimate how long a script will take?
    

14.10. Experience Report Evidence#

14.11. Questions After Today’s Class#

Tip

You can learn more about working on a server for an explore badge. For example connecting via ssh to AWS, google cloud, or similar, or doing more work on seawulf.

14.11.1. What is going on behind the scenes to get our GitBash terminal access to seawulf via our uri account?#

Your local terminal opens a secure shell (ssh) connection and then that serves as the interface to the shell on the remove server.

14.11.2. When we were working in the terminal, how were we not connected to the URI wifi? what wifi were we connected to? Why was it so much faster?#

Our laptops were using the URI wifi to talk to the server, but then when we downloaded the file, the server used the hardwired faster connection to the internet

14.11.3. Is the future of HPC cloud based?#

Comparing and contrasting these is a good explore topic.