How can I work on a remote server?

Contents

16. How can I work on a remote server?#

Today we will connect to a remote server and learn new bash commands for working with the content of files.

16.1. What are remote servers and HPC systems?#

diagram illustrating a remote connection to a login node and compute cluster

16.2. Connecting to Seawulf#

We connect with secure shell or ssh from our terminal (GitBash or Putty on windows) to URI’s teaching High Performance Computing (HPC) Cluster Seawulf.

Our login is the part of your uri e-mail address before the @.

ssh -l brownsarahm seawulf.uri.edu

When it logs in it looks like this and requires you to change your password. They configure it with a default and with it past expired.

Note

This block is sort of weird, because it is interactive terminal. I have rendered it all as output, but broken it down to separate chunks to add explanation.

The authenticity of host 'seawulf.uri.edu (131.128.217.210)' can't be established.
ECDSA key fingerprint is SHA256:RwhTUyjWLqwohXiRw+tYlTiJEbqX2n/drCpkIwQVCro.
Are you sure you want to continue connecting (yes/no/[fingerprint])? y
Please type 'yes', 'no' or the fingerprint: yes

Follow the instruction to type yes

I will tell you how to find your default password if you missed class (do not want to post it publicly). Comment on your experience report PR to ask for this information.

Warning: Permanently added 'seawulf.uri.edu,131.128.217.210' (ECDSA) to the list of known hosts.
brownsarahm@seawulf.uri.edu's password:

It does not show charachters when you type your password, but it works when you press enter

Then it requires you to change your password

You are required to change your password immediately (root enforced)
WARNING: Your password has expired.
You must change your password now and login again!

To change, it asks for you current (default) password first,

Important

You use the default password when prompted for your username’s password. Then again when it asks for the (current) UNIX password:. Then you must type the same, new password twice.

Choose a new password you will remember, we will come back to this server

Changing password for user brownsarahm.
Changing password for brownsarahm.
(current) UNIX password:

then the new one twice

New password:
Retype new password:
passwd: all authentication tokens updated successfully.
Connection to seawulf.uri.edu closed.

after you give it a new password, then it logs you out and you have to log back in.

brownsarahm@~ $ ssh -l brownsarahm seawulf.uri.edu

When you log in it shows you information about your recent logins.

[brownsarahm@seawulf ~]$ ls
[brownsarahm@seawulf ~]$ pwd

Notice that the prompt says uriusername@seawulf to indicate that you are logged into the server, not working locally.

16.3. Downloading files#

wget allows you to get files from the web.

[brownsarahm@seawulf ~]$ wget http://www.hpc-carpentry.org/hpc-shell/files/bash-lesson.tar.gz
--2022-03-08 12:58:09--  http://www.hpc-carpentry.org/hpc-shell/files/bash-lesson.tar.gz
Resolving www.hpc-carpentry.org (www.hpc-carpentry.org)... 104.21.33.152, 172.67.146.136
Connecting to www.hpc-carpentry.org (www.hpc-carpentry.org)|104.21.33.152|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 12534006 (12M) [application/gzip]
Saving to: ‘bash-lesson.tar.gz’

100%[======================================>] 12,534,006  4.19MB/s   in 2.9s   

2022-03-08 12:58:12 (4.19 MB/s) - ‘bash-lesson.tar.gz’ saved [12534006/12534006]

Note that this is a reasonably sized download and it finished very quickly. This is because the download happened on the remote server not your laptop. The server has a high quality hard-wired connection to the internet that is very fast, unlike the wifi in our classroom.

This is an advantage of using a remote system. If your connection is slow, but stable enough to connect, you can do the work on a different computer that has better connection.

Now we see we have the file.

[brownsarahm@seawulf ~]$ ls
bash-lesson.tar.gz

we see the new file!

16.4. Unzipping a file on the command line#

This file is compressed.

We can use man tar to see the manual aka man file of the tar program to learn how it works. You can also read man files online from GNU where you can choose your format, this page shows the full version.

[brownsarahm@seawulf ~]$ tar -xvf bash-lesson.tar.gz

This command uses the tar program and:

  • v makes it verbose (I have cut this output here)

  • x makes it extract

  • f option accepts the file name to work on

We can see what it did with ls

[brownsarahm@seawulf ~]$ ls
bash-lesson.tar.gz                           SRR307026_1.fastq
dmel-all-r6.19.gtf                           SRR307026_2.fastq
dmel_unique_protein_isoforms_fb_2016_01.tsv  SRR307027_1.fastq
gene_association.fb                          SRR307027_2.fastq
SRR307023_1.fastq                            SRR307028_1.fastq
SRR307023_2.fastq                            SRR307028_2.fastq
SRR307024_1.fastq                            SRR307029_1.fastq
SRR307024_2.fastq                            SRR307029_2.fastq
SRR307025_1.fastq                            SRR307030_1.fastq
SRR307025_2.fastq                            SRR307030_2.fastq

Note: To extract files to a different directory use the option --directory --directory path/to/directory

16.5. Working with large files#

[brownsarahm@seawulf ~]$ cat SRR307023_1.fastq 

Warning

this output is truncated for display purposes

@SRR307023.37289418.1 GA-C_0019:1:120:7087:20418 length=101
CGAGCGACTTTTGTATAACTATATTTTTCTCGTTCTTGGCTCCGACATCTATACAAATTCAGAAGGCAGTTTTGCGCGTGGAGGGACAATTACAAATTGAG
...
IDIG?IHBGIIDIHIDDDD#@=@##############################################################################
@SRR307023.37294417.1 GA-C_0019:1:120:18444:20714 length=101
GGGGCAGTGCTAAGGTACTNGAAAGNNNCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
+SRR307023.37294417.1 GA-C_0019:1:120:18444:20714 length=101
IIIGIIIHIHHIIIIEFFF#?################################################################################
[brownsarahm@seawulf ~]$ head SRR307023_1.fastq 
@SRR307023.37289418.1 GA-C_0019:1:120:7087:20418 length=101
CGAGCGACTTTTGTATAACTATATTTTTCTCGTTCTTGGCTCCGACATCTATACAAATTCAGAAGGCAGTTTTGCGCGTGGAGGGACAATTACAAATTGAG
+SRR307023.37289418.1 GA-C_0019:1:120:7087:20418 length=101
GGFGGFHHHHFEBH@G@EGEGHHEHGHHHHHHHHHBHDGHHGDGFHH?H:DFEDEGGGDGD:DBDA=@BBE@G?D?FD8FDGGDD<:BAB4;4;>CF@B3B
@SRR307023.37289419.1 GA-C_0019:1:120:7127:20414 length=101
GGGCAGGGCCGTAATCAATGCCCCTCAAACACAAAGTTACTGGGAATTCCAAGTTCATGTGAACAGTTTCAGTTCACAATCCCAAGCATGAAAGGGGTTCA
+SRR307023.37289419.1 GA-C_0019:1:120:7127:20414 length=101
HEHGHHHHH@GGBGBHEH@B=EGDGAADDDEFBEFF8:EB(/,257;.;;94/4.>@3@7?A=:C@72;1@;4;A688:<E@GGG<EG3A###########
@SRR307023.37289420.1 GA-C_0019:1:120:7204:20410 length=101
CCACCCACTTAGTGCTGCACTATCAAGCAACACGACTCTTTTGAAACATCATCTAGTAATCATTAACGTTATACGGGCCTGGCACCCTCTATGGGTAAATG
[brownsarahm@seawulf ~]$ tail SRR307023_1.fastq 
+SRR307023.37294415.1 GA-C_0019:1:120:17521:20714 length=101
IIIIIIHIIIHIIIIFFFF#?@;B#############################################################################
@SRR307023.37294416.1 GA-C_0019:1:120:17598:20714 length=101
GGGGCACCCACATTATACANAACCANNNANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
+SRR307023.37294416.1 GA-C_0019:1:120:17598:20714 length=101
IDIG?IHBGIIDIHIDDDD#@=@##############################################################################
@SRR307023.37294417.1 GA-C_0019:1:120:18444:20714 length=101
GGGGCAGTGCTAAGGTACTNGAAAGNNNCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
+SRR307023.37294417.1 GA-C_0019:1:120:18444:20714 length=101
IIIGIIIHIHHIIIIEFFF#?################################################################################

We see that this actually take a long time to output and is way tooo much information to actually read. In fact, in order to make the website work, I had to cut that content using command line tools, my text editor couldn’t open the file and GitHub was unhappy when I pushed it.

Note

I initially used head and tail to pick out the excerpts of the terminal output that I needed, but this year I reused what I had done before

For a file like this, we don’t really want to read the whole file but we do need to know what it’s strucutred like in order to design programs to work with it.

We can also see how much content is in the file wc give a word count,

[brownsarahm@seawulf ~]$ wc SRR307023_1.fastq 
  20000   40000 1625262 SRR307023_1.fastq

We see three numbers, we can use help to see more about that

[brownsarahm@seawulf ~]$ wc --help
Usage: wc [OPTION]... [FILE]...
  or:  wc [OPTION]... --files0-from=F
Print newline, word, and byte counts for each FILE, and a total line if
more than one FILE is specified.  With no FILE, or when FILE is -,
read standard input.  A word is a non-zero-length sequence of characters
delimited by white space.
The options below may be used to select which counts are printed, always in
the following order: newline, word, character, byte, maximum line length.
  -c, --bytes            print the byte counts
  -m, --chars            print the character counts
  -l, --lines            print the newline counts
      --files0-from=F    read input from the files specified by
                           NUL-terminated names in file F;
                           If F is - then read names from standard input
  -L, --max-line-length  print the length of the longest line
  -w, --words            print the word counts
      --help     display this help and exit
      --version  output version information and exit

GNU coreutils online help: <http://www.gnu.org/software/coreutils/>
For complete documentation, run: info coreutils 'wc invocation'
[brownsarahm@seawulf ~]$ wc -l SRR307023_1.fastq 
20000 SRR307023_1.fastq

and then we get only the single output

We can use wc with patterns

[brownsarahm@seawulf ~]$ wc -l *.fastq
   20000 SRR307023_1.fastq
   20000 SRR307023_2.fastq
   20000 SRR307024_1.fastq
   20000 SRR307024_2.fastq
   20000 SRR307025_1.fastq
   20000 SRR307025_2.fastq
   20000 SRR307026_1.fastq
   20000 SRR307026_2.fastq
   20000 SRR307027_1.fastq
   20000 SRR307027_2.fastq
   20000 SRR307028_1.fastq
   20000 SRR307028_2.fastq
   20000 SRR307029_1.fastq
   20000 SRR307029_2.fastq
   20000 SRR307030_1.fastq
   20000 SRR307030_2.fastq
  320000 total

and it gives us each result, plus the total

16.5.1. Searching file Contents#

[brownsarahm@seawulf ~]$ grep --help
Usage: grep [OPTION]... PATTERN [FILE]...
Search for PATTERN in each FILE or standard input.
PATTERN is, by default, a basic regular expression (BRE).
Example: grep -i 'hello world' menu.h main.c

Regexp selection and interpretation:
  -E, --extended-regexp     PATTERN is an extended regular expression (ERE)
  -F, --fixed-strings       PATTERN is a set of newline-separated fixed strings
  -G, --basic-regexp        PATTERN is a basic regular expression (BRE)
  -P, --perl-regexp         PATTERN is a Perl regular expression
  -e, --regexp=PATTERN      use PATTERN for matching
  -f, --file=FILE           obtain PATTERN from FILE
  -i, --ignore-case         ignore case distinctions
  -w, --word-regexp         force PATTERN to match only whole words
  -x, --line-regexp         force PATTERN to match only whole lines
  -z, --null-data           a data line ends in 0 byte, not newline

Miscellaneous:
  -s, --no-messages         suppress error messages
  -v, --invert-match        select non-matching lines
  -V, --version             display version information and exit
      --help                display this help text and exit

Output control:
  -m, --max-count=NUM       stop after NUM matches
  -b, --byte-offset         print the byte offset with output lines
  -n, --line-number         print line number with output lines
      --line-buffered       flush output on every line
  -H, --with-filename       print the file name for each match
  -h, --no-filename         suppress the file name prefix on output
      --label=LABEL         use LABEL as the standard input file name prefix
  -o, --only-matching       show only the part of a line matching PATTERN
  -q, --quiet, --silent     suppress all normal output
      --binary-files=TYPE   assume that binary files are TYPE;
                            TYPE is 'binary', 'text', or 'without-match'
  -a, --text                equivalent to --binary-files=text
  -I                        equivalent to --binary-files=without-match
  -d, --directories=ACTION  how to handle directories;
                            ACTION is 'read', 'recurse', or 'skip'
  -D, --devices=ACTION      how to handle devices, FIFOs and sockets;
                            ACTION is 'read' or 'skip'
  -r, --recursive           like --directories=recurse
  -R, --dereference-recursive
                            likewise, but follow all symlinks
      --include=FILE_PATTERN
                            search only files that match FILE_PATTERN
      --exclude=FILE_PATTERN
                            skip files and directories matching FILE_PATTERN
      --exclude-from=FILE   skip files matching any file pattern from FILE
      --exclude-dir=PATTERN directories that match PATTERN will be skipped.
  -L, --files-without-match print only names of FILEs containing no match
  -l, --files-with-matches  print only names of FILEs containing matches
  -c, --count               print only a count of matching lines per FILE
  -T, --initial-tab         make tabs line up (if needed)
  -Z, --null                print 0 byte after FILE name

Context control:
  -B, --before-context=NUM  print NUM lines of leading context
  -A, --after-context=NUM   print NUM lines of trailing context
  -C, --context=NUM         print NUM lines of output context
  -NUM                      same as --context=NUM
      --group-separator=SEP use SEP as a group separator
      --no-group-separator  use empty string as a group separator
      --color[=WHEN],
      --colour[=WHEN]       use markers to highlight the matching strings;
                            WHEN is 'always', 'never', or 'auto'
  -U, --binary              do not strip CR characters at EOL (MSDOS/Windows)
  -u, --unix-byte-offsets   report offsets as if CRs were not there
                            (MSDOS/Windows)

'egrep' means 'grep -E'.  'fgrep' means 'grep -F'.
Direct invocation as either 'egrep' or 'fgrep' is deprecated.
When FILE is -, read standard input.  With no FILE, read . if a command-line
-r is given, - otherwise.  If fewer than two FILEs are given, assume -h.
Exit status is 0 if any line is selected, 1 otherwise;
if any error occurs and -q is not given, the exit status is 2.

Report bugs to: bug-grep@gnu.org
GNU Grep home page: <http://www.gnu.org/software/grep/>
General help using GNU software: <http://www.gnu.org/gethelp/>

So we can use it as follows:

[brownsarahm@seawulf ~]$ grep length SRR307023_1.fastq 
@SRR307023.37289418.1 GA-C_0019:1:120:7087:20418 length=101
+SRR307023.37289418.1 GA-C_0019:1:120:7087:20418 length=101
@SRR307023.37289419.1 GA-C_0019:1:120:7127:20414 length=101
+SRR307023.37289419.1 GA-C_0019:1:120:7127:20414 length=101
@SRR307023.37289420.1 GA-C_0019:1:120:7204:20410 length=101
+SRR307023.37289420.1 GA-C_0019:1:120:7204:20410 length=101
...
@SRR307023.37294416.1 GA-C_0019:1:120:17598:20714 length=101
+SRR307023.37294416.1 GA-C_0019:1:120:17598:20714 length=101
@SRR307023.37294417.1 GA-C_0019:1:120:18444:20714 length=101
+SRR307023.37294417.1 GA-C_0019:1:120:18444:20714 length=101

16.5.2. How many times does mRNA appear in the file dmel-all-r6.19.gtf ?#

We can answer this two ways:

grep has a count option, -c:

[brownsarahm@seawulf ~]$ grep -c mRNA dmel-all-r6.19.gtf 
34025

or we can use grep and wc together with a pipe:

[brownsarahm@seawulf ~]$ grep  mRNA dmel-all-r6.19.gtf |wc -l
34025

and we get the same answer.

16.6. Closing a session#

we can use exit or logout to clsoe the connection, it will also close if you lose internet.

[brownsarahm@seawulf ~]$ logout
Connection to seawulf.uri.edu closed.

16.7. Using ssh keys to authenticate#

  1. generate a key pair

  2. store the public key on the server

  3. Request to login, tell server your public key, get back a session ID from the server

  4. if it has that public key, then it generates a random string, encrypts it with your public key and sends it back to your computer

  5. On your computer, it decrypts the message + the session ID with your private key then hashes the message and sends it back

  6. the server then hashes its copy of the message and session ID and if the hash received and calculated match, then you are loggied in

a toy example

cheatsheet on ssh from julia evans

from wizardzines

Lots more networking detals in the full zine available for purchase or I have a copy if you want to borrow it.

16.8. Generating a Key Pair#

We can use ssh-keygen to create a keys.

  • -f option allows us to specify the file name of the keys.

  • -t option allows us to specify the encryption algorithm

  • -b option allows us to specify the size of the key in bits

ssh-keygen -f ~/seawulf -t rsa -b 1024
Generating public/private rsa key pair.
/Users/brownsarahm/seawulf already exists.
Overwrite (y/n)? y
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /Users/brownsarahm/seawulf.
Your public key has been saved in /Users/brownsarahm/seawulf.pub.
The key fingerprint is:
SHA256:UExqbNzmhmxW/rCWvUxb77wU6O7CmUI57G3cBvbYKco brownsarahm@231.126.20.172.s.wireless.uri.edu
The key's randomart image is:
+---[RSA 1024]----+
|       oo        |
|     o +.        |
|      B +        |
|     + B     .   |
|      =.S.  . .  |
|     o .=*o.   . |
|       o+*+Oo..  |
|       ooo@=*+   |
|        E++=o.=. |
+----[SHA256]-----+

We can see it created a file

ls ~/seawulf
/Users/brownsarahm/seawulf

or even look at the public key

cat ~/seawulf.pub
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQDBurS41c5YmFybXWSDM7QQZdj0/5sCSJsEcS3Cqxm00veJK/c6gqlvftrjOGzmQH9t6VL8MA5scmL8IQqqubfKsVOjLbp7enQNwdxCA6z/v365XAUuvpyEfAxMyT2dxCpdnWdGnDABBAeIqdP4McoBNWQfKjeMwif0hK94vlhCzw== brownsarahm@231.126.20.172.s.wireless.uri.edu

16.9. Sending the public key to a server#

again -i to specify the file name

ssh-copy-id -i ~/seawulf brownsarahm@seawulf.uri.edu
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/Users/brownsarahm/seawulf.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
brownsarahm@seawulf.uri.edu's password: 

Number of key(s) added:        1

Now try logging into the machine, with:   "ssh 'brownsarahm@seawulf.uri.edu'"
and check to make sure that only the key(s) you wanted were added.

16.10. Logging in#

To login without usng a password you have to tell ssh which key to use:

ssh -i ~/seawulf brownsarahm@seawulf.uri.edu
Last login: Tue Mar 26 12:54:22 2024 from 172.20.126.231
[brownsarahm@seawulf ~]$ client_loop: send disconnect: Broken pipe

Or you can add the following to your ~/.ssh/config file

Host seawulf
  Hostname seawulf.uri.edu
  Username brownsarahm
  IdentityFile ~/.ssh/seawulf

16.11. Prepare for Next Class#

Make sure you have your ssh key configured to seawulf

16.12. Badges#

  1. Answer the following in hpc.md of your KWL repo: (to think about how the design of the system we used in class impacts programming and connect it to other ideas taught in CS)

    1. What kinds of things would your code need to do if you were going to run it on an HPC system? 
    1. What sbatch options seem the most helpful?
    1. How might you go about setting the time limits for a script? How could you estimate how long a script will take?
    
  2. Read through this rsa encryption demo site to review how it works. Find 2 other encyrption algorithms that could be used wiht ssh (hint: try man ssh or read it online) and compare them in encyryption_compare. Your comparison should be enough to advise someone which is best and why, but need not get into the details.

  3. Post your public key and then decrypte the message we give you back.

  1. Answer the following in hpc.md of your KWL repo: (to think about how the design of the system we used in class impacts programming and connect it to other ideas taught in CS)

    1. What kinds of things would your code need to do if you were going to run it on an HPC system? 
    2. What sbatch options seem the most helpful?
    3. How might you go about setting the time limits for a script? How could you estimate how long a script will take?
    
  2. configure ssh keys for your github account

  3. Find 2 other encyrption algorithms that could be used wiht ssh (hint: try man ssh or read it online) and compare them in encyryption_compare. Your comparison should be enough to advise someone which is best and why, but need not get into the details.

  4. Read through this rsa encryption demo site site and use it to show what, for a particular public/private key pair (based on small primes) it would look like to pass a message, “It is spring now!” encrypted. (that is encypt it, and decrypt it with the site, show it’s encrypted version and the key pair and describe what would go where). Put your example in rsademo.md.

16.13. Experience Report Evidence#

16.14. Questions After Today’s Class#

16.14.1. Can you create a server off any internet connected PC and connect through git bash akin to seawulf?#

You can create a server and connect through ssh from any terminal, yes. Not all will have all of the features of seawulf though.

16.14.2. Is using a remote server safe?#

Depends on the server itself. Using the ones from URI are in most senses and in most cases Amazon, Google and Microsoft ones are, depending on what type of safety. Using corporate ones might mean they have access to some of your data.

16.14.3. Also is a remote server like how a VPN does since you use a computer that is not yours to access things?#

It has that in common, but that is about the end of the similarity. This is a good explore topic.

16.14.4. What happens if you forget your password or lose the SSH key file?#

For seawulf, you have to email ITS to get your account reset.

Generally, you talk to someone who has admin rights. If that’s you… you could fully lose access.

16.14.5. do comapnys use their own private HPCs with their employees?#

Sometimes, othertimes they use Amazon, Google, or Microsoft ones. Those three companies made their own so big they sell access to them to inviduals and other companies alongside their cloud offerings.

Most universities host their own and some share theirs. See, for example HPC at URI which access to includes the Mass Green High performance computing center which is a collab between BU, MIT, Northeastern, UMASS, and Harvard initially that URI has joined.

16.14.6. are we always allowed to use the schools HPC that we used today in class?#

seawulf is specifically for teaching purposes, there are not good data policies for it and your account could be deactivated after the semester is over. If you are doing something for class you can use it, including badges or just tinkering to understand. If you are interested in a side proejct, let me know the purpose and I can help you determine and access a more appropriate HPC system.

16.14.7. can you use an ssh key to log in to your github account to allow you to create push requests?#

yes!

16.14.8. What are some of the other types of virtual machines and what are there uses?#

What we did today was a server not a virtual machine a virtual machine is an isolated system within another system; a server is what hosts virtual machines, and the web.

More broadly, this is a big open question, that is a good explore topic.

16.14.9. What are the capabalities of URI’s seawulf vm? What do people usually do with it?#

You can read the seawulf specs to see what hardward and software it has and the using seawulf page for general information about it.

It is used for learning HPC or doing HPC-requiring computations in courses.

16.14.10. What other ways can ssh keys be used for?#

They’re used for authentication, signatures, or message passing.

16.14.11. Are there any drawbacks to using an ssh key instead of a password?#

Not really, passwords are weak.

16.14.12. what does grep do?#

grep searches the contents of a file

16.14.13. Can ssh use a differnt encryption algorithm thats not RSA thats more secure?#

Yes, see its man file

16.14.14. why is sshing into a server useful?#

Until a few years ago it was the only way to access most servers. Now, more provide a browser based interface, but running that interface uses computational power. So it is more efficient to access with ssh. It also means you can, eg edit in your local IDE but then push and run on a remote server all in one window.

16.14.15. What is the algorithm/logic behind the two generated keys#

This is probably best understood through an rsa demo

16.14.16. Will these ssh keys be able to be decrypted with quantum computing?#

RSA is actually not the most secure already, it is simpler to understand though, so it is a good place to start. Quantum computing, however to the best of my knowledge does not improve encryption breaking, but this is also a good explore topic.

16.14.17. why is the private key so much longer than the public one?#

short answer: It contains extra information

Long answer: this stackexchange post explains it well