GridPACK On Cloud

From GridPACK
Revision as of 17:29, 31 March 2020 by Bjpalmer (talk | contribs)

Jump to: navigation, search

This capability is under development.

The GridPACK development team is currently working on making GridPACK available through the cloud. The reasons for doing this are twofold. One is to reduce the overhead of getting started with GridPACK by providing an out-of-the-box build that can be used immediately by anyone interested in investigating GridPACK or trying out some of its applications and features. The second reason is to make GridPACK available to users that do not currently have access to Linux workstations or clusters or would like to see what can be accomplished with these kinds of resources before making the investment in setting up a Linux system.

Once you have created an instance of GridPACK on the cloud from a machine image file, it behaves exactly like a remote Linux computer and you can access it using the same kinds of methods that you would use to access any other Linux workstation or cluster. The current GridPACK images come preloaded with the VI editor and GNU compilers. It also comes with git, sftp, scp and svn, so any development you do can eventually be transferred to some other platform. Other packages can be installed with yum. You will have sudo privileges on any instance you create so it is possible to configure it to suit your preferences. The GridPACK images contain two directories. The software directory contains all the libraries needed by GridPACK, including PETSc, Boost, GA, and MPI. The GridPACK directory contains all of the GridPACK source code as well as two builds of GridPACK. The builds are located in GridPACK/src/build_ts and GridPACK/src/build_pr. Inside each of these directories is an install directory that can be linked to by new GridPACK applications. The build_ts and build_pr are built using the two-sided and progress rank runtimes of GA, respectively. The two-sided build is probably sufficient for most runs using small numbers of processors, the progress ranks build should be used when running on large numbers of processors.

Amazon Web Services: At present, GridPACK is only available through Amazon Web Services (AWS). The GridPACK development team is new to this type of computing environment and we encourage any users that run into difficulties to contact the development team. We will try and provide further clarification or make appropriate changes to our cloud distribution so that GridPACK will work properly. Our access to AWS is through a corporate account and we may have a different experience from users accessing the cloud from other environments. Again, if you are having problems and the instructions below do not appear to correspond with what you are seeing when you log in, please contact us and we will try and resolve whatever issues you may be having.

To use GridPACK via AWS it is first necessary to get an Amazon account. We in the past we have found that it is possible to get some of the applications and tests to run with an AWS instance type of t2.micro (which is free), but as GridPACK has grown this no longer seems to be the case. Compiling and running jobs on multiple processors generally requires more memory and disk than is available on the t2.micro instance, so users will need to set up a larger instance if they are interested in developing their own applications or investigating performance gains with larger numbers of processors.

Once users have set up an AWS account and logged in they should end up on the AWS Management Console page. This page will list a variety of AWS services available to users. If you have never logged in before or want to create a new instance from a GridPACK Amazon Machine Image (AMI), go to the "Build a solution" box and click on the "Launch a virtual machine" link (it will also mention EC2). Select this.

Finding and Launching an Amazon Machine Image If you clicked on the "Launch a virtual machine" link you will end up an a page titled ("Step 1: Choose and Amazon Machine Image (AMI). Select "AWS Marketplace" from the list on the left hand side and type in "gridpack" in the search field at the top. You should see a link show up for some number of results in Community AMIs. Select one of these images. The main difference between the different images is the operating system they represent. After selecting the configuration, click the "Next: Configure Instance Details" button at the bottom of the page. This page can be left as is, as long as a network appears in the Network block. Go to the bottom of the page and click on the "Next: Add Storage" button. Set the "Size (GiB)" field to at least 30 GiB. Then click on "Review and Launch" at the bottom of the page. Next click on the "Launch" button at the bottom of the page.

A dialog box will pop up asking you to "Select and existing key pair or create a new key pair". If you have created a key pair in the past, you can choose an existing key pair. Otherwise follow the instructions for creating a new key pair. Make sure to save the file that will be downloaded as part of this process that contains the new key pair. This file is required in order to log in to running instances.

The next page you will be taken to lists the different machine configurations that you can select for your new instance. Generally, the more CPUs (vCPUs) and the higher the network performance, the more expensive it should be. The default amount of memory for all of the instances is generally inadequate for GridPACK and should be increased significantly. After choosing "EC2", you will be taken to the EC2 page. On the left hand side of the page is a column labeled "EC2 Dashboard". In the middle of the page is a block labeled "Resources". The lower portion of this block has a partition labeled "Service Health" and under this is a field called "Service Status". Make sure that this is set to "US West (Oregon)". If it is not, go to the upper right hand corner of the page, which should have a pull-down menu currently set to either a US state (e.g. Ohio) or some country (e.g. Tokyo). Select "US West (Oregon)".

Once you have verified that you have selected the correct cloud network, you can create your own instance from the GridPACK image. Under the EC2 Dashboard column, go to the "Images" tab and select "AMIs". This will bring up the AMI page, which lists any AMIs that you may have created. At the top of this page is a search field. The label to the left of this search field is "Owned by me". Type "gridpack" or "gridpack_aws" into the search field. An AMI with the name "GridPACK_AWS_vX", where "X" is an integer, should appear in the list of AMIs. Choose this AMI by clicking on the square icon to the left of the AMI name. If more than one AMI shows up, select the one with the highest value of X.

After selecting the GridPACK AMI, go to the top of the page, and click the "Launch" button. This will bring you to a page called "Step 2: Choose an Instance Type". The page will display a table of different virtual machines that you can select to run GridPACK on. The page also feature information on the properties of the different instance (e.g. number of cores, memory, etc.). The t2.micro instance is free, but is too small to do much with. Charges apply for larger instances so you may want to start with a relatively small instance and increase the size as needed. It is also worth noting that you cannot change the size of an instance once you have created it, without first saving it as an AMI and then starting a new instance from that AMI.

Once you have decided what type of instance you want to use, select the square corresponding to that instance on the left hand side of the page and then click the "Review and Launch" button on lower right hand side of the page. This will take you to a page labeled "Step 7: Review Instance Launch". You can directly hit the "Launch" button to start up an instance, but before doing that you should increase the amount of disk space associated with the instance. This can be done by clicking on the "Edit Storage" link on the right hand side of the page, near the bottom. This will allow you to modify the amount of disk available to this instance. After setting the storage amount, click on "Review and Launch" at the bottom of the page. This will return you to the "Review Instance Launch" page. Click on the "Launch" button at the bottom of the page.

When you launch the instance, a dialog box appears labeled "Select an existing key pair or create a new key pair". If you have used AWS instances previously, then a key pair file should show up in the selection menu, otherwise you will need to create a new one. You can also create a new key pair even if some old ones are available. To create a new key pair, select the "Create a new key pair" option in the menu list and type in a name for the key pair file. The dialog will warn you to make sure that you save the key pair file after you create it. After typing in the key pair file name, click the "Download Key Pair" button. The key file will now be in your download area with the name filename.pem, where "filename" is the name you put in the name field for the key file. You will need to copy this file to whatever platform you are planning to use to access your AWS instance. Once you have created the key file, click on the "Launch Instance" button. This will take you to the "Launch Status" page. At the bottom right of the page, click on the "View Instances" button. You will then be taken to the EC2 Dashboard page, except that now you will see the "Instances" that you have access to. You can also get to this page by going directly to the EC2 Dashboard page and clicking on the "Instances" link under the "Instances" tab. You should see a new instance appear and after a minute or two, it should have the status "Running". Note that once an instance is "Running", you are being charged for whatever resources are required by the instance. You can stop an instance by going to "Actions" menu at the top of the page, selecting "Instance State", and then selecting "Stop" under the submenu. This will halt the instance, and you will only be charged for storage.

Once an instance has been created, you can start and stop it repeatedly by selecting the instance, going to the "Actions" menu and clicking "Start" or "Stop" under the "Instance State" submenu. Each time you start an instance, an IP address should appear in the "Public IP" field for that instance. This address can be used to SSH into the running instance from some other computer. Note that every time you restart an existing instance, you will get a new IP address. Once an instance is running and has an IP address, it can be accessed from another computer just like any Linux workstation or cluster. When you first create an instance, it will always have an account called "ec2-user". This user name can be used be to log into the instance. You can use your privileges as superuser to add other user accounts, if desired. If you are using your instance to develop your own applications and decide that you want to use a different size instance, or perhaps you want to share it with others, you can save your instance as an AMI and then create a new instance from it on a different size virtual machine. You can also create AMIs as a way to save your work, although you should also want to look at tools such as git and svn to make a permanent record of your code development.

There are a number of ways to access an instance once it is running. A few are listed below.

From Linux: This is the simplest platform to use to log into your AWS instance. Copy the key pair file that you are going to use to a directory on your Linux platform that you would like to use for logging in to the AWS instance. Change the permissions on the file using the command

   chmod 600 keyfile.pem

where "keyfile.pem" is the key pair file name. SSH will not allow you to use a key file that is world readable. Then type

   ssh -i keyfile.pem -l ec2-user ip.add.re.ss

where ip.add.re.ss is the numerical IP address in the "Public IP" field on the EC2 Dashboard page. This will log you in as user ec2-user.

From a Mac: Accessing an instance from a Mac is almost the same as for a Linux box. Bring up a terminal on the Mac and go to whatever directory contains your key pair file. Use the above SSH command to connect to the remote instance after changing the permissions on the key pair file.

From Windows: A running instance can be accessed using Putty. A detailed description on how to do this is available here. To use Putty you will need to download both the putty and puttygen executables. These are both freely available from the Putty website.

Instances can also be accessed using Cygwin. After installing Cygwin, bring up a Cygwin window. This will behave largely like a Linux terminal. We will assume that the .pem file created when you started your AWS instance is located somewhere on your Windows desktop. If you just type ls to get a directory listing, you will see a variety of directories such as Desktop, Downloads etc. that mimic the folders on Windows. At least initially, these folders will not have anything in them. To find the .pem file, type df in the Cygwin window. You will probably see something like

   Filesystem     1K-blocks      Used Available Use% Mounted on
   C:/cygwin64    209712124 205861652   3850472  99% /
   U:             524284924 390740552 133544372  75% /cygdrive/u

Your Windows desktop and other folders are located under the partition /cygdrive/u. If you type cd /cygdrive/u and then type ls you will again see a listing of folders such as Desktop, Downloads etc. but this time they will actually correspond to your Windows folders. Cd into the folder containing the .pem file and change the permissions on the .pem file using chmod 600 keyfile.pem. Use ssh as described above to log into your running instance.

AWS Accounts: Information on setting up an AWS account can be found here and information on billing can be found here. Additional information on the properties of different instances can be found here. The properties page also has some information about setting up a free trial account with AWS.