Easy start guide for Deepracer on Cloud in AWS:
Search for ec2 in Services:
Launch Instance:
Select the Deep Learning AMI (Ubuntu 18.04) AMI (This is a template for the machine including lots of good ML stuff)
Select instance type g4dn from All instance families:
Now select specific g4dn.2xlarge as it’s a good fit by clicking the checkbox next to it:
Click Review and Launch:
Ignore these warnings as it’s just telling you that you are allowing this machine to be accessed from the web (you need this for running the notebook for log analysis)
Scroll down and expand Storage then click Edit Storage:
Change root to 200Gb (Default ran out of space installing deep racer for me):
Click Review and Launch
Click Launch
Create a new keypair (name it anything you like):
Download the keypair:
Then click Launch Instances.
Shows you this:
Scroll down and click View Instances:
Wait till you see it go from Initialization to 2/2 checks passed. In the example below I have a completed machine and a new one for the demo. You’ll just have one machine.
Click on the instance ID for your machine:
Copy the Public IPv4 address or the Public IPv4 DNS. In my case this is ec2–3–228–6–131.compute-1.amazonaws.com or 3.228.6.131
Open up a command prompt:
Run a command like above using your instance’s IP address and the path to the pem file you downloaded a few steps before. The default user is called ubuntu.
In my case I used:
C:\Users\f_mac>ssh -i Downloads\testdeepracer.pem ubuntu@3.228.6.131
You’ll have a different IP address and potentially a different location and name for the pem file.
As it’s the first time you have connected you’ll be challenged with a response like this:
The authenticity of host ‘3.228.6.131 (3.228.6.131)’ can’t be established.
ECDSA key fingerprint is SHA256:smpNoKledFjgsMxlAREdIB4JLSQr4HhN7SPxuW9uJE8.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Type in ‘yes’ and click enter like in my example.
This will then give you a unix prompt:
Now clone the deepracer repository from github by inputting this command:
git clone https://github.com/aws-deepracer-community/deepracer-for-cloud.git
Now run the second command to do the installation:
cd deepracer-for-cloud && ./bin/prepare.sh
This step takes about 5- 10 mins. So go make a cup of tea or coffee :)
Still working….
Yay, it’s done :)
You must now reboot the instance. Easiest way to do it is with this command:
sudo reboot now
You’ll now see confirmation of the disconnection.
While it reboots let’s make a S3 bucket and a user for you to connect to s3 (Amazon’s web storage)
Click of services and type in S3:
If you have been using deepracer console it should have already created
Take a note of the bucket name as you’ll need this later. This corresponds to the DR_UPLOAD_S3_BUCKET
Parameter you will use later
Create a new bucket using the Create Bucket Button:
You have to give your bucket name a unique name across all of AWS so something like deerpracer followed by your full name will probably work so for me deepracer-finlay-macrae
Scroll down and click Create Bucket:
Now click on services again and enter IAM:
Click Users:
Click Add User:
Select a username that makes sense and click the checkbox for Programmatic access:
Next: Permissions
Click Attach existing policies directly and then search for S3 in the filter box. For ease select AmazonS3FullAccess. This is more rights than it needs but I’ll leave creating roles and assigning granular bucket access as an exercise for the reader.
Click
Next:Tags
Then click Next:Review
Then click Create user:
Take a note of your Access key ID and also a note of your Secret access key. You need both.
Mine were AKIASOXA435TSWLNXTV4 and UKwKPBGRzhQFobFFq8lXmJ4voxF0Igezuu8ZtoqD
Don’t share these with anyone you don’t want to be able to use your S3 account.
Your instance will have rebooted by now so let’s jump back to your command window:
Connect to the instance with the same command as before:
Now run the follow up command:
cd deepracer-for-cloud && ./bin/init.sh -c aws -a gpu
It’ll download the docker images and other things it needs:
This takes around 5 mins:
Now we can start configuring the machine:
Edit the system.env file with nano, vi or vim (nano probably easiest):
Change DR_UPLOAD_S3_BUCKET to the name of the s3 bucket AWS created for you. For me it was aws-deepracer-8a1ef6af-8e70–4066-afa1-b7d113e431d9 You’ll have a different but similar one.
Change DR_UPLOAD_S3_ROLE to default
Change DR_LOCAL_BUCKET to the name of the s3 bucket you created so in my case deepracer-finlay-macrae
Now we configure the default AWS credentials:
Give it the keys from the user you created and provide
us-east-1
As the region.
Leave default output format as blank.
Now run
source bin/activate.sh
Run
dr-update
Which updates our changes to the system.env file
Run
dr-upload-custom-files
This helper function copies the files held in the custom_files into the s3 bucket.
Hyperparameters.json holds details about the training. Model_metadata contains the action space and reward_function contains the reward function. You can edit these and run dr-upload-custom_files again to replace the ones deepracer for cloud is using.
Now run
dr-start-training
To actually start training:
Remember to always stop training using the dr-stop-training command as not doing so leaves your docker containers in a mess.
If you want to resume training the easiest way to start a new session is the dr-increment-training command. If you run this while training is stopped it increments your model name and update your run.env file to select your old model as the pre-trained one to build on.
DR_LOCAL_S3_MODEL_PREFIX=rl-deepracer-2
DR_LOCAL_S3_PRETRAINED=True
DR_LOCAL_S3_PRETRAINED_PREFIX=rl-deepracer-1
The rest of the helper functions are documented here : https://aws-deepracer-community.github.io/deepracer-for-cloud/reference.html
OK that’s got you started. Check out the S3 bucket and you should see logs etc appearing. This is the end of the AWS install guide. Please feedback if any changes are needed.
Check out my next article on how to get log analysis and KVS streams running.