Reading S3 via s3fs from Ubuntu 12.04 EC2 instance

Posted on Wednesday, November 14, 2012


Getting EC2 to read S3




I have a simple need to be able to read/write data at my S3 from an EC2 instance.   You would think there would be a nice simple way of doing that, like allowing EC2 instances with a certain (ec2) security group have permissions to a bucket or a folder in a bucket of S3.  But there is not, at least not from what I can see.   The good news is there is a way to do this, but its more complex and it gives you a lot more tools than you could ask for.

So with that let’s start diving into it all…


S3


First I am going to log into S3 via the web console and make a new bucket and place a few folders and files in it.




















Click on My Account/Console -> AWS management Console

























Sign in with your amazon account email and password






















Click on S3




























Click “Create Bucket”
























I will call this bucket “pats-test-bucket”    click create



















Select the pats-test-bucket then click on “Create Folder”














Call it test_folder

















Open up the test_folder and click on Upload.  Upload a few files.




I uploaded two files Test.xls and test_file.txt

 



Get your keys


Get your main keys for your account.



















Click on My Account/Console -> Security Credentials




















Go down to the Access Credentials and copy the Access Key ID,  mine of course is blanked out for security.


But let’s suppose it’s

Access Key ID = KAIAA3478XOQQHNYF54A



Now click on the show button under the Secret Access Key














Copy the Secret Access Key, again I blanked mine out for security reason.

Let’s assume its  xdg4x/26cfr9+XqVInnf438Hdd34PjiLzhAi43Dd
For example purposes we have

Access Key ID       = KAIAA3478XOQQHNYF54A
Secret Access Key = xdg4x/26cfr9+XqVInnf438Hdd34PjiLzhAi43Dd



If you do not have an Ubuntu 12.04 ec2 running, here is the command line to get one created.  This assumes you have AWS command line tools set up on your system and have a keypair created.  If not you can use the AWS web console.


         > ec2-run-instances ami-9c78c0f5 -b /dev/sda1=:8:true -k my-keypair -t t1.micro -g default  --availability-zone us-east-1a

(use your own keypair)


Now log into your Ubuntu 12.04 EC2 instance
In my case it’s at ec2-184-72-175-14.compute-1.amazonaws.com


         >  ssh -i .ec2/my-keypair.pem ubuntu@ec2-184-72-175-14.compute-1.amazonaws.com

(again this assumes you have your keypair in the given location)



s3fs


Now install s3fs on the Ubuntu instance.  The main site for s3fs is at http://code.google.com/p/s3fs/
[1]



         >    sudo apt-get install build-essential
         >    sudo apt-get install libfuse-dev
         >    sudo apt-get install fuse-utils
         >    sudo apt-get install libcurl4-openssl-dev
         >    sudo apt-get install libxml2-dev
         >  sudo apt-get install mime-support


Download the tar file from the source


         >  wget http://s3fs.googlecode.com/files/s3fs-1.61.tar.gz


Untar it and make it


         >  tar xzvf s3fs-1.61.tar.gz
         >  cd s3fs-1.61
         >  sudo ./configure
         >  sudo make
         >  sudo make install



Edit /etc/fuse.conf


         >  sudo vi /etc/fuse.conf


Edit it to (uncomment #user_allow_other)


# Set the maximum number of FUSE mounts allowed to non-root users.
# The default is 1000.
#
#mount_max = 1000

# Allow non-root users to specify the 'allow_other' or 'allow_root'
# mount options.
#
user_allow_other






Set up /etc/passwd-s3fs


Create this file.


         >  vi ~/.passwd-s3fs


Put the following contents in the file.  The format of this file is

Access Key ID:Secret Access Key ID
(separated by a colon all on one line, remember use your keys!)


KAIAA3478XOQQHNYF54A: xdg4x/26cfr9+XqVInnf438Hdd34PjiLzhAi43Dd


Update the permissions on the passwd-s3fs file


         >  chmod 600 ~/.passwd-s3fs



Now run the following command


         >    cd
         >    mkdir s3
         >    s3fs pats-test-bucket s3 -ouse_cache=/tmp




         >    cd s3
         >    ls






But you see nothing.   Even though there is actually a folder in S3 you can’t see it.  Why?

Folders in S3 are not really folders, and how s3fs handles them is a bit different than how the S3 console handles it.  As a result s3fs cannot see folders made via the S3 web console.  So any folder structure you want to see via s3fs needs to be made in s3fs

Run the following command


         >    df -h











Here you can see the mounted s3 drive and it has 256T of space, basically unlimited.


Run the following commands from the s3 directory


         >    mkdir test
         >    cd test
         >    touch bob_test.txt


Log back into the Amazon S3 web console and you will see













The test folder we just made via s3fs and a o byte test file.  As I understand it s3fs needs to make these 0 byte files to it can see the “folder” structure.

Open the test folder and you can see bob_test.txt.






Setting up auto mount of s3 drive


It would be nice to just auto mount the drive so it is there when we reboot the machine.

Here is one way of doing that.

Create this file as root.


         >  sudo vi /etc/passwd-s3fs


Put the following contents in the file.  The format of this file is

Access Key ID:Secret Access Key ID
(separated by a colon all on one line, remember use your keys!)


KAIAA3478XOQQHNYF54A: xdg4x/26cfr9+XqVInnf438Hdd34PjiLzhAi43Dd


Update the permissions on the passwd-s3fs file


         >  sudo chmod 600 /etc/passwd-s3fs



Make a directory to mount to



         >    sudo mkdir /etc/s3
         >    sudo chmod



Edit fstab


         >    sudo vi /etc/fstab


Add the following line to the end of the file.  (this allows a secure connection)


s3fs#pats-test-bucket /mnt/s3 fuse allow_other,url=https://s3.amazonaws.com 0 0


Now run this command


         >    sudo mount /etc/s3


Then this one to see that it has mounted just fine


         >    ls /mnt/s3/test/







I had issues in the past auto-mounting EBS drives in on EC2 so I created a script to run during start up to mount it.   Here is my solution to mount the S3 drive during the start up processs.

Set up start up script (to mount the hard drives


         >  sudo vi /etc/init.d/mountHD


Then place the following in it.


mount /mnt/s3


Make it executable


         >  sudo chmod 755 /etc/init.d/mountHD



Add it to autostart


         >  sudo update-rc.d mountHD defaults


Reboot to test auto mount of hard drives


         >  sudo reboot now



The problem


The big problem with this set up is the keys.  These keys, that were obtained on the Security Credentials page, have access to everything on AWS for your account.    In the wrong hands these keys can wreak havoc.   So I for one do not want them on my ec2 running instance.

The good news is that AWS provides a tool for still using keys but giving them limited permissions.   The tool is called Identity and Access Management (IAM) http://aws.amazon.com/iam/ [2]



IAM


This is my first journey into using IAM, so bear with me if I make a few mistakes or can’t fully explain what I am doing. J


In this example I am going to create a user called test_bob and give that user special permissions that only allow him to read from a specific S3 bucket.




Log into the web console and click on IAM















Click on Users.















Click on Create New Users



 



Enter the name test_bob and click create.















Click on Download Credentials

This file will contain something like this. 

"User Name","Access Key Id","Secret Access Key"
"test_bob","AKIAJBFSHWME4UTQDXHQ","knR6B8Slm8sHFZ6URhZtgvwlfzWoVOPRlV6jjON9"


Access Key Id     = AKIAJBFSHWME4UTQDXHQ
Secret Access Key = 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











Click on Close Window





















Select the user then click on Permissons ->Attach User Policy





Scroll down and select “Amazon S3 Read Only Access” and click on Select.

























Review then click on Apply Policy (You can always change it later)

Here is the full policy



{
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:Get*",
        "s3:List*"
      ],
      "Resource": "*"
    }
  ]
}


This policy will allow this user to get and list all document in all of your S3 buckets.  Let’s keep it that way for now so we can confirm it when we limit it later.






Go back to the ec2 instance and update the keys in /etc/passwd-s3fs


         >  sudo vi /etc/passwd-s3fs


Enter test_bob’s keys

Access Key Id     = AKIAJBFSHWME4UTQDXHQ
Secret Access Key = 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


KIAJBFSHWME4UTQDXHQ: knR6B8Slm8sHFZ6URhZtgvwlfzWoVOPRlV6jjON9


Reboot (just a simple way to use the new keys)


         >  sudo reboot now


Now test it out you should be able to see files within the S3 bucket but not be able to write to them.   This is the result of my attempt to write a file.





If you update the policy to the following, then the user can only see files within the listed bucket “pats-test-bucket”



{
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:ListAllMyBuckets"
      ],
      "Resource": "arn:aws:s3:::*"
    },
    {
      "Effect": "Allow",
      "Action": [
          "s3:Get*",
          "s3:List*"
      ],
      "Resource": [
          "arn:aws:s3:::pats-test-bucket",
          "arn:aws:s3:::pats-test-bucket/*"
      ]
    }
  ]
}




References
[1]  S3FS main site
       Visited 1/2012
[2]  Installation notes
       Visited 11/2012
[3]  Installing FUSE
       Visited 11/2012
[4]  Not seeing directories / files unless created via s3fs
       Visited 11/2012



2 comments:

  1. Thanks for detai information . Worked straightway .

    ReplyDelete
    Replies
    1. I am glad it worked for you. I really like this good simple S3 tools that are out there.

      Delete