Monday, March 29, 2010

Introducing cloud-init's cloud-config syntax

Since this is my first post on a brand new blog, *and* I'm fairly new to the Ubuntu community, I figure I should introduce myself. My name is Scott Moser, I've been a member of the Ubuntu Server team for the past 9 months or so. The majority of my time has been focused on Ubuntu's cloud efforts, both on ec2 and on the Ubuntu Enterprise Cloud (UEC).

Thats enough personal introduction, now on to the content.

The cloud-init package provides "first boot" functionality for the Ubuntu UEC images. It is in charge of taking the generic filesystem image that is booting and customizing it for this particular instance. That includes things like:
  • setting the hostname
  • putting the provided ssh public keys into ~ubuntu/.ssh/authorized_keys
  • running a user provided script or otherwise modifying the image
If you have used the Official Ubuntu Images for Hardy or Karmic, you may be aware that the above functionality was previously provided by ec2-init. The cloud-init is package is largely a "cloud agnostic" replacement for ec2-init. The AWS specific portion of the name didn't fit with UEC at the moment, and seemed limiting for the future. We hope to have it working on other cloud offerings as well.

Setting hostname and configuring a system so the person who launched it can actually log into it are not terribly interesting. The interesting things that can be done with cloud-init are made possible by data provided at launch time called user-data.

ec2-init, cloud-init, and the Alestic images support customization through user-data in one very simple yet effective manner. If the user-data starts with '#!', then it will be stored and executed as root late in the boot process of the instance's first boot (similar to a traditional 'rc.local' script). Output from the script is directed to the console. For example:

$ cat ud.txt
#!/bin/sh
echo ========== Hello World: $(date) ==========
echo "I have been up for $(cut -d\  -f 1 < /proc/uptime) sec"

$ ec2-run-instances ami-a908e7c0 --key mykey.us-east-1 \
   --user-data-file=ud.txt
# wait now for the system to come up and console to be available

$ ec2-get-console-output i-97fc7afc | grep --after-context=1 Hello
========== Hello World: Mon Mar 29 18:05:05 UTC 2010 ==========
I have been up for 28.26 sec

The simple approach shown above gives a great deal of power. The user-data can contain a script in any language where an interpreter already exists in the image (#!/bin/sh, #!/usr/bin/python, #!/usr/bin/perl, #!/usr/bin/awk ... ).

For many cases, the user may not be interested in writing a program. For this case, cloud-init provides "cloud-config", a configuration based approach towards customization. To utilize the cloud-config syntax, the supplied user-data must start with a '#cloud-config'. For example:

$ cat cloud-config.txt
#cloud-config
apt_upgrade: true
apt_sources:
- source: "ppa:smoser/ppa"

packages:
- build-essential
- pastebinit

runcmd:
- echo ======= Hello World =====
- echo "I have been up for $(cut -d\  -f 1 < /proc/uptime) sec"

$ ec2-run-instances ami-a908e7c0 --key mykey.us-east-1 \
   --user-data-file=cloud-config.txt

Now, when the above system is booted, it will have:
  • added my personal ppa
  • run an upgrade to get all updates available
  • installed the 'build-essential' and 'pastebinit' packages
  • printed a similar message to the script above

The 'runcmd' commands are run at the same point in boot that the '#!' script would run in the previous example. It is present to allow you to get the full power of a scripting language if you need it without abandoning cloud-config.

Note, that in this case the fairly large amount of output to the console from 'apt-get upgrade' ended up scrolling our 'Hello World' message off the ec2-console buffer, so it didn't appear there. That is something that will need to be addressed in lucid+1.

For more information on what kinds of things can be done with cloud-config, see doc/examples in the source.

cloud-init supports a couple other formats of user-data which provide more customization possibilities. I hope to write another blog entry covering those other formats soon.

15 comments:

  1. Great first post Scott! I know you did some great work on XC2 commands as well, any chance for you to tell us about it?

    ReplyDelete
  2. Great stuff! I use my own scripts and thought about Puppet. This is more tidy than the former and Puppet seemed like an overkill for my needs.

    Do you know if other distros are going to port this? It'd be convenient to use the same config system everywhere.

    ReplyDelete
  3. Igor,
    Right now cloud-init is both ubuntu specific and ubuntu-only. I agree that it would be wonderful if we had a consistent, cross-OS cloud configuration syntax.

    I'm not at all opposed to cloud-init being ported elsewhere, and would definitely help anyone interested in doing so.

    ReplyDelete
  4. Is there a way to provide the timezone via direct syntax? That would be another thing that often needs to get customized on first boot.

    ReplyDelete
  5. Great work! Very useful: http://blog.topicbranch.net/2010/08/xubuntu-and-neatx-on-ec2.html

    Now I noticed that the runcmd commands are executed immediately. I had hoped they would be executed last after all packages were installed. Where do you accept feature requests?

    Also, what is the "user" option supposed to do. I tried it hoping the default user (id 1000) with sudo rights would get that user name, but it is still ubuntu (which makes it impossible to log in :-)

    ReplyDelete
  6. What is the best way to test user-data scripts?
    If I mistype a package name or something like that my cloud-init script fails, I fix that and then encounter another error. Is there a better way to give it a trial run then terminating and restarting a new instance each time?

    I like cloud-init's simplicity but I'm thinking of investing some time learning puppet and using cloud-init to bootstrap puppet as it seems easier to test.

    ReplyDelete
  7. @M03hr3,
    There is currently no option in cloud-config to set the timezone. This could be easily added. I went ahead and opened a place holder bug at https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/645458 .

    ReplyDelete
  8. @igor
    The Amazon Linux Images that were just announced have included cloud-init. Hopefully eventually we could get it into other distributions as well. The Amazon folks did some work to make it less distro specific.

    ReplyDelete
  9. @Mark,
    The 'user' option just indicates which (existing) user will get the ssh keys that the instance was launched with added to their account. You're right that it will fail miserably if there is no such user.

    It might be useful to allow user data to configure what the UID 1000 user's name is. Care to open a bug/feature request against cloud-init ?

    ReplyDelete
  10. @Kurt,
    I just posted a somewhat lengthy response to a similar question on the ec2ubuntu mailing list See the "Fast way to test user-data" thread at http://groups.google.com/group/ec2ubuntu/browse_thread/thread/d4d51238a2afb55b?pli=1

    ReplyDelete
  11. @Mark, regarding runcmd and ordering. In 10.04 and 10.10, runcmd should run after all packages were installed. This was fixed under bug 613309.

    ReplyDelete
  12. While setting the hostname may not be not terribly interesting. It is necessary.

    I look at all the examples and did not see a means to set the hostname via #config-init

    Thanks

    ReplyDelete
  13. @Phillip,
    What do you mean setting the hostname ? cloud-init should set the hostname to the hostname as provided by EC2's metadata service (ie, write to /etc/hostname and call 'hostname').
    If you think that setting hostname to a user supplied value would be useful, please open a bug and explain your usecase.

    ReplyDelete
  14. Hi, apologies for what might seem an obvious question (on an old thread), to those who know, but I seem unable to find a 'dummies' explanation for 'user-data'. For example, on Ubuntu 11.10 is 'user-data' a single script/file? Or a collection of scripts? And where does it (they) reside? I think I have a handle on running things via the tools from the cli but what if I want a pre-bundled AMI responding to auto-scaling to run a script at boot and update itself from files held in an S3? Is this possible?

    Any advice appreciated, I'm sure I'm just missing a basic concept here so please excuse my ignorance.

    Richard.

    ReplyDelete
  15. what to do if cloud-init is setting the hostname to an incorrect value?

    ReplyDelete