stillhq.com : Mikal, a geek from Canberra living in Silicon Valleyhttp://www.stillhq.com
The life, times, travel and software of Michael StillenCopyright (c) Michael Still 2000 - 2006blosxom simplerss20 v20050208hh180http://blogs.law.harvard.edu/tech/rssSo you want to setup a Ceph dev environment using OSA/openstack/osaSat, 27 May 2017 18:30:00 PSTSupport for installing and configuring Ceph was added to openstack-ansible in Ocata, so now that I have a need for a Ceph development environment it seems logical that I would build it by building an openstack-ansible Ocata AIO. There were a few gotchas there, so I want to explain the process I used.
<br/><br/>
First off, Ceph is enabled in an openstack-ansible AIO using a thing I've never seen before called a "Scenario". Basically this means that you need to export an environment variable called "SCENARIO" before running the AIO install. Something like this will do the trick?L:
<br/><br/>
<ul><pre>
export SCENARIO=ceph
</pre></ul>
<br/><br/>
Next you need to set the global pg_num in the ceph role or the install will fail. I did that with this patch:
<br/><br/>
<ul><pre>
--- /etc/ansible/roles/ceph.ceph-common/defaults/main.yml 2017-05-26 08:55:07.803635173 +1000
+++ /etc/ansible/roles/ceph.ceph-common/defaults/main.yml 2017-05-26 08:58:30.417019878 +1000
@@ -338,7 +338,9 @@
# foo: 1234
# bar: 5678
#
-ceph_conf_overrides: {}
+ceph_conf_overrides:
+ global:
+ osd_pool_default_pg_num: 8
#############
@@ -373,4 +375,4 @@
# Set this to true to enable File access via NFS. Requires an MDS role.
nfs_file_gw: true
# Set this to true to enable Object access via NFS. Requires an RGW role.
-nfs_obj_gw: false
\ No newline at end of file
+nfs_obj_gw: false
</pre></ul>
<br/><br/>
That of course needs to be done after the Ceph role has been fetched, but before it is executed, so in other words after the AIO bootstrap, but before the install.
<br/><br/>
And that was about it (although of course that took a fair while to work out). I have this automated in my little install helper thing, so I'll never need to think about it again which is nice.
<br/><br/>
Once Ceph is installed, you interact with it via the monitor container, not the utility container, which is a bit odd. That said, all you really need is the Ceph config file and the Ceph utilities, so you could move those elsewhere.
<br/><br/>
<ul><pre>
root@labosa:/etc/openstack_deploy# <b>lxc-attach -n aio1_ceph-mon_container-a3d8b8b1</b>
root@aio1-ceph-mon-container-a3d8b8b1:/# <b>ceph -s</b>
cluster 24424319-b5e9-49d2-a57a-6087ab7f45bd
health HEALTH_OK
monmap e1: 1 mons at {aio1-ceph-mon-container-a3d8b8b1=172.29.239.114:6789/0}
election epoch 3, quorum 0 aio1-ceph-mon-container-a3d8b8b1
osdmap e20: 3 osds: 3 up, 3 in
flags sortbitwise,require_jewel_osds
pgmap v36: 40 pgs, 5 pools, 0 bytes data, 0 objects
102156 kB used, 3070 GB / 3070 GB avail
40 active+clean
root@aio1-ceph-mon-container-a3d8b8b1:/# <b>ceph osd tree</b>
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 2.99817 root default
-2 2.99817 host labosa
0 0.99939 osd.0 up 1.00000 1.00000
1 0.99939 osd.1 up 1.00000 1.00000
2 0.99939 osd.2 up 1.00000 1.00000
</pre></ul>
<br/><br/><i>Tags for this post: <a href="http://www.stillhq.com/tags/openstack.html">openstack</a> <a href="http://www.stillhq.com/tags/osa.html">osa</a> <a href="http://www.stillhq.com/tags/ceph.html">ceph</a> <a href="http://www.stillhq.com/tags/openstack-ansible.html">openstack-ansible</a></i><br/><i>Related posts: <a href="http://www.stillhq.com/docker/000001.html">Configuring docker to use rexray and Ceph for persistent storage</a></i>
<a href="http://www.stillhq.com/openstack/osa/000001.commentform.html">Comment</a>
http://www.stillhq.com/openstack/osa/000001.html
http://www.stillhq.com/openstack/osa/000001.htmlThings I read today: the best description I've seen of metadata routing in neutron/openstackSun, 07 May 2017 17:52:00 PSTI happened upon a thread about OVN's proposal for how to handle nova metadata traffic, which linked to <a href="https://www.suse.com/communities/blog/vms-get-access-metadata-neutron/">this very good Suse blog post about how metadata traffic is routed in neutron</a>. I'm just adding the link here because I think it will be useful to others. The <a href="https://review.openstack.org/#/c/452811/6/doc/source/design/metadata_api.rst">OVN proposal</a> is also an interesting read.
<br/><br/><i>Tags for this post: <a href="http://www.stillhq.com/tags/openstack.html">openstack</a> <a href="http://www.stillhq.com/tags/nova.html">nova</a> <a href="http://www.stillhq.com/tags/neutron.html">neutron</a> <a href="http://www.stillhq.com/tags/metadata.html">metadata</a> <a href="http://www.stillhq.com/tags/ovn.html">ovn</a></i><br/><i>Related posts: <a href="http://www.stillhq.com/openstack/juno/000014.html">Juno nova mid-cycle meetup summary: nova-network to Neutron migration</a>; <a href="http://www.stillhq.com/openstack/000022.html">Nova vendordata deployment, an excessively detailed guide</a>; <a href="http://www.stillhq.com/openstack/kilo/000004.html">One week of Nova Kilo specifications</a>; <a href="http://www.stillhq.com/openstack/kilo/000005.html">Specs for Kilo</a>; <a href="http://www.stillhq.com/openstack/juno/000001.html">Juno Nova PTL Candidacy</a>; <a href="http://www.stillhq.com/openstack/juno/000012.html">Juno nova mid-cycle meetup summary: scheduler</a></i>
<a href="http://www.stillhq.com/openstack/000023.commentform.html">Comment</a>
http://www.stillhq.com/openstack/000023.html
http://www.stillhq.com/openstack/000023.htmlNova vendordata deployment, an excessively detailed guide/openstackThu, 02 Feb 2017 19:49:00 PSTNova presents configuration information to instances it starts via a mechanism
called metadata. This metadata is made available via either a configdrive, or
the metadata service. These mechanisms are widely used via helpers such as
cloud-init to specify things like the root password the instance should use.
There are three separate groups of people who need to be able to specify
metadata for an instance.
<br/><br/>
<b>User provided data</b>
<br/><br/>
The user who booted the instance can pass metadata to the instance in several
ways. For authentication keypairs, the keypairs functionality of the Nova APIs
can be used to upload a key and then specify that key during the Nova boot API
request. For less structured data, a small opaque blob of data may be passed
via the user-data feature of the Nova API. Examples of such unstructured data
would be the puppet role that the instance should use, or the HTTP address of a
server to fetch post-boot configuration information from.
<br/><br/>
<b>Nova provided data</b>
<br/><br/>
Nova itself needs to pass information to the instance via its internal
implementation of the metadata system. Such information includes the network
configuration for the instance, as well as the requested hostname for the
instance. This happens by default and requires no configuration by the user or
deployer.
<br/><br/>
<b>Deployer provided data</b>
<br/><br/>
There is however a third type of data. It is possible that the deployer of
OpenStack needs to pass data to an instance. It is also possible that this data
is not known to the user starting the instance. An example might be a
cryptographic token to be used to register the instance with Active Directory
post boot -- the user starting the instance should not have access to Active
Directory to create this token, but the Nova deployment might have permissions
to generate the token on the user's behalf.
<br/><br/>
Nova supports a mechanism to add "vendordata" to the metadata handed to
instances. This is done by loading named modules, which must appear in the nova
source code. We provide two such modules:
<br/><br/>
<ul>
<li>StaticJSON: a module which can include the contents of a static JSON file loaded from disk. This can be used for things which don't change between instances, such as the location of the corporate puppet server.
<li>DynamicJSON: a module which will make a request to an external REST service to determine what metadata to add to an instance. This is how we recommend you generate things like Active Directory tokens which change per instance.
</ul>
<br/><br/>
<b>Tell me more about DynamicJSON</b>
<br/><br/>
Having said all that, this post is about how to configure the DynamicJSON plugin, as I think its the most interesting bit here.
<br/><br/>
To use DynamicJSON, you configure it like this:
<br/><br/>
<ul>
<li>Add "DynamicJSON" to the vendordata_providers configuration option. This can also include "StaticJSON" if you'd like.
<li>Specify the REST services to be contacted to generate metadata in the vendordata_dynamic_targets configuration option. There can be more than one of these, but note that they will be queried once per metadata request from the instance, which can mean a fair bit of traffic depending on your configuration and the configuration of the instance.
</ul>
<br/><br/>
The format for an entry in vendordata_dynamic_targets is like this:
<br/><br/>
<pre>
&lt;name&gt;@&lt;url&gt;
</pre>
<br/><br/>
Where name is a short string not including the '@' character, and where the
URL can include a port number if so required. An example would be:
<br/><br/>
<pre>
testing@http://127.0.0.1:125
</pre>
<br/><br/>
Metadata fetched from this target will appear in the metadata service at a
new file called vendordata2.json, with a path (either in the metadata service
URL or in the configdrive) like this:
<br/><br/>
<pre>
openstack/2016-10-06/vendor_data2.json
</pre>
<br/><br/>
For each dynamic target, there will be an entry in the JSON file named after
that target. For example::
<br/><br/>
<pre>
{
"testing": {
"value1": 1,
"value2": 2,
"value3": "three"
}
}
</pre>
<br/><br/>
Do not specify the same name more than once. If you do, we will ignore
subsequent uses of a previously used name.
<br/><br/>
The following data is passed to your REST service as a JSON encoded POST:
<br/><br/>
<ul>
<li>project-id: the UUID of the project that owns the instance
<li>instance-id: the UUID of the instance
<li>image-id: the UUID of the image used to boot this instance
<li>user-data: as specified by the user at boot time
<li>hostname: the hostname of the instance
<li>metadata: as specified by the user at boot time
</ul>
<br/><br/>
<b>Deployment considerations</b>
<br/><br/>
Nova provides authentication to external metadata services in order to provide
some level of certainty that the request came from nova. This is done by
providing a service token with the request -- you can then just deploy your
metadata service with the keystone authentication WSGI middleware. This is
configured using the keystone authentication parameters in the
vendordata_dynamic_auth configuration group.
<br/><br/>
This behavior is optional however, if you do not configure a service user nova will not authenticate with the external metadata service.
<br/><br/>
<b>Deploying the same vendordata service</b>
<br/><br/>
There is a sample vendordata service that is meant to model what a deployer would use for their custom metadata at <a href="http://github.com/mikalstill/vendordata">http://github.com/mikalstill/vendordata</a>. Deploying that service is relatively simple:
<br/><br/>
<pre>
$ git clone http://github.com/mikalstill/vendordata
$ cd vendordata
$ apt-get install virtualenvwrapper
$ . /etc/bash_completion.d/virtualenvwrapper <i>(only needed if virtualenvwrapper wasn't already installed)</i>
$ mkvirtualenv vendordata
$ pip install -r requirements.txt
</pre>
<br/><br/>
We need to configure the keystone WSGI middleware to authenticate against the right keystone service. There is a sample configuration file in git, but its configured to work with an openstack-ansible all in one install that I setup up for my private testing, which probably isn't what you're using:
<br/><br/>
<pre>
[keystone_authtoken]
insecure = False
auth_plugin = password
auth_url = http://172.29.236.100:35357
auth_uri = http://172.29.236.100:5000
project_domain_id = default
user_domain_id = default
project_name = service
username = nova
password = 5dff06ac0c43685de108cc799300ba36dfaf29e4
region_name = RegionOne
</pre>
<br/><br/>
Per the README file in the vendordata sample repository, you can test the vendordata server in a stand alone manner by generating a token manually from keystone:
<br/><br/>
<pre>
$ curl -d @credentials.json -H "Content-Type: application/json" http://172.29.236.100:5000/v2.0/tokens &gt; token.json
$ token=`cat token.json | python -c "import sys, json; print json.loads(sys.stdin.read())['access']['token']['id'];"`
</pre>
<br/><br/>
We then include that token in a test request to the vendordata service:
<br/><br/>
<pre>
curl -H "X-Auth-Token: $token" http://127.0.0.1:8888/
</pre>
<br/><br/>
<b>Configuring nova to use the external metadata service</b>
<br/><br/>
Now we're ready to wire up the sample metadata service with nova. You do that by adding something like this to the nova.conf configuration file:
<br/><br/>
<pre>
[api]
vendordata_providers=DynamicJSON
vendordata_dynamic_targets=testing@http://metadatathingie.example.com:8888
</pre>
<br/><br/>
Where <i>metadatathingie.example.com</i> is the IP address or hostname of the server running the external metadata service. Now if we boot an instance like this:
<br/><br/>
<pre>
nova boot --image 2f6e96ca-9f58-4832-9136-21ed6c1e3b1f --flavor tempest1 --nic net-name=public --config-drive true foo
</pre>
<br/><br/>
We end up with a config drive which contains the information or external metadata service returned (in the example case, handy Carrie Fischer quotes):
<br/><br/>
<pre>
# cat openstack/latest/vendor_data2.json | python -m json.tool
{
"testing": {
"carrie_says": "I really love the internet. They say chat-rooms are the trailer park of the internet but I find it amazing."
}
}
</pre>
<br/><br/><i>Tags for this post: <a href="http://www.stillhq.com/tags/openstack.html">openstack</a> <a href="http://www.stillhq.com/tags/nova.html">nova</a> <a href="http://www.stillhq.com/tags/metadata.html">metadata</a> <a href="http://www.stillhq.com/tags/vendordata.html">vendordata</a> <a href="http://www.stillhq.com/tags/configdrive.html">configdrive</a> <a href="http://www.stillhq.com/tags/cloud-init.html">cloud-init</a></i><br/><i>Related posts: <a href="http://www.stillhq.com/openstack/000023.html">Things I read today: the best description I've seen of metadata routing in neutron</a>; <a href="http://www.stillhq.com/openstack/kilo/000004.html">One week of Nova Kilo specifications</a>; <a href="http://www.stillhq.com/openstack/kilo/000005.html">Specs for Kilo</a>; <a href="http://www.stillhq.com/openstack/juno/000014.html">Juno nova mid-cycle meetup summary: nova-network to Neutron migration</a>; <a href="http://www.stillhq.com/openstack/juno/000001.html">Juno Nova PTL Candidacy</a>; <a href="http://www.stillhq.com/openstack/juno/000012.html">Juno nova mid-cycle meetup summary: scheduler</a></i>
<a href="http://www.stillhq.com/openstack/000022.commentform.html">Comment</a>
http://www.stillhq.com/openstack/000022.html
http://www.stillhq.com/openstack/000022.htmlSydney Developer Bugs Smash/openstack/mitakaSun, 14 Feb 2016 14:00:00 PSTThe OpenStack community is arranging a series of bug smash events globally, with one in Sydney. These events are aimed at closing bugs related to enterprise pain points in OpenStack, although as self guided events there isn't anyone in the room ordering you to do a certain thing. There will however be no presentations -- this is a group working session.
<br/><br/>
The global event etherpad is at <a href="https://etherpad.openstack.org/p/OpenStack-Bug-Smash-Mitaka">https://etherpad.openstack.org/p/OpenStack-Bug-Smash-Mitaka</a>.
<br/><br/>
The Sydney event is being hosted by Rackspace Australia, and has its own signup etherpad at <a href="https://etherpad.openstack.org/p/OpenStack-Bug-Smash-Mitaka-Sydney">https://etherpad.openstack.org/p/OpenStack-Bug-Smash-Mitaka-Sydney</a>.
<br/><br/>
Please note this event is not aimed at end users, deployers or administrators. It is aimed at developers of OpenStack. So, if you're an OpenStack developer please consider coming along!
<br/><br/>
RSVP is on the Sydney event etherpad.
<br/><br/><i>Tags for this post: <a href="http://www.stillhq.com/tags/openstack.html">openstack</a> <a href="http://www.stillhq.com/tags/mitaka.html">mitaka</a></i>
<a href="http://www.stillhq.com/openstack/mitaka/000001.commentform.html">Comment</a>
http://www.stillhq.com/openstack/mitaka/000001.html
http://www.stillhq.com/openstack/mitaka/000001.htmlHow we got to test_init_instance_retries_reboot_pending_soft_became_hard/openstackWed, 23 Sep 2015 23:30:00 PSTI've been asked some questions about <a href="https://review.openstack.org/#/c/219980">a recent change to nova that I am responsible for</a>, and I thought it would be easier to address those in this format than trying to explain what's happening in IRC. That way whenever someone compliments me on possibly the longest unit test name ever written, I can point them here.
<br/><br/>
Let's start with some definitions. What is the difference between a soft reboot and a hard reboot in Nova? The short answer is that a soft reboot gives the operating system running in the instance an opportunity to respond to an ACPI power event gracefully before the rug is pulled out from under the instance, whereas a hard reboot just punches the instance in the face immediately.
<br/><br/>
There is a bit more complexity than that of course, because this is OpenStack. A hard reboot also re-fetches image meta-data, and rebuilds the XML description of the instance that we hand to libvirt. It also re-populates any missing backing files. Finally it ensures that the networking is configured correctly and boots the instance again. In other words, a hard reboot is kind of like an initial instance boot, in that it makes fewer assumptions about how much you can trust the current state of the instance on the hypervisor node. Finally, a soft reboot which fails (probably because the instance operation system didn't respond to the ACPI event in a timely manner) is turned into a hard reboot after libvirt.wait_soft_reboot_seconds. So, we already perform hard reboots when a user asked for a soft reboot in certain error cases.
<br/><br/>
Its important to note that the actual reboot mechanism is similar though -- its just how patient we are and what side effects we create that change -- in libvirt they both end up as a shutdown of the virtual machine and then a startup.
<br/><br/>
<a href="https://launchpad.net/bugs/1072751">Bug 1072751</a> reported an interesting edge case with a soft reboot though. If nova-compute crashes after shutting down the virtual machine, but before the virtual machine is started again, then the instance is left in an inconsistent state. We can demonstrate this with a devstack installation:
<br/><br/>
<pre><ul>
<b>Setup the right version of nova</b>
cd /opt/stack/nova
git checkout dc6942c1218279097cda98bb5ebe4f273720115d
<b>Patch nova so it crashes on a soft reboot</b>
cat - &gt; /tmp/patch &lt;&lt;EOF
> diff --git a/nova/virt/libvirt/driver.py b/nova/virt/libvirt/driver.py
> index ce19f22..6c565be 100644
> --- a/nova/virt/libvirt/driver.py
> +++ b/nova/virt/libvirt/driver.py
> @@ -34,6 +34,7 @@ import itertools
> import mmap
> import operator
> import os
> +import sys
> import shutil
> import tempfile
> import time
> @@ -2082,6 +2083,10 @@ class LibvirtDriver(driver.ComputeDriver):
> # is already shutdown.
> if state == power_state.RUNNING:
> dom.shutdown()
> +
> + # NOTE(mikal): temporarily crash
> + sys.exit(1)
> +
> # NOTE(vish): This actually could take slightly longer than the
> # FLAG defines depending on how long the get_info
> # call takes to return.
> EOF
patch -p1 < /tmp/patch
<i>...now restart nova-compute inside devstack to make sure you're running
the patched version...</i>
<b>Boot a victim instance</b>
cd ~/devstack
source openrc admin
glance image-list
nova boot --image=cirros-0.3.4-x86_64-uec --flavor=1 foo
<b>Soft reboot, and verify its gone</b>
nova list
nova reboot cacf99de-117d-4ab7-bd12-32cc2265e906
sudo virsh list
<i>...virsh list should now show no virtual machines running as nova-compute
crashed before it could start the instance again. However, nova-api knows that
the instance should be rebooting...</i>
$ nova list
+--------------------------------------+------+---------+----------------+-------------+------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+------+---------+----------------+-------------+------------------+
| cacf99de-117d-4ab7-bd12-32cc2265e906 | foo | REBOOT | reboot_started | Running | private=10.0.0.3 |
+--------------------------------------+------+---------+----------------+-------------+------------------+
<i>...now start nova-compute again, nova-compute detects the missing
instance on boot, and tries to start it up again...</i>
sg libvirtd '/usr/local/bin/nova-compute --config-file /etc/nova/nova.conf' \
> & echo $! >/opt/stack/status/stack/n-cpu.pid; fg || \
> echo "n-cpu failed to start" | tee "/opt/stack/status/stack/n-cpu.failure"
[...snip...]
Traceback (most recent call last):
File "/opt/stack/nova/nova/conductor/manager.py", line 444, in _object_dispatch
return getattr(target, method)(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 213, in wrapper
return fn(self, *args, **kwargs)
File "/opt/stack/nova/nova/objects/instance.py", line 728, in save
columns_to_join=_expected_cols(expected_attrs))
File "/opt/stack/nova/nova/db/api.py", line 764, in instance_update_and_get_original
expected=expected)
File "/opt/stack/nova/nova/db/sqlalchemy/api.py", line 216, in wrapper
return f(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/oslo_db/api.py", line 146, in wrapper
ectxt.value = e.inner_exc
File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 195, in __exit__
six.reraise(self.type_, self.value, self.tb)
File "/usr/local/lib/python2.7/dist-packages/oslo_db/api.py", line 136, in wrapper
return f(*args, **kwargs)
File "/opt/stack/nova/nova/db/sqlalchemy/api.py", line 2464, in instance_update_and_get_original
expected, original=instance_ref))
File "/opt/stack/nova/nova/db/sqlalchemy/api.py", line 2602, in _instance_update
raise exc(**exc_props)
UnexpectedTaskStateError: Conflict updating instance cacf99de-117d-4ab7-bd12-32cc2265e906.
Expected: {'task_state': [u'rebooting_hard', u'reboot_pending_hard', u'reboot_started_hard']}.
Actual: {'task_state': u'reboot_started'}
</ul></pre>
<br/><Br/>
So what happened here? This is a bit confusing because we asked for a soft reboot of the instance, but the error we are seeing here is that a hard reboot was attempted -- specifically, we're trying to update an instance object but all the task states we expect the instance to be in are related to a hard reboot, but the task state we're actually in is for a soft reboot.
<br/><Br/>
We need to take a tour of the compute manager code to understand what happened here. nova-compute is implemented at nova/compute/manager.py in the nova code base. Specifically, ComputeVirtAPI.init_host() sets up the service to start handling compute requests for a specific hypervisor node. As part of startup, this method calls ComputeVirtAPI._init_instance() once per instance on the hypervisor node. This method tries to do some sanity checking for each instance that nova thinks should be on the hypervisor:
<br/><br/>
<ul>
<li>Detecting if the instance was part of a failed evacuation.
<li>Detecting instances that are soft deleted, deleting, or in an error state and ignoring them apart from a log message.
<li>Detecting instances which we think are fully deleted but aren't in fact gone.
<li>Moving instances we thought were booting, but which never completed into an error state. This happens if nova-compute crashes during the instance startup process.
<li>Similarly, instances which were rebuilding are moved to an error state as well.
<li>Clearing the task state for uncompleted tasks like snapshots or preparing for resize.
<li>Finishes deleting instances which were partially deleted last time we saw them.
<li>And finally, if the instance should be running but isn't, tries to reboot the instance to get it running.
</ul>
<br/><br/>
It is this final state which is relevant in this case -- we think the instance should be running and its not, so we're going to reboot it. We do that by calling ComputeVirtAPI.reboot_instance(). The code which does this work looks like this:
<br/><br/>
<pre><ul>
try_reboot, reboot_type = self._retry_reboot(context, instance)
current_power_state = self._get_power_state(context, instance)
if try_reboot:
LOG.debug("Instance in transitional state (%(task_state)s) at "
"start-up and power state is (%(power_state)s), "
"triggering reboot",
{'task_state': instance.task_state,
'power_state': current_power_state},
instance=instance)
self.reboot_instance(context, instance, block_device_info=None,
reboot_type=reboot_type)
return
[...snip...]
def _retry_reboot(self, context, instance):
current_power_state = self._get_power_state(context, instance)
current_task_state = instance.task_state
retry_reboot = False
reboot_type = compute_utils.get_reboot_type(current_task_state,
current_power_state)
pending_soft = (current_task_state == task_states.REBOOT_PENDING and
instance.vm_state in vm_states.ALLOW_SOFT_REBOOT)
pending_hard = (current_task_state == task_states.REBOOT_PENDING_HARD
and instance.vm_state in vm_states.ALLOW_HARD_REBOOT)
started_not_running = (current_task_state in
[task_states.REBOOT_STARTED,
task_states.REBOOT_STARTED_HARD] and
current_power_state != power_state.RUNNING)
if pending_soft or pending_hard or started_not_running:
retry_reboot = True
return retry_reboot, reboot_type
</ul></pre>
<br/><br/>
So, we ask ComputeVirtAPI._retry_reboot() if a reboot is required, and if so what type. ComputeVirtAPI._retry_reboot() just uses nova.compute.utils.get_reboot_type() (aliased as compute_utils.get_reboot_type) to determine what type of reboot to use. This is the crux of the matter. Read on for a surprising discovery!
<br/><br/>
nova.compute.utils.get_reboot_type() looks like this:
<br/><br/>
<pre><ul>
def get_reboot_type(task_state, current_power_state):
"""Checks if the current instance state requires a HARD reboot."""
if current_power_state != power_state.RUNNING:
return 'HARD'
soft_types = [task_states.REBOOT_STARTED, task_states.REBOOT_PENDING,
task_states.REBOOTING]
reboot_type = 'SOFT' if task_state in soft_types else 'HARD'
return reboot_type
</ul></pre>
<br/><br/>
So, after all that it comes down to this. If the instance isn't running, then its a hard reboot. In our case, we shutdown the instance but haven't started it yet, so its not running. This will therefore be a hard reboot. This is where our problem lies -- we chose a hard reboot. The code doesn't blow up until later though -- when we try to do the reboot itself.
<br/><br/>
<pre><ul>
@wrap_exception()
@reverts_task_state
@wrap_instance_event
@wrap_instance_fault
def reboot_instance(self, context, instance, block_device_info,
reboot_type):
"""Reboot an instance on this host."""
# acknowledge the request made it to the manager
if reboot_type == "SOFT":
instance.task_state = task_states.REBOOT_PENDING
expected_states = (task_states.REBOOTING,
task_states.REBOOT_PENDING,
task_states.REBOOT_STARTED)
else:
instance.task_state = task_states.REBOOT_PENDING_HARD
expected_states = (task_states.REBOOTING_HARD,
task_states.REBOOT_PENDING_HARD,
task_states.REBOOT_STARTED_HARD)
context = context.elevated()
LOG.info(_LI("Rebooting instance"), context=context, instance=instance)
block_device_info = self._get_instance_block_device_info(context,
instance)
network_info = self.network_api.get_instance_nw_info(context, instance)
self._notify_about_instance_usage(context, instance, "reboot.start")
instance.power_state = self._get_power_state(context, instance)
instance.save(expected_task_state=expected_states)
[...snip...]
</ul></pre>
<br/><br/>
And there's our problem. We have a reboot_type of HARD, which means we set the expected_states to those matching a hard reboot. However, the state the instance is actually in will be one correlating to a soft reboot, because that's what the user requested. We therefore experience an exception when we try to save our changes to the instance. This is the exception we saw above.
<br/><br/>
The fix in <a href="https://review.openstack.org/#/c/219980/1/nova/compute/manager.py,cm">my patch</a> is simply to change the current task state for an instance in this situation to one matching a hard reboot. It all just works then.
<br/><br/>
So why do we decide to use a hard reboot if the current power state is not RUNNING? This code was introduced in <a href="https://review.openstack.org/#/c/57967/">this patch</A> and there isn't much discussion in the review comments as to why a hard reboot is the right choice here. That said, we already fall back to a hard reboot in error cases of a soft reboot inside the libvirt driver, and a hard reboot requires less trust of the surrounding state for the instance (block device mappings, networks and all those side effects mentioned at the very beginning), so I think it is the right call.
<br/><br/>
In conclusion, we use a hard reboot for soft reboots that fail, and a nova-compute crash during a soft reboot counts as one of those failure cases. So, when nova-compute detects a failed soft reboot, it converts it to a hard reboot and trys again.
<br/><br/><i>Tags for this post: <a href="http://www.stillhq.com/tags/openstack.html">openstack</a> <a href="http://www.stillhq.com/tags/reboot.html">reboot</a> <a href="http://www.stillhq.com/tags/nova.html">nova</a> <a href="http://www.stillhq.com/tags/nova-compute.html">nova-compute</a></i><br/><i>Related posts: <a href="http://www.stillhq.com/openstack/kilo/000004.html">One week of Nova Kilo specifications</a>; <a href="http://www.stillhq.com/openstack/kilo/000005.html">Specs for Kilo</a>; <a href="http://www.stillhq.com/openstack/juno/000014.html">Juno nova mid-cycle meetup summary: nova-network to Neutron migration</a>; <a href="http://www.stillhq.com/openstack/juno/000001.html">Juno Nova PTL Candidacy</a>; <a href="http://www.stillhq.com/openstack/juno/000012.html">Juno nova mid-cycle meetup summary: scheduler</a>; <a href="http://www.stillhq.com/openstack/juno/000007.html">Juno nova mid-cycle meetup summary: ironic</a></i>
<a href="http://www.stillhq.com/openstack/000021.commentform.html">Comment</a>
http://www.stillhq.com/openstack/000021.html
http://www.stillhq.com/openstack/000021.htmlThe linux.conf.au 2016 Call For Proposals is open!/openstackSun, 31 May 2015 22:44:00 PSTThe OpenStack community has been well represented at linux.conf.au over the last few years, which I think is reflective of both the growing level of interest in OpenStack in the general Linux community, as well as the fact that OpenStack is one of the largest Python projects around these days. linux.conf.au is one of the region's biggest Open Source conferences, and has a solid reputation for deep technical content.
<br/><br/>
Its time to make it all happen again, with the <a href="http://lca2016.linux.org.au/cfp">linux.conf.au 2016 Call For Proposals</a> opening today! I'm especially keen to encourage talk proposals which are somehow more than introductions to various components of OpenStack. Its time to talk detail about how people's networking deployments work, what container solutions we're using, and how we're deploying OpenStack in the real world to do seriously cool stuff.
<br/><br/>
The conference is in the first week of February in Geelong, Australia. I'd be happy to chat with anyone who has questions about the CFP process.
<br/><br/><i>Tags for this post: <a href="http://www.stillhq.com/tags/openstack.html">openstack</a> <a href="http://www.stillhq.com/tags/conference.html">conference</a> <a href="http://www.stillhq.com/tags/linux.conf.au.html">linux.conf.au</a> <a href="http://www.stillhq.com/tags/lca2016.html">lca2016</a></i><br/><i>Related posts: <a href="http://www.stillhq.com/conference/lca2007/000003.html">LCA 2007 Video: CFQ IO</a>; <a href="http://www.stillhq.com/linux/conference/opensource/lca2006/000003.html">LCA 2006: CFP closes today</a>; <a href="http://www.stillhq.com/presentations/000019.html">I just noticed...</a>; <a href="http://www.stillhq.com/linux/conference/opensource/lca2006/000001.html">LCA2006 -- CFP opens soon!</a>; <a href="http://www.stillhq.com/mythtv/000015.html">I just noticed...</a>; <a href="http://www.stillhq.com/mythtv/tutorial/lca2007/000002.html">Updated: linux.conf.au 2007 MythTV tutorial homework</a></i>
<a href="http://www.stillhq.com/openstack/000020.commentform.html">Comment</a>
http://www.stillhq.com/openstack/000020.html
http://www.stillhq.com/openstack/000020.htmlAnother Nova spec update/openstack/kiloThu, 15 Jan 2015 19:16:00 PSTI started chasing down the list of spec freeze exceptions that had been requested, and that resulted in the list of specs for Kilo being updated. That updated list is below, but I'll do a separate post with the exception requests highlighted soon as well.
<br/><br/><b>API</b><br/><br/>
<ul>
<li>Add more detailed network information to the metadata server: <a href="https://review.openstack.org/#/c/85673">review 85673</a> <b>(approved)</b>.
<li>Add separated policy rule for each v2.1 api: <a href="https://review.openstack.org/#/c/127863">review 127863</a> <b>(requested a spec exception)</b>.
<li>Add user limits to the limits API (as well as project limits): <a href="https://review.openstack.org/#/c/127094">review 127094</a>.
<li>Allow all printable characters in resource names: <a href="https://review.openstack.org/#/c/126696">review 126696</a> <b>(approved)</b>.
<li>Consolidate all console access APIs into one: <a href="https://review.openstack.org/#/c/141065">review 141065</a> <b>(approved)</b>.
<li>Expose the lock status of an instance as a queryable item: <a href="https://review.openstack.org/#/c/127139">review 127139</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/85928">review 85928</a> <b>(approved)</b>.
<li>Extend api to allow specifying vnic_type: <a href="https://review.openstack.org/#/c/138808">review 138808</a> <b>(requested a spec exception)</b>.
<li>Implement instance tagging: <a href="https://review.openstack.org/#/c/127281">review 127281</a> <b>(fast tracked, approved)</b>.
<li>Implement the v2.1 API: <a href="https://review.openstack.org/#/c/126452">review 126452</a> <b>(fast tracked, approved)</b>.
<li>Improve the return codes for the instance lock APIs: <a href="https://review.openstack.org/#/c/135506">review 135506</a>.
<li>Microversion support: <a href="https://review.openstack.org/#/c/127127">review 127127</a> <b>(approved)</b>.
<li>Move policy validation to just the API layer: <a href="https://review.openstack.org/#/c/127160">review 127160</a> <b>(approved)</b>.
<li>Nova Server Count API Extension: <a href="https://review.openstack.org/#/c/134279">review 134279</a> <b>(fast tracked)</b>.
<li>Provide a policy statement on the goals of our API policies: <a href="https://review.openstack.org/#/c/128560">review 128560</a> <b>(abandoned)</b>.
<li>Sorting enhancements: <a href="https://review.openstack.org/#/c/131868">review 131868</a> <b>(fast tracked, approved, implemented)</b>.
<li>Support JSON-Home for API extension discovery: <a href="https://review.openstack.org/#/c/130715">review 130715</a> <b>(requested a spec exception)</b>.
<li>Support X509 keypairs: <a href="https://review.openstack.org/#/c/105034">review 105034</a> <b>(approved)</b>.
</ul>
<br/><br/><b>API (EC2)</b><br/><br/>
<ul>
<li>Expand support for volume filtering in the EC2 API: <a href="https://review.openstack.org/#/c/104450">review 104450</a>.
<li>Implement tags for volumes and snapshots with the EC2 API: <a href="https://review.openstack.org/#/c/126553">review 126553</a> <b>(fast tracked, approved)</b>.
</ul>
<br/><br/><b>Administrative</b><br/><br/>
<ul>
<li>Actively hunt for orphan instances and remove them: <a href="https://review.openstack.org/#/c/137996">review 137996</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/138627">review 138627</a>.
<li>Add totalSecurityGroupRulesUsed to the quota limits: <a href="https://review.openstack.org/#/c/145689">review 145689</a>.
<li>Check that a service isn't running before deleting it: <a href="https://review.openstack.org/#/c/131633">review 131633</a>.
<li>Enable the nova metadata cache to be a shared resource to improve the hit rate: <a href="https://review.openstack.org/#/c/126705">review 126705</a> <b>(abandoned)</b>.
<li>Implement a daemon version of rootwrap: <a href="https://review.openstack.org/#/c/105404">review 105404</a> <b>(requested a spec exception)</b>.
<li>Log request id mappings: <a href="https://review.openstack.org/#/c/132819">review 132819</a> <b>(fast tracked)</b>.
<li>Monitor the health of hypervisor hosts: <a href="https://review.openstack.org/#/c/137768">review 137768</a>.
<li>Remove the assumption that there is a single endpoint for services that nova talks to: <a href="https://review.openstack.org/#/c/132623">review 132623</a>.
</ul>
<br/><br/><b>Block Storage</b><br/><br/>
<ul>
<li>Allow direct access to LVM volumes if supported by Cinder: <a href="https://review.openstack.org/#/c/127318">review 127318</a>.
<li>Cache data from volumes on local disk: <a href="https://review.openstack.org/#/c/138292">review 138292</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/138619">review 138619</a>.
<li>Enhance iSCSI volume multipath support: <a href="https://review.openstack.org/#/c/134299">review 134299</a> <b>(requested a spec exception)</b>.
<li>Failover to alternative iSCSI portals on login failure: <a href="https://review.openstack.org/#/c/137468">review 137468</a> <b>(requested a spec exception)</b>.
<li>Give additional info in BDM when source type is "blank": <a href="https://review.openstack.org/#/c/140133">review 140133</a>.
<li>Implement support for a DRBD driver for Cinder block device access: <a href="https://review.openstack.org/#/c/134153">review 134153</a> <b>(requested a spec exception)</b>.
<li>Poll volume status: <a href="https://review.openstack.org/#/c/142828">review 142828</a> <b>(abandoned)</b>.
<li>Refactor ISCSIDriver to support other iSCSI transports besides TCP: <a href="https://review.openstack.org/#/c/130721">review 130721</a> <b>(approved)</b>.
<li>StorPool volume attachment support: <a href="https://review.openstack.org/#/c/115716">review 115716</a> <b>(approved, requested a spec exception)</b>.
<li>Support Cinder Volume Multi-attach: <a href="https://review.openstack.org/#/c/139580">review 139580</a> <b>(approved)</b>.
<li>Support iSCSI live migration for different iSCSI target: <a href="https://review.openstack.org/#/c/132323">review 132323</a> <b>(approved)</b>.
</ul>
<br/><br/><b>Cells</b><br/><br/>
<ul>
<li>Cells Scheduling: <a href="https://review.openstack.org/#/c/141486">review 141486</a>.
<li>Create an instance mapping database: <a href="https://review.openstack.org/#/c/135644">review 135644</a> <b>(approved)</b>.
<li>Flexible cell selection: <a href="https://review.openstack.org/#/c/140031">review 140031</a>.
<li>Implement instance mapping: <a href="https://review.openstack.org/#/c/135424">review 135424</a> <b>(approved)</b>.
<li>Populate the instance mapping database: <a href="https://review.openstack.org/#/c/136490">review 136490</a> <b>(requested a spec exception)</b>.
</ul>
<br/><br/><b>Containers Service</b><br/><br/>
<ul>
<li>Initial specification: <a href="https://review.openstack.org/#/c/114044">review 114044</a> <b>(abandoned)</b>.
</ul>
<br/><br/><b>Database</b><br/><br/>
<ul>
<li>Develop and implement a profiler for SQL requests: <a href="https://review.openstack.org/#/c/142078">review 142078</a> <b>(abandoned)</b>.
<li>Enforce instance uuid uniqueness in the SQL database: <a href="https://review.openstack.org/#/c/128097">review 128097</a> <b>(fast tracked, approved, implemented)</b>.
<li>Nova db purge utility: <a href="https://review.openstack.org/#/c/132656">review 132656</a>.
<li>Online schema change options: <a href="https://review.openstack.org/#/c/102545">review 102545</a> <b>(approved)</b>.
<li>Support DB2 as a SQL database: <a href="https://review.openstack.org/#/c/141097">review 141097</a> <b>(fast tracked, approved)</b>.
<li>Validate database migrations and model': <a href="https://review.openstack.org/#/c/134984">review 134984</a> <b>(approved)</b>.
</ul>
<br/><br/><b>Hypervisor: Docker</b><br/><br/>
<ul>
<li>Migrate the Docker Driver into Nova: <a href="https://review.openstack.org/#/c/128753">review 128753</a>.
</ul>
<br/><br/><b>Hypervisor: FreeBSD</b><br/><br/>
<ul>
<li>Implement support for FreeBSD networking in nova-network: <a href="https://review.openstack.org/#/c/127827">review 127827</a>.
</ul>
<br/><br/><b>Hypervisor: Hyper-V</b><br/><br/>
<ul>
<li>Allow volumes to be stored on SMB shares instead of just iSCSI: <a href="https://review.openstack.org/#/c/102190">review 102190</a> <b>(approved, implemented)</b>.
<li>Instance hot resize: <a href="https://review.openstack.org/#/c/141219">review 141219</a>.
</ul>
<br/><br/><b>Hypervisor: Ironic</b><br/><br/>
<ul>
<li>Add config drive support: <a href="https://review.openstack.org/#/c/98930">review 98930</a> <b>(approved)</b>.
<li>Pass through flavor capabilities to ironic: <a href="https://review.openstack.org/#/c/136104">review 136104</a> <b>(approved)</b>.
</ul>
<br/><br/><b>Hypervisor: VMWare</b><br/><br/>
<ul>
<li>Add ephemeral disk support to the VMware driver: <a href="https://review.openstack.org/#/c/126527">review 126527</a> <b>(fast tracked, approved)</b>.
<li>Add support for the HTML5 console: <a href="https://review.openstack.org/#/c/127283">review 127283</a> <b>(requested a spec exception)</b>.
<li>Allow Nova to access a VMWare image store over NFS: <a href="https://review.openstack.org/#/c/126866">review 126866</a>.
<li>Enable administrators and tenants to take advantage of backend storage policies: <a href="https://review.openstack.org/#/c/126547">review 126547</a> <b>(fast tracked, approved)</b>.
<li>Enable the mapping of raw cinder devices to instances: <a href="https://review.openstack.org/#/c/128697">review 128697</a>.
<li>Implement vSAN support: <a href="https://review.openstack.org/#/c/128600">review 128600</a> <b>(fast tracked, approved)</b>.
<li>Support multiple disks inside a single OVA file: <a href="https://review.openstack.org/#/c/128691">review 128691</a>.
<li>Support the OVA image format: <a href="https://review.openstack.org/#/c/127054">review 127054</a> <b>(fast tracked, approved)</b>.
</ul>
<br/><br/><b>Hypervisor: libvirt</b><br/><br/>
<ul>
<li>Add Quobyte USP support: <a href="https://review.openstack.org/#/c/138372">review 138372</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/138373">review 138373</a> <b>(approved)</b>.
<li>Add VIF_VHOSTUSER vif type: <a href="https://review.openstack.org/#/c/138736">review 138736</a> <b>(approved)</b>.
<li>Add a Quobyte Volume Driver: <a href="https://review.openstack.org/#/c/138375">review 138375</a> <b>(abandoned)</b>.
<li>Add finetunable configuration settings for virtio-scsi: <a href="https://review.openstack.org/#/c/103797">review 103797</a> <b>(abandoned)</b>.
<li>Add large page support: <a href="https://review.openstack.org/#/c/129608">review 129608</a> <b>(approved)</b>.
<li>Add support for SMBFS as a image storage backend: <a href="https://review.openstack.org/#/c/103203">review 103203</a> <b>(approved, implemented)</b>.
<li>Allow scheduling of instances such that PCI passthrough devices are co-located on the same NUMA node as other instance resources: <a href="https://review.openstack.org/#/c/128344">review 128344</a> <b>(fast tracked, approved)</b>.
<li>Allow specification of the device boot order for instances: <a href="https://review.openstack.org/#/c/133254">review 133254</a>.
<li>Allow the administrator to explicitly set the version of the qemu emulator to use: <a href="https://review.openstack.org/#/c/138731">review 138731</a> <b>(abandoned)</b>.
<li>Consider PCI offload capabilities when scheduling instances: <a href="https://review.openstack.org/#/c/135331">review 135331</a>.
<li>Convert to using built in libvirt disk copy mechanisms for cold migrations on non-shared storage: <a href="https://review.openstack.org/#/c/126979">review 126979</a> <b>(fast tracked)</b>.
<li>Derive hardware policy from libosinfo: <a href="https://review.openstack.org/#/c/133945">review 133945</a> <b>(approved)</b>.
<li>Implement COW volumes via VMThunder to allow fast boot of large numbers of instances: <a href="https://review.openstack.org/#/c/128810">review 128810</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/128813">review 128813</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/128830">review 128830</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/128845">review 128845</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/129093">review 129093</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/129108">review 129108</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/129110">review 129110</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/129113">review 129113</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/129116">review 129116</a>; <a href="https://review.openstack.org/#/c/137617">review 137617</a>.
<li>Implement configurable policy over where virtual CPUs should be placed on physical CPUs: <a href="https://review.openstack.org/#/c/129606">review 129606</a> <b>(approved)</b>.
<li>Implement support for Parallels Cloud Server: <a href="https://review.openstack.org/#/c/111335">review 111335</a> <b>(approved)</b>; <a href="https://review.openstack.org/#/c/128990">review 128990</a> <b>(abandoned)</b>.
<li>Implement support for zkvm as a libvirt hypervisor: <a href="https://review.openstack.org/#/c/130447">review 130447</a> <b>(approved)</b>.
<li>Improve total network throughput by supporting virtio-net multiqueue: <a href="https://review.openstack.org/#/c/128825">review 128825</a> <b>(requested a spec exception)</b>.
<li>Improvements to the cinder integration for snapshots: <a href="https://review.openstack.org/#/c/134517">review 134517</a>.
<li>Quiesce instance disks during snapshot: <a href="https://review.openstack.org/#/c/128112">review 128112</a>; <a href="https://review.openstack.org/#/c/131587">review 131587</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/131597">review 131597</a>.
<li>Real time instances: <a href="https://review.openstack.org/#/c/139688">review 139688</a>.
<li>Stop dm-crypt device when an encrypted instance is suspended or stopped: <a href="https://review.openstack.org/#/c/140847">review 140847</a> <b>(approved)</b>.
<li>Support SR-IOV interface attach and detach: <a href="https://review.openstack.org/#/c/139910">review 139910</a> <b>(requested a spec exception)</b>.
<li>Support StorPool as a storage backend: <a href="https://review.openstack.org/#/c/137830">review 137830</a>.
<li>Support for live block device IO tuning: <a href="https://review.openstack.org/#/c/136704">review 136704</a>.
<li>Support libvirt storage pools: <a href="https://review.openstack.org/#/c/126978">review 126978</a> <b>(fast tracked, approved)</b>.
<li>Support live migration with macvtap SR-IOV: <a href="https://review.openstack.org/#/c/136077">review 136077</a>.
<li>Support quiesce filesystems during snapshot: <a href="https://review.openstack.org/#/c/126966">review 126966</a> <b>(fast tracked, approved)</b>.
<li>Support using qemu's built in iSCSI initiator: <a href="https://review.openstack.org/#/c/133048">review 133048</a> <b>(approved)</b>.
<li>Volume driver for Huawei SDSHypervisor: <a href="https://review.openstack.org/#/c/130919">review 130919</a>.
</ul>
<br/><br/><b>Instance features</b><br/><br/>
<ul>
<li>Allow portions of an instance's uuid to be configurable: <a href="https://review.openstack.org/#/c/130451">review 130451</a>.
<li>Allow the resize of ephemeral disks during resize: <a href="https://review.openstack.org/#/c/145736">review 145736</a>.
<li>Attempt to schedule cinder volumes "close" to instances: <a href="https://review.openstack.org/#/c/130851">review 130851</a>; <a href="https://review.openstack.org/#/c/131050">review 131050</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/131051">review 131051</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/131151">review 131151</a> <b>(abandoned)</b>.
<li>Dynamic server groups: <a href="https://review.openstack.org/#/c/130005">review 130005</a> <b>(abandoned)</b>.
<li>Improve the performance of unshelve for those using shared storage for instance disks: <a href="https://review.openstack.org/#/c/135387">review 135387</a> <b>(requested a spec exception)</b>.
</ul>
<br/><br/><b>Internal</b><br/><br/>
<ul>
<li>A lock-free quota implementation: <a href="https://review.openstack.org/#/c/135296">review 135296</a> <b>(approved)</b>.
<li>Automate the documentation of the virtual machine state transition graph: <a href="https://review.openstack.org/#/c/94835">review 94835</a>.
<li>Fake Libvirt driver for simulating HW testing: <a href="https://review.openstack.org/#/c/139927">review 139927</a> <b>(abandoned)</b>.
<li>Flatten Aggregate Metadata in the DB: <a href="https://review.openstack.org/#/c/134573">review 134573</a> <b>(abandoned)</b>.
<li>Flatten Instance Metadata in the DB: <a href="https://review.openstack.org/#/c/134945">review 134945</a> <b>(abandoned)</b>.
<li>Implement a new code coverage API extension: <a href="https://review.openstack.org/#/c/130855">review 130855</a>.
<li>Move flavor data out of the system_metadata table in the SQL database: <a href="https://review.openstack.org/#/c/126620">review 126620</a> <b>(approved)</b>.
<li>Move to polling for cinder operations: <a href="https://review.openstack.org/#/c/135367">review 135367</a>.
<li>PCI test cases for third party CI: <a href="https://review.openstack.org/#/c/141270">review 141270</a>.
<li>Transition Nova to using the Glance v2 API: <a href="https://review.openstack.org/#/c/84887">review 84887</a> <b>(abandoned)</b>.
<li>Transition to using glanceclient instead of our own home grown wrapper: <a href="https://review.openstack.org/#/c/133485">review 133485</a> <b>(approved)</b>.
</ul>
<br/><br/><b>Internationalization</b><br/><br/>
<ul>
<li>Enable lazy translations of strings: <a href="https://review.openstack.org/#/c/126717">review 126717</a> <b>(fast tracked, approved)</b>.
</ul>
<br/><br/><b>Networking</b><br/><br/>
<ul>
<li>Add a new linuxbridge VIF type, macvtap: <a href="https://review.openstack.org/#/c/117465">review 117465</a> <b>(abandoned)</b>.
<li>Add a plugin mechanism for VIF drivers: <a href="https://review.openstack.org/#/c/136827">review 136827</a> <b>(abandoned)</b>.
<li>Add support for InfiniBand SR-IOV VIF Driver: <a href="https://review.openstack.org/#/c/131729">review 131729</a> <b>(requested a spec exception)</b>.
<li>Neutron DNS Using Nova Hostname: <a href="https://review.openstack.org/#/c/90150">review 90150</a> <b>(abandoned)</b>.
<li>New VIF type to allow routing VM data instead of bridging it: <a href="https://review.openstack.org/#/c/130732">review 130732</a> <b>(approved, requested a spec exception)</b>.
<li>Nova Plugin for OpenContrail: <a href="https://review.openstack.org/#/c/126446">review 126446</a> <b>(approved)</b>.
<li>Refactor of the Neutron network adapter to be more maintainable: <a href="https://review.openstack.org/#/c/131413">review 131413</a>.
<li>Use the Nova hostname in Neutron DNS: <a href="https://review.openstack.org/#/c/137669">review 137669</a>.
<li>Wrap the Python NeutronClient: <a href="https://review.openstack.org/#/c/141108">review 141108</a>.
</ul>
<br/><br/><b>Performance</b><br/><br/>
<ul>
<li>Dynamically alter the interval nova polls components at based on load and expected time for an operation to complete: <a href="https://review.openstack.org/#/c/122705">review 122705</a>.
</ul>
<br/><br/><b>Scheduler</b><br/><br/>
<ul>
<li>A nested quota driver API: <a href="https://review.openstack.org/#/c/129420">review 129420</a>.
<li>Add a filter to take into account hypervisor type and version when scheduling: <a href="https://review.openstack.org/#/c/137714">review 137714</a>.
<li>Add an IOPS weigher: <a href="https://review.openstack.org/#/c/127123">review 127123</a> <b>(approved, implemented)</b>; <a href="https://review.openstack.org/#/c/132614">review 132614</a>.
<li>Add instance count on the hypervisor as a weight: <a href="https://review.openstack.org/#/c/127871">review 127871</a> <b>(abandoned)</b>.
<li>Add soft affinity support for server group: <a href="https://review.openstack.org/#/c/140017">review 140017</a> <b>(approved)</b>.
<li>Allow extra spec to match all values in a list by adding the ALL-IN operator: <a href="https://review.openstack.org/#/c/138698">review 138698</a> <b>(fast tracked, approved)</b>.
<li>Allow limiting the flavors that can be scheduled on certain host aggregates: <a href="https://review.openstack.org/#/c/122530">review 122530</a> <b>(abandoned)</b>.
<li>Allow the remove of servers from server groups: <a href="https://review.openstack.org/#/c/136487">review 136487</a>.
<li>Cache aggregate metadata: <a href="https://review.openstack.org/#/c/141846">review 141846</a>.
<li>Convert get_available_resources to use an object instead of dict: <a href="https://review.openstack.org/#/c/133728">review 133728</a> <b>(abandoned)</b>.
<li>Convert the resource tracker to objects: <a href="https://review.openstack.org/#/c/128964">review 128964</a> <b>(fast tracked, approved)</b>.
<li>Create an object model to represent a request to boot an instance: <a href="https://review.openstack.org/#/c/127610">review 127610</a> <b>(approved)</b>.
<li>Decouple services and compute nodes in the SQL database: <a href="https://review.openstack.org/#/c/126895">review 126895</a> <b>(approved)</b>.
<li>Distribute PCI Requests Across Multiple Devices: <a href="https://review.openstack.org/#/c/142094">review 142094</a>.
<li>Enable adding new scheduler hints to already booted instances: <a href="https://review.openstack.org/#/c/134746">review 134746</a>.
<li>Fix the race conditions when migration with server-group: <a href="https://review.openstack.org/#/c/135527">review 135527</a> <b>(abandoned)</b>.
<li>Implement resource objects in the resource tracker: <a href="https://review.openstack.org/#/c/127609">review 127609</a> <b>(approved, requested a spec exception)</b>.
<li>Improve the ComputeCapabilities filter: <a href="https://review.openstack.org/#/c/133534">review 133534</a> <b>(requested a spec exception)</b>.
<li>Isolate Scheduler DB for Filters: <a href="https://review.openstack.org/#/c/138444">review 138444</a> <b>(requested a spec exception)</b>.
<li>Isolate the scheduler's use of the Nova SQL database: <a href="https://review.openstack.org/#/c/89893">review 89893</a> <b>(approved)</b>.
<li>Let schedulers reuse filter and weigher objects: <a href="https://review.openstack.org/#/c/134506">review 134506</a> <b>(abandoned)</b>.
<li>Move select_destinations() to using a request object: <a href="https://review.openstack.org/#/c/127612">review 127612</a> <b>(approved)</b>.
<li>Persist scheduler hints: <a href="https://review.openstack.org/#/c/88983">review 88983</a>.
<li>Refactor allocate_for_instance: <a href="https://review.openstack.org/#/c/141129">review 141129</a>.
<li>Stop direct lookup for host aggregates in the Nova database: <a href="https://review.openstack.org/#/c/132065">review 132065</a> <b>(abandoned)</b>.
<li>Stop direct lookup for instance groups in the Nova database: <a href="https://review.openstack.org/#/c/131553">review 131553</a> <b>(abandoned)</b>.
<li>Support scheduling based on more image properties: <a href="https://review.openstack.org/#/c/138937">review 138937</a>.
<li>Trusted computing support: <a href="https://review.openstack.org/#/c/133106">review 133106</a>.
</ul>
<br/><br/><b>Scheduling</b><br/><br/>
<ul>
<li>Dynamic Management of Server Groups: <a href="https://review.openstack.org/#/c/139272">review 139272</a>.
</ul>
<br/><br/><b>Security</b><br/><br/>
<ul>
<li>Make key manager interface interoperable with Barbican: <a href="https://review.openstack.org/#/c/140144">review 140144</a> <b>(fast tracked, approved)</b>.
<li>Provide a reference implementation for console proxies that uses TLS: <a href="https://review.openstack.org/#/c/126958">review 126958</a> <b>(fast tracked, approved)</b>.
<li>Strongly validate the tenant and user for quota consuming requests with keystone: <a href="https://review.openstack.org/#/c/92507">review 92507</a> <b>(approved)</b>.
</ul>
<br/><br/><b>Service Groups</b><br/><br/>
<ul>
<li>Pacemaker service group driver: <a href="https://review.openstack.org/#/c/139991">review 139991</a>.
<li>Transition service groups to using the new oslo Tooz library: <a href="https://review.openstack.org/#/c/138607">review 138607</a>.
</ul>
<a href="http://www.stillhq.com/openstack/kilo/000008.commentform.html">Comment</a>
http://www.stillhq.com/openstack/kilo/000008.html
http://www.stillhq.com/openstack/kilo/000008.htmlKilo Nova deploy recommendations/openstackMon, 12 Jan 2015 14:11:00 PSTWhat would a Nova developer tell a deployer to think about before their first OpenStack install? This was the question I wanted to answer for my <a href="http://linux.conf.au">linux.conf.au</a> OpenStack miniconf talk, and writing this essay seemed like a reasonable way to take the bullet point list of ideas we generated and turn it into something that was a cohesive story. Hopefully this essay is also useful to people who couldn't make the conference talk.
<br/><br/>
Please understand that none of these are hard rules -- what I seek is for you to consider your options and make informed decisions. Its really up to you how you deploy Nova.
<br/><br/>
<i>Operating environment</i>
<br/><br/>
<ul>
<li><b>Consider what base OS you use for your hypervisor nodes if you're using Linux</b>. I know that many environments have standardized on a given distribution, and that many have a preference for a long term supported release. However, Nova is at its most basic level a way of orchestrating tools packaged by your distribution via APIs. If those underlying tools are buggy, then your Nova experience will suffer as well. Sometimes we can work around known issues in older versions of our dependencies, but often those work-arounds are hard to implement (and therefore likely to be less than perfect) or have performance impacts. There are many examples of the problems you can encounter, but hypervisor kernel panics, and disk image corruption are just two examples. We are trying to work with distributions on ensuring they back port fixes, but the distributions might not be always willing to do that. Sometimes upgrading the base OS on your hypervisor nodes might be a better call.
<li><b>The version of Python you use matters</b>. The OpenStack project only tests with specific versions of Python, and there can be bugs between releases. This is especially true for very old versions of Python (anything older than 2.7) and new versions of Python (Python 3 is not supported for example). Your choice of base OS will affect the versions of Python available, so this is related to the previous point.
<li><b>There are existing configuration management recipes for most configuration management systems</b>. I'd avoid reinventing the wheel here and use the community supported recipes. There are definitely resources available for chef, puppet, juju, ansible and salt. If you're building a very large deployment from scratch consider triple-o as well. Please please please don't fork the community recipes. I know its tempting, but contribute to upstream instead. Invariably upstream will continue developing their stuff, and if you fork you'll spend a lot of effort keeping in sync.
<li><b>Have a good plan for log collection and retention at your intended scale</b>. The hard reality at the moment is that diagnosing Nova often requires that you turn on debug logging, which is very chatty. Whilst we're happy to take bug reports where we've gotten the log level wrong, we haven't had a lot of success at systematically fixing this issue. Your log infrastructure therefore needs to be able to handle the demands of debug logging when its turned on. If you're using central log servers think seriously about how much disks they require. If you're not doing centralized syslog logging, perhaps consider something like logstash.
<li><b>Pay attention to memory usage on your controller nodes</b>. OpenStack python processes can often consume hundreds of megabytes of virtual memory space. If you run many controller services on the same node, make sure you have enough RAM to deal with the number of processes that will, by default, be spawned for the many service endpoints. After a day or so of running a controller node, check in on the VMM used for python processes and make any adjustments needed to your "workers" configuration settings.
</ul>
<br/><br/>
<i>Scale</i>
<ul>
<li><b>Estimate your final scale now</b>. Sure, you're building a proof of concept, but these things have a habit of becoming entrenched. If you are planning a deployment that is likely to end up being thousands of nodes, then you are going to need to deploy with cells. This is also possibly true if you're going to have more than one hypervisor or hardware platform in your deployment -- its very common to have a cell per hypervisor type or per hardware platform. Cells is relatively cheap to deploy for your proof of concept, and it helps when that initial deploy grows into a bigger thing. Should you be deploying cells from the beginning? It should be noted however that not all features are currently implemented in cells. We are working on this at the moment though.
<li><b>Consider carefully what SQL database to use</b>. Nova supports many SQL databases via sqlalchemy, but are some are better tested and more widely deployed than others. For example, the Postgres back end is rarely deployed and is less tested. I'd recommend a variant of MySQL for your deployment. Personally I've seen good performance on Percona, but I know that many use the stock MySQL as well. There are known issues at the moment with Galera as well, so show caution there. There is active development happening on the select-for-update problems with Galera at the moment, so that might change by the time you get around to deploying in production. You can read more about our current Galera problems <a href="http://www.joinfu.com/2015/01/understanding-reservations-concurrency-locking-in-nova/">on Jay Pipe's blog</a> .
<li><b>We support read only replicas of the SQL database</b>. Nova supports offloading read only SQL traffic to read only replicas of the main SQL database, but I do no believe this is widely deployed. It might be of interest to you though.
<li><b>Expect a lot of SQL database connections</b>. While Nova has the nova-conductor service to control the number of connections to the database server, other OpenStack services do not, and you will quickly out pace the number of default connections allowed, at least for a MySQL deployment. Actively monitor your SQL database connection counts so you know before you run out. Additionally, there are many places in Nova where a user request will block on a database query, so if your SQL back end isn't keeping up this will affect performance of your entire Nova deployment.
<li><b>There are options with message queues as well</b>. We currently support rabbitmq, zeromq and qpid. However, rabbitmq is the original and by far the most widely deployed. rabbitmq is therefore a reasonable default choice for deployment.
</ul>
<br/><br/>
<i>Hypervisors</i>
<ul>
<li><b>Not all hypervisor drivers are created equal</b>. Let's be frank here -- some hypervisor drivers just aren't as actively developed as others. This is especially true for drivers which aren't in the Nova code base -- at least the ones the Nova team manage are updated when we change the internals of Nova. I'm not a hypervisor bigot -- there is a place in the world for many different hypervisor options. However, the start of a Nova deploy might be the right time to consider what hypervisor you want to use. I'd personally recommend drivers in the Nova code base with active development teams and good continuous integration, but ultimately you have to select a driver based on its merits in your situation. I've included some more detailed thoughts on how to evaluate hypervisor drivers later in this post, as I don't want to go off on a big tangent during my nicely formatted bullet list.
<li><b>Remember that the hypervisor state is interesting debugging information</b>. For example with the libvirt hypervisor, the contents on /var/lib/instances is super useful for debugging misbehaving instances. Additionally, all of the existing libvirt tools work, so you can use those to investigate as well. However, I strongly recommend you only change instance state via Nova, and not go directly to the hypervisor.
</ul>
<br/><br/>
<i>Networking</i>
<ul>
<li><b>Avoid new deployments of nova-network</b>. nova-network has been on the deprecation path for a very long time now, and we're currently working on the final steps of a migration plan for nova-network users to neutron. If you're a new deployment of Nova and therefore don't yet depend on any of the features of nova-network, I'd start with Neutron from the beginning. This will save you a possible troublesome migration to Neutron later.
</ul>
<br/><br/>
<i>Testing and upgrades</i>
<ul>
<li><b>You need a test lab</b>. For a non-trivial deployment, you need a realistic test environment. Its expected that you test all upgrades before you do them in production, and rollbacks can sometimes be problematic. For example, some database migrations are very hard to roll back, especially if new instances have been created in the time it took you to decide to roll back. Perhaps consider turning off API access (or putting the API into a read only state) while you are validating a production deploy post upgrade, that way you can restore a database snapshot if you need to undo the upgrade. We know this isn't perfect and are working on a better upgrade strategy for information stored in the database, but we will always expect you to test upgrades before deploying them.
<li><b>Test database migrations on a copy of your production database before doing them for real</b>. Another reason to test upgrades before doing them in production is because some database migrations can be very slow. Its hard for the Nova developers to predict which migrations will be slow, but we do try to test for this and minimize the pain. However, aspects of your deployment can affect this in ways we don't expect -- for example if you have large numbers of volumes per instance, then that could result in database tables being larger than we expect. You should always test database migrations in a lab and report any problems you see.
<li><b>Think about your upgrade strategy in general</b>. While we now support having the control infrastructure running a newer release than the services on hypervisor nodes, we only support that for one release (so you could have your control plane running Kilo for example while you are still running Juno on your hypervisors, you couldn't run Icehouse on the hypervisors though). Are you going to upgrade every six months? Or are you going to do it less frequently but step through a series of upgrades in one session? I suspect the latter option is more risky -- if you encounter a bug in a previous release we would need to back port a fix, which is a much slower process than fixing the most recent release. There are also deployments which choose to "continuously deploy" from trunk. This gets the access to features as they're added, but means that the deployments need to have more operational skill and a closer association with the upstream developers. In general continuous deployers are larger public clouds as best as I can tell.
</ul>
<br/><br/>
<i>libvirt specific considerations</i>
<ul>
<li><b>For those intending to run the libvirt hypervisor driver, not all libvirt hypervisors are created equal</b>. libvirt implements pluggable hypervisors, so if you select the Nova libvirt hypervisor driver, you then need to select what hypervisor to use with libvirt as well. It should be noted however that some hypervisors work better than others, with kvm being the most widely deployed.
<li><b>There are two types of storage for instances</b>. There is "instance storage", which is block devices that exist for the life of the instance and are then cleaned up when the instance is destroyed. There is also block storage provided Cinder, which is persistent and arguably easier to manage than instance storage. I won't discuss storage provided by Cinder any further however, because it is outside the scope of this post. Instance storage is provided by a plug in layer in the libvirt hypervisor driver, which presents you with another set of deployment decisions.
<li>Shared instance storage is attractive, but it comes at a cost. Shared instance storage is an attractive option, but isn't required for live migration of instances using the libvirt hypervisor. Think about the costs of shared storage though -- for example putting everything on network attached storage is likely to be expensive, especially if most of your instances don't need the facility. There are other options such as Ceph, but the storage interface layer in libvirt is one of the areas of code where we need to improve testing so be wary of bugs before relying on those storage back ends.
</ul>
<br/><br/>
<b>Thoughts on how to evaluate hypervisor drivers</b>
<br/><br/>
As promised, I also have some thoughts on how to evaluate which hypervisor driver is the right choice for you. First off, if your organization has a lot of experience with a particular hypervisor, then there is always value in that. If that is the case, then you should seriously consider running the hypervisor you already have experience with, as long as that hypervisor has a driver for Nova which meets the criteria below.
<br/><br/>
What's important is to be looking for a driver which works well with Nova, and a good measure of that is how well the driver development team works with the Nova development team. The obvious best case here is where both teams are the same people -- which is true for drivers that are in the Nova code base. I am aware there are drivers that live outside of Nova's code repository, but you need to remember that the interface these drivers plug into isn't a stable or versioned interface. The risk of those drivers being broken by the ongoing development of Nova is very high. Additionally, only a very small number of those "out of tree" drivers contribute to our continuous integration testing. That means that the Nova team also doesn't know when those drivers are broken. The breakages can also be subtle, so if your vendor isn't at the very least doing tempest runs against their out of tree driver before shipping it to you then I'd be very worried.
<br/><br/>
You should also check out how many bugs are open in LaunchPad for your chosen driver (this assumes the Nova team is aware of the existence of the driver I suppose). Here's an example link to the libvirt driver bugs currently open. As well as total bug count, I'd be looking for bug close activity -- its nice if there is a very small number of bugs filed, but perhaps that's because there aren't many users. It doesn't necessarily mean the team for that driver is super awesome at closing bugs. The easiest way to look into bug close rates (and general code activity) would be to checkout the code for Nova and then look at the log for your chosen driver. For example for the libvirt driver again:
<br/><br/>
<pre>
$ git clone http://git.openstack.org/openstack/nova
$ cd nova/nova/virt/driver/libvirt
$ git log .
</pre>
<br/><br/>
That will give you a report on all the commits ever for that driver. You don't need to read the entire report, but it will give you an idea of what the driver authors have recently been thinking about.
<br/><br/>
Another good metric is the specification activity for your driver. Specifications are the formal design documents that Nova adopted for the Juno release, and they document all the features that we're currently working on. I write summaries of the current state of Nova specs regularly, which you can see posted at stillhq.com with this being the most recent summary at the time of writing this post. You should also check how much your driver authors interact with the core Nova team. The easiest way to do that is probably to keep an eye on the Nova team meeting minutes, which are posted online.
<br/><br/>
Finally, the OpenStack project believes strongly in continuous integration testing. It (s/It/Testing) has clear value in the number of bugs it finds in code before our users experience them, and I would be very wary of driver code which isn't continuously integrated with Nova. Thus, you need to ensure that your driver has well maintained continuous integration testing. This is easy for "in tree" drivers, as we do that for all of them. For out of tree drivers, continuous integration testing is done with a thing called "third party CI".
<br/><br/>
How do you determine if a third party CI system is well maintained? First off, I'd start by determining if a third party CI system actually exists by looking at OpenStack's list of known third party CI systems. If the third party isn't listed on that page, then that's a very big warning sign. Next you can use Joe Gordon's lastcomment tool to see when a given CI system last reported a result:
<br/><br/>
<pre>
$ git clone https://github.com/jogo/lastcomment
$ ./lastcomment.py --name "DB Datasets CI"
last 5 comments from 'DB Datasets CI'
[0] 2015-01-07 00:46:33 (1:35:13 old) https://review.openstack.org/145378 'Ignore 'dynamic' addr flag on gateway initialization'
[1] 2015-01-07 00:37:24 (1:44:22 old) https://review.openstack.org/136931 'Use session with neutronclient'
[2] 2015-01-07 00:35:33 (1:46:13 old) https://review.openstack.org/145377 'libvirt: Expanded test libvirt driver'
[3] 2015-01-07 00:29:50 (1:51:56 old) https://review.openstack.org/142450 'ephemeral file names should reflect fs type and mkfs command'
[4] 2015-01-07 00:15:59 (2:05:47 old) https://review.openstack.org/142534 'Support for ext4 as default filesystem for ephemeral disks'
</pre>
<br/><br/>
You can see here that the most recent run is 1 hour 35 minutes old when I ran this command. That's actually pretty good given that I wrote this while most of America was asleep. If the most recent run is days old, that's another warning sign. If you're left in doubt, then I'd recommend appearing in the OpenStack IRC channels on freenode and asking for advice. OpenStack has a number of requirements for third party CI systems, and I haven't discussed many of them here. There is more detail on what OpenStack considers a "well run CI system" on the OpenStack Infrastructure documentation page.
<br/><br/>
<b>General operational advice</b>
<br/><br/>
Finally, I have some general advice for operators of OpenStack. There is an active community of operators who discuss their use of the various OpenStack components at the openstack-operators mailing list, if you're deploying Nova you should consider joining that mailing list. While you're welcome to ask questions about deploying OpenStack at that list, you can also ask questions at the more general OpenStack mailing list if you want to.
<br/><br/>
There are also many companies now which will offer to operate an OpenStack cloud for you. For some organizations engaging a subject matter expert will be the right decision. Probably the most obvious way to evaluate which of those companies to use is to look at their track record of successful deployments, as well as their overall involvement in the OpenStack community. You need a partner who can advocate for you with the OpenStack developers, as well as keeping an eye on what's happening upstream to ensure it meets your needs.
<br/><br/>
<b>Conclusion</b>
<br/><br/>
Thanks for reading so far! I hope this document is useful to someone out there. I'd love to hear your feedback -- are there other things we wished deployers considered before committing to a plan? Am I simply wrong somewhere? Finally, this is the first time that I've posted an essay form of a conference talk instead of just the slide deck, and I'd be interested in if people find this format more useful than a YouTube video post conference. Please drop me a line and let me know if you find this useful!
<br/><br/><i>Tags for this post: <a href="http://www.stillhq.com/tags/openstack.html">openstack</a> <a href="http://www.stillhq.com/tags/nova.html">nova</a></i><br/><i>Related posts: <a href="http://www.stillhq.com/openstack/kilo/000004.html">One week of Nova Kilo specifications</a>; <a href="http://www.stillhq.com/openstack/kilo/000005.html">Specs for Kilo</a>; <a href="http://www.stillhq.com/openstack/juno/000014.html">Juno nova mid-cycle meetup summary: nova-network to Neutron migration</a>; <a href="http://www.stillhq.com/openstack/juno/000001.html">Juno Nova PTL Candidacy</a>; <a href="http://www.stillhq.com/openstack/juno/000012.html">Juno nova mid-cycle meetup summary: scheduler</a>; <a href="http://www.stillhq.com/openstack/juno/000007.html">Juno nova mid-cycle meetup summary: ironic</a></i>
<a href="http://www.stillhq.com/openstack/000019.commentform.html">Comment</a>
http://www.stillhq.com/openstack/000019.html
http://www.stillhq.com/openstack/000019.htmlHow are we going with Nova Kilo specs after our review day?/openstack/kiloSun, 14 Dec 2014 15:15:00 PSTTime for another summary I think, because announcing the review day seems to have caused a rush of new specs to be filed (which wasn't really my intention, but hey). We did approve a fair few specs on the review day, so I think overall it was a success. Here's an updated summary of the state of play:
<br/><Br/>
<br/><br/><b>API</b><br/><br/>
<ul>
<li>Add more detailed network information to the metadata server: <a href="https://review.openstack.org/#/c/85673">review 85673</a>.
<li>Add separated policy rule for each v2.1 api: <a href="https://review.openstack.org/#/c/127863">review 127863</a>.
<li>Add user limits to the limits API (as well as project limits): <a href="https://review.openstack.org/#/c/127094">review 127094</a>.
<li>Allow all printable characters in resource names: <a href="https://review.openstack.org/#/c/126696">review 126696</a>.
<li>Consolidate all console access APIs into one: <a href="https://review.openstack.org/#/c/141065">review 141065</a>.
<li>Expose the lock status of an instance as a queryable item: <a href="https://review.openstack.org/#/c/127139">review 127139</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/85928">review 85928</a> <b>(approved)</b>.
<li>Extend api to allow specifying vnic_type: <a href="https://review.openstack.org/#/c/138808">review 138808</a>.
<li>Implement instance tagging: <a href="https://review.openstack.org/#/c/127281">review 127281</a> <b>(fast tracked, approved)</b>.
<li>Implement the v2.1 API: <a href="https://review.openstack.org/#/c/126452">review 126452</a> <b>(fast tracked, approved)</b>.
<li>Improve the return codes for the instance lock APIs: <a href="https://review.openstack.org/#/c/135506">review 135506</a>.
<li>Microversion support: <a href="https://review.openstack.org/#/c/127127">review 127127</a> <b>(approved)</b>.
<li>Move policy validation to just the API layer: <a href="https://review.openstack.org/#/c/127160">review 127160</a>.
<li>Nova Server Count API Extension: <a href="https://review.openstack.org/#/c/134279">review 134279</a> <b>(fast tracked)</b>.
<li>Provide a policy statement on the goals of our API policies: <a href="https://review.openstack.org/#/c/128560">review 128560</a> <b>(abandoned)</b>.
<li>Sorting enhancements: <a href="https://review.openstack.org/#/c/131868">review 131868</a> <b>(fast tracked, approved)</b>.
<li>Support JSON-Home for API extension discovery: <a href="https://review.openstack.org/#/c/130715">review 130715</a>.
<li>Support X509 keypairs: <a href="https://review.openstack.org/#/c/105034">review 105034</a> <b>(approved)</b>.
</ul>
<br/><br/><b>API (EC2)</b><br/><br/>
<ul>
<li>Expand support for volume filtering in the EC2 API: <a href="https://review.openstack.org/#/c/104450">review 104450</a>.
<li>Implement tags for volumes and snapshots with the EC2 API: <a href="https://review.openstack.org/#/c/126553">review 126553</a> <b>(fast tracked, approved)</b>.
</ul>
<br/><br/><b>Administrative</b><br/><br/>
<ul>
<li>Actively hunt for orphan instances and remove them: <a href="https://review.openstack.org/#/c/137996">review 137996</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/138627">review 138627</a>.
<li>Check that a service isn't running before deleting it: <a href="https://review.openstack.org/#/c/131633">review 131633</a>.
<li>Enable the nova metadata cache to be a shared resource to improve the hit rate: <a href="https://review.openstack.org/#/c/126705">review 126705</a> <b>(abandoned)</b>.
<li>Implement a daemon version of rootwrap: <a href="https://review.openstack.org/#/c/105404">review 105404</a>.
<li>Log request id mappings: <a href="https://review.openstack.org/#/c/132819">review 132819</a> <b>(fast tracked)</b>.
<li>Monitor the health of hypervisor hosts: <a href="https://review.openstack.org/#/c/137768">review 137768</a>.
<li>Remove the assumption that there is a single endpoint for services that nova talks to: <a href="https://review.openstack.org/#/c/132623">review 132623</a>.
</ul>
<br/><br/><b>Block Storage</b><br/><br/>
<ul>
<li>Allow direct access to LVM volumes if supported by Cinder: <a href="https://review.openstack.org/#/c/127318">review 127318</a>.
<li>Cache data from volumes on local disk: <a href="https://review.openstack.org/#/c/138292">review 138292</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/138619">review 138619</a>.
<li>Enhance iSCSI volume multipath support: <a href="https://review.openstack.org/#/c/134299">review 134299</a>.
<li>Failover to alternative iSCSI portals on login failure: <a href="https://review.openstack.org/#/c/137468">review 137468</a>.
<li>Give additional info in BDM when source type is "blank": <a href="https://review.openstack.org/#/c/140133">review 140133</a>.
<li>Implement support for a DRBD driver for Cinder block device access: <a href="https://review.openstack.org/#/c/134153">review 134153</a>.
<li>Refactor ISCSIDriver to support other iSCSI transports besides TCP: <a href="https://review.openstack.org/#/c/130721">review 130721</a> <b>(approved)</b>.
<li>StorPool volume attachment support: <a href="https://review.openstack.org/#/c/115716">review 115716</a>.
<li>Support Cinder Volume Multi-attach: <a href="https://review.openstack.org/#/c/139580">review 139580</a> <b>(approved)</b>.
<li>Support iSCSI live migration for different iSCSI target: <a href="https://review.openstack.org/#/c/132323">review 132323</a> <b>(approved)</b>.
</ul>
<br/><br/><b>Cells</b><br/><br/>
<ul>
<li>Cells Scheduling: <a href="https://review.openstack.org/#/c/141486">review 141486</a>.
<li>Create an instance mapping database: <a href="https://review.openstack.org/#/c/135644">review 135644</a>.
<li>Flexible cell selection: <a href="https://review.openstack.org/#/c/140031">review 140031</a>.
<li>Implement instance mapping: <a href="https://review.openstack.org/#/c/135424">review 135424</a> <b>(approved)</b>.
<li>Populate the instance mapping database: <a href="https://review.openstack.org/#/c/136490">review 136490</a>.
</ul>
<br/><br/><b>Containers Service</b><br/><br/>
<ul>
<li>Initial specification: <a href="https://review.openstack.org/#/c/114044">review 114044</a> <b>(abandoned)</b>.
</ul>
<br/><br/><b>Database</b><br/><br/>
<ul>
<li>Enforce instance uuid uniqueness in the SQL database: <a href="https://review.openstack.org/#/c/128097">review 128097</a> <b>(fast tracked, approved)</b>.
<li>Nova db purge utility: <a href="https://review.openstack.org/#/c/132656">review 132656</a>.
<li>Online schema change options: <a href="https://review.openstack.org/#/c/102545">review 102545</a>.
<li>Support DB2 as a SQL database: <a href="https://review.openstack.org/#/c/141097">review 141097</a> <b>(fast tracked, approved)</b>.
<li>Validate database migrations and model': <a href="https://review.openstack.org/#/c/134984">review 134984</a> <b>(approved)</b>.
</ul>
<br/><br/><b>Hypervisor: Docker</b><br/><br/>
<ul>
<li>Migrate the Docker Driver into Nova: <a href="https://review.openstack.org/#/c/128753">review 128753</a>.
</ul>
<br/><br/><b>Hypervisor: FreeBSD</b><br/><br/>
<ul>
<li>Implement support for FreeBSD networking in nova-network: <a href="https://review.openstack.org/#/c/127827">review 127827</a>.
</ul>
<br/><br/><b>Hypervisor: Hyper-V</b><br/><br/>
<ul>
<li>Allow volumes to be stored on SMB shares instead of just iSCSI: <a href="https://review.openstack.org/#/c/102190">review 102190</a> <b>(approved)</b>.
<li>Instance hot resize: <a href="https://review.openstack.org/#/c/141219">review 141219</a>.
</ul>
<br/><br/><b>Hypervisor: Ironic</b><br/><br/>
<ul>
<li>Add config drive support: <a href="https://review.openstack.org/#/c/98930">review 98930</a> <b>(approved)</b>.
<li>Pass through flavor capabilities to ironic: <a href="https://review.openstack.org/#/c/136104">review 136104</a>.
</ul>
<br/><br/><b>Hypervisor: VMWare</b><br/><br/>
<ul>
<li>Add ephemeral disk support to the VMware driver: <a href="https://review.openstack.org/#/c/126527">review 126527</a> <b>(fast tracked, approved)</b>.
<li>Add support for the HTML5 console: <a href="https://review.openstack.org/#/c/127283">review 127283</a>.
<li>Allow Nova to access a VMWare image store over NFS: <a href="https://review.openstack.org/#/c/126866">review 126866</a>.
<li>Enable administrators and tenants to take advantage of backend storage policies: <a href="https://review.openstack.org/#/c/126547">review 126547</a> <b>(fast tracked, approved)</b>.
<li>Enable the mapping of raw cinder devices to instances: <a href="https://review.openstack.org/#/c/128697">review 128697</a>.
<li>Implement vSAN support: <a href="https://review.openstack.org/#/c/128600">review 128600</a> <b>(fast tracked, approved)</b>.
<li>Support multiple disks inside a single OVA file: <a href="https://review.openstack.org/#/c/128691">review 128691</a>.
<li>Support the OVA image format: <a href="https://review.openstack.org/#/c/127054">review 127054</a> <b>(fast tracked, approved)</b>.
</ul>
<br/><br/><b>Hypervisor: libvirt</b><br/><br/>
<ul>
<li>Add Quobyte USP support: <a href="https://review.openstack.org/#/c/138372">review 138372</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/138373">review 138373</a> <b>(approved)</b>.
<li>Add VIF_VHOSTUSER vif type: <a href="https://review.openstack.org/#/c/138736">review 138736</a> <b>(approved)</b>.
<li>Add a Quobyte Volume Driver: <a href="https://review.openstack.org/#/c/138375">review 138375</a> <b>(abandoned)</b>.
<li>Add finetunable configuration settings for virtio-scsi: <a href="https://review.openstack.org/#/c/103797">review 103797</a> <b>(abandoned)</b>.
<li>Add large page support: <a href="https://review.openstack.org/#/c/129608">review 129608</a> <b>(approved)</b>.
<li>Add support for SMBFS as a image storage backend: <a href="https://review.openstack.org/#/c/103203">review 103203</a> <b>(approved)</b>.
<li>Allow scheduling of instances such that PCI passthrough devices are co-located on the same NUMA node as other instance resources: <a href="https://review.openstack.org/#/c/128344">review 128344</a> <b>(fast tracked, approved)</b>.
<li>Allow specification of the device boot order for instances: <a href="https://review.openstack.org/#/c/133254">review 133254</a>.
<li>Allow the administrator to explicitly set the version of the qemu emulator to use: <a href="https://review.openstack.org/#/c/138731">review 138731</a> <b>(abandoned)</b>.
<li>Consider PCI offload capabilities when scheduling instances: <a href="https://review.openstack.org/#/c/135331">review 135331</a>.
<li>Convert to using built in libvirt disk copy mechanisms for cold migrations on non-shared storage: <a href="https://review.openstack.org/#/c/126979">review 126979</a> <b>(fast tracked)</b>.
<li>Derive hardware policy from libosinfo: <a href="https://review.openstack.org/#/c/133945">review 133945</a>.
<li>Implement COW volumes via VMThunder to allow fast boot of large numbers of instances: <a href="https://review.openstack.org/#/c/128810">review 128810</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/128813">review 128813</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/128830">review 128830</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/128845">review 128845</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/129093">review 129093</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/129108">review 129108</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/129110">review 129110</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/129113">review 129113</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/129116">review 129116</a>; <a href="https://review.openstack.org/#/c/137617">review 137617</a>.
<li>Implement configurable policy over where virtual CPUs should be placed on physical CPUs: <a href="https://review.openstack.org/#/c/129606">review 129606</a> <b>(approved)</b>.
<li>Implement support for Parallels Cloud Server: <a href="https://review.openstack.org/#/c/111335">review 111335</a> <b>(approved)</b>; <a href="https://review.openstack.org/#/c/128990">review 128990</a> <b>(abandoned)</b>.
<li>Implement support for zkvm as a libvirt hypervisor: <a href="https://review.openstack.org/#/c/130447">review 130447</a> <b>(approved)</b>.
<li>Improve total network throughput by supporting virtio-net multiqueue: <a href="https://review.openstack.org/#/c/128825">review 128825</a>.
<li>Improvements to the cinder integration for snapshots: <a href="https://review.openstack.org/#/c/134517">review 134517</a>.
<li>Quiesce instance disks during snapshot: <a href="https://review.openstack.org/#/c/128112">review 128112</a>; <a href="https://review.openstack.org/#/c/131587">review 131587</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/131597">review 131597</a>.
<li>Real time instances: <a href="https://review.openstack.org/#/c/139688">review 139688</a>.
<li>Stop dm-crypt device when an encrypted instance is suspended or stopped: <a href="https://review.openstack.org/#/c/140847">review 140847</a> <b>(approved)</b>.
<li>Support SR-IOV interface attach and detach: <a href="https://review.openstack.org/#/c/139910">review 139910</a>.
<li>Support StorPool as a storage backend: <a href="https://review.openstack.org/#/c/137830">review 137830</a>.
<li>Support for live block device IO tuning: <a href="https://review.openstack.org/#/c/136704">review 136704</a>.
<li>Support libvirt storage pools: <a href="https://review.openstack.org/#/c/126978">review 126978</a> <b>(fast tracked, approved)</b>.
<li>Support live migration with macvtap SR-IOV: <a href="https://review.openstack.org/#/c/136077">review 136077</a>.
<li>Support quiesce filesystems during snapshot: <a href="https://review.openstack.org/#/c/126966">review 126966</a> <b>(fast tracked, approved)</b>.
<li>Support using qemu's built in iSCSI initiator: <a href="https://review.openstack.org/#/c/133048">review 133048</a> <b>(approved)</b>.
<li>Volume driver for Huawei SDSHypervisor: <a href="https://review.openstack.org/#/c/130919">review 130919</a>.
</ul>
<br/><br/><b>Instance features</b><br/><br/>
<ul>
<li>Allow portions of an instance's uuid to be configurable: <a href="https://review.openstack.org/#/c/130451">review 130451</a>.
<li>Attempt to schedule cinder volumes "close" to instances: <a href="https://review.openstack.org/#/c/130851">review 130851</a>; <a href="https://review.openstack.org/#/c/131050">review 131050</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/131051">review 131051</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/131151">review 131151</a> <b>(abandoned)</b>.
<li>Dynamic server groups: <a href="https://review.openstack.org/#/c/130005">review 130005</a> <b>(abandoned)</b>.
<li>Improve the performance of unshelve for those using shared storage for instance disks: <a href="https://review.openstack.org/#/c/135387">review 135387</a>.
</ul>
<br/><br/><b>Internal</b><br/><br/>
<ul>
<li>A lock-free quota implementation: <a href="https://review.openstack.org/#/c/135296">review 135296</a>.
<li>Automate the documentation of the virtual machine state transition graph: <a href="https://review.openstack.org/#/c/94835">review 94835</a>.
<li>Fake Libvirt driver for simulating HW testing: <a href="https://review.openstack.org/#/c/139927">review 139927</a> <b>(abandoned)</b>.
<li>Flatten Aggregate Metadata in the DB: <a href="https://review.openstack.org/#/c/134573">review 134573</a> <b>(abandoned)</b>.
<li>Flatten Instance Metadata in the DB: <a href="https://review.openstack.org/#/c/134945">review 134945</a> <b>(abandoned)</b>.
<li>Implement a new code coverage API extension: <a href="https://review.openstack.org/#/c/130855">review 130855</a>.
<li>Move flavor data out of the system_metadata table in the SQL database: <a href="https://review.openstack.org/#/c/126620">review 126620</a> <b>(approved)</b>.
<li>Move to polling for cinder operations: <a href="https://review.openstack.org/#/c/135367">review 135367</a>.
<li>PCI test cases for third party CI: <a href="https://review.openstack.org/#/c/141270">review 141270</a>.
<li>Transition Nova to using the Glance v2 API: <a href="https://review.openstack.org/#/c/84887">review 84887</a>.
<li>Transition to using glanceclient instead of our own home grown wrapper: <a href="https://review.openstack.org/#/c/133485">review 133485</a> <b>(approved)</b>.
</ul>
<br/><br/><b>Internationalization</b><br/><br/>
<ul>
<li>Enable lazy translations of strings: <a href="https://review.openstack.org/#/c/126717">review 126717</a> <b>(fast tracked)</b>.
</ul>
<br/><br/><b>Networking</b><br/><br/>
<ul>
<li>Add a new linuxbridge VIF type, macvtap: <a href="https://review.openstack.org/#/c/117465">review 117465</a> <b>(abandoned)</b>.
<li>Add a plugin mechanism for VIF drivers: <a href="https://review.openstack.org/#/c/136827">review 136827</a>.
<li>Add support for InfiniBand SR-IOV VIF Driver: <a href="https://review.openstack.org/#/c/131729">review 131729</a>.
<li>Neutron DNS Using Nova Hostname: <a href="https://review.openstack.org/#/c/90150">review 90150</a> <b>(abandoned)</b>.
<li>New VIF type to allow routing VM data instead of bridging it: <a href="https://review.openstack.org/#/c/130732">review 130732</a>.
<li>Nova Plugin for OpenContrail: <a href="https://review.openstack.org/#/c/126446">review 126446</a> <b>(approved)</b>.
<li>Refactor of the Neutron network adapter to be more maintainable: <a href="https://review.openstack.org/#/c/131413">review 131413</a>.
<li>Use the Nova hostname in Neutron DNS: <a href="https://review.openstack.org/#/c/137669">review 137669</a>.
<li>Wrap the Python NeutronClient: <a href="https://review.openstack.org/#/c/141108">review 141108</a>.
</ul>
<br/><br/><b>Performance</b><br/><br/>
<ul>
<li>Dynamically alter the interval nova polls components at based on load and expected time for an operation to complete: <a href="https://review.openstack.org/#/c/122705">review 122705</a>.
</ul>
<br/><br/><b>Scheduler</b><br/><br/>
<ul>
<li>A nested quota driver API: <a href="https://review.openstack.org/#/c/129420">review 129420</a>.
<li>Add a filter to take into account hypervisor type and version when scheduling: <a href="https://review.openstack.org/#/c/137714">review 137714</a>.
<li>Add an IOPS weigher: <a href="https://review.openstack.org/#/c/127123">review 127123</a> <b>(approved, implemented)</b>; <a href="https://review.openstack.org/#/c/132614">review 132614</a>.
<li>Add instance count on the hypervisor as a weight: <a href="https://review.openstack.org/#/c/127871">review 127871</a> <b>(abandoned)</b>.
<li>Allow extra spec to match all values in a list by adding the ALL-IN operator: <a href="https://review.openstack.org/#/c/138698">review 138698</a> <b>(fast tracked, approved)</b>.
<li>Allow limiting the flavors that can be scheduled on certain host aggregates: <a href="https://review.openstack.org/#/c/122530">review 122530</a> <b>(abandoned)</b>.
<li>Allow the remove of servers from server groups: <a href="https://review.openstack.org/#/c/136487">review 136487</a>.
<li>Convert get_available_resources to use an object instead of dict: <a href="https://review.openstack.org/#/c/133728">review 133728</a> <b>(abandoned)</b>.
<li>Convert the resource tracker to objects: <a href="https://review.openstack.org/#/c/128964">review 128964</a> <b>(fast tracked, approved)</b>.
<li>Create an object model to represent a request to boot an instance: <a href="https://review.openstack.org/#/c/127610">review 127610</a> <b>(approved)</b>.
<li>Decouple services and compute nodes in the SQL database: <a href="https://review.openstack.org/#/c/126895">review 126895</a> <b>(approved)</b>.
<li>Enable adding new scheduler hints to already booted instances: <a href="https://review.openstack.org/#/c/134746">review 134746</a>.
<li>Fix the race conditions when migration with server-group: <a href="https://review.openstack.org/#/c/135527">review 135527</a> <b>(abandoned)</b>.
<li>Implement resource objects in the resource tracker: <a href="https://review.openstack.org/#/c/127609">review 127609</a>.
<li>Improve the ComputeCapabilities filter: <a href="https://review.openstack.org/#/c/133534">review 133534</a>.
<li>Isolate Scheduler DB for Filters: <a href="https://review.openstack.org/#/c/138444">review 138444</a>.
<li>Isolate the scheduler's use of the Nova SQL database: <a href="https://review.openstack.org/#/c/89893">review 89893</a>.
<li>Let schedulers reuse filter and weigher objects: <a href="https://review.openstack.org/#/c/134506">review 134506</a> <b>(abandoned)</b>.
<li>Move select_destinations() to using a request object: <a href="https://review.openstack.org/#/c/127612">review 127612</a> <b>(approved)</b>.
<li>Persist scheduler hints: <a href="https://review.openstack.org/#/c/88983">review 88983</a>.
<li>Refactor allocate_for_instance: <a href="https://review.openstack.org/#/c/141129">review 141129</a>.
<li>Stop direct lookup for host aggregates in the Nova database: <a href="https://review.openstack.org/#/c/132065">review 132065</a> <b>(abandoned)</b>.
<li>Stop direct lookup for instance groups in the Nova database: <a href="https://review.openstack.org/#/c/131553">review 131553</a> <b>(abandoned)</b>.
<li>Support scheduling based on more image properties: <a href="https://review.openstack.org/#/c/138937">review 138937</a>.
<li>Trusted computing support: <a href="https://review.openstack.org/#/c/133106">review 133106</a>.
</ul>
<br/><br/><b>Scheduling</b><br/><br/>
<ul>
<li>Dynamic Management of Server Groups: <a href="https://review.openstack.org/#/c/139272">review 139272</a>.
</ul>
<br/><br/><b>Security</b><br/><br/>
<ul>
<li>Make key manager interface interoperable with Barbican: <a href="https://review.openstack.org/#/c/140144">review 140144</a> <b>(fast tracked, approved)</b>.
<li>Provide a reference implementation for console proxies that uses TLS: <a href="https://review.openstack.org/#/c/126958">review 126958</a> <b>(fast tracked, approved)</b>.
<li>Strongly validate the tenant and user for quota consuming requests with keystone: <a href="https://review.openstack.org/#/c/92507">review 92507</a>.
</ul>
<br/><br/><b>Service Groups</b><br/><br/>
<ul>
<li>Pacemaker service group driver: <a href="https://review.openstack.org/#/c/139991">review 139991</a>.
<li>Transition service groups to using the new oslo Tooz library: <a href="https://review.openstack.org/#/c/138607">review 138607</a>.
</ul>
<br/><br/><b>Sheduler</b><br/><br/>
<ul>
<li>Add soft affinity support for server group: <a href="https://review.openstack.org/#/c/140017">review 140017</a> <b>(approved)</b>.
</ul>
<br/><br/><i>Tags for this post: <a href="http://www.stillhq.com/tags/openstack.html">openstack</a> <a href="http://www.stillhq.com/tags/kilo.html">kilo</a> <a href="http://www.stillhq.com/tags/blueprint.html">blueprint</a> <a href="http://www.stillhq.com/tags/spec.html">spec</a> <a href="http://www.stillhq.com/tags/nova.html">nova</a></i><br/><i>Related posts: <a href="http://www.stillhq.com/openstack/kilo/000005.html">Specs for Kilo</a>; <a href="http://www.stillhq.com/openstack/kilo/000004.html">One week of Nova Kilo specifications</a>; <a href="http://www.stillhq.com/openstack/kilo/000003.html">Compute Kilo specs are open</a>; <a href="http://www.stillhq.com/openstack/kilo/000006.html">Specs for Kilo</a>; <a href="http://www.stillhq.com/openstack/juno/000013.html">Juno nova mid-cycle meetup summary: slots</a>; <a href="http://www.stillhq.com/openstack/juno/000014.html">Juno nova mid-cycle meetup summary: nova-network to Neutron migration</a></i>
<a href="http://www.stillhq.com/openstack/kilo/000007.commentform.html">Comment</a>
http://www.stillhq.com/openstack/kilo/000007.html
http://www.stillhq.com/openstack/kilo/000007.htmlSoft deleting instances and the reclaim_instance_interval in Nova/openstackSun, 14 Dec 2014 13:51:00 PSTI got asked the other day how the reclaim_instance_interval in Nova works, so I thought I'd write it up here in case its useful to other people.
<br/><br/>
First off, there is a periodic task run the nova-compute process (or the computer manager as a developer would know it), which runs every reclaim_instance_interval seconds. It looks for instances in the SOFT_DELETED state which don't have any tasks running at the moment for the hypervisor node that nova-compute is running on.
<br/><br/>
For each instance it finds, it checks if the instance has been soft deleted for at least reclaim_instance_interval seconds. This has the side effect from my reading of the code that an instance needs to be deleted for at least reclaim_instance_Interval seconds before it will be removed from disk, but that the instance might be up to approximately twice that age (if it was deleted just as the periodic task ran, it would skip the next run and therefore not be deleted for two intervals).
<br/><br/>
Once these conditions are met, the instance is deleted from disk.
<br/><br/><i>Tags for this post: <a href="http://www.stillhq.com/tags/openstack.html">openstack</a> <a href="http://www.stillhq.com/tags/nova.html">nova</a> <a href="http://www.stillhq.com/tags/instance.html">instance</a> <a href="http://www.stillhq.com/tags/delete.html">delete</a></i><br/><i>Related posts: <a href="http://www.stillhq.com/openstack/kilo/000004.html">One week of Nova Kilo specifications</a>; <a href="http://www.stillhq.com/openstack/kilo/000005.html">Specs for Kilo</a>; <a href="http://www.stillhq.com/openstack/juno/000014.html">Juno nova mid-cycle meetup summary: nova-network to Neutron migration</a>; <a href="http://www.stillhq.com/diary/001052.html">Historical revisionism</a>; <a href="http://www.stillhq.com/openstack/juno/000001.html">Juno Nova PTL Candidacy</a>; <a href="http://www.stillhq.com/openstack/juno/000012.html">Juno nova mid-cycle meetup summary: scheduler</a></i>
<a href="http://www.stillhq.com/openstack/000018.commentform.html">Comment</a>
http://www.stillhq.com/openstack/000018.html
http://www.stillhq.com/openstack/000018.htmlSpecs for Kilo, an update/openstack/kiloMon, 01 Dec 2014 20:13:00 PSTWe're now a few weeks away from the kilo-1 milestone, so I thought it was time to update my summary of the Nova specifications that have been proposed so far. So here we go...
<br/><br/><b>API</b><br/><br/>
<ul>
<li>Add more detailed network information to the metadata server: <a href="https://review.openstack.org/#/c/85673">review 85673</a>.
<li>Add separated policy rule for each v2.1 api: <a href="https://review.openstack.org/#/c/127863">review 127863</a>.
<li>Add user limits to the limits API (as well as project limits): <a href="https://review.openstack.org/#/c/127094">review 127094</a>.
<li>Allow all printable characters in resource names: <a href="https://review.openstack.org/#/c/126696">review 126696</a>.
<li>Expose the lock status of an instance as a queryable item: <a href="https://review.openstack.org/#/c/127139">review 127139</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/85928">review 85928</a> <b>(approved)</b>.
<li>Implement instance tagging: <a href="https://review.openstack.org/#/c/127281">review 127281</a> <b>(fast tracked, approved)</b>.
<li>Implement the v2.1 API: <a href="https://review.openstack.org/#/c/126452">review 126452</a> <b>(fast tracked, approved)</b>.
<li>Improve the return codes for the instance lock APIs: <a href="https://review.openstack.org/#/c/135506">review 135506</a>.
<li>Microversion support: <a href="https://review.openstack.org/#/c/127127">review 127127</a> <b>(approved)</b>.
<li>Move policy validation to just the API layer: <a href="https://review.openstack.org/#/c/127160">review 127160</a>.
<li>Nova Server Count API Extension: <a href="https://review.openstack.org/#/c/134279">review 134279</a> <b>(fast tracked)</b>.
<li>Provide a policy statement on the goals of our API policies: <a href="https://review.openstack.org/#/c/128560">review 128560</a>.
<li>Sorting enhancements: <a href="https://review.openstack.org/#/c/131868">review 131868</a> <b>(fast tracked, approved)</b>.
<li>Support JSON-Home for API extension discovery: <a href="https://review.openstack.org/#/c/130715">review 130715</a>.
<li>Support X509 keypairs: <a href="https://review.openstack.org/#/c/105034">review 105034</a> <b>(approved)</b>.
</ul>
<br/><br/><b>API (EC2)</b><br/><br/>
<ul>
<li>Expand support for volume filtering in the EC2 API: <a href="https://review.openstack.org/#/c/104450">review 104450</a>.
<li>Implement tags for volumes and snapshots with the EC2 API: <a href="https://review.openstack.org/#/c/126553">review 126553</a> <b>(fast tracked, approved)</b>.
</ul>
<br/><br/><b>Administrative</b><br/><br/>
<ul>
<li>Check that a service isn't running before deleting it: <a href="https://review.openstack.org/#/c/131633">review 131633</a>.
<li>Enable the nova metadata cache to be a shared resource to improve the hit rate: <a href="https://review.openstack.org/#/c/126705">review 126705</a> <b>(abandoned)</b>.
<li>Enforce instance uuid uniqueness in the SQL database: <a href="https://review.openstack.org/#/c/128097">review 128097</a> <b>(fast tracked, approved)</b>.
<li>Implement a daemon version of rootwrap: <a href="https://review.openstack.org/#/c/105404">review 105404</a>.
<li>Log request id mappings: <a href="https://review.openstack.org/#/c/132819">review 132819</a> <b>(fast tracked)</b>.
<li>Monitor the health of hypervisor hosts: <a href="https://review.openstack.org/#/c/137768">review 137768</a>.
<li>Remove the assumption that there is a single endpoint for services that nova talks to: <a href="https://review.openstack.org/#/c/132623">review 132623</a>.
</ul>
<br/><br/><b>Cells</b><br/><br/>
<ul>
<li>Create an instance mapping database: <a href="https://review.openstack.org/#/c/135644">review 135644</a>.
<li>Implement instance mapping: <a href="https://review.openstack.org/#/c/135424">review 135424</a>.
<li>Populate the instance mapping database: <a href="https://review.openstack.org/#/c/136490">review 136490</a>.
</ul>
<br/><br/><b>Containers Service</b><br/><br/>
<ul>
<li>Initial specification: <a href="https://review.openstack.org/#/c/114044">review 114044</a> <b>(abandoned)</b>.
</ul>
<br/><br/><b>Database</b><br/><br/>
<ul>
<li>Nova db purge utility: <a href="https://review.openstack.org/#/c/132656">review 132656</a>.
<li>Online schema change options: <a href="https://review.openstack.org/#/c/102545">review 102545</a>.
<li>Validate database migrations and model': <a href="https://review.openstack.org/#/c/134984">review 134984</a> <b>(approved)</b>.
</ul>
<br/><br/><b>Hypervisor: Docker</b><br/><br/>
<ul>
<li>Migrate the Docker Driver into Nova: <a href="https://review.openstack.org/#/c/128753">review 128753</a>.
</ul>
<br/><br/><b>Hypervisor: FreeBSD</b><br/><br/>
<ul>
<li>Implement support for FreeBSD networking in nova-network: <a href="https://review.openstack.org/#/c/127827">review 127827</a>.
</ul>
<br/><br/><b>Hypervisor: Hyper-V</b><br/><br/>
<ul>
<li>Allow volumes to be stored on SMB shares instead of just iSCSI: <a href="https://review.openstack.org/#/c/102190">review 102190</a> <b>(approved)</b>.
</ul>
<br/><br/><b>Hypervisor: Ironic</b><br/><br/>
<ul>
<li>Add config drive support: <a href="https://review.openstack.org/#/c/98930">review 98930</a>.
</ul>
<br/><br/><b>Hypervisor: VMWare</b><br/><br/>
<ul>
<li>Add ephemeral disk support to the VMware driver: <a href="https://review.openstack.org/#/c/126527">review 126527</a> <b>(fast tracked, approved)</b>.
<li>Add support for the HTML5 console: <a href="https://review.openstack.org/#/c/127283">review 127283</a>.
<li>Allow Nova to access a VMWare image store over NFS: <a href="https://review.openstack.org/#/c/126866">review 126866</a>.
<li>Enable administrators and tenants to take advantage of backend storage policies: <a href="https://review.openstack.org/#/c/126547">review 126547</a> <b>(fast tracked, approved)</b>.
<li>Enable the mapping of raw cinder devices to instances: <a href="https://review.openstack.org/#/c/128697">review 128697</a>.
<li>Implement vSAN support: <a href="https://review.openstack.org/#/c/128600">review 128600</a> <b>(fast tracked, approved)</b>.
<li>Support multiple disks inside a single OVA file: <a href="https://review.openstack.org/#/c/128691">review 128691</a>.
<li>Support the OVA image format: <a href="https://review.openstack.org/#/c/127054">review 127054</a> <b>(fast tracked, approved)</b>.
</ul>
<br/><br/><b>Hypervisor: ironic</b><br/><br/>
<ul>
<li>Pass through flavor capabilities to ironic: <a href="https://review.openstack.org/#/c/136104">review 136104</a>.
</ul>
<br/><br/><b>Hypervisor: libvirt</b><br/><br/>
<ul>
<li>Add finetunable configuration settings for virtio-scsi: <a href="https://review.openstack.org/#/c/103797">review 103797</a> <b>(abandoned)</b>.
<li>Add large page support: <a href="https://review.openstack.org/#/c/129608">review 129608</a> <b>(approved)</b>.
<li>Add support for SMBFS as a image storage backend: <a href="https://review.openstack.org/#/c/103203">review 103203</a> <b>(approved)</b>.
<li>Allow scheduling of instances such that PCI passthrough devices are co-located on the same NUMA node as other instance resources: <a href="https://review.openstack.org/#/c/128344">review 128344</a> <b>(fast tracked, approved)</b>.
<li>Allow specification of the device boot order for instances: <a href="https://review.openstack.org/#/c/133254">review 133254</a>.
<li>Consider PCI offload capabilities when scheduling instances: <a href="https://review.openstack.org/#/c/135331">review 135331</a>.
<li>Convert to using built in libvirt disk copy mechanisms for cold migrations on non-shared storage: <a href="https://review.openstack.org/#/c/126979">review 126979</a> <b>(fast tracked)</b>.
<li>Derive hardware policy from libosinfo: <a href="https://review.openstack.org/#/c/133945">review 133945</a>.
<li>Implement COW volumes via VMThunder to allow fast boot of large numbers of instances: <a href="https://review.openstack.org/#/c/128810">review 128810</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/128813">review 128813</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/128830">review 128830</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/128845">review 128845</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/129093">review 129093</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/129108">review 129108</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/129110">review 129110</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/129113">review 129113</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/129116">review 129116</a>; <a href="https://review.openstack.org/#/c/137617">review 137617</a>.
<li>Implement configurable policy over where virtual CPUs should be placed on physical CPUs: <a href="https://review.openstack.org/#/c/129606">review 129606</a> <b>(approved)</b>.
<li>Implement support for Parallels Cloud Server: <a href="https://review.openstack.org/#/c/111335">review 111335</a> <b>(approved)</b>; <a href="https://review.openstack.org/#/c/128990">review 128990</a> <b>(abandoned)</b>.
<li>Implement support for zkvm as a libvirt hypervisor: <a href="https://review.openstack.org/#/c/130447">review 130447</a> <b>(approved)</b>.
<li>Improve total network throughput by supporting virtio-net multiqueue: <a href="https://review.openstack.org/#/c/128825">review 128825</a>.
<li>Improvements to the cinder integration for snapshots: <a href="https://review.openstack.org/#/c/134517">review 134517</a>.
<li>Quiesce instance disks during snapshot: <a href="https://review.openstack.org/#/c/128112">review 128112</a>; <a href="https://review.openstack.org/#/c/131587">review 131587</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/131597">review 131597</a>.
<li>Support StorPool as a storage backend: <a href="https://review.openstack.org/#/c/137830">review 137830</a>.
<li>Support for live block device IO tuning: <a href="https://review.openstack.org/#/c/136704">review 136704</a>.
<li>Support libvirt storage pools: <a href="https://review.openstack.org/#/c/126978">review 126978</a> <b>(fast tracked, approved)</b>.
<li>Support live migration with macvtap SR-IOV: <a href="https://review.openstack.org/#/c/136077">review 136077</a>.
<li>Support quiesce filesystems during snapshot: <a href="https://review.openstack.org/#/c/126966">review 126966</a> <b>(fast tracked, approved)</b>.
<li>Support using qemu's built in iSCSI initiator: <a href="https://review.openstack.org/#/c/133048">review 133048</a> <b>(approved)</b>.
<li>Volume driver for Huawei SDSHypervisor: <a href="https://review.openstack.org/#/c/130919">review 130919</a>.
</ul>
<br/><br/><b>Instance features</b><br/><br/>
<ul>
<li>Allow portions of an instance's uuid to be configurable: <a href="https://review.openstack.org/#/c/130451">review 130451</a>.
<li>Attempt to schedule cinder volumes "close" to instances: <a href="https://review.openstack.org/#/c/130851">review 130851</a>; <a href="https://review.openstack.org/#/c/131050">review 131050</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/131051">review 131051</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/131151">review 131151</a> <b>(abandoned)</b>.
<li>Dynamic server groups: <a href="https://review.openstack.org/#/c/130005">review 130005</a> <b>(abandoned)</b>.
<li>Improve the performance of unshelve for those using shared storage for instance disks: <a href="https://review.openstack.org/#/c/135387">review 135387</a>.
</ul>
<br/><br/><b>Internal</b><br/><br/>
<ul>
<li>A lock-free quota implementation: <a href="https://review.openstack.org/#/c/135296">review 135296</a>.
<li>Automate the documentation of the virtual machine state transition graph: <a href="https://review.openstack.org/#/c/94835">review 94835</a>.
<li>Flatten Aggregate Metadata in the DB: <a href="https://review.openstack.org/#/c/134573">review 134573</a>.
<li>Flatten Instance Metadata in the DB: <a href="https://review.openstack.org/#/c/134945">review 134945</a>.
<li>Implement a new code coverage API extension: <a href="https://review.openstack.org/#/c/130855">review 130855</a>.
<li>Move flavor data out of the system_metadata table in the SQL database: <a href="https://review.openstack.org/#/c/126620">review 126620</a> <b>(approved)</b>.
<li>Move to polling for cinder operations: <a href="https://review.openstack.org/#/c/135367">review 135367</a>.
<li>Transition Nova to using the Glance v2 API: <a href="https://review.openstack.org/#/c/84887">review 84887</a>.
<li>Transition to using glanceclient instead of our own home grown wrapper: <a href="https://review.openstack.org/#/c/133485">review 133485</a>.
</ul>
<br/><br/><b>Internationalization</b><br/><br/>
<ul>
<li>Enable lazy translations of strings: <a href="https://review.openstack.org/#/c/126717">review 126717</a> <b>(fast tracked)</b>.
</ul>
<br/><br/><b>Networking</b><br/><br/>
<ul>
<li>Add a new linuxbridge VIF type, macvtap: <a href="https://review.openstack.org/#/c/117465">review 117465</a> <b>(abandoned)</b>.
<li>Add a plugin mechanism for VIF drivers: <a href="https://review.openstack.org/#/c/136827">review 136827</a>.
<li>Add support for InfiniBand SR-IOV VIF Driver: <a href="https://review.openstack.org/#/c/131729">review 131729</a>.
<li>Neutron DNS Using Nova Hostname: <a href="https://review.openstack.org/#/c/90150">review 90150</a>.
<li>New VIF type to allow routing VM data instead of bridging it: <a href="https://review.openstack.org/#/c/130732">review 130732</a>.
<li>Nova Plugin for OpenContrail: <a href="https://review.openstack.org/#/c/126446">review 126446</a>.
<li>Refactor of the Neutron network adapter to be more maintainable: <a href="https://review.openstack.org/#/c/131413">review 131413</a>.
<li>Use the Nova hostname in Neutron DNS: <a href="https://review.openstack.org/#/c/137669">review 137669</a>.
</ul>
<br/><br/><b>Performance</b><br/><br/>
<ul>
<li>Dynamically alter the interval nova polls components at based on load and expected time for an operation to complete: <a href="https://review.openstack.org/#/c/122705">review 122705</a>.
</ul>
<br/><br/><b>Scheduler</b><br/><br/>
<ul>
<li>Add a filter to take into account hypervisor type and version when scheduling: <a href="https://review.openstack.org/#/c/137714">review 137714</a>.
<li>Add an IOPS weigher: <a href="https://review.openstack.org/#/c/127123">review 127123</a> <b>(approved, implemented)</b>; <a href="https://review.openstack.org/#/c/132614">review 132614</a>.
<li>Add instance count on the hypervisor as a weight: <a href="https://review.openstack.org/#/c/127871">review 127871</a> <b>(abandoned)</b>.
<li>Allow limiting the flavors that can be scheduled on certain host aggregates: <a href="https://review.openstack.org/#/c/122530">review 122530</a> <b>(abandoned)</b>.
<li>Allow the remove of servers from server groups: <a href="https://review.openstack.org/#/c/136487">review 136487</a>.
<li>Convert get_available_resources to use an object instead of dict: <a href="https://review.openstack.org/#/c/133728">review 133728</a>.
<li>Convert the resource tracker to objects: <a href="https://review.openstack.org/#/c/128964">review 128964</a> <b>(fast tracked, approved)</b>.
<li>Create an object model to represent a request to boot an instance: <a href="https://review.openstack.org/#/c/127610">review 127610</a>.
<li>Decouple services and compute nodes in the SQL database: <a href="https://review.openstack.org/#/c/126895">review 126895</a> <b>(approved)</b>.
<li>Enable adding new scheduler hints to already booted instances: <a href="https://review.openstack.org/#/c/134746">review 134746</a>.
<li>Fix the race conditions when migration with server-group: <a href="https://review.openstack.org/#/c/135527">review 135527</a> <b>(abandoned)</b>.
<li>Implement resource objects in the resource tracker: <a href="https://review.openstack.org/#/c/127609">review 127609</a>.
<li>Improve the ComputeCapabilities filter: <a href="https://review.openstack.org/#/c/133534">review 133534</a>.
<li>Isolate the scheduler's use of the Nova SQL database: <a href="https://review.openstack.org/#/c/89893">review 89893</a>.
<li>Let schedulers reuse filter and weigher objects: <a href="https://review.openstack.org/#/c/134506">review 134506</a> <b>(abandoned)</b>.
<li>Move select_destinations() to using a request object: <a href="https://review.openstack.org/#/c/127612">review 127612</a>.
<li>Persist scheduler hints: <a href="https://review.openstack.org/#/c/88983">review 88983</a>.
<li>Stop direct lookup for host aggregates in the Nova database: <a href="https://review.openstack.org/#/c/132065">review 132065</a> <b>(abandoned)</b>.
<li>Stop direct lookup for instance groups in the Nova database: <a href="https://review.openstack.org/#/c/131553">review 131553</a>.
</ul>
<br/><br/><b>Security</b><br/><br/>
<ul>
<li>Provide a reference implementation for console proxies that uses TLS: <a href="https://review.openstack.org/#/c/126958">review 126958</a> <b>(fast tracked, approved)</b>.
<li>Strongly validate the tenant and user for quota consuming requests with keystone: <a href="https://review.openstack.org/#/c/92507">review 92507</a>.
</ul>
<br/><br/><b>Storage</b><br/><br/>
<ul>
<li>Allow direct access to LVM volumes if supported by Cinder: <a href="https://review.openstack.org/#/c/127318">review 127318</a>.
<li>Enhance iSCSI volume multipath support: <a href="https://review.openstack.org/#/c/134299">review 134299</a>.
<li>Failover to alternative iSCSI portals on login failure: <a href="https://review.openstack.org/#/c/137468">review 137468</a>.
<li>Implement support for a DRBD driver for Cinder block device access: <a href="https://review.openstack.org/#/c/134153">review 134153</a>.
<li>Refactor ISCSIDriver to support other iSCSI transports besides TCP: <a href="https://review.openstack.org/#/c/130721">review 130721</a>.
<li>StorPool volume attachment support: <a href="https://review.openstack.org/#/c/115716">review 115716</a>.
<li>Support iSCSI live migration for different iSCSI target: <a href="https://review.openstack.org/#/c/132323">review 132323</a> <b>(approved)</b>.
</ul>
<br/><br/><i>Tags for this post: <a href="http://www.stillhq.com/tags/openstack.html">openstack</a> <a href="http://www.stillhq.com/tags/kilo.html">kilo</a> <a href="http://www.stillhq.com/tags/blueprint.html">blueprint</a> <a href="http://www.stillhq.com/tags/spec.html">spec</a> <a href="http://www.stillhq.com/tags/nova.html">nova</a></i><br/><i>Related posts: <a href="http://www.stillhq.com/openstack/kilo/000005.html">Specs for Kilo</a>; <a href="http://www.stillhq.com/openstack/kilo/000007.html">How are we going with Nova Kilo specs after our review day?</a>; <a href="http://www.stillhq.com/openstack/kilo/000004.html">One week of Nova Kilo specifications</a>; <a href="http://www.stillhq.com/openstack/kilo/000003.html">Compute Kilo specs are open</a>; <a href="http://www.stillhq.com/openstack/juno/000013.html">Juno nova mid-cycle meetup summary: slots</a>; <a href="http://www.stillhq.com/openstack/juno/000014.html">Juno nova mid-cycle meetup summary: nova-network to Neutron migration</a></i>
<a href="http://www.stillhq.com/openstack/kilo/000006.commentform.html">Comment</a>
http://www.stillhq.com/openstack/kilo/000006.html
http://www.stillhq.com/openstack/kilo/000006.htmlSpecs for Kilo/openstack/kiloThu, 23 Oct 2014 19:27:00 PSTHere's an updated list of the specs currently proposed for Kilo. I wanted to produce this before I start travelling for the summit in the next couple of days because I think many of these will be required reading for the Nova track at the summit.
<br/><br/><b>API</b><br/><br/>
<ul>
<li>Add instance administrative lock status to the instance detail results: <a href="https://review.openstack.org/#/c/127139">review 127139</a> <b>(abandoned)</b>.
<li>Add more detailed network information to the metadata server: <a href="https://review.openstack.org/#/c/85673">review 85673</a>.
<li>Add separated policy rule for each v2.1 api: <a href="https://review.openstack.org/#/c/127863">review 127863</a>.
<li>Add user limits to the limits API (as well as project limits): <a href="https://review.openstack.org/#/c/127094">review 127094</a>.
<li>Allow all printable characters in resource names: <a href="https://review.openstack.org/#/c/126696">review 126696</a>.
<li>Expose the lock status of an instance as a queryable item: <a href="https://review.openstack.org/#/c/85928">review 85928</a> <b>(approved)</b>.
<li>Implement instance tagging: <a href="https://review.openstack.org/#/c/127281">review 127281</a> <b>(fast tracked, approved)</b>.
<li>Implement tags for volumes and snapshots with the EC2 API: <a href="https://review.openstack.org/#/c/126553">review 126553</a> <b>(fast tracked, approved)</b>.
<li>Implement the v2.1 API: <a href="https://review.openstack.org/#/c/126452">review 126452</a> <b>(fast tracked, approved)</b>.
<li>Microversion support: <a href="https://review.openstack.org/#/c/127127">review 127127</a>.
<li>Move policy validation to just the API layer: <a href="https://review.openstack.org/#/c/127160">review 127160</a>.
<li>Provide a policy statement on the goals of our API policies: <a href="https://review.openstack.org/#/c/128560">review 128560</a>.
<li>Support X509 keypairs: <a href="https://review.openstack.org/#/c/105034">review 105034</a>.
</ul>
<br/><br/><b>Administrative</b><br/><br/>
<ul>
<li>Enable the nova metadata cache to be a shared resource to improve the hit rate: <a href="https://review.openstack.org/#/c/126705">review 126705</a> <b>(abandoned)</b>.
<li>Enforce instance uuid uniqueness in the SQL database: <a href="https://review.openstack.org/#/c/128097">review 128097</a> <b>(fast tracked, approved)</b>.
</ul>
<br/><br/><b>Containers Service</b><br/><br/>
<ul>
<li>Initial specification: <a href="https://review.openstack.org/#/c/114044">review 114044</a>.
</ul>
<br/><br/><b>Hypervisor: Docker</b><br/><br/>
<ul>
<li>Migrate the Docker Driver into Nova: <a href="https://review.openstack.org/#/c/128753">review 128753</a>.
</ul>
<br/><br/><b>Hypervisor: FreeBSD</b><br/><br/>
<ul>
<li>Implement support for FreeBSD networking in nova-network: <a href="https://review.openstack.org/#/c/127827">review 127827</a>.
</ul>
<br/><br/><b>Hypervisor: Hyper-V</b><br/><br/>
<ul>
<li>Allow volumes to be stored on SMB shares instead of just iSCSI: <a href="https://review.openstack.org/#/c/102190">review 102190</a> <b>(approved)</b>.
</ul>
<br/><br/><b>Hypervisor: Ironic</b><br/><br/>
<ul>
<li>Add config drive support: <a href="https://review.openstack.org/#/c/98930">review 98930</a>.
</ul>
<br/><br/><b>Hypervisor: VMWare</b><br/><br/>
<ul>
<li>Add ephemeral disk support to the VMware driver: <a href="https://review.openstack.org/#/c/126527">review 126527</a> <b>(fast tracked, approved)</b>.
<li>Add support for the HTML5 console: <a href="https://review.openstack.org/#/c/127283">review 127283</a>.
<li>Allow Nova to access a VMWare image store over NFS: <a href="https://review.openstack.org/#/c/126866">review 126866</a>.
<li>Enable administrators and tenants to take advantage of backend storage policies: <a href="https://review.openstack.org/#/c/126547">review 126547</a> <b>(fast tracked, approved)</b>.
<li>Enable the mapping of raw cinder devices to instances: <a href="https://review.openstack.org/#/c/128697">review 128697</a>.
<li>Implement vSAN support: <a href="https://review.openstack.org/#/c/128600">review 128600</a> <b>(fast tracked, approved)</b>.
<li>Support multiple disks inside a single OVA file: <a href="https://review.openstack.org/#/c/128691">review 128691</a>.
<li>Support the OVA image format: <a href="https://review.openstack.org/#/c/127054">review 127054</a> <b>(fast tracked, approved)</b>.
</ul>
<br/><br/><b>Hypervisor: libvirt</b><br/><br/>
<ul>
<li>Add a new linuxbridge VIF type, macvtap: <a href="https://review.openstack.org/#/c/117465">review 117465</a> <b>(abandoned)</b>.
<li>Add finetunable configuration settings for virtio-scsi: <a href="https://review.openstack.org/#/c/103797">review 103797</a>.
<li>Add large page support: <a href="https://review.openstack.org/#/c/129608">review 129608</a> <b>(approved)</b>.
<li>Add support for SMBFS as a image storage backend: <a href="https://review.openstack.org/#/c/103203">review 103203</a>.
<li>Allow scheduling of instances such that PCI passthrough devices are co-located on the same NUMA node as other instance resources: <a href="https://review.openstack.org/#/c/128344">review 128344</a> <b>(fast tracked, approved)</b>.
<li>Convert to using built in libvirt disk copy mechanisms for cold migrations on non-shared storage: <a href="https://review.openstack.org/#/c/126979">review 126979</a> <b>(fast tracked)</b>.
<li>Implement COW volumes via VMThunder to allow fast boot of large numbers of instances: <a href="https://review.openstack.org/#/c/128810">review 128810</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/128813">review 128813</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/128830">review 128830</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/128845">review 128845</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/129093">review 129093</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/129108">review 129108</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/129110">review 129110</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/129113">review 129113</a> <b>(abandoned)</b>; <a href="https://review.openstack.org/#/c/129116">review 129116</a>.
<li>Implement configurable policy over where virtual CPUs should be placed on physical CPUs: <a href="https://review.openstack.org/#/c/129606">review 129606</a>.
<li>Implement support for Parallels Cloud Server: <a href="https://review.openstack.org/#/c/111335">review 111335</a>; <a href="https://review.openstack.org/#/c/128990">review 128990</a> <b>(abandoned)</b>.
<li>Implement support for zkvm as a libvirt hypervisor: <a href="https://review.openstack.org/#/c/130447">review 130447</a>.
<li>Improve total network throughput by supporting virtio-net multiqueue: <a href="https://review.openstack.org/#/c/128825">review 128825</a>.
<li>Quiesce instance disks during snapshot: <a href="https://review.openstack.org/#/c/128112">review 128112</a>.
<li>Support libvirt storage pools: <a href="https://review.openstack.org/#/c/126978">review 126978</a> <b>(fast tracked)</b>.
<li>Support quiesce filesystems during snapshot: <a href="https://review.openstack.org/#/c/126966">review 126966</a> <b>(fast tracked)</b>.
</ul>
<br/><br/><b>Instance features</b><br/><br/>
<ul>
<li>Allow direct access to LVM volumes if supported by Cinder: <a href="https://review.openstack.org/#/c/127318">review 127318</a>.
<li>Allow portions of an instance's uuid to be configurable: <a href="https://review.openstack.org/#/c/130451">review 130451</a>.
<li>Dynamic server groups: <a href="https://review.openstack.org/#/c/130005">review 130005</a>.
</ul>
<br/><br/><b>Internal</b><br/><br/>
<ul>
<li>Move flavor data out of the system_metdata table in the SQL database: <a href="https://review.openstack.org/#/c/126620">review 126620</a> <b>(approved)</b>.
<li>Transition Nova to using the Glance v2 API: <a href="https://review.openstack.org/#/c/84887">review 84887</a>.
</ul>
<br/><br/><b>Internationalization</b><br/><br/>
<ul>
<li>Enable lazy translations of strings: <a href="https://review.openstack.org/#/c/126717">review 126717</a> <b>(fast tracked)</b>.
</ul>
<br/><br/><b>Performance</b><br/><br/>
<ul>
<li>Dynamically alter the interval nova polls components at based on load and expected time for an operation to complete: <a href="https://review.openstack.org/#/c/122705">review 122705</a>.
</ul>
<br/><br/><b>Scheduler</b><br/><br/>
<ul>
<li>Add an IOPS weigher: <a href="https://review.openstack.org/#/c/127123">review 127123</a> <b>(approved)</b>.
<li>Add instance count on the hypervisor as a weight: <a href="https://review.openstack.org/#/c/127871">review 127871</a> <b>(abandoned)</b>.
<li>Allow limiting the flavors that can be scheduled on certain host aggregates: <a href="https://review.openstack.org/#/c/122530">review 122530</a> <b>(abandoned)</b>.
<li>Convert the resource tracker to objects: <a href="https://review.openstack.org/#/c/128964">review 128964</a> <b>(fast tracked, approved)</b>.
<li>Create an object model to represent a request to boot an instance: <a href="https://review.openstack.org/#/c/127610">review 127610</a>.
<li>Decouple services and compute nodes in the SQL database: <a href="https://review.openstack.org/#/c/126895">review 126895</a>.
<li>Implement resource objects in the resource tracker: <a href="https://review.openstack.org/#/c/127609">review 127609</a>.
<li>Isolate the scheduler's use of the Nova SQL database: <a href="https://review.openstack.org/#/c/89893">review 89893</a>.
<li>Move select_destinations() to using a request object: <a href="https://review.openstack.org/#/c/127612">review 127612</a>.
</ul>
<br/><br/><b>Security</b><br/><br/>
<ul>
<li>Provide a reference implementation for console proxies that uses TLS: <a href="https://review.openstack.org/#/c/126958">review 126958</a> <b>(fast tracked)</b>.
<li>Strongly validate the tenant and user for quota consuming requests with keystone: <a href="https://review.openstack.org/#/c/92507">review 92507</a>.
</ul>
<br/><br/><i>Tags for this post: <a href="http://www.stillhq.com/tags/openstack.html">openstack</a> <a href="http://www.stillhq.com/tags/kilo.html">kilo</a> <a href="http://www.stillhq.com/tags/blueprint.html">blueprint</a> <a href="http://www.stillhq.com/tags/spec.html">spec</a> <a href="http://www.stillhq.com/tags/nova.html">nova</a></i><br/><i>Related posts: <a href="http://www.stillhq.com/openstack/kilo/000007.html">How are we going with Nova Kilo specs after our review day?</a>; <a href="http://www.stillhq.com/openstack/kilo/000004.html">One week of Nova Kilo specifications</a>; <a href="http://www.stillhq.com/openstack/kilo/000003.html">Compute Kilo specs are open</a>; <a href="http://www.stillhq.com/openstack/kilo/000006.html">Specs for Kilo</a>; <a href="http://www.stillhq.com/openstack/juno/000013.html">Juno nova mid-cycle meetup summary: slots</a>; <a href="http://www.stillhq.com/openstack/juno/000014.html">Juno nova mid-cycle meetup summary: nova-network to Neutron migration</a></i>
<a href="http://www.stillhq.com/openstack/kilo/000005.commentform.html">Comment</a>
http://www.stillhq.com/openstack/kilo/000005.html
http://www.stillhq.com/openstack/kilo/000005.htmlOne week of Nova Kilo specifications/openstack/kiloMon, 13 Oct 2014 03:27:00 PSTIts been one week of specifications for Nova in Kilo. What are we seeing proposed so far? Here's a summary...
<br/><br/><b>API</b><br/><br/>
<ul>
<li>Add instance administrative lock status to the instance detail results: <a href="https://review.openstack.org/#/c/127139">review 127139</a>.
<li>Add more detailed network information to the metadata server: <a href="https://review.openstack.org/#/c/85673">review 85673</a>.
<li>Add separated policy rule for each v2.1 api: <a href="https://review.openstack.org/#/c/127863">review 127863</a>.
<li>Add user limits to the limits API (as well as project limits): <a href="https://review.openstack.org/#/c/127094">review 127094</a>.
<li>Allow all printable characters in resource names: <a href="https://review.openstack.org/#/c/126696">review 126696</a>.
<li>Implement instance tagging: <a href="https://review.openstack.org/#/c/127281">review 127281</a>.
<li>Implement tags for volumes and snapshots with the EC2 API: <a href="https://review.openstack.org/#/c/126553">review 126553</a> <b>(spec approved)</b>.
<li>Implement the v2.1 API: <a href="https://review.openstack.org/#/c/126452">review 126452</a> <b>(spec approved)</b>.
<li>Microversion support: <a href="https://review.openstack.org/#/c/127127">review 127127</a>.
<li>Move policy validation to just the API layer: <a href="https://review.openstack.org/#/c/127160">review 127160</a>.
<li>Support X509 keypairs: <a href="https://review.openstack.org/#/c/105034">review 105034</a>.
</ul>
<br/><br/><b>Administrative</b><br/><br/>
<ul>
<li>Enable the nova metadata cache to be a shared resource to improve the hit rate: <a href="https://review.openstack.org/#/c/126705">review 126705</a>.
</ul>
<br/><br/><b>Containers Service</b><br/><br/>
<ul>
<li>Initial specification: <a href="https://review.openstack.org/#/c/114044">review 114044</a>.
</ul>
<br/><br/><b>Hypervisor: FreeBSD</b><br/><br/>
<ul>
<li>Implement support for FreeBSD networking in nova-network: <a href="https://review.openstack.org/#/c/127827">review 127827</a>.
</ul>
<br/><br/><b>Hypervisor: Hyper-V</b><br/><br/>
<ul>
<li>Allow volumes to be stored on SMB shares instead of just iSCSI: <a href="https://review.openstack.org/#/c/102190">review 102190</a>.
</ul>
<br/><br/><b>Hypervisor: VMWare</b><br/><br/>
<ul>
<li>Add ephemeral disk support to the VMware driver: <a href="https://review.openstack.org/#/c/126527">review 126527</a> <b>(spec approved)</b>.
<li>Add support for the HTML5 console: <a href="https://review.openstack.org/#/c/127283">review 127283</a>.
<li>Allow Nova to access a VMWare image store over NFS: <a href="https://review.openstack.org/#/c/126866">review 126866</a>.
<li>Enable administrators and tenants to take advantage of backend storage policies: <a href="https://review.openstack.org/#/c/126547">review 126547</a> <b>(spec approved)</b>.
<li>Support the OVA image format: <a href="https://review.openstack.org/#/c/127054">review 127054</a>.
</ul>
<br/><br/><b>Hypervisor: libvirt</b><br/><br/>
<ul>
<li>Add a new linuxbridge VIF type, macvtap: <a href="https://review.openstack.org/#/c/117465">review 117465</a>.
<li>Add support for SMBFS as a image storage backend: <a href="https://review.openstack.org/#/c/103203">review 103203</a>.
<li>Convert to using built in libvirt disk copy mechanisms for cold migrations on non-shared storage: <a href="https://review.openstack.org/#/c/126979">review 126979</a>.
<li>Support libvirt storage pools: <a href="https://review.openstack.org/#/c/126978">review 126978</a>.
<li>Support quiesce filesystems during snapshot: <a href="https://review.openstack.org/#/c/126966">review 126966</a>.
</ul>
<br/><br/><b>Instance features</b><br/><br/>
<ul>
<li>Allow direct access to LVM volumes if supported by Cinder: <a href="https://review.openstack.org/#/c/127318">review 127318</a>.
</ul>
<br/><br/><b>Interal</b><br/><br/>
<ul>
<li>Move flavor data out of the system_metdata table in the SQL database: <a href="https://review.openstack.org/#/c/126620">review 126620</a>.
</ul>
<br/><br/><b>Internationalization</b><br/><br/>
<ul>
<li>Enable lazy translations of strings: <a href="https://review.openstack.org/#/c/126717">review 126717</a>.
</ul>
<br/><br/><b>Scheduler</b><br/><br/>
<ul>
<li>Add an IOPS weigher: <a href="https://review.openstack.org/#/c/127123">review 127123</a> <b>(spec approved)</b>.
<li>Allow limiting the flavors that can be scheduled on certain host aggregates: <a href="https://review.openstack.org/#/c/122530">review 122530</a>.
<li>Create an object model to represent a request to boot an instance: <a href="https://review.openstack.org/#/c/127610">review 127610</a>.
<li>Decouple services and compute nodes in the SQL database: <a href="https://review.openstack.org/#/c/126895">review 126895</a>.
<li>Implement resource objects in the resource tracker: <a href="https://review.openstack.org/#/c/127609">review 127609</a>.
<li>Move select_destinations() to using a request object: <a href="https://review.openstack.org/#/c/127612">review 127612</a>.
</ul>
<br/><br/><b>Scheduling</b><br/><br/>
<ul>
<li>Add instance count on the hypervisor as a weight: <a href="https://review.openstack.org/#/c/127871">review 127871</a>.
</ul>
<br/><br/><b>Security</b><br/><br/>
<ul>
<li>Provide a reference implementation for console proxies that uses TLS: <a href="https://review.openstack.org/#/c/126958">review 126958</a>.
<li>Strongly validate the tenant and user for quota consuming requests with keystone: <a href="https://review.openstack.org/#/c/92507">review 92507</a>.
</ul>
<br/><br/><i>Tags for this post: <a href="http://www.stillhq.com/tags/openstack.html">openstack</a> <a href="http://www.stillhq.com/tags/kilo.html">kilo</a> <a href="http://www.stillhq.com/tags/blueprints.html">blueprints</a> <a href="http://www.stillhq.com/tags/spec.html">spec</a> <a href="http://www.stillhq.com/tags/nova.html">nova</a></i><br/><i>Related posts: <a href="http://www.stillhq.com/openstack/kilo/000003.html">Compute Kilo specs are open</a>; <a href="http://www.stillhq.com/openstack/kilo/000005.html">Specs for Kilo</a>; <a href="http://www.stillhq.com/openstack/kilo/000007.html">How are we going with Nova Kilo specs after our review day?</a>; <a href="http://www.stillhq.com/openstack/kilo/000006.html">Specs for Kilo</a>; <a href="http://www.stillhq.com/openstack/juno/000014.html">Juno nova mid-cycle meetup summary: nova-network to Neutron migration</a>; <a href="http://www.stillhq.com/openstack/juno/000001.html">Juno Nova PTL Candidacy</a></i>
<a href="http://www.stillhq.com/openstack/kilo/000004.commentform.html">Comment</a>
http://www.stillhq.com/openstack/kilo/000004.html
http://www.stillhq.com/openstack/kilo/000004.htmlCompute Kilo specs are open/openstack/kiloSun, 12 Oct 2014 16:39:00 PSTFrom my email last week on the topic:
<pre>
I am pleased to announce that the specs process for nova in kilo is
now open. There are some tweaks to the previous process, so please
read this entire email before uploading your spec!
Blueprints approved in Juno
===========================
For specs approved in Juno, there is a fast track approval process for
Kilo. The steps to get your spec re-approved are:
- Copy your spec from the specs/juno/approved directory to the
specs/kilo/approved directory. Note that if we declared your spec to
be a "partial" implementation in Juno, it might be in the implemented
directory. This was rare however.
- Update the spec to match the new template
- Commit, with the "Previously-approved: juno" commit message tag
- Upload using git review as normal
Reviewers will still do a full review of the spec, we are not offering
a rubber stamp of previously approved specs. However, we are requiring
only one +2 to merge these previously approved specs, so the process
should be a lot faster.
A note for core reviewers here -- please include a short note on why
you're doing a single +2 approval on the spec so future generations
remember why.
Trivial blueprints
==================
We are not requiring specs for trivial blueprints in Kilo. Instead,
create a blueprint in Launchpad
at https://blueprints.launchpad.net/nova/+addspec and target the
specification to Kilo. New, targeted, unapproved specs will be
reviewed in weekly nova meetings. If it is agreed they are indeed
trivial in the meeting, they will be approved.
Other proposals
===============
For other proposals, the process is the same as Juno... Propose a spec
review against the specs/kilo/approved directory and we'll review it
from there.
</pre>
<br/><br/>
After a week I'm seeing something interesting. In Juno the specs process was new, and we saw a pause in the development cycle while people actually wrote down their designs before sending the code. This time around people know what to expect, and there are left over specs from Juno lying around. We're therefore seeing specs approved much faster than in Kilo. This should reduce the effect of the "pipeline flush" that we saw in Juno.
<br/><Br/>
So far we have five approved specs after only a week.
<br/><br/><i>Tags for this post: <a href="http://www.stillhq.com/tags/openstack.html">openstack</a> <a href="http://www.stillhq.com/tags/kilo.html">kilo</a> <a href="http://www.stillhq.com/tags/blueprints.html">blueprints</a> <a href="http://www.stillhq.com/tags/spec.html">spec</a> <a href="http://www.stillhq.com/tags/nova.html">nova</a></i><br/><i>Related posts: <a href="http://www.stillhq.com/openstack/kilo/000004.html">One week of Nova Kilo specifications</a>; <a href="http://www.stillhq.com/openstack/kilo/000005.html">Specs for Kilo</a>; <a href="http://www.stillhq.com/openstack/kilo/000007.html">How are we going with Nova Kilo specs after our review day?</a>; <a href="http://www.stillhq.com/openstack/kilo/000006.html">Specs for Kilo</a>; <a href="http://www.stillhq.com/openstack/juno/000014.html">Juno nova mid-cycle meetup summary: nova-network to Neutron migration</a>; <a href="http://www.stillhq.com/openstack/juno/000001.html">Juno Nova PTL Candidacy</a></i>
<a href="http://www.stillhq.com/openstack/kilo/000003.commentform.html">Comment</a>
http://www.stillhq.com/openstack/kilo/000003.html
http://www.stillhq.com/openstack/kilo/000003.htmlOn layers/openstack/kiloTue, 30 Sep 2014 18:57:00 PSTThere's been a lot of talk recently about what we should include in OpenStack and what is out of scope. This is interesting, in that many of us used to believe that we should do ''everything''. I think what's changed is that we're learning that solving all the problems in the world is hard, and that we need to re-focus on our core products. In this post I want to talk through the various "layers" proposals that have been made in the last month or so. Layers don't directly address what we should include in OpenStack or not, but they are a useful mechanism for trying to break up OpenStack into simpler to examine chunks, and I think that makes them useful in their own right.
<br/><br/>
I would address what I believe the scope of the OpenStack project should be, but I feel that it makes this post so long that no one will ever actually read it. Instead, I'll cover that in a later post in this series. For now, let's explore what people are proposing as a layering model for OpenStack.
<br/><br/>
<b>What are layers?</b>
<br/><br/>
Dean Troyer did a good job of describing a layers model for the OpenStack project <a href="http://hackstack.org/x/blog/2013/09/05/openstack-seven-layer-dip-as-a-service/">on his blog</a> quite a while ago. He proposed the following layers (this is a summary, you should really read his post):
<br/><br/>
<ul>
<li>layer 0: operating system and Oslo
<li>layer 1: basic services -- Keystone, Glance, Nova
<li>layer 2: extended basics -- Neutron, Cinder, Swift, Ironic
<li>layer 3: optional services -- Horizon and Ceilometer
<li>layer 4: turtles all the way up -- Heat, Trove, Moniker / Designate, Marconi / Zaqar
</ul>
<br/><br/>
Dean notes that Neutron would move to layer 1 when nova-network goes away and Neutron becomes required for all compute deployments. Dean's post was also over a year ago, so it misses services like Barbican that have appeared since then. Services are only allowed to require services from lower numbered layers, but can use services from higher number layers as optional add ins. So Nova for example can use Neutron, but cannot require it until it moves into layer 1. Similarly, there have been proposals to add Ceilometer as a dependency to schedule instances in Nova, and if we were to do that then we would need to move Ceilometer down to layer 1 as well. (I think doing that would be a mistake by the way, and have argued against it during at least two summits).
<br/><br/>
Sean Dague re-ignited this discussion with his own blog post <a href="https://dague.net/2014/08/26/openstack-as-layers/">relatively recently</a>. Sean proposes new names for most of the layers, but the intent remains the same -- a compute-centric view of the services that are required to build a working OpenStack deployment. Sean and Dean's layer definitions are otherwise strongly aligned, and Sean notes that the probability of seeing something deployed at a given installation reduces as the layer count increases -- so for example Trove is way less commonly deployed than Nova, because the set of people who want a managed database as a service is smaller than the set of of people who just want to be able to boot instances.
<br/><br/>
Now, I'm not sure I agree with the compute centric nature of the two layers proposals mentioned so far. I see people installing just Swift to solve a storage problem, and I think that's a completely valid use of OpenStack and should be supported as a first class citizen. On the other hand, resolving my concern with the layers model there is trivial -- we just move Swift to layer 1.
<br/><br/>
<b>What do layers give us?</b>
<br/><br/>
Sean makes a good point about the complexity of OpenStack installs and how we scare away new users. I agree completely -- we show people our architecture diagrams which are deliberately confusing, and then we wonder why they're not impressed. I think we do it because we're proud of the scope of the thing we've built, but I think our audiences walk away thinking that we don't really know what problem we're trying to solve. Do I really need to deploy Horizon to have working compute? No of course not, but our architecture diagrams don't make that obvious. I gave a talk along these lines at pyconau, and I think as a community we need to be better at explaining to people what we're trying to do, while remembering that not everyone is as excited about writing a whole heap of cloud infrastructure code as we are. This is also why the OpenStack miniconf at <a href="http://lca2015.linux.org.au/programme/miniconfs">linux.conf.au 2015</a> has pivoted from being a generic OpenStack chatfest to being something more solidly focussed on issues of interest to deployers -- we're just not great at talking to our users and we need to reboot the conversation at community conferences until its something which meets their needs.
<br/><br/>
<div align=center>
<img src="/openstack/kilo/we_intend_this_diagram_to_amaze_and_confuse.jpg"><br/>
<i>We intend this diagram to amaze and confuse our victims</i>
</div>
<br/><br/>
Agreeing on a set of layers gives us a framework within which to describe OpenStack to our users. It lets us communicate the services we think are basic and always required, versus those which are icing on the cake. It also let's us explain the dependency between projects better, and that helps deployers work out what order to deploy things in.
<br/><br/>
<b>Do layers help us work out what OpenStack should focus on?</b>
<br/><br/>
Sean's blog post then pivots and starts talking about the size of the OpenStack ecosystem -- or the "size of our tent" as he phrases it. While I agree that we need to shrink the number of projects we're working on at the moment, I feel that the blog post is missing a logical link between the previous layers discussion and the tent size conundrum. It feels to me that Sean wanted to propose that OpenStack focus on a specific set of layers, but didn't quite get there for whatever reason.
<br/><br/>
Next Monty Taylor had a go at furthering this conversation with his own <a href="http://inaugust.com/post/108">blog post on the topic</a>. Monty starts by making a very important point -- he (like all involved) both want the OpenStack community to be as inclusive as possible. I want lots of interesting people at the design summits, even if they don't work directly on projects that OpenStack ships. You can be a part of the OpenStack community without having our logo on your product.
<br/><br/>
A concrete example of including non-OpenStack projects in our wider community was visible at the Atlanta summit -- I know for a fact that there were software engineers at the summit who work on Google Compute Engine. I know this because I used to work with them at Google when I was a SRE there. I have no problem with people working on competing products being at our summits, as long as they are there to contribute meaningfully in the sessions, and not just take from us. It needs to be a two way street. Another concrete example is Ceph. I think Ceph is cool, and I'm completely fine with people using it as part of their OpenStack deploy. What upsets me is when people conflate Ceph with OpenStack. They are different. They're separate. And that is fine. Let's just not confuse people by saying Ceph is part of the OpenStack project -- it simply isn't because it doesn't fall under our governance model. Ceph is still a valued member of our community and more than welcome at our summits.
<br/><br/>
Do layers help us work our what to focus OpenStack on for now? I think they do. Should we simply say that we're only going to work on a single layer? Absolutely not. What we've tried to do up until now is have OpenStack be a single big thing, what we call "the integrated release". I think layers gives us a tool to find logical ways to break that thing up. Perhaps we need a smaller integrated release, but then continue with the other projects but on their own release cycles? Or perhaps they release at the same time, but we don't block the release of a layer 1 service on the basis of release critical bugs in a layer 4 service?
<br/><br/>
<b>Is there consensus on what sits in each layer?</b>
<br/><br>
Looking at the posts I can find on this topic so far, I'd have to say the answer is no. We're close, but we're not aligned yet. For example, one proposal has a tweak to the previously proposed layer model that adds Cinder, Designate and Neutron down into layer 1 (basic services). The author argues that this is because stateless cloud isn't particularly useful to users of OpenStack. However, I think this is wrong to be honest. I can see that stateless cloud isn't super useful by itself, but we are assuming that OpenStack is the only piece of infrastructure that a given organization has. Perhaps that's true for the public cloud case, but the vast majority of OpenStack deployments at this point are private clouds. So, you're an existing IT organization and you're deploying OpenStack to increase the level of flexibility in compute resources. You don't need to deploy Cinder or Designate to do that. Let's take the storage case for a second -- our hypothetical IT organization probably already has some form of storage -- a SAN, or NFS appliances, or something like that. So stateful cloud is easy for them -- they just have their instances mount resources from those existing storage pools like they would any other machine. Eventually they'll decide that hand managing that is horrible and move to Cinder, but that's probably later once they've gotten through the initial baby step of deploying Nova, Glance and Keystone.
<br/><br/>
The first step to using layers to decide what we should focus on is to decide what is in each layer. I think the conversation needs to revolve around that for now, because it we drift off into whether existing in a given layer means you're voted off the OpenStack island, when we'll never even come up with a set of agreed layers.
<br/><br/>
<b>Let's ignore tents for now</b>
<br/><br/>
The size of the OpenStack "tent" is the metaphor being used at the moment for working out what to include in OpenStack. As I say above, I think we need to reach agreement on what is in each layer before we can move on to that very important conversation.
<br/><br/>
<b>Conclusion</b>
<br/><br/>
Given the focus of this post is the layers model, I want to stop introducing new concepts here for now. Instead let me summarize where I stand so far -- I think the layers model is useful. I also think the layers should be an inverted pyramid -- layer 1 should be as small as possible for example. This is because of the dependency model that the layers model proposes -- it is important to keep the list of things that a layer 2 service must use as small and coherent as possible. Another reason to keep the lower layers as small as possible is because each layer represents the smallest possible increment of an OpenStack deployment that we think is reasonable. We believe it is currently reasonable to deploy Nova without Cinder or Neutron for example.
<br/><br/>
Most importantly of all, having those incremental stages of OpenStack deployment gives us a framework we have been missing in talking to our deployers and users. It makes OpenStack less confusing to outsiders, as it gives them bite sized morsels to consume one at a time.
<br/><br/>
So here are the layers as I see them for now:
<br/><br/>
<ul>
<li>layer 0: operating system, and Oslo
<li>layer 1: basic services -- Keystone, Glance, Nova, and Swift
<li>layer 2: extended basics -- Neutron, Cinder, and Ironic
<li>layer 3: optional services -- Horizon, and Ceilometer
<li>layer 4: application services -- Heat, Trove, Designate, and Zaqar
</ul>
<br/><br/>
I am not saying that everything inside a single layer is required to be deployed simultaneously, but I do think its reasonable for Ceilometer to assume that Swift is installed and functioning. The big difference here between my view of layers and that of Dean, Sean and Monty is that I think that Swift is a layer 1 service -- it provides basic functionality that may be assumed to exist by services above it in the model.
<br/><br/>
I believe that when projects come to the Technical Committee requesting incubation or integration, they should specify what layer they see their project sitting at, and the justification for a lower layer number should be harder than that for a higher layer. So for example, we should be reasonably willing to accept proposals at layer 4, whilst we should be super concerned about the implications of adding another project at layer 1.
<br/><br/>
In the next post in this series I'll try to address the size of the OpenStack "tent", and what projects we should be focussing on.
<br/><br/><i>Tags for this post: <a href="http://www.stillhq.com/tags/openstack.html">openstack</a> <a href="http://www.stillhq.com/tags/kilo.html">kilo</a> <a href="http://www.stillhq.com/tags/technical.html">technical</a> <a href="http://www.stillhq.com/tags/committee.html">committee</a> <a href="http://www.stillhq.com/tags/tc.html">tc</a> <a href="http://www.stillhq.com/tags/layers.html">layers</a></i><br/><i>Related posts: <a href="http://www.stillhq.com/openstack/kilo/000004.html">One week of Nova Kilo specifications</a>; <a href="http://www.stillhq.com/openstack/kilo/000005.html">Specs for Kilo</a>; <a href="http://www.stillhq.com/imagemagick/book/000008.html">Working on review comments for Chapters 2, 3 and 4 tonight</a>; <a href="http://www.stillhq.com/work/000011.html">What do you do when you care about a standard...</a>; <a href="http://www.stillhq.com/openstack/kilo/000003.html">Compute Kilo specs are open</a>; <a href="http://www.stillhq.com/openstack/kilo/000006.html">Specs for Kilo</a></i>
<a href="http://www.stillhq.com/openstack/kilo/000002.commentform.html">Comment</a>
http://www.stillhq.com/openstack/kilo/000002.html
http://www.stillhq.com/openstack/kilo/000002.htmlBlueprints implemented in Nova during Juno/openstack/junoTue, 30 Sep 2014 13:56:00 PSTAs we get closer to releasing the RC1 of Nova for Juno, I've started collecting a list of all the blueprints we implemented in Juno. This was mostly done because it helps me write the release notes, but I am posting it here because I am sure that others will find it handy too.
<br/><br/>
<b>Process</b>
<br/><br/>
<ul>
<li>Reserve 10 sql schema version numbers for back ports of Juno migrations to Icehouse. <a href="https://blueprints.launchpad.net/nova/+spec/backportable-db-migrations-juno">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/backportable-db-migrations-juno">specification</a>
</ul>
<br/><br/>
<b>Ongoing behind the scenes work</b>
<br/><br/>
<i>Object conversion</i>
<ul>
<li>Convert the compute manager to use nova objects. <a href="https://blueprints.launchpad.net/nova/+spec/compute-manager-objects-juno">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/compute-manager-objects-juno">specification</a>
<li>Convert EC2 API to use nova objects. <a href="https://blueprints.launchpad.net/nova/+spec/convert-ec2-api-to-use-nova-objects">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/convert_ec2_api_to_use_nova_objects">specification</a>
<li>Start converting hypervisor drivers to use objects. <a href="https://blueprints.launchpad.net/nova/+spec/virt-objects-juno">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/virt-objects-juno">specification</a>
</ul>
<br/><br/>
<i>Scheduler</i>
<ul>
<li>Support sub-classing objects. <a href="https://blueprints.launchpad.net/nova/+spec/object-subclassing">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/object-subclassing">specification</a>
<li>Stop using the scheduler run_instance method. Previously the scheduler would select a host, and then boot the instance. Instead, let the scheduler select hosts, but then return those so the caller boots the instance. This will make it easier to move the scheduler to being a generic service instead of being internal to nova. <a href="https://blueprints.launchpad.net/nova/+spec/remove-cast-to-schedule-run-instance">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/remove-cast-to-schedule-run-instance">specification</a>
<li>Refactor the nova scheduler into being a library. This will make splitting the scheduler out into its own service later easier. <a href="https://blueprints.launchpad.net/nova/+spec/scheduler-lib">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/scheduler-lib">specification</a>
<li>Move nova to using the v2 cinder API. <a href="https://blueprints.launchpad.net/nova/+spec/support-cinderclient-v2">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/support-cinderclient-v2">specification</a>
<li>Move prep_resize to conductor in preparation for splitting out the scheduler. <a href="https://blueprints.launchpad.net/nova/+spec/move-prep-resize-to-conductor">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/move-prep-resize-to-conductor">specification</a>
</ul>
<br/><br/>
<i>API</i>
<ul>
<li>Use JSON schema to strongly validate v3 API request bodies. Please note this work will later be released as v2.1 of the Nova API. <a href="https://blueprints.launchpad.net/nova/+spec/v3-api-schema">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/v3-api-schema">specification</a>
<li>Provide a standard format for the output of the VM diagnostics call. This work will be exposed by a later version of the v2.1 API. <a href="https://blueprints.launchpad.net/nova/+spec/v3-diagnostics">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/v3-diagnostics">specification</a>
<li>Move to the OpenStack standard name for the request id header, in a backward compatible manner. <a href="https://blueprints.launchpad.net/nova/+spec/cross-service-request-id">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/cross-service-request-id">specification</a>
<li>Implement the v2.1 API on the V3 API code base. This work is not yet complete. <a href="https://blueprints.launchpad.net/nova/+spec/v2-on-v3-api">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/v2-on-v3-api">specification</a>
</ul>
<br/><br/>
<i>Other</i>
<ul>
<li>Refactor the internal nova API to make the nova-network and neutron implementations more consistent. <a href="https://blueprints.launchpad.net/nova/+spec/refactor-network-api">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/refactor-network-api">specification</a>
</ul>
<br/><br/>
<b>General features</b>
<br/><br/>
<i>Instance features</i>
<ul>
<li>Allow users to specify an image to use for rescue instead of the original base image. <a href="https://blueprints.launchpad.net/nova/+spec/allow-image-to-be-specified-during-rescue">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/allow-image-to-be-specified-during-rescue">specification</a>
<li>Allow images to specify if a config drive should be used. <a href="https://blueprints.launchpad.net/nova/+spec/config-drive-image-property">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/config-drive-image-property">specification</a>
<li>Give users and administrators the ability to control the vCPU topology exposed to guests via flavors. <a href="https://blueprints.launchpad.net/nova/+spec/virt-driver-vcpu-topology">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/virt-driver-vcpu-topology">specification</a>
<li>Attach All Local Disks During Rescue. <a href="https://blueprints.launchpad.net/nova/+spec/rescue-attach-all-disks">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/rescue-attach-all-disks">specification</a>
</ul>
<br/><br/>
<i>Networking</i>
<ul>
<li>Improve the nova-network code to allow per-network settings. <a href="https://blueprints.launchpad.net/nova/+spec/better-support-for-multiple-networks">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/better-support-for-multiple-networks">specification</a>
<li>Allow deployers to add hooks which are informed as soon as networking information for an instance is changed. <a href="https://blueprints.launchpad.net/nova/+spec/instance-network-info-hook">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/instance-network-info-hook">specification</a>
<li>Enable nova instances to be booted up with SR-IOV neutron ports. <a href="https://blueprints.launchpad.net/nova/+spec/pci-passthrough-sriov">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/pci-passthrough-sriov">specification</a>
<li>Permit VMs to attach multiple interfaces to one network. <a href="https://blueprints.launchpad.net/nova/+spec/multiple-if-1-net">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/nfv-multiple-if-1-net">specification</a>
</ul>
<br/><br/>
<i>Scheduling</i>
<ul>
<li>Extensible Resource Tracking. The set of resources tracked by nova is hard coded, this change makes that extensible, which will allow plug-ins to track new types of resources for scheduling. <a href="https://blueprints.launchpad.net/nova/+spec/extensible-resource-tracking">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/extensible-resource-tracking">specification</a>
<li>Allow a host to be evacuated, but with the scheduler selecting destination hosts for the instances moved. <a href="https://blueprints.launchpad.net/nova/+spec/find-host-and-evacuate-instance">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/find-host-and-evacuate-instance">specification</a>
<li>Add support for host aggregates to scheduler filters. launchpad: <a href="https://blueprints.launchpad.net/nova/+spec/per-aggregate-disk-allocation-ratio">disk</a>; <a href="https://blueprints.launchpad.net/nova/+spec/per-aggregate-max-instances-per-host">instances</a>; and <a href="https://blueprints.launchpad.net/nova/+spec/per-aggregate-max-io-ops-per-host">IO ops</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/per-aggregate-filters">specification</a>
</ul>
<br/><br/>
<i>Other</i>
<ul>
<li>i18n Enablement for Nova, turn on the lazy translation support from Oslo i18n and updating Nova to adhere to the restrictions this adds to translatable strings. <a href="https://blueprints.launchpad.net/nova/+spec/i18n-enablement">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/i18n-enablement">specification</a>
<li>Offload periodic task sql query load to a slave sql server if one is configured. <a href="https://blueprints.launchpad.net/nova/+spec/juno-slaveification">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/juno-slaveification">specification</a>
<li>Only update the status of a host in the sql database when the status changes, instead of every 60 seconds. <a href="https://blueprints.launchpad.net/nova/+spec/on-demand-compute-update">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/on-demand-compute-update">specification</a>
<li>Include status information in API listings of hypervisor hosts. <a href="https://blueprints.launchpad.net/nova/+spec/return-status-for-hypervisor-node">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/return-status-for-hypervisor-node">specification</a>
<li>Allow API callers to specify more than one status to filter by when listing services. <a href="https://blueprints.launchpad.net/nova/+spec/servers-list-support-multi-status">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/servers-list-support-multi-status">specification</a>
<li>Add quota values to constrain the number and size of server groups a users can create. <a href="https://blueprints.launchpad.net/nova/+spec/server-group-quotas">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/server-group-quotas">specification</a>
</ul>
<br/><br/>
<b>Hypervisor driver specific</b>
<br/><br/>
<i>Hyper-V</i>
<ul>
<li>Support for differencing vhdx images. <a href="https://blueprints.launchpad.net/nova/+spec/add-differencing-vhdx-resize-support">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/add-differencing-vhdx-resize-support">specification</a>
<li>Support for console serial logs. <a href="https://blueprints.launchpad.net/nova/+spec/hyper-v-console-log">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/hyper-v-console-log">specification</a>
<li>Support soft reboot. <a href="https://blueprints.launchpad.net/nova/+spec/hyper-v-soft-reboot">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/hyper-v-soft-reboot">specification</a>
</ul>
<br/><br/>
<i>Ironic</i>
<ul>
<li>Add a virt driver for Ironic. <a href="https://blueprints.launchpad.net/nova/+spec/add-ironic-driver">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/add-ironic-driver">specification</a>
</ul>
<br/><br/>
<i>libvirt</i>
<ul>
<li>Performance improvements to listing instances on modern libvirts. <a href="https://blueprints.launchpad.net/nova/+spec/libvirt-domain-listing-speedup">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/libvirt-domain-listing-speedup">specification</a>
<li>Allow snapshots of network backed disks. <a href="https://blueprints.launchpad.net/nova/+spec/libvirt-volume-snap-network-disk">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/libvirt-volume-snap-network-disk">specification</a>
<li>Enable qemu memory balloon statistics for ceilometer reporting. <a href="https://blueprints.launchpad.net/nova/+spec/enabled-qemu-memballoon-stats">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/enabled-qemu-memballoon-stats">specification</a>
<li>Add support for handing back unused disk blocks to the underlying storage system. <a href="https://blueprints.launchpad.net/nova/+spec/libvirt-disk-discard-option">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/libvirt-disk-discard-option">specification</a>
<li>Meta-data about an instance is now recorded in the libvirt domain XML. This is intended to help administrators while debugging problems. <a href="https://blueprints.launchpad.net/nova/+spec/libvirt-driver-domain-metadata">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/libvirt-driver-domain-metadata">specification</a>
<li>Support namespaces for LXC containers. <a href="https://blueprints.launchpad.net/nova/+spec/libvirt-lxc-user-namespaces">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/libvirt-lxc-user-namespaces">specification</a>
<li>Copy-on-write cloning for RBD-backed disks. <a href="https://blueprints.launchpad.net/nova/+spec/rbd-clone-image-handler">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/rbd-clone-image-handler">specification</a>
<li>Expose interactive serial consoles. <a href="https://blueprints.launchpad.net/nova/+spec/serial-ports">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/serial-ports">specification</a>
<li>Allow controlled shutdown of guest operating systems during VM power off. <a href="https://blueprints.launchpad.net/nova/+spec/user-defined-shutdown">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/user-defined-shutdown">specification</a>
<li>Intelligent NUMA node placement for guests. <a href="https://blueprints.launchpad.net/nova/+spec/virt-driver-numa-placement">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/virt-driver-numa-placement">specification</a>
</ul>
<br/><br/>
<i>vmware</i>
<ul>
<li>Move the vmware driver to using the oslo vmware helper library. <a href="https://blueprints.launchpad.net/nova/+spec/use-oslo-vmware">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/use-oslo-vmware">specification</a>
<li>Add support for network interface hot plugging to vmware. <a href="https://blueprints.launchpad.net/nova/+spec/vmware-hot-plug">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/vmware-hot-plug">specification</a>
<li>Refactor the vmware driver's spawn functionality to be more maintainable. This work was internal, but is mentioned here because it significantly improves the supportability of the VMWare driver. <a href="https://blueprints.launchpad.net/nova/+spec/vmware-spawn-refactor">launchpad</a> <a href="http://specs.openstack.org/openstack/nova-specs/specs/juno/vmware-spawn-refactor">specification</a>
</ul>
<br/><br/><i>Tags for this post: <a href="http://www.stillhq.com/tags/openstack.html">openstack</a> <a href="http://www.stillhq.com/tags/juno.html">juno</a> <a href="http://www.stillhq.com/tags/blueprints.html">blueprints</a> <a href="http://www.stillhq.com/tags/implemented.html">implemented</a></i><br/><i>Related posts: <a href="http://www.stillhq.com/openstack/kilo/000004.html">One week of Nova Kilo specifications</a>; <a href="http://www.stillhq.com/openstack/juno/000014.html">Juno nova mid-cycle meetup summary: nova-network to Neutron migration</a>; <a href="http://www.stillhq.com/openstack/juno/000001.html">Juno Nova PTL Candidacy</a>; <a href="http://www.stillhq.com/openstack/juno/000012.html">Juno nova mid-cycle meetup summary: scheduler</a>; <a href="http://www.stillhq.com/openstack/juno/000007.html">Juno nova mid-cycle meetup summary: ironic</a>; <a href="http://www.stillhq.com/openstack/juno/000008.html">Review priorities as we approach juno-3</a></i>
<a href="http://www.stillhq.com/openstack/juno/000018.commentform.html">Comment</a>
http://www.stillhq.com/openstack/juno/000018.html
http://www.stillhq.com/openstack/juno/000018.htmlChronological list of Juno Nova mid-cycle meetup posts/openstack/junoMon, 29 Sep 2014 23:10:00 PSTThis is a just a quick list of the posts I wrote summarizing the Juno Nova mid-cycle meetup in the right order to read them, because I didn't actually have that online before. I needed to send this list to someone, so I figured its easier to just post it here.
<br/><br/>
<ul>
<li><a href="/openstack/juno/000005.html">Juno nova mid-cycle meetup summary: social issues</a>
<li><a href="/openstack/juno/000006.html">Juno nova mid-cycle meetup summary: containers</a>
<li><a href="/openstack/juno/000007.html">Juno nova mid-cycle meetup summary: ironic</a>
<li><a href="/openstack/juno/000009.html">Juno nova mid-cycle meetup summary: DB2 support</a>
<li><a href="/openstack/juno/000010.html">Juno nova mid-cycle meetup summary: cells</a>
<li><a href="/openstack/juno/000012.html">Juno nova mid-cycle meetup summary: scheduler</a>
<li><a href="/openstack/juno/000013.html">Juno nova mid-cycle meetup summary: slots</a>
<li><a href="/openstack/juno/000014.html">Juno nova mid-cycle meetup summary: nova-network to Neutron migration</a>
<li><a href="/openstack/juno/000015.html">Juno nova mid-cycle meetup summary: the next generation Nova API</a>
<li><a href="/openstack/juno/000016.html">Juno nova mid-cycle meetup summary: conclusion</a>
</ul>
<br/><br/><i>Tags for this post: <a href="http://www.stillhq.com/tags/openstack.html">openstack</a> <a href="http://www.stillhq.com/tags/juno.html">juno</a> <a href="http://www.stillhq.com/tags/juno.html">juno</a> <a href="http://www.stillhq.com/tags/nova.html">nova</a> <a href="http://www.stillhq.com/tags/mid-cycle.html">mid-cycle</a> <a href="http://www.stillhq.com/tags/summary.html">summary</a></i><br/><i>Related posts: <a href="http://www.stillhq.com/openstack/juno/000014.html">Juno nova mid-cycle meetup summary: nova-network to Neutron migration</a>; <a href="http://www.stillhq.com/openstack/juno/000012.html">Juno nova mid-cycle meetup summary: scheduler</a>; <a href="http://www.stillhq.com/openstack/juno/000007.html">Juno nova mid-cycle meetup summary: ironic</a>; <a href="http://www.stillhq.com/openstack/juno/000016.html">Juno nova mid-cycle meetup summary: conclusion</a>; <a href="http://www.stillhq.com/openstack/juno/000009.html">Juno nova mid-cycle meetup summary: DB2 support</a>; <a href="http://www.stillhq.com/openstack/juno/000005.html">Juno nova mid-cycle meetup summary: social issues</a></i>
<a href="http://www.stillhq.com/openstack/juno/000017.commentform.html">Comment</a>
http://www.stillhq.com/openstack/juno/000017.html
http://www.stillhq.com/openstack/juno/000017.htmlMy candidacy for Kilo Compute PTL/openstack/kiloMon, 29 Sep 2014 18:34:00 PSTThis is mostly historical at this point, but I forgot to post it here when I emailed it a week or so ago. So, for future reference:
<br/><br/>
<pre>
I'd like another term as Compute PTL, if you'll have me.
We live in interesting times. openstack has clearly gained a large
amount of mind share in the open cloud marketplace, with Nova being a
very commonly deployed component. Yet, we don't have a fantastic
container solution, which is our biggest feature gap at this point.
Worse -- we have a code base with a huge number of bugs filed against
it, an unreliable gate because of subtle bugs in our code and
interactions with other openstack code, and have a continued need to
add features to stay relevant. These are hard problems to solve.
Interestingly, I think the solution to these problems calls for a
social approach, much like I argued for in my Juno PTL candidacy
email. The problems we face aren't purely technical -- we need to work
out how to pay down our technical debt without blocking all new
features. We also need to ask for understanding and patience from
those feature authors as we try and improve the foundation they are
building on.
The specifications process we used in Juno helped with these problems,
but one of the things we've learned from the experiment is that we
don't require specifications for all changes. Let's take an approach
where trivial changes (no API changes, only one review to implement)
don't require a specification. There will of course sometimes be
variations on that rule if we discover something, but it means that
many micro-features will be unblocked.
In terms of technical debt, I don't personally believe that pulling
all hypervisor drivers out of Nova fixes the problems we face, it just
moves the technical debt to a different repository. However, we
clearly need to discuss the way forward at the summit, and come up
with some sort of plan. If we do something like this, then I am not
sure that the hypervisor driver interface is the right place to do
that work -- I'd rather see something closer to the hypervisor itself
so that the Nova business logic stays with Nova.
Kilo is also the release where we need to get the v2.1 API work done
now that we finally have a shared vision for how to progress. It took
us a long time to get to a good shared vision there, so we need to
ensure that we see that work through to the end.
We live in interesting times, but they're also exciting as well.
</pre>
<br/><Br/>
I have since been elected unopposed, so thanks for that!
<br/><br/><i>Tags for this post: <a href="http://www.stillhq.com/tags/openstack.html">openstack</a> <a href="http://www.stillhq.com/tags/kilo.html">kilo</a> <a href="http://www.stillhq.com/tags/compute.html">compute</a> <a href="http://www.stillhq.com/tags/ptl.html">ptl</a></i><br/><i>Related posts: <a href="http://www.stillhq.com/openstack/kilo/000004.html">One week of Nova Kilo specifications</a>; <a href="http://www.stillhq.com/openstack/kilo/000005.html">Specs for Kilo</a>; <a href="http://www.stillhq.com/openstack/juno/000001.html">Juno Nova PTL Candidacy</a>; <a href="http://www.stillhq.com/openstack/juno/000008.html">Review priorities as we approach juno-3</a>; <a href="http://www.stillhq.com/openstack/juno/000003.html">Thoughts from the PTL</a>; <a href="http://www.stillhq.com/openstack/kilo/000003.html">Compute Kilo specs are open</a></i>
<a href="http://www.stillhq.com/openstack/kilo/000001.commentform.html">Comment</a>
http://www.stillhq.com/openstack/kilo/000001.html
http://www.stillhq.com/openstack/kilo/000001.htmlJuno nova mid-cycle meetup summary: conclusion/openstack/junoThu, 21 Aug 2014 23:47:00 PSTThere's been a lot of content in this series about the Juno Nova mid-cycle meetup, so thanks to those who followed along with me. I've also received a lot of positive feedback about the posts, so I am thinking the exercise is worthwhile, and will try to be more organized for the next mid-cycle (and therefore get these posts out earlier). To recap quickly, here's what was covered in the series:
<br/><br/>
The <a href="http://www.stillhq.com/openstack/juno/000005.html">first post in the series</a> covered social issues: things like how we organized the mid-cycle meetup, how we should address core reviewer burnout, and the current state of play of the Juno release. Bug management has been an ongoing issue for Nova for a while, so we talked about <a href="http://www.stillhq.com/openstack/juno/000011.html">bug management</a>. We are making progress on this issue, but more needs to be done and it's going to take a lot of help for everyone to get there. There was also discussion about <a href="http://www.stillhq.com/openstack/juno/000013.html">proposals on how to handle review workload in the Kilo release</a>, although nothing has been finalized yet.
<br/><br/>
The <a href="http://www.stillhq.com/openstack/juno/000006.html">second post</a> covered the current state of play for containers in Nova, as well as our future direction. Unexpectedly, this was by far the most read post in the series if Google Analytics is to be believed. There is clear interest in support for containers in Nova. I expect this to be a hot topic at the Paris summit as well. Another new feature we're working on is <a href="http://www.stillhq.com/openstack/juno/000007.html">the Ironic driver merge into Nova</a>. This is progressing well, and we hope to have it fully merged by the end of the Juno release cycle.
<br/><br/>
At a superficial level the post about <a href="http://www.stillhq.com/openstack/juno/000009.html">DB2 support in Nova</a> is a simple tale of IBM's desire to have people use their database. However, to the skilled observer its deeper than that -- its a tale of love and loss, as well as a discussion of how to safely move our schema forward without causing undue pain for our large deployments. We also covered the state of <a href="http://www.stillhq.com/openstack/juno/000010.html">cells support in Nova</a>, with the main issue being that we really need cells to be feature complete. Hopefully people are working on a plan for this now. Another internal refactoring is the current <a href="http://www.stillhq.com/openstack/juno/000012.html">scheduler work</a>, which is important because it positions us for the future.
<br/><br/>
We also discussed the <a href="http://www.stillhq.com/openstack/juno/000015.html">next gen Nova API</a>, and talked through the <a href="http://www.stillhq.com/openstack/juno/000014.html">proposed upgrade path for the transition from nova-network to neutron</a>.
<br/><br/>
For those who are curious, there are 8,259 words (not that I am counting or anything) in this post series including this summary post. I estimate it took me about four working days to write <i>(ED: and about two days for his trained team of technical writers to edit into mostly coherent English)</a>. I would love to get your feedback on if you found the series useful as it's a pretty big investment in time.
<br/><br/><i>Tags for this post: <a href="http://www.stillhq.com/tags/openstack.html">openstack</a> <a href="http://www.stillhq.com/tags/juno.html">juno</a> <a href="http://www.stillhq.com/tags/nova.html">nova</a> <a href="http://www.stillhq.com/tags/mid-cycle.html">mid-cycle</a> <a href="http://www.stillhq.com/tags/summary.html">summary</a></i><br/><i>Related posts: <a href="http://www.stillhq.com/openstack/juno/000014.html">Juno nova mid-cycle meetup summary: nova-network to Neutron migration</a>; <a href="http://www.stillhq.com/openstack/juno/000012.html">Juno nova mid-cycle meetup summary: scheduler</a>; <a href="http://www.stillhq.com/openstack/juno/000007.html">Juno nova mid-cycle meetup summary: ironic</a>; <a href="http://www.stillhq.com/openstack/juno/000009.html">Juno nova mid-cycle meetup summary: DB2 support</a>; <a href="http://www.stillhq.com/openstack/juno/000005.html">Juno nova mid-cycle meetup summary: social issues</a>; <a href="http://www.stillhq.com/openstack/juno/000013.html">Juno nova mid-cycle meetup summary: slots</a></i>
<a href="http://www.stillhq.com/openstack/juno/000016.commentform.html">Comment</a>
http://www.stillhq.com/openstack/juno/000016.html
http://www.stillhq.com/openstack/juno/000016.htmlJuno nova mid-cycle meetup summary: the next generation Nova API/openstack/junoThu, 21 Aug 2014 16:52:00 PSTThis is the final post in my series covering the highlights from the Juno Nova mid-cycle meetup. In this post I will cover our next generation API, which used to be called the v3 API but is largely now referred to as the v2.1 API. Getting to this point has been one of the more painful processes I think I've ever seen in Nova's development history, and I think we've learnt some important things about how large distributed projects operate along the way. My hope is that we remember these lessons next time we hit something as contentious as our API re-write has been.
<br/><br/>
Now on to the API itself. It started out as an attempt to improve our current API to be more maintainable and less confusing to our users. We deliberately decided that we would not focus on adding features, but instead attempt to reduce as much technical debt as possible. This development effort went on for about a year before we realized we'd made a mistake. The mistake we made is that we assumed that our users would agree it was trivial to move to a new API, and that they'd do that even if there weren't compelling new features, which it turned out was entirely incorrect.
<br/><br/>
I want to make it clear that this wasn't a mistake on the part of the v3 API team. They implemented what the technical leadership of Nova at the time asked for, and were very surprised when we discovered our mistake. We've now spent over a release cycle trying to recover from that mistake as gracefully as possible, but the upside is that the API we will be delivering is significantly more future proof than what we have in the current v2 API.
<br/><br/>
At the Atlanta Juno summit, it was agreed that the v3 API would never ship in its current form, and that what we would instead do is provide a v2.1 API. This API would be 99% compatible with the current v2 API, with the incompatible things being stuff like if you pass a malformed parameter to the API we will now tell you instead of silently ignoring it, which we call 'input validation'. The other thing we are going to add in the v2.1 API is a system of 'micro-versions', which allow a client to specify what version of the API it understands, and for the server to gracefully degrade to older versions if required.
<br/><br/>
This micro-version system is important, because the next step is to then start adding the v3 cleanups and fixes into the v2.1 API, but as a series of micro-versions. That way we can drag the majority of our users with us into a better future, without abandoning users of older API versions. I should note at this point that the mechanics for deciding what the minimum micro-version a version of Nova will support are largely undefined at the moment. My instinct is that we will tie to stable release versions in some way; if your client dates back to a release of Nova that we no longer support, then we might expect you to upgrade. However, that hasn't been debated yet, so don't take my thoughts on that as rigid truth.
<br/><br/>
Frustratingly, the intent of the v2.1 API has been agreed and unchanged since the Atlanta summit, yet we're late in the Juno release and most of the work isn't done yet. This is because we got bogged down in the mechanics of how micro-versions will work, and how the translation for older API versions will work inside the Nova code later on. We finally unblocked this at the mid-cycle meetup, which means this work can finally progress again.
<br/><br/>
The main concern that we needed to resolve at the mid-cycle was the belief that if the v2.1 API was implemented as a series of translations on top of the v3 code, then the translation layer would be quite thick and complicated. This raises issues of maintainability, as well as the amount of code we need to understand. The API team has now agreed to produce an API implementation that is just the v2.1 functionality, and will then layer things on top of that. This is actually invisible to users of the API, but it leaves us with an implementation where changes after v2.1 are additive, which should be easier to maintain.
<br/><br/>
One of the other changes in the original v3 code is that we stopped proxying functionality for Neutron, Cinder and Glance. With the decision to implement a v2.1 API instead, we will need to rebuild that proxying implementation. To unblock v2.1, and based on advice from the HP and Rackspace public cloud teams, we have decided to delay implementing these proxies. So, the first version of the v2.1 API we ship will not have proxies, but later versions will add them in. The current v2 API implementation will not be removed until all the proxies have been added to v2.1. This is prompted by the belief that many advanced API users don't use the Nova API proxies, and therefore could move to v2.1 without them being implemented.
<br/><br/>
Finally, I want to thank the Nova API team, especially Chris Yeoh and Kenichi Oomichi for their patience with us while we have worked through these complicated issues. It's much appreciated, and I find them a consistent pleasure to work with.
<br/><br/>
That brings us to the end of my summary of the Nova Juno mid-cycle meetup. I'll write up a quick summary post that ties all of the posts together, but apart from that this series is now finished. Thanks for following along.
<br/><br/><i>Tags for this post: <a href="http://www.stillhq.com/tags/openstack.html">openstack</a> <a href="http://www.stillhq.com/tags/juno.html">juno</a> <a href="http://www.stillhq.com/tags/nova.html">nova</a> <a href="http://www.stillhq.com/tags/mid-cycle.html">mid-cycle</a> <a href="http://www.stillhq.com/tags/summary.html">summary</a> <a href="http://www.stillhq.com/tags/api.html">api</a> <a href="http://www.stillhq.com/tags/v3.html">v3</a> <a href="http://www.stillhq.com/tags/v2.1.html">v2.1</a></i><br/><i>Related posts: <a href="http://www.stillhq.com/openstack/juno/000014.html">Juno nova mid-cycle meetup summary: nova-network to Neutron migration</a>; <a href="http://www.stillhq.com/openstack/juno/000012.html">Juno nova mid-cycle meetup summary: scheduler</a>; <a href="http://www.stillhq.com/openstack/juno/000007.html">Juno nova mid-cycle meetup summary: ironic</a>; <a href="http://www.stillhq.com/openstack/juno/000016.html">Juno nova mid-cycle meetup summary: conclusion</a>; <a href="http://www.stillhq.com/openstack/juno/000009.html">Juno nova mid-cycle meetup summary: DB2 support</a>; <a href="http://www.stillhq.com/openstack/juno/000005.html">Juno nova mid-cycle meetup summary: social issues</a></i>
<a href="http://www.stillhq.com/openstack/juno/000015.commentform.html">Comment</a>
http://www.stillhq.com/openstack/juno/000015.html
http://www.stillhq.com/openstack/juno/000015.html