Ayende @ Rahienhttps://ayende.com/blog/Ayende @ RahienCopyright (C) Ayende Rahien 2004 - 2018 (c) 201860Rejection, dejection and resurrection, oh my!<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Rejection-dejection-and_9712/image_2.png"><img width="321" height="335" title="image" align="right" style="border: 0px currentcolor; border-image: none; float: right; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Rejection-dejection-and_9712/image_thumb.png" border="0"></a>Regardless of how good your software is, there is always a point where we can put more load on the system than it is capable of handling.</p><p>One such case is when you are firing about a hundred requests a second, per second, regardless of whatever the previous requests have completed and at the same time throttling the I/O so we can’t complete the requests fast enough.</p><p> What happens then is known as a convoy. Requests start piling up, as more and more work is waiting to be done, we are falling further and further behind. The typical way this ends is when you run out of resources completely. If you are using thread per requests, you end up with all your threads blocked on some lock. If you are using async operations, you start consuming more and more memory as you hold the async state of the request until it is completed. </p><p>We put a lot of pressure on the system, and we want to know that it responds well. And the way to do that is to recognize that there is a convoy in progress and handle it. But how can you do that?</p><p>The problem is that you are currently in the middle of processing a set of operations in a transaction. We can obviously abort it, and roll back everything, but the problem is that we are now in the <em>second</em> stage. We have a transaction that we wrote <em>to the disk</em>, and we are waiting for the disk to come back and confirm that the write is successful while already speculatively executing the current transaction. And we can’t abort the transaction that we are currently writing to disk, because there is no way to know at what stage the write is.&nbsp; </p><p>So we now need to decide what to do. And we choose the following set of behaviors. When running a speculative transaction (a transaction that is run while the previous transaction is being committed to disk) we observe the amount of memory that is used by this transaction. If the amount of memory being used it too high, we stop processing incoming operations and wait for the previous transaction to come back from the disk.</p><p>At the same time, we might <em>still</em> be getting new operations to complete, but we can’t process them. At this point, after we waited for enough time to be worrying, we start proactively rejecting requests, telling the client immediately that we are in a timeout situation and that they should failover to another node.</p><p>The key problem is that I/O is, by its nature, highly unpredictable, and may be impacted by many things. On the cloud, you might hit your IOPS limits and see a <em>drastic </em>drop in performance all of a sudden. We considered a lot of ways to actually manage it ourselves, by limiting what kind of I/O operations we’ll send at each time, queuing and optimizing things, but we can only control the things that we do. So we decided to just measure what is going on and react accordingly. </p><p>Beyond being proactive to incoming requests, we are also making sure that we’ll surface these kind of details to the user:</p><p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Rejection-dejection-and_9712/image_4.png"><img width="298" height="219" title="image" style="margin: 0px; border: 0px currentcolor; border-image: none; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Rejection-dejection-and_9712/image_thumb_1.png" border="0"></a></p><p>Knowing that the I/O system may be giving us this kind of response can be invaluable when you are trying to figure out what is going on. And we made sure that this is very clearly displayed to the admin.</p>https://ayende.com/blog/181793-C/rejection-dejection-and-resurrection-oh-my?Key=cdc63d7b-22ce-4ad3-b0b3-cc52d878fe8chttps://ayende.com/blog/181793-C/rejection-dejection-and-resurrection-oh-my?Key=cdc63d7b-22ce-4ad3-b0b3-cc52d878fe8cFri, 16 Feb 2018 10:00:00 GMTMemory management as the operating system sees it<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Memory-management-as-the-operating-syste_D121/image_2.png"><img width="344" height="211" title="image" align="right" style="border: 0px currentcolor; border-image: none; float: right; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Memory-management-as-the-operating-syste_D121/image_thumb.png" border="0"></a>About 15 years ago I got a few Operating Systems books and started reading them cover to cover. They were quite interesting to someone who was just starting to learn that there <em>is</em> something under the covers. I remember thinking that this was a pretty complex topics and that the operating system had to do a lot to make everything seem to go.</p><p>The discussion of memory was especially enlightening, since the details of what was going on behind the scene of the flat memory model we usually take for granted are fascinating. In this post, I’m going to lay out a few terms and try to explain how the operating system sees it and how it impacts your application.</p><p>The operating system needs to manage the RAM, and typically there is also some swap space that is available as well to spill things to. There is also mmap files, which come with their own backing store, but I’m jumping ahead a bit.</p><p><strong>Physical memory</strong> – The amount of RAM on the device. This is probably the simplest to grasp here.</p><p><strong>Virtual memory</strong> – The amount of virtual memory <em>each</em> <em>process</em> can access. This is different between each process and quite different from how much memory is actually in used.</p><ul><li>Reserved virtual memory – a section of virtual memory that was reserved by the process. The only thing that the operating system needs to do is not allocate anything within this range of memory. It comes with no other costs. Trying to access this memory without first committing it will cause a fault.</li><li>Committed virtual memory – a section of virtual memory that the process has told the operating system that they intend to use. The operating system commits to <em>having</em> this memory when the process actually uses it. The system can also refuse to commit memory if it choses to do so (for example, because it doesn’t have enough memory for that).</li><li>Used virtual memory –&nbsp; a memory section that was previously committed from the operating system and is actually in use. When you commit memory, that isn’t actually doing anything. Only when you access the memory will the OS actually assign a physical memory page for that memory you just accessed. The distinction between the last two is quite important. It is very common to commit far more memory than is actually in use. By not actually taking any space until it is used, the OS can save a lot of work. </li></ul><p><strong>Memory mapped files</strong> – a section of the virtual address space that uses a particular file as its backing store.</p><p><strong>Shared memory</strong> – a named piece of memory that may be mapped into more than a single process. </p><p>All of these interact with one another in non trivial manners, so it can sometimes be hard to figure out what is going on.</p><p>The interesting case happens when the amount of memory we want to access is higher than the amount of physical RAM on the machine. At this point, the operating system needs to start juggle things around and actually make decisions.</p><p>Reserving virtual memory is mostly a very cheap operation. This can be used if you will want some contiguous memory space but don’t need all of it right now. On 32 bits, the address space is quite constrained, so that can fail, but on 64 bits, you typically have enough memory address space that you don’t have to worry about it.</p><p>Committing virtual memory is where we start getting into interesting issues. We ask the operating system to ensure that we can access this memory, and it will typically say yes. But in order to make that commitment, the OS needs to look at its global state. How many <em>other</em> commitment did it make? In general, the amount of memory commitments that the OS can safely do is limited to the size of the RAM plus the size of the swap. Windows will simply refuse to allocate more (but it can dynamically increase the size of the swap as load grows) but Linux will happily ignore the limit and rely on the fact that applications will rarely actually make use of all the memory they are committing. </p><p>So committed memory is counted against the limit, but it isn’t memory that is actually in use. When a process is accessing memory, only then will the OS allocate that memory for it, until then, it is just a ledger entry.</p><p>But the memory on your machine is not just stuff that processes allocated. There are a bunch of other stuff that may make use of the physical memory. There are I/O bound devices, which we’ll ignore because they don’t matter for us at this point.</p><p>But of much more interest to us at this point is the notion of memory mapped files. These are most certainly memory resident, but they aren’t counted against the commit size of the system. Why is that? Because when we use a memory mapped file, by definition, we are also supplying a file that will be the backing store for this memory. That, in turn, means that we don’t need to worry about where we’ll put this memory if we need to evict some for other purposes, we have the actual file.</p><p>All of this, of course, revolves around the issue of what will actually reside in physical memory. And that leads us to another very important term: </p><p><strong>Active working set</strong> – The portion of the process memory that resides on the physical RAM. Some portions of the process memory may have been paged to disk (if the system is overloaded or if it has just mapped a file and haven’t access the contents yet). The actual term refer to the amount of memory that the process has recently been using, and under load, the working set may be higher then the amount of memory actually in use, leading to thrashing. The OS will keep evicting pages to the page file and then loading them again, in a vicious cycle that typically kills the machines. </p><p>Now that we know all these terms, let’s take a a look at what RavenDB reports in some case:</p><p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Memory-management-as-the-operating-syste_D121/image_4.png"><img width="516" height="162" title="image" style="margin: 0px; border: 0px currentcolor; border-image: none; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Memory-management-as-the-operating-syste_D121/image_thumb_1.png" border="0"></a></p><p>The total system memory is 8 GB (about 200MB are reserved for the hardware). RavenDB is using 5.96GB and the machine entire memory usage is 1.95GB.&nbsp; How can a single process in the machine use more memory than the entire machine?</p><p>The reason for that is that we aren’t always talking about the same thing. Here is the output of pertinent memory information from this machine (<em>cat /proc/meminfo</em>).</p><blockquote><p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Memory-management-as-the-operating-syste_D121/image_6.png"><img width="359" height="184" title="image" style="margin: 0px; border: 0px currentcolor; border-image: none; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Memory-management-as-the-operating-syste_D121/image_thumb_2.png" border="0"></a></p></blockquote><p>You can see that we have a total memory of 8GB, but only 140MB are free. In active use we have 2.2GB and a lot of stuff in inactive.</p><p>There is also the MemAvailable field, which says that we have 6.2GB available. But what does this means? It is a guesstimate of how much memory we can start using without starting to swap. Taking the values from <em>top</em>, it might be easier to understand:</p><p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Memory-management-as-the-operating-syste_D121/image_8.png"><img width="1209" height="41" title="image" style="margin: 0px; border: 0px currentcolor; border-image: none; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Memory-management-as-the-operating-syste_D121/image_thumb_3.png" border="0"></a></p><p>There are about 6GB of cached data, but what is it caching? The answer is that RavenDB is making use of memory mapped files, so we gave the system extra swap space, so to speak. Here is what this looks like when looking at the RavenDB process:</p><p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Memory-management-as-the-operating-syste_D121/image_10.png"><img width="1337" height="78" title="image" style="margin: 0px; border: 0px currentcolor; border-image: none; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Memory-management-as-the-operating-syste_D121/image_thumb_4.png" border="0"></a></p><p>In other words, large parts of our working set is composed of memory mapped files, and we don’t want to try to account that against the actual memory being in use in the system. Because it is very common for us to operate with almost no free memory, because that memory is being used (by the memory mapped files) and the OS knows that it can just make use of this memory if new demands comes in.</p>https://ayende.com/blog/181825-A/memory-management-as-the-operating-system-sees-it?Key=491c6405-43e0-4169-960f-72d2625e687ehttps://ayende.com/blog/181825-A/memory-management-as-the-operating-system-sees-it?Key=491c6405-43e0-4169-960f-72d2625e687eWed, 14 Feb 2018 10:00:00 GMTHandling resource disposal on constrained machines<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Handling-state_8447/image_2.png"><img width="203" height="259" title="image" align="right" style="border: 0px currentcolor; border-image: none; float: right; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Handling-state_8447/image_thumb.png" border="0"></a>A user reported that when running a set of unit tests against a RavenDB service running on a 1 CPU, 512MB Docker machine instance they were able to reliably reproduce an out of memory exception that would kill the server.</p><p>They were able to do that by simply running a long series of pretty simple unit tests. In some cases, this crashed the server.</p><p>It took a while to figure out what the root cause was. RavenDB unit tests are run on isolated databases, with each test having their own db. Running a lot of these tests, especially if they were short, will effectively create and delete a <em>lot</em> of databases on the server.</p><p>So we were able to reproduce this independently of anything by just running the create/delete database in a loop and the problem became obvious. </p><p>Spinning and tearing down a database are pretty big things. Unit tests asides, this is not something that you’ll typically do very often. But with unit tests, you may be creating thousands of them very rapidly. And it looked like that caused a problem, but why?</p><p>Well, we had hanging references to this databases. There were two major areas that caused this:</p><ul><li>Timers of various kinds that might hang around for a while, waiting to actually fire. We had a few cases where we weren’t actually stopping the timer on db teardown, just checking that if the db was disposed when the timer fired. </li><li>Queues for background operations, which might be delayed because we want to optimize I/O globally. In particular, flushing data to disk is expensive, so we defer it as late as possible. But we didn’t remove the db entry from the flush queue on shutdown, relying on the flusher to check if the db was disposed or not.</li></ul><p>None of these are actually really leaks. In both cases, we will clean up everything eventually. But while that happens, this keeps a reference to the database instance, and prevent us from fully releasing all the resources associated with it. </p><p>Being more explicitly about freeing all these resource <em>now</em>, rather than waiting a few seconds and have them released automatically made a major difference in this scenario. This was only the case in such small machines because the amount of memory they had was small enough that they run out before they could start clearing the backlog. On machines with a bit more resources, everything will hum along nicely, because we had enough spare capacity.</p>https://ayende.com/blog/181795-C/handling-resource-disposal-on-constrained-machines?Key=50ccd1c3-9dc9-49c2-8323-deea080bf9d0https://ayende.com/blog/181795-C/handling-resource-disposal-on-constrained-machines?Key=50ccd1c3-9dc9-49c2-8323-deea080bf9d0Tue, 13 Feb 2018 10:00:00 GMTUnderstanding memory utilization in RavenDB<p>This is a snapshot from our production server right his moment. As you can see, the system is now using every little bit of RAM that it has at its disposal. You can also see that the CPU is kinda busy and the network is quite active. </p><p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Understanding-memory-utilization-in-Rave_7643/image_2.png"><img width="558" height="522" title="image" style="border: 0px currentcolor; border-image: none; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Understanding-memory-utilization-in-Rave_7643/image_thumb.png" border="0"></a></p><p>In most cases, an admin seeing this will immediately hit the alarm bell and start figuring out what is causing the system to use all available memory. This looks like a system that is in the precipice of doom, after all.</p><p>However, this is a system that is actually:</p><ul><li>Working as intended</li><li>Is quite fast</li><li>Have plenty of extra head room to spare</li></ul><p>The problem is that the numbers, as you see them here, are lying to you. The system is using 3.9 GB of memory, but look at the value of the Committed. There are only 2.6GB that are actually committed. Memory internals are complex, but basically, this means that the system need to find place in RAM and on the page file for a 2.6GB. But what about all the other stuff? That is being used, but it isn’t held back from the system if we need it. RavenDB is making active use of all the memory that is available in the system, by way of memory mapped files. Let’s see what the RAMMap tool (which is great is diagnosing such issues) tells us:</p><p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Understanding-memory-utilization-in-Rave_7643/image_4.png"><img width="625" height="105" title="image" style="margin: 0px; border: 0px currentcolor; border-image: none; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Understanding-memory-utilization-in-Rave_7643/image_thumb_1.png" border="0"></a>&nbsp;</p><p>So out of a total of 4GB on this machine, about 2.5 GB that are actually in memory are memory mapped files. Out of which we have 1.3 GB in the active set (so they are actively worked on), but we also have 1.1GB here that are in the Standby column.</p><p>What this means is that this is memory that is the OS can just repurpose without any extra costs attached. Note that the amount of modified memory (which will require us to write it out to disk) is really small.</p><p>This is a good place to be at, even if at first glace it was quite surprising to see.</p>https://ayende.com/blog/181729-C/understanding-memory-utilization-in-ravendb?Key=ce51837b-f52d-4a0e-a13d-64f0a9712265https://ayende.com/blog/181729-C/understanding-memory-utilization-in-ravendb?Key=ce51837b-f52d-4a0e-a13d-64f0a9712265Tue, 06 Feb 2018 10:00:00 GMTTransactions, request processing and convoys, oh my!<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Transactions-request-processing-and-conv_1547/image_2.png"><img width="492" height="509" title="image" align="right" style="border: 0px currentcolor; border-image: none; float: right; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Transactions-request-processing-and-conv_1547/image_thumb.png" border="0"></a>You might have noticed that we have been testing, quite heavily, what RavenDB can do. Part of that testing focused on running on barely adequate at ridiculous loads. </p><p>Because we setup the production cluster like a drunk monkeys, a single high load database was able to affect all other databases.&nbsp; To be fair, I explicitly gave the booze to the monkeys and then didn’t look when they made a hash of it, because we wanted to see how RavenDB does when setup by someone who has as little interest in how to properly setup things and just want to Get Things Done.</p><p>The most interesting part of&nbsp; this experiment was that we had a wide variety of mixed workload on the system. Some databases are primarily read heavy, some are used infrequently, some databases are write heavy and some are both write heavy and indexing heavy. Oh, and all the nodes in the cluster were setup identically, with each write and index going to all the nodes. </p><p>That was really interesting when we started very heavy indexing loads and then pushed a lot of writes. Remember, we intentionally under provisioned, and these machines are 2 cores with 4GB RAM and they were doing heavy indexing, processing a lot of reads and writes, the words. </p><p>Here is the traffic summary from one of the instances:</p><p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Transactions-request-processing-and-conv_1547/image_4.png"><img width="515" height="284" title="image" style="margin: 0px; border: 0px currentcolor; border-image: none; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Transactions-request-processing-and-conv_1547/image_thumb_1.png" border="0"></a></p><p>A very interesting wrinkle that I didn’t mention is that this setup has a really interesting property that we never tested. It has fast I/O, but the number of tasks that are waiting for the two cores is high enough that they don’t get a lot of CPU time on an individual basis. What <em>that</em> means is that it looks like we have I/O that is faster than the CPU.</p><p>A core concept of RavenDB performance is that we can send work to the disk to run and in that timeframe we will be able to complete more operations in memory, then send a batch of them to disk. Rinse, repeat, etc.</p><p>This wasn’t the case here. By the time we finished a <em>single</em> operation we’ll already have the previous operation completed, and so we’ll proceed with a single operation every time. That <em>killed</em> our performance, and it meant that the transactions merger queue would grow and grow.</p><p>We fixed things so we’ll take into account the load on the system when this happens, and we gave a higher priority to the transaction merging thread over normal request processing or indexing. This is because write requests can’t complete before the transaction has been committed, so obviously we don’t want to process further requests at the expense of processing the current requests. </p><p>This problem can only occur when we are competing heavily for CPU time, something that we don’t typically see. We are usually constrained a lot more by network or disk. With enough CPU capacity, there is never an issue of the requests and the transaction merger competing for the same core and interrupting each other, so we didn’t consider this scenario.</p><p>Another problem we had was the kind of assumptions we made with regards to the processing power. Because we tested on higher end machines, we tested with some ridiculous performance numbers, including <em>hundreds of thousands </em>of writes per second, indexing, mixed read / write load, etc. But we tested that on high end hardware, which means that we got requests that completed fairly quickly. And that led to a particular pattern of resource utilization. Because we reuse buffers between requests, it is typically better to grab a single big buffer and keep using it, rather than having to enlarge it between requests. </p><p>If your requests are short, the number of requests you have in flight is small, so you get great locality of reference and don’t have to allocate memory from the OS so often. But if that isn’t the case…</p><p>We had a <em>lot</em> of requests in flight. A lot of them because they were waiting for the transaction merger to complete its work, and it was having to fight the incoming requests for some CPU time. So we have a lot of inflight requests, and they intentionally got a bigger buffer than they actually needed (pre-allocating). You can continue down this line of thought, but I’ll give you a hint, it ends with KABOOM.</p><p>All in all, I think that it was a very profitable experiment. This is going to be very relevant for <a href="https://ayende.com/blog/181251-A/there-is-a-docker-in-your-assumptions?key=4bd3e12435c64ea5ab8fe67f223176d4">users on the low end</a>, especially those running Docker instances, etc. But it should also help if you are running production worthy systems and can benefit from higher utilization of the system resources.</p>https://ayende.com/blog/181697-C/transactions-request-processing-and-convoys-oh-my?Key=a119bec5-7f82-4938-ab8c-531b4da13212https://ayende.com/blog/181697-C/transactions-request-processing-and-convoys-oh-my?Key=a119bec5-7f82-4938-ab8c-531b4da13212Fri, 02 Feb 2018 10:00:00 GMTThe cost of finalizers<p>A common pattern we use is to wrap objects that require disposing of resources with a dispose / finalizer. Here is a simple example:</p><blockquote><script src="https://gist.github.com/ayende/49aaf0f52b9551382a0e22555ac31cfd.js"></script></blockquote><p>This is a pretty simple case, and obviously we try to avoid these allocations, so we try to reuse the SecretItem class.</p><p>However, when peeking into high memory utilization issue, we run into a really strange situation. We had a <em>lot</em> of buffers in memory, and most of them were held by items that were held in turn by the Finalization Queue. </p><p>I started to write up what I figured out about finalizer in .NET, but <a href="https://nabacg.wordpress.com/2012/03/11/what-do-you-know-about-freachable-queue/">this post does a great job of explaining everything</a>. The issue here is that we retain a reference to the byte array after the dispose. Because the object is finalizable, the GC isn’t going to be able to remove it on the spot. That means that it is reachable, and anything that it holds is also reachable. </p><p>Now, we are trying <em>very </em>hard to be nice to the GC and not allocate too much, which means that the finalizer thread will only wake up occasionally, but that means that the buffer will be held up in memory for a much longer duration. </p><p>The fix was the set the Buffer to null during the Dispose call, which means that the GC can pick up that it isn’t actually being used much faster.</p>https://ayende.com/blog/181665-A/the-cost-of-finalizers?Key=664d22a5-0349-4e9a-a069-73f0b1be707ahttps://ayende.com/blog/181665-A/the-cost-of-finalizers?Key=664d22a5-0349-4e9a-a069-73f0b1be707aThu, 01 Feb 2018 10:00:00 GMTProduction Test Run: Rowhammer in Voron<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Production-Test-Run_146DE/image_2.png"><img width="411" height="167" title="image" align="right" style="border: 0px currentcolor; border-image: none; float: right; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Production-Test-Run_146DE/image_thumb.png" border="0"></a><a href="https://en.wikipedia.org/wiki/Row_hammer">Rowhammer</a> is a type of attack on the way DRAM is built. A specific write pattern to a specific memory cell can cause “leakage” to nearby memory cells, causing bit flips. The issue we were facing in production ended up being very similar. </p><p>The initial concern was that a database on disk size was very large. Enough that we started a serious investigating into what exactly is taking all this space. Our internal reporting said, nothing. And that was very strange. Looking at the actual file usage, we had a <em>lot</em> of transaction journals taking a lot of space there. By a lot I mean that the size of the data file in question was 32 MB and the journals were taking a total of over 15GB. To be fair, some of them were kept around to be reused, but that was <em>bad</em>.</p><p>It took a while to figure out what was going on. This particular Voron instance was used for a map/reduce index on a database that had <em>high</em> write throughput. Because if that, the indexing was <em>always</em> active. So far, so good, we have several other such instances, and they don’t exhibit this behavior. What was different about this index is that due to the shape of the index and the nature of the data, what ended up happening is that we’ll always modify the same (small) set of the data. </p><p>This index sums up a number of events and aggregate them based on when they happened. This particular system handle about a hundred updates a second on average, and can peak to about five to seven times that. The index gives us things such as “how many events of each type happened today” and things like that. This means that there is a <em>lot</em> of locality in the updates. And that was the key. </p><p>Even though this index (and the Voron storage that backed it) was experienced a lot of writes, these writes almost always happened to the same set of data (basically updating the counters). That means that there wasn’t actually just a very small set of pages in the data that were modified. And that set off a chain of behaviors that results in a lot of disk space being used. </p><ul><li>A lot of data is modified, meaning that we need to write a lot to the journal on each transaction.</li><li>Because the same data is constant modified, the total amount of modified bytes is smaller than a certain threshold.</li><li>Writes are constants.</li><li>Flushing the data to disk is <em>expensive</em>, so we try to avoid it.</li><li>We can only delete journals after we flushed the data.</li></ul><p>Because we try to avoid flushing to disk if we can, we only do that when there is enough idle time or when enough data has been modified. In this case, there <em>was</em> no idle time, and the amount of data that was modified was too small to hit the limit.</p><p>The system would actually balance itself out eventually (which is why it stopped at around ~15GB or so of journals). At some point we would either hit an idle spot or the buffer will hit the limit and we’ll flush to disk, which allow us to free the journals, but that happened only after we had quite a few. The fix was to add a limit to how long we’ll delay flushing to disk in such a case and was quite trivial once we figure out what exactly all the different parts were doing.</p>https://ayende.com/blog/181409-A/production-test-run-rowhammer-in-voron?Key=78d06309-34fb-4dd8-9d39-42072a903740https://ayende.com/blog/181409-A/production-test-run-rowhammer-in-voron?Key=78d06309-34fb-4dd8-9d39-42072a903740Mon, 22 Jan 2018 10:00:00 GMTThe TCP Inversion Proposal<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/The-TCP-Inversal-Prosopal_817F/image_2.png"><img width="691" height="482" title="image" align="right" style="border: 0px currentcolor; border-image: none; float: right; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/The-TCP-Inversal-Prosopal_817F/image_thumb.png" border="0"></a>A customer asked for an interesting feature. Given two servers that need to replicate data between themselves, and the following network topology:</p><ul><li>The Red DB is on the internal network, it is able to connect to the Gray DB.</li><li>The Gray DB is on the DMZ, it is <em>not</em> able to connect to the Red DB.</li></ul><p>They are setup (with RavenDB) to do replication to one another. With RavenDB, this is done by each server opening a TCP connection to the other and sending all the updates this way. </p><p>Now, this is a simple example of a one way network topology, but there are many other cases where you might get into a situation where two nodes need to talk to each other, but only one node is able to connect to the other. However, once a TCP connection is established, communication is bidirectional. </p><p>The customer asked if we could add a configuration to support reversing the communication mode for replication. Instead of the source server initiating the connection, the destination server will do that, and then the source server will use the already established TCP connection henceforth. </p><p>This <em>works</em>, at least in theory, but there are many subtle issues that you’ll need to deal with:</p><ul><li>This means that the source server (now confusingly the one that <em>accepts</em> requests) is not in control of sending the data. Conversely, it means that the destination side must always keep the connection open, retrying immediately if there was a failure and never getting a chance to actually idle. This is a minor concern.</li><li>Security is another problem. Replication is usually setup by the admin on the source server, but now we have to set it up on both ends, and make sure that the destination server has the ability to connect to the source server. That might carry with it more permissions that we want to grant to the destination (such as the ability to <em>modify</em> data, not just get it). </li><li>Configuration is now more complex, because replication has a bunch of options that needs to be set, and now we need to set these on the source server, then somehow have the destination server let the source know which replication configuration it is interested in. What happens if the configuration differs between the two nodes?</li><li>Failover in distributed system made of distributed systems is <em>hard.</em> So far we actually talked about nodes, but this isn’t actually the case, the Red and Gray DBS may be clusters in their own right, composed of multiple independent nodes each. When using replication in the usual manner, the source cluster will select a node to be in charge of the replication task, and this will replicate the data to a node on the other side. This can have multiple failure modes, a source node can be down, a destination node can be done, etc. That is all handled, but it will need to be handled <em>anew</em> for the other side.</li><li>Concurrency is yet another issue. Replication is now controlled by the source, so it can assume certain sequence of operations, if the destination can initiate the connection, it can initiate <em>multiple</em> connections (or different destination nodes will open connections to the same / different source nodes at the same time) resulting in a sequential code path suddenly needing to deal with concurrency, locking, etc. </li></ul><p>In short, even though it looks like a simple feature, the amount of complexity it brings is <em>quite</em> high. </p><p>Luckily for us, we don’t <em>need</em> to do all that. If what we want is just to have the connection be initiated by the other side, that is quite easy. Set things up <em>the other way</em>, at the TCP level. We’ll call our solution Routy, because it’s funny. </p><p>First, you’ll need a Routy service at the destination node, this will just open an TCP connection to the source node. Because this is initiated by the destination, this works fine. This TCP connection is not sent directly to the DB on the source, instead, it is sent to the Routy service on the source that will accept the connection and keep it open. </p><p>At the source side, you’ll configure it to connect to the source-side service. At this point, the Routy service on the source has <em>two</em> TCP connections. One that came from the source and one that came from the remote Routy service on the destination. At this point, it will basically copy the data between the two sockets and we’ll have a connection to the other side. On the destination side, the Routy service will start getting data from the source, at which point it will initiate its own connection to the database on the destination, ending up with two connections of its own that it can then tie together.</p><p>From the point of view of the databases, the source server initiated the connection and the destination server accepted it, as usual. From the point of view of the network, this is a TCP connection that came from the destination to the source, and then a bunch of connections locally on either end. </p><p>You can write such a Routy service in a day or so, although the error handling is probably the trickiest one. However, you probably don’t need to do that. This is called TCP reverse tunneling, and you can just use <a href="https://toic.org/blog/2009/reverse-ssh-port-forwarding/">SSH to do that</a>. There are also many other tools that can do the same. </p><p>Oh, and you might want to talk to your network admin first, it is possible that this would be easier if they will just change the firewall settings. And if they don’t do that, remember that this is <em>effectively</em> doing the same, so there might be a good reason here. </p>https://ayende.com/blog/181377-B/the-tcp-inversion-proposal?Key=a8b41f20-05f6-4f96-adff-a355b0fc0034https://ayende.com/blog/181377-B/the-tcp-inversion-proposal?Key=a8b41f20-05f6-4f96-adff-a355b0fc0034Fri, 19 Jan 2018 10:00:00 GMTProduction Test Run: The worst is yet to come<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Production-Test-Run-The_128E2/image_2.png"><img width="288" height="343" title="image" align="right" style="border: 0px currentcolor; border-image: none; float: right; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Production-Test-Run-The_128E2/image_thumb.png" border="0"></a>Before stamping RavenDB with the RTM marker, we decided that we wanted to push it to our production systems. That is something that we have been doing for quite a while, obviously, dogfooding our own infrastructure. But this time was different. While before we had a pretty simple deployment and stable pace, this time we decided to mix things up.</p><p>In other words, we decided to go ahead with the IT version of the stooges, for our production systems. In particular, that means this blog, the internal systems that run our business, all our websites, external services that are exposed to customers, etc. As I’m writing this, one of the nodes in our cluster has run out of disk space, it has been doing that since last week. Another node has been torn down and rebuilt at least twice during this run. </p><p>We also did a few times of “it is compiles, it fits production”. In other words, we basically <a href="https://twitter.com/ExpertBeginner1">read this guy’s twitter stream and did what he said.</a> This resulted in an infinite loop in production on two nodes and <em>that</em> issue was handled by someone who didn’t know what the problem was, wasn’t part of the change that cause it and was able to figure it out, and then had to workaround it with no code changes.</p><p>We also had two different things upgrade their (interdependent) systems at the same time, which included both upgrading the software and adding new features. I also had two guys with the ability to manage machines, and a whole brigade of people who were uploading things to production. That meant that we had distinct lack of knowledge across the board, so the people managing the machines weren’t always aware that the system was experiencing and the people deploying software weren’t aware of the actual state of the system. At some points I’m pretty sure that we had two concurrent (and opposing) rolling upgrades to the database servers.</p><p>No, I didn’t spike my coffee with anything but extra sugar. This mess of a production deployment was quite carefully <em>planned</em>. I’ll admit that I wanted to do that a few months earlier, but it looks like my shipment of additional time was delayed in the mail, so we do what we can.</p><p>We need to support this software for a minimum of five years, likely longer, that means that we really need to see where all the potholes are and patch them as best we can. This means that we need to test it on bad situations. And there is only so much that a chaos monkey can do. I don’t want to see what happens when the network failed. That is quite easily enough to simulate and certainly something that we are thinking about. But being able to diagnose a live production system with a infinite loop because of bad error handling and recover that. That is the kind of stuff that I want to know that we can do in order to properly support things in production.</p><p>And while we had a few glitches, but for the most part, I don’t think that any one that was really observed externally. The reason for that is the reliability mechanisms in RavenDB 4.0, we need just a single server to remain functional, for the most part, which meant that we can just run without issue even if most of the cluster was flat out broken for an extended period of time.</p><p>We got a <em>lot</em> of really interesting results for this experience, I’ll be posting about some of them in the near future. I don’t think that I recommend doing that for any customers, but the problem is that we <em>have</em> seen systems that are managed about as poorly, and we want to be able to survive in such (hostile) environment and also be able to support customers that have partial or even misleading ideas about what their own systems look like and behave. </p>https://ayende.com/blog/181345-C/production-test-run-the-worst-is-yet-to-come?Key=e182a5a6-126a-420f-b36b-7b3e8b9b5b03https://ayende.com/blog/181345-C/production-test-run-the-worst-is-yet-to-come?Key=e182a5a6-126a-420f-b36b-7b3e8b9b5b03Tue, 16 Jan 2018 10:00:00 GMTThere is a Docker in your assumptions<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/There-is-a-docker-in-your-assumptions_E97B/image_4.png"><img width="378" height="220" title="image" align="right" style="border: 0px currentcolor; border-image: none; float: right; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/There-is-a-docker-in-your-assumptions_E97B/image_thumb_1.png" border="0"></a>About a decade ago I remember getting a call from a customer that was very upset with RavenDB. They just deployed to production a brand new system, they were ready for high load so they wen with a 32 GB and 16 cores system (which was a <em>lot</em> at the time). </p><p>The gist of the issue, RavenDB was using 15% of the CPU and about 3 GB on RAM to serve requests. When I inquired further about how fast it was doing that I got a grudging “a millisecond or three, not more”. I ended the call wondering if I should add a thread that would do nothing but allocate memory and compute primes. That was a long time ago, since the idea of having a thread do crypto mining didn’t occur to me <img class="wlEmoticon wlEmoticon-smile" style="" alt="Smile" src="https://ayende.com/blog/Images/Open-Live-Writer/There-is-a-docker-in-your-assumptions_E97B/wlEmoticon-smile_2.png">.</p><p>This is a funny story, but it shows a real problem. Users really want you to be able to make full use of their system, and one of the design goals for RavenDB has been to do just that. This means making use of as much memory as we can and as much CPU as we need. We did that with an eye toward common production machines, with many GB of memory and cores to spare.</p><p>And then came Docker, and suddenly it was the age of the 512MB machine with a single core all over again. That cause… issues for us. In particular, our usual configuration is meant for a much stronger machine, so we now need to also deploy separate configuration for lower end machines. Luckily for us, we were always planning on running on low end hardware, for POS and embedded scenarios, but it is funny to see the resurgence of the small machine in production again.</p>https://ayende.com/blog/181251-A/there-is-a-docker-in-your-assumptions?Key=4bd3e124-35c6-4ea5-ab8f-e67f223176d4https://ayende.com/blog/181251-A/there-is-a-docker-in-your-assumptions?Key=4bd3e124-35c6-4ea5-ab8f-e67f223176d4Thu, 11 Jan 2018 10:00:00 GMTThe five requirements for the design of all major RavenDB features<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/The-five-pillars-of-all-major-RavenDB-fe_121BA/image_2.png"><img width="452" height="245" title="image" align="right" style="border: 0px currentcolor; border-image: none; float: right; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/The-five-pillars-of-all-major-RavenDB-fe_121BA/image_thumb.png" border="0"></a>We started some (minor) design work for the <em>next</em> set of features for RavenDB (as we discussed in the roadmap) and a few interesting things came out of that. In particular, the concept of the five pillars any major feature need to stand on.</p><p>By major I mean something that impact the persistent state of the system as a whole. For example, attachments, <a href="https://ayende.com/blog/180067/ravendb-4-0-interlocked-distributed-operations">cmpxchng</a>, revisions and conflicts are quite obvious in this manner, while a query is local and transient.</p><p>Here they are, in no order of importance:</p><ul><li>Client API</li><li>Cluster</li><li>Backup</li><li>Studio</li><li>Disaster Recovery</li></ul><p>The client API is how a feature is exposed to clients, obviously. This can be explicit, as in the case of attachments or more subtle, like the CmpXchg usage, which can either be the low level calls or using it directly from RQL.</p><p>The cluster is how a particular feature operates in the cluster. In the case of attachments, it means that attachments flow across the network as part of the replication behavior between nodes. For CmgXchg, it means that the values are directly stored in the cluster state machine and are managed by the Raft cluster. The actual <em>way</em> it works doesn’t matter, that we thought about the implications of this feature in a distributed environment has been discussed.</p><p>Backup is subtle. It is easy to implement a feature and forget that we actually need to support backup and restore until very late in the game. RavenDB has a few backup strategies, and this also include migrating data from another instance, long term behavior, etc. RavenDB has a few backup strategies (full snapshot or regular) and the feature need to work across all of them. </p><p>The studio refers to how we are actually going to expose a feature to the user on the studio. A good example where we failed in the CmpXchng values that are currently not exposed in the studio (there are endpoints for that, but we haven’t got around to this). We are feeling the lack and it is on the fast track for new features for the next minor release. If a feature isn’t in the studio, how do we expect a user to discover, manage or work with it?</p><p>Finally, we have <a href="https://ayende.com/blog/179713/when-disk-and-hardware-fall">disaster recovery</a>. We are taking data integrity very seriously, and one of the things we do is to make sure that even in the case of disk failure or some other data corruption, we can still get the data out. This is done by laying out the data on disk in such a way that there are multiple ways to access it. First, by reading the data normally and assuming a valid structure. This is what we usually do. Second, by reading one byte at a time and still being able to reconstruct the data back, even if some parts of that has been corrupted. This require us to plan ahead how we store the data for a feature in advance so we can support recovery.</p><p>There are other stuff as well, anything from monitoring to debugging to performance, but usually they aren’t so important at the <em>design</em> phase of a feature.</p>https://ayende.com/blog/181185-C/the-five-requirements-for-the-design-of-all-major-ravendb-features?Key=dfe37b13-2ca8-4e08-b3ad-d0e0ca841b0dhttps://ayende.com/blog/181185-C/the-five-requirements-for-the-design-of-all-major-ravendb-features?Key=dfe37b13-2ca8-4e08-b3ad-d0e0ca841b0dThu, 04 Jan 2018 10:00:00 GMTInvisible race conditions: The cache has poisoned us<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Inisibile_D600/image_2.png"><img style="border: 0px currentcolor; float: right; display: inline; background-image: none;" title="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Inisibile_D600/image_thumb.png" alt="image" width="333" height="275" align="right" border="0" /></a>We got a memory corruption error one of those days that was quite interesting. It was in a place where we previous <em>fixed </em>a memory corruption error and was, at a glance, quite impossible.</p>
<p>The code would checkout an item from the cache and increment its ref count, which will keep it alive for as long as we were using it. But <em>something </em>made it fail, and quite horribly, too. We finally tracked the code down to this piece of code, which is run when we update the cache:</p>
<blockquote>
<script src="https://gist.github.com/ayende/bcf2b2c147e7cf3f37148bad785a8dbe.js"></script>
</blockquote>
<p>When the ref count goes to zero, we&rsquo;ll release the memory, and _items is a Concurrent Dictionary.</p>
<p>Do you see the error?</p>
<p>The AddOrUpdate method will call the updateValueFactory when it needs to update a value, but it makes <em>no promises </em>with regards to its atomicity. In other words, if you have two threads calling this method, the update lambda will be called <em>twice</em> with the same item, resulting in early release of the value and hence memory corruption.</p>
<p>This can be <a href="https://github.com/dotnet/corefx/blob/780ad74c8cfae78a7eaba65530e436ddda63142e/src/System.Collections.Concurrent/src/System/Collections/Concurrent/ConcurrentDictionary.cs#L1080-L1100">seen here</a>:</p>
<blockquote>
<script src="https://gist.github.com/ayende/233f23152b33c23fa478ebb92e9cbd5b.js"></script>
</blockquote>
<p>As you can see, we are looking at a loop that may be executed several times, as such the updateValueFactory can be called several times, and the only guarantee we have is that after the method has returned, the last value we were called with was the value that was in the cache and we replaced.</p>
<p>Here is the fix:</p>
<blockquote>
<script src="https://gist.github.com/ayende/38590ab1d37a585b011f08727e18c5c3.js"></script>
</blockquote>
<p>That was quite hard to figure out, because at a glance, this looks just fine.</p>https://ayende.com/blog/181121-A/invisible-race-conditions-the-cache-has-poisoned-us?Key=d75cf796-5004-4a42-8344-1ba8b5c4af1dhttps://ayende.com/blog/181121-A/invisible-race-conditions-the-cache-has-poisoned-us?Key=d75cf796-5004-4a42-8344-1ba8b5c4af1dTue, 02 Jan 2018 10:00:00 GMTWhen letting the user put the system into an invalid state is a desirable property<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/When-letting-the-user-put-the-system-int_C3D0/image_2.png"><img width="648" height="187" title="image" align="right" style="border: 0px currentcolor; border-image: none; float: right; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/When-letting-the-user-put-the-system-int_C3D0/image_thumb.png" border="0"></a>A bug report was opened against RavenDB by one of our developers: “We should prevent error if the user told us to use a URL whose hostname isn’t matching the certificate provided.”</p><p>The intent is clear, we have a setup in which we have a certificate, and the only valid URLs in this case are hostnames that are in the certificate. If the user configured the system so we’ll be listening on: https://my-rvn-one but the certificate has hostname “their-rvn-two”, then we <em>know</em> that this is going to cause issues for clients. They will try to connect but fail because of certificate validation. The hostname in the URL and the hostnames in the certificate don’t match, therefor, an error.</p><p>I closed this bug as Won’t Fix, and that deserve an explanation. I usually <em>care</em> very deeply about the kind of errors that we generate, and we <em>want</em> to catch things as early as possible. </p><p>But sometimes, that is the worst possible thing. By preventing the user from doing the “wrong” thing, we also prevent it from doing something that is required if you got yourself into a bad state.</p><p>Consider the following case, a node is down, and we provisioned another one. We got a different IP, but we need to update the DNS record. That is going to take 24 hours to propagate properly, but we need to be up <em>now</em>. So I change the system configuration to use a different URL, but I can’t get a certificate for the new one yet for whatever reason. Now the validation kicks in, and I’m dead in the water. I might just want to be able to peek into the system, or configure the clients to ignore the certificate error, or something.</p><p>In this case, putting the system into an invalid state (such as mismatch between hostname and certificate) is desirable. An admin may want to do this for a host of reasons, mostly because they are under the gun and need things to <em>work</em>. There are surprisingly a large number of such cases, where you know that the situation is invalid, but you allow it because <em>not</em> doing so will lead to blocking off important scenarios. </p>https://ayende.com/blog/180993/when-letting-the-user-put-the-system-into-an-invalid-state-is-a-desirable-property?Key=356dbc05-362f-4d34-98ed-c29329af7810https://ayende.com/blog/180993/when-letting-the-user-put-the-system-into-an-invalid-state-is-a-desirable-property?Key=356dbc05-362f-4d34-98ed-c29329af7810Fri, 22 Dec 2017 10:00:00 GMTThe married couple component design pattern<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/The-married-couple-design-decision_CCE4/image_2.png"><img width="385" height="334" title="image" align="right" style="border: 0px currentcolor; border-image: none; float: right; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/The-married-couple-design-decision_CCE4/image_thumb.png" border="0"></a>One of the tough problems in distributed programming is how to deal with a component that is made up of multiple nodes. Consider a reservation service that is made up of a few nodes that needs to ensure that regardless of which node you’re talking to, if you made a reservation, you’ll have a spot. There are a lot of ways to solve this from the inside, but that isn’t what I want to talk about right now. I want to talk about the overall approach to modeling such systems. </p><p>Instead of focusing on how you would implement such a system, consider this to be an internal problem for this particular component. A good parallel for this problem is making plans with a couple for a meetup. You might be talking to both of them or just one, but you don’t care. The person you are talking to is the one that is going to giving you a decision that is valid for the couple.</p><p><em>How</em> they do that is not relevant. It can be that one of them is in charge of the social calendar or that they flip based on the day of the week or whoever got out of bed first this morning or whatever his mother called her dog last year or… you don’t care. Furthermore, you probably <em>don’t want to know</em>. That is an internal problem and sticking your nose into the internal decision making is a Bad Idea that may lead to someone sleeping on <em>your</em> couch for an indefinite period of time.</p><p>But, and this is important, you can walk to either one of them and they will make such a decision. It may be something in the order of “Let me talk to my significant other and I’ll get back to you by tomorrow” or it can be “Sure, how about 8 PM on Sunday?” or even “I don’t like you, so nope, nope nope” but you’ll get some sort of an answer.</p><p>Taking this back to the distributed components design, that kind of decision in internal to the component and the mechanics of this is handled internally shouldn’t be exposed outside. Let’s take a look on why this is the case.</p><p>Starting out, we run all such decisions as a consensus that required a majority of the nodes. But a couple of nodes went down and took down the system in a bad way, so the next iteration we had moved to reserving some spots for each node that they own and can hand off to others on their own, without consulting any other nodes.&nbsp; This sort of change shouldn’t matters to callers of this component, but it is very common to have outside parties take notice of how you are doing things and take a dependency on that.</p><p>The main reason I call it the married couple design problem is that it should immediately cause you to consider how you should stay <em>away</em> from the decision making there. Of course, if you don’t, I’m going to call your design the Mother In Law Architecture.</p>https://ayende.com/blog/180898/the-married-couple-component-design-pattern?Key=b39c84de-f98a-405c-ad8e-e2b1f23f3483https://ayende.com/blog/180898/the-married-couple-component-design-pattern?Key=b39c84de-f98a-405c-ad8e-e2b1f23f3483Wed, 20 Dec 2017 10:00:00 GMTTime handling and user experience, Backup Scheduling in RavenDB as a case study<p>Time handling in software <em>sucks</em>. In one of the RavenDB conferences a few years ago we had a fantastic <a href="https://www.google.co.il/url?sa=t&amp;rct=j&amp;q=&amp;esrc=s&amp;source=web&amp;cd=1&amp;cad=rja&amp;uact=8&amp;ved=0ahUKEwi1lP6Bj6LXAhUHHxoKHQXSCPYQtwIIJDAA&amp;url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3Du9Q4qbbJ0Vw&amp;usg=AOvVaw1fD3rOeH1MpvZz8LotyOTL">talk for over an hour that talked about just that.</a>&nbsp; It sucks because what a computer think about as time and what we think about as time are two very different things. This usually applies to applications, since that is where you are typically working with dates &amp; times in front of the users, but we had an interesting case with backups scheduling inside RavenDB.</p><p>RavenDB allow you to schedule full and incremental backups and it used the CRON format to set things up. This make things very easy to setup and is highly configurable. </p><p>It is also very confusing. Consider the following initial user interface:</p><p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Handling-time_14714/image_8.png"><img width="633" height="116" title="image" style="margin: 0px; border: 0px currentcolor; border-image: none; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Handling-time_14714/image_thumb_3.png" border="0"></a></p><p>It’s functional, does exactly what it needs to do and allow the administrator complete freedom. It is also pretty much opaque, requiring the admin to either know the CRON format by heart (possible, but not something that we want to rely on) or find a webpage that would translate that.</p><p>The next thing we did was to avoid the extra step and let you know explicitly what this definition meant.</p><p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Handling-time_14714/image_6.png"><img width="637" height="156" title="image" style="margin: 0px; border: 0px currentcolor; border-image: none; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Handling-time_14714/image_thumb_2.png" border="0"></a></p><p>This is much better, but we can still do better. Instead of just an abstract description, let us let the user know <em>when </em>the next backup is going to happen. If you run backups each Friday, you probably want to change that to the before or after <em>Black</em> Friday, for example. So date &amp; time matter. </p><p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Handling-time_14714/image_4.png"><img width="654" height="209" title="image" style="margin: 0px; border: 0px currentcolor; border-image: none; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Handling-time_14714/image_thumb_1.png" border="0"></a></p><p>This lead us to the next issue, <em>what</em> time? In particular, backups are done on the server’s local time, on the assumption that most of the time this is what the administrator will expects. This make it easier to do things like schedule backup to happen in the off hours. We thought about doing that always in UTC, but this would require you to always do date math in your head.</p><p>That does lead to the issue of what to do when the admin’s clock and the server clock are out of sync? This is how this will look like in that case.</p><p><br></p><p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Handling-time_14714/image_2.png"><img width="636" height="263" title="image" style="margin: 0px; border: 0px currentcolor; border-image: none; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Handling-time_14714/image_thumb.png" border="0"></a></p><p>We let the user know that the backup will run in the local server time and when that will happen in the user’s time. </p><p>We also provide on the fly translation from CRON format to a human readable form.</p><p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Handling-time_14714/image_10.png"><img width="580" height="112" title="image" style="margin: 0px; border: 0px currentcolor; border-image: none; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Handling-time_14714/image_thumb_4.png" border="0"></a></p><p>And finally, to make sure that we cover all the basis, in addition to giving you the time specification, the server time and local time, we also give you the time duration to the next backup. </p><p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Handling-time_14714/image_12.png"><img width="576" height="141" title="image" style="margin: 0px; border: 0px currentcolor; border-image: none; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Handling-time_14714/image_thumb_5.png" border="0"></a></p><p>I think that this covers up pretty much every scenario that I can think of. </p><p>Except getting the administrator to do practice restores to ensure that they are familiar with how to do this. <img class="wlEmoticon wlEmoticon-smile" alt="Smile" src="https://ayende.com/blog/Images/Open-Live-Writer/Handling-time_14714/wlEmoticon-smile_2.png"></p><p><strong>Update: </strong>Here is how the field looks like when empty:</p><p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Time-handling-and-user-experience-Backup_FAAD/image_2.png"><img width="629" height="67" title="image" style="margin: 0px; border: 0px currentcolor; border-image: none; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Time-handling-and-user-experience-Backup_FAAD/image_thumb.png" border="0"></a></p>https://ayende.com/blog/180482/time-handling-and-user-experience-backup-scheduling-in-ravendb-as-a-case-study?Key=be482cc8-cf0d-4066-bc75-46fb06e43d56https://ayende.com/blog/180482/time-handling-and-user-experience-backup-scheduling-in-ravendb-as-a-case-study?Key=be482cc8-cf0d-4066-bc75-46fb06e43d56Tue, 05 Dec 2017 10:00:00 GMTAPI Design: The lack of a method was intentional forethought<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/API-Design-The-lack-of-a-method-was_7ECB/image_2.png"><img width="461" height="274" title="image" align="right" style="border: 0px currentcolor; border-image: none; float: right; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/API-Design-The-lack-of-a-method-was_7ECB/image_thumb.png" border="0"></a>One of the benefits of having a product in the market for a decade is that you gain some experience in how people are using it. This lead to interesting design decisions over time. Some of them are obvious. Such as the setup process for RavenDB. Some aren’t, such as the surface of the session. It is kept small and focused on CRUD operations to make it easy to understand and use in the common cases.</p><p>And sometimes, the design is in the fact that the code isn’t there at all. Case in point, the notion of connection strings in RavenDB 4.0. This feature was removed in its entirety in this release and users are expected to provide the connection parameters to the document store on their own. How they do that is not something that we concern ourselves with. A large part of the reasoning behind this decision was around our use of X509 certificates for authentication. In many environments there are strict rules about the usage and deployment of certificates and having a connection string facility would force us to always chase the latest ones. For that matter, where you <em>store</em> the connection string is also a problem. We have seen configuration stored in app.config, environment variables, json configuration, DI configuration and more. And each time we were expected to support this new method of getting the connection string.&nbsp; By not having any such mechanism, we are able to circumvent the problem entirely. </p><p>This sounds like a copout, but it isn’t. Consider <a href="https://groups.google.com/forum/#!topic/ravendb/5E8iYCnjKgQ">this thread in the RavenDB mailing list</a>. It talks about how to setup RavenDB 4.0 in Azure in a secure manner. Just reading the title of the thread made me cringe, thinking that this is going to be a question that would take a long time to answer (setup time, mostly). But that isn’t it at all. instead, this is a walk through showing you how you can setup things properly in an environment where you cannot load a certificate from a file and need to do that directly from the Azure certificate store. </p><p>This is quite important, since this is one of the things that I keep having to explain to team members. We want to be a very clear demarcation about the kind of things that we support and the kinds we don’t. Mostly because I’m not willing to do half ass job in supporting things. So saying something like: Oh, we’ll just support a file path and we’ll let the user do the rest for more complex stuff is not going to fly with this design philosophy.</p><p>If we do something, a user reasonably expects us to do a complete job in doing that and puts the entire onus of responsibility on us. On the other hand, if you don’t do something, there is usually no expectation that you’ll handle that. There is also the issue that is many cases, solving the general problem is nearly impossible while solving a particular user scenario is trivial. So letting them have full responsibility works much better. At a minimum, they don’t need to circumvent the things we do for the stuff that we do support, but can start from a clear ground.</p><p>Coming back to the certificate example, if we would have a Certificate property and a CertificatePath property, allowing for each setup for a common scenario, then it is easy down the line to just assume that the CertificatePath is set if we have a certificate, and suddenly a user that doesn’t use a certificate from a file is going to need to be aware of this and handle the issue. If there is no such property, the behavior is always going to be correct. </p>https://ayende.com/blog/180834/api-design-the-lack-of-a-method-was-intentional-forethought?Key=1026cb7b-fd76-4e6c-b4f8-4d2233ae3700https://ayende.com/blog/180834/api-design-the-lack-of-a-method-was-intentional-forethought?Key=1026cb7b-fd76-4e6c-b4f8-4d2233ae3700Mon, 04 Dec 2017 10:00:00 GMTThe bare minimum a distributed system developer should know about: Binding to IP addresses<p>It is easy to think about a service that listen to the network as just that, it listens to <em>the </em>network. In practice, this is often quite a bit more complex than that.</p><p>For example, what happens when I’m doing something like this?</p><blockquote><p><a href="https://ayende.com/blog/Images/Open-Live-Writer/The_13B8B/image_6.png"><img width="586" height="166" title="image" style="border: 0px currentcolor; border-image: none; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/The_13B8B/image_thumb_2.png" border="0"></a></p></blockquote><p>In this case, we are setting up a web server with binding to the local machine name. But that isn’t actually how it works.</p><p>At the TCP level, there is no such thing as machine name. So how can this even work? </p><p>Here is what is going on. When we specify a server URL in this manner, we are actually doing something like this:</p><p><a href="https://ayende.com/blog/Images/Open-Live-Writer/The_13B8B/image_2.png"><img width="797" height="209" title="image" style="border: 0px currentcolor; border-image: none; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/The_13B8B/image_thumb.png" border="0"></a></p><p>And then the server is going to bind <em>to each and every one of them</em>. Here is an interesting tidbit:</p><p><a href="https://ayende.com/blog/Images/Open-Live-Writer/The_13B8B/image_4.png"><img width="599" height="392" title="image" style="margin: 0px; border: 0px currentcolor; border-image: none; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/The_13B8B/image_thumb_1.png" border="0"></a></p><p>What this means is that this service doesn’t have a single entry point, you can reach it through multiple distinct IP addresses.</p><p>But why would my machine have so may IP addresses? Well, let us take a look. It looks like this machine has quite a few network adapters:</p><p><a href="https://ayende.com/blog/Images/Open-Live-Writer/The_13B8B/image_8.png"><img width="919" height="228" title="image" style="margin: 0px; border: 0px currentcolor; border-image: none; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/The_13B8B/image_thumb_3.png" border="0"></a></p><p>I got a bunch of virtual ones for Docker and VMs, and then the Wi-Fi (writing on my laptop) and wired network.</p><p>Each one of these represent a way to bind to the network. In fact, there are also over 16 <em>million </em>additional IP addresses that I’m not showing, the entire 127.x.x.x range. (You probably know that 127.0.0.1 is loopback, right? But so it 127.127.127.127, etc.).</p><p>All of this is not really that interesting, until you realize that this has real world implications for you. Consider a server that has multiple network cards, such as this one:</p><p><a href="https://ayende.com/blog/Images/Open-Live-Writer/The_13B8B/image_12.png"><img width="1112" height="582" title="image" style="border: 0px currentcolor; border-image: none; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/The_13B8B/image_thumb_5.png" border="0"></a></p><p>What we have here is a server that has two totally separate network cards. One to talk to the outside world and one to talk to the internal network.</p><p>When is this useful? In pretty much every single cloud provider you’ll have very different networks. On Amazon, the internal network gives you effectively free bandwidth, while you pay for the external one. And that is leaving aside the security implications</p><p>It is also common to have different things bound to different interfaces. Your admin API endpoint isn’t even listening to the public internet, for example, it will only process packets coming from the internal network. That adds a bit more security and isolation (you still need encryption, authentication, etc of course).</p><p>Another deployment mode (which has gone out of fashion) was to hook both network cards to the public internet, using different routes. This way, if one went down, you could still respond to requests, and usually you could also handle more traffic. This was in the days where the network was often the bottleneck, but nowadays I think we have enough network bandwidth that program efficiency is of more importance and this practice somewhat fell out of favor.</p>https://ayende.com/blog/180609/the-bare-minimum-a-distributed-system-developer-should-know-about-binding-to-ip-addresses?Key=783fed9c-fbd9-4977-be86-3f43873b385ahttps://ayende.com/blog/180609/the-bare-minimum-a-distributed-system-developer-should-know-about-binding-to-ip-addresses?Key=783fed9c-fbd9-4977-be86-3f43873b385aMon, 20 Nov 2017 10:00:00 GMTThe best features are the ones you never knew were there: Protocol fix-ups<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/The-best-features-are-the-ones-you-never_AC9B/image_6.png"><img width="626" height="366" title="image" align="right" style="border: 0px currentcolor; border-image: none; float: right; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/The-best-features-are-the-ones-you-never_AC9B/image_thumb_2.png" border="0"></a>RavenDB uses HTTP for most of its communication. It can be used in unsecured mode, using HTTP or in secured mode, using HTTPS. So far, this is pretty standard. Let us look at a couple of URLs:</p><ul><li>http://github.com</li><li>https://github.com</li></ul><p>If you try to go to github using HTTP, it will redirect you to the HTTPS site. It is very easy to do, because the URLs above are actually:</p><ul><li>http://github.com:80</li><li>https://github.com:443</li></ul><p>In other words, by default when you are using HTTP, you’ll use port 80, while HTTPS will default to port 443. This means that the server in port 80 can just read the response and redirect you immediately to the HTTPS endpoint.</p><p>RavenDB, however, it usually used in environments where you will explicitly specify a port. So the URL would look something like this:</p><ul><li>http://a.orders.raven.local:8080</li><li>https://a.orders.raven.local:8080</li></ul><p>It is very common for our users to start running with port 8080 in an unsecured mode, then later move to a secure mode with HTTPS but retain the same port. That can lead to some complications. For example, here is what happens in a similar situation if I’m trying to connect to an HTTPS endpoint using HTTP or vice versa.</p><p><a href="https://ayende.com/blog/Images/Open-Live-Writer/The-best-features-are-the-ones-you-never_AC9B/image_2.png"><img width="1030" height="60" title="image" style="margin: 0px; border: 0px currentcolor; border-image: none; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/The-best-features-are-the-ones-you-never_AC9B/image_thumb.png" border="0"></a></p><p><a href="https://ayende.com/blog/Images/Open-Live-Writer/The-best-features-are-the-ones-you-never_AC9B/image_4.png"><img width="1016" height="46" title="image" style="margin: 0px; border: 0px currentcolor; border-image: none; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/The-best-features-are-the-ones-you-never_AC9B/image_thumb_1.png" border="0"></a></p><p>This means that a common scenario (running on a non native port and using the wrong protocol) will lead to a nasty error. We call this a nasty error because the user has no real way to figure out what the issue is from the error. In many cases, this will trigger an escalation to the network admin or support ticket. This is the kind of issue that I hate, it is plainly obvious, but it is so hard to figure out and then you feel <em>stupid</em> for not realizing this upfront.</p><p>Let us see how we can resolve such an issue. I already gave some hints on how to do it <a href="https://ayende.com/blog/180513/the-bare-minimum-a-distributed-system-developer-should-know-about-https-negotiation?key=6329dd9044194fbcb7017d018491f05b">earlier</a>, but the technique in that&nbsp; post wasn’t suitable for production use in our codebase. In particular, we introduced another Stream wrapping instance and another allocation that would affect all input / output calls over the network. We would really want to avoid that.</p><p>So we cheat (but we do that a lot, so this is fine). Kestrel allow us to define connection adapters, which give us a hook very early in the process to how the TCP connection is managed. However, that lead to another problem. We want to sniff the first byte of the raw TCP request, but Stream doesn’t provide a way to Peek at a byte, any such attempt will consume it, which will result in the same problem on an additional indirection that we wanted to avoid. </p><p>Therefor, we decided to take advantage of the way Kestrel is handling things. It is buffering data in memory and if you dig a bit you can access that in some very useful ways. Here is how we are able to sniff HTTP vs. HTTPS:</p><blockquote><script src="https://gist.github.com/ayende/5beb5dd25c0fe71bac668a21bbcedeb9.js"></script></blockquote><p>The key here is that we use a bit of reflection emit magic to get the inner <em>IPipeReader</em> instance from Kestrel. We have to do it this way because that value isn’t exposed externally. Once we do have the pipe reader instance, we borrow the already read buffer and inspect it, if the first character is a capital character (G from GET, P from PUT, etc), this is an HTTP connection (SSL connection’s first byte is either 22 or greater than 127, so there is no overlap). We then return the buffer to the stream and carry on, Kestrel will parse the request normally, but another portion in the pipeline will get the wrong protocol message and throw that to the user. And obviously we’ll skip doing the SSL negotiation. </p><p>This is important, because the client is speaking HTTP, and we can’t magically upgrade it to HTTPS without causing errors such as the one above. We need to speak the same protocol as the client expect. </p><p>With this code, trying to use the wrong protocol give us this error:</p><p><a href="https://ayende.com/blog/Images/Open-Live-Writer/The-best-features-are-the-ones-you-never_AC9B/image_8.png"><img width="782" height="145" title="image" style="margin: 0px; border: 0px currentcolor; border-image: none; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/The-best-features-are-the-ones-you-never_AC9B/image_thumb_3.png" border="0"></a></p><p>Now, if you are not reading the error message that might still mean a support call, but it should be resolved as soon as someone actually read the error message.</p>https://ayende.com/blog/180675/the-best-features-are-the-ones-you-never-knew-were-there-protocol-fix-ups?Key=94a59bbf-0784-4725-9d5c-88af569468b0https://ayende.com/blog/180675/the-best-features-are-the-ones-you-never-knew-were-there-protocol-fix-ups?Key=94a59bbf-0784-4725-9d5c-88af569468b0Thu, 16 Nov 2017 10:00:00 GMTThe bare minimum a distributed system developer should know about: HTTPS Negotiation<p>I mentioned in <a href="https://ayende.com/blog/180388/the-bare-minimum-a-distributed-system-developer-should-know-about-certificates?key=de30db814fea4a7fa8145c50c3d8cc75">a previous post</a> that an SSL connection will typically use a Server Name Indication in the initial (unencrypted) packet to let the server know&nbsp; which address it is interested in. This allow the server to do things such as select the appropriate certificate to answer this initial challenge. </p><p>A more interesting scenario is when you want to force your users to always use HTTPS. That is pretty trivial, you setup a website to listen on port 80 and port 443 and redirect all HTTP traffic from port 80 to port 443 as HTTPS. Pretty much any web server under the sun already have some sort of easy to use configuration for that that. Let us see how this will look like if we were writing this using bare bones Kestrel.</p><blockquote><script src="https://gist.github.com/ayende/6f6bff9d6af7819a1d46144ab0ac89d0.js"></script></blockquote><p>This is pretty easy, right? We setup a connection adapter on port 80, so we can detect that this is using the wrong port and then just redirect it. Notice that there is some magic that we need to apply here. At the connection adapter, we deal with raw TCP socket, but we don’t want to mess around with that, so we just pass the decision up the chain until we get to the part that deal with HTTP and let it send the redirect. </p><p>Pretty easy, right? But about about when a user does something like this?</p><blockquote><p><strong>http</strong>://my-awesome-service:443</p></blockquote><p>Note that in this case, we are using the HTTP protocol and <em>not </em>the HTTPS protocol. At that point, things are a mess. A client will make a request and send a TCP packet containing HTTP request data, but the server is trying to parse that as an SSL client help message. What will usually happen is that the server will look at the incoming packet, decide that this is garbage and just close the connection. That lead to some really hard to figure out errors and much forehead slapping when you figure out what the issue is.</p><p>Now, I’m sure that you’ll agree that anyone seeing a URL as listed about will be a <em>bit </em>suspicious. But what about these ones?</p><ul><li>http://my-awesome-service:8080 </li><li>https://my-awesome-service:8080</li></ul><p>Unlike before, where we would probably notice that :443 is the HTTPS port and we are using HTTP, here there is no additional indication about what the problem is. So we need to try both. And if a user is getting connection dropped error when trying the connection, there is very little chance that they’ll consider switching to HTTPS. It is far more likely that they will start looking at the firewall rules. </p><p>So now, we need to do protocol sniffing and figure out what to do from there. Let us see how this will look like in code:</p><blockquote><script src="https://gist.github.com/ayende/71675e42802f852653186ecab5898093.js"></script></blockquote><p>We read the first few bytes of the request and see if this is the start of an SSL TCP connection. If it is, we forward the call to the usual Kestrel HTTPS behavior. If it isn’t, we mark the request as must redirect and pass it, as is, to the request parsed and ready for action and then send the redirect back.</p><p>In this way, any request on port 80 will be sent to port 443 and an HTTP request on a port that listens to HTTPS will be told that it needs to switch. </p><p>One note about the code in this post. This was written at 1:30 AM as a proof of concept only. I’m pretty sure that I’m heavily abusing the connection adapter system, especially with regards to the reflection bits there. </p>https://ayende.com/blog/180513/the-bare-minimum-a-distributed-system-developer-should-know-about-https-negotiation?Key=6329dd90-4419-4fbc-b701-7d018491f05bhttps://ayende.com/blog/180513/the-bare-minimum-a-distributed-system-developer-should-know-about-https-negotiation?Key=6329dd90-4419-4fbc-b701-7d018491f05bWed, 15 Nov 2017 10:00:00 GMTThe best features are the ones you never knew were there: Company culture and incentive structure<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/The-best-features-are-the-ones-you-never_AFFC/image_2.png"><img width="407" height="407" title="image" align="right" style="border: 0px currentcolor; border-image: none; float: right; display: inline; background-image: none;" alt="image" src="https://ayende.com/blog/Images/Open-Live-Writer/The-best-features-are-the-ones-you-never_AFFC/image_thumb.png" border="0"></a>I introduced the notion of <a href="https://ayende.com/blog/180673/the-best-features-are-the-ones-you-never-knew-were-there-comfortable-shoes-friction-removal?key=abe1ad9cb3aa48168f7a493462468c18">frictionless software</a> in the previous post, but I wanted to dedicate some time to talk about the deeper meaning for this kind of thinking. RavenDB is an open source product. There are a lot of business models around OSS projects, and the most common ones includes charging for support and services.</p><p>Hibernating Rhinos was founded because I wanted to write code. And the way the way we structured the company is primarily to write software and the tooling around it. We provide support and consulting services, certainly, but we aren’t looking at them as the money makers. From my perspective, we want to sell people RavenDB licenses, not to have them pay us to help them do things with RavenDB.</p><p>That means that from the company perspective, support is a <em>cost</em> center, not a revenue center. In other words, the more support calls I have, the sadder I become. </p><p>This mesh well with my professional pride. I want to create stuff that are useful, awesome and friction free. I want our users to take what we do and blast off, not to have them double check that their support contracts are up to date and that the support lines are open. I did a lot of study around that early on, and similar to Conway’s law, the structure of the company and its culture has deep impact on the software that it produces.</p><p>With support seen as a cost center, this lead to a ripple effect on the structure of the software. It means that error message are clearer, because if you give the user a good error message, maybe with some indication of how to fix the issue, they can resolve things on their own, without having to call support. It means that configuration and tuning should be minimal and mostly self served, instead of having to open a support ticket with “what should be my configuration settings for this or that scenario”.</p><p>It also means that we want to reduce as much as possible anything that might trip users up as they setup and use our software. You can see that with the RavenDB Studio, how we spend a tremendous amount of time and effort to make information accessible and actionable for the user. Be it the overall dashboard, or the deep insight into the internals, various graphs and metrics we expose, etc. The whole idea is to make sure that the users and admins have all the information and tooling they need in order to make things works without having to call support. </p><p>Now, to be clear, we have a support hotline with 24/7 availability, because at our scale and with the kind of software that we provide, you <em>need</em> that. But we are able to reduce the support load by an order of magnitude with such techniques. And it means that by and far, our support, when you need it, is going to be excellent (because we don’t need to deal with a lot of low level support issues). That means that we don’t need a many tiered support system and it take very little time to actually get to an engineer that has deep familiarity with the system and how to troubleshoot it.</p><p>There are a bunch of reasons why we went this route, treating support as a necessary overhead that needs to be reduced as much as possible. Building new features is much more interesting than fielding support calls, so we do our best to develop things so we’ll not have to spend much time on support. But mostly, it is about creating a product that is well round and <em>complete. </em>It’s about taking pride in not only having all the bells and whistles but also taking care to ensure that things work and that the level of friction you’ll run into using our products is as low as possible. </p>https://ayende.com/blog/180674/the-best-features-are-the-ones-you-never-knew-were-there-company-culture-and-incentive-structure?Key=9dd77f8b-7d1e-4d24-accc-e7fc35c4f004https://ayende.com/blog/180674/the-best-features-are-the-ones-you-never-knew-were-there-company-culture-and-incentive-structure?Key=9dd77f8b-7d1e-4d24-accc-e7fc35c4f004Tue, 14 Nov 2017 10:00:00 GMT