1. He started by stating that efficiency of certain containers/algorithms' implementation matter a lot at Facebook.

The operations that are performed very often, on "big data" and need to be efficient and scalable:

- Dictionary lookup: hash maps (unordered_map) vs maps.

- Depending on the scenario, one may want to pick one over another.

- Hash maps are generally faster, but hash key computation can be expensive.

- In case of maps it may be beneficial to scramble the keys, to avoid wasting time on comparing many strings with similar prefixes.

- For small data sets ("max 7 to 17" elements) just plain arrays with linear search may be better.

- Set operations: union and intersection - using search engine terms as an example.

- Set intersection on sorted arrays can be improved via binary search.

- Which can be improved via so-called "gallop-search" (going from start and probing in exponential increments: 2, 4, 8, 16...)

2. There was an interlude about the benefits of generic, type parameters (template) based vs. interface (base abstract class/inheritance) based API design.

The main argument for former is performance and more compile time checking.

The latter is generally more flexible (as behaviours can be modified at runtime).

The rest of the seminar was about the details of template-based APIs/implementations, mostly from the perspective of a library designer (which apparently has been Alexandrescu's primary occupation all these years).

3. He talked rather a lot about the new C++11 feature - so called variadic templates.

- It's a generally obscure feature, which found its way into the standard almost by accident, and many people still not sure why they would need it at all.

Unfortunately, there was no clear explanation why would one need typelists or tuples in the first place.

There was one typelists success story from the audience, which went along: "There are clients, and there are numbers, and there are money, and all of them together - we'd need to write a lot of code, but we used typelists, and then BAAM - magic happens!".

The take-away is that it's probably a good thing to know about (just in case), but better avoid doing it at home.

Or do it at home, but don't do it at work. Or whatever.

4. Policy-based design:

- This is one of the things Alexandrescu is famous of, and there was a bit of talk about it.

- In a sense, it has been a cornerstone of the standard C++ library design (with things like predicates and allocaters as typer parameters)

- Very powerful idiom, albeit sometimes hard to understand.

- "Pull the implementation down, but the policy upwards, to the client" - allow client to decide separate aspects of the class behaviour. But provide good defaults in case he does not care.

- So, classes can be parametrized by different types, each of the representing a separate behaviour aspect (policgy): e.g a memory allocation policy, locking policy, comparison policy etc.

- His Loki C++ library was very policy heavy, but C++ standard designers decided that having too many of those is too complicated.

- The variadic template hacks could be used to simplify the syntax for user (e.g. user could specify policy classes in any order and skip the defaults). Not worth it anyway.

- Generally, all this stuff blows up the compile time through the roof, but this consideration was somehow sweeped under the rag in the course of discussion.

5. Many Standard C++ library implementations still suck from the point of view of performance.

- Vector resizing: not specified exactly by the standard, and almost all compilers have been doing it wrong.

- Growing the capacity by the factor of two has been prevalent (except of the recent Microsoft C++ library implementation). Optimal - golden section, 1.6, so after reallocation the freed memory chinks have higher chance of being reused, thus reducing the probability of heap fragmentation.

- Search of a substring in a string - there are better algorithms than the default one being used, e.g. Rabin-Karp (a smart "floating character code sum" idea)

6. Memory allocation.

- Things like freelists and other custom allocation strategies.

- The underlying memory allocator performance still matters a lot.

- The majority of the memory allocators implementations are reluctant to release memory to the system, which is a problem in a system with many services.

- Facebook uses jemalloc, which fixes that problem. They claim they could not find anything better so far. Worth trying.

7. Efficient/scalable STL usage.

- Matters even more in multithreaded environment.

- Mind the memory allocation: locks should be held for as short period of time as possible, locking the memory allocator should be avoided at all costs.

- Double checked locking trick can be used.

- Another trick: Edit a copy of container aside, lock the original container, swap, unlock.

- C++11 has nice threading primitives that can be used.

8. C++ design by committee is rather unfortunate.

- For example, they removed COW (copy on write) from strings, trying to "fix" the potential threading problems.

- Alexandrescu seems to be missing his cow a lot. There are sound ways to fix these problems without killing the cow.

- Facebook does that in its Folly library (which is open sourced and available on github)

- There is even more nice (and/or obscure) features coming up with C++14, so one can use/abuse them to their liking.

День второй:

Day 2.

------

There were two main topics this day: error handling and low-level C++ optimization.

- Can try doing the tricks to get the best of two worlds: error codes and exceptions.

- We build it as a container to either actual value of T OR std::exception_ptr (C++11 feature, kind of a shared pointer to an exception).

- Example usage: "Expected<int> parseInt(const string&);" (instead of just returning an int, with a "special" value in case of error)

- Other languages have similar stuff. Haskell "Maybe T", Scala "Option[T]", C# "Nullable<T>" etc. But we want to the embed the error information as well (which those other language types do not provide).

- Kind of promise<T>/future<T> in C++11 (and other languages), but we don't need async with all the overhead.

- Performance benefits - creates the exception and encapsulates it, but does not actually throw anything, no try/catch.

- Btw, in C++11 there is this new notion of move constructors for RHS values (with the "&&" syntax); allows for better efficiency in the move-scenarios.

- Exception values - watch for slicing. Exception are not your regular objects, need to be passed by reference etc.

- ...followed a bunch of gritty implementation details of this using C++11.

- According to Alexandrescu, "this is a very new thing, just starting to be used at Facebook".

4. ScopeGuard.

- There was his article in Dr.Dobbs written in 2000.

- Encapsulates the action/cleanup/rollback-on-action-fail pattern.

- Allows for better composition (e.g. when want nesting these kind of actions)

- Improvements in C++11: use lambdas and get rid of nesting. Move constructor for ScopeGuard (and explicitly prohibit, "delete" all other constructors)

- Can do some macro tricks to wrap the scope guard, to use auto-generated anonymous variable names etc.

- All of this is available in Facebook Folly library on github.

5. Code optimization.

- Mind the Amdahl's law (the system is only as fast as its slowest part); optimize the big spenders first.

- Remember about the operations that are "negligible" but are spread all over the place.