Still trying to get it all out

May 10, 2016

Somewhere around the beginning of this year, Google suffered a traumatic brain injury and forgot how to stop spam on Blogger. Probably forgot the product existed entirely, knowing Google.
So I've had to disable comments completely. Feel free to reach out to me on Twitter @SaraMG.

May 2, 2016

Last week, my wife found a new glucose meter via reddit or something, and it was only $40 so it sounded like it was worth giving a shot. It just arrived today, so I figured I'd share my initial thoughts below:

Ordering

My first annoyance was with buying the product. While the intro page tells you all about the meter and how much it costs, the pricing on strips was suspiciously absent. After a few minutes of clicking around the site, I finally tried hitting the [BUY] button, and was treated to actual numbers. In fairness, that could have been my first guess, but I'm easily annoyed.

After placing my order, I was given a stark white page saying "Your order has been placed." No order number, no tracking info, no estimated delivery date. Just a brief affirmation. Okay, whatevs, it'll come when it comes.

Unboxing

And it did come after only a few days, so point for promptness. The meter is just what it says on the tin, a compact housing for test strips, lancet device, and the small square dongle used for actually reading a strip and pumping it into one's phone.

One small quirk did pop up fairly early. The instructions indicate that the test strip cartridge should "snap" into the casing. It doesn't. In fact, when removing the outer cap, it pulls the cartridge out of the casing, then I have to pry the cap off the cartridge, so I can open the cartidge, and fish out the strip. For a device which hinges on convenience and ease of use, this isn't convenient or easy to use. Hopefully this is just an early model engineering defect and not indicative of the poor quality of the device overall.

Using the app

Unsurprisingly, the app wants you to register/login to their site. That's not unexpected. What surprised me here is that the account I created last week (while buying the device) wasn't usable for the actual app. Maybe that's a good thing, if the store is run by a third-party, principle of least privileges and all. But it's certainly inconvenient from a user standpoint.

But finally, I'm ready to take a reading...

Fuuuuuuuudge...

Now, don't judge me on the hipster wood case, I'm very clumsy and it helps me not break phones. I used to have plastic/vinyl wallet cases, but they fall apart in a matter of weeks, and... look, my point is, just a SLIGHTLY longer neck would have made this a lot more usable. Square knows this problem exists, which is why their 2nd gen readers don't match their 1st gen readers.

But it works well enough, the data matches my other meter within tolerance... Wait, no... one more gripe: The app doesn't talk to iOS's health data stream. My current go-to logbook app, MySugr (highly recommend, btw) is able to share data with my phone's OS, which means my apps can work better together.

Conclusion

Not a terrible little device, I'll probably keep it in my hoodie-of-many-pockets for those random occasions where I used to use OneTouch minis, but it's not quite ready for me yet.

Jan 8, 2015

Hopefully you've had some time to digest the first three parts of my HHVM Extension Writing series, and you're ready to embark into the world of Resources. Not quite as exciting as objects, they're certainly less commonly used in new extensions since Objects are so much more versatile, but there were an integral part of PHP's history prior to PHP5, and are still in heavy use by such features as streams, curl, and most database connectors.

Resource

Defining a new resource type shares some similarity with Objects, except that you won't be defining anything in systemlib initially, because a resource doesn't have an API in and of itself. It's just an opaque pointer used by seemingly unrelated functions and methods. We'll define some global functions in the second half of this post when we start accessing the resource. Any wonder Objects are winning out?

As with the Object example, we'll be implementing a bit of filesystem access to demonstrate how one might use resources.

The first three lines of our class are basically boilerplate. DECLARE_RESOURCE_ALLOCATION_NO_SWEEP handles some specifics about the MemoryManager, because unlike objects which are allocated from userspace, resources are allocated from C++, and a bare c++ new won't do a "smart" allocation unless told to do so by this macro. The CLASSNAME_IS macro, and the o_getClassNameHook() virtual below it, define the "resource name" as seen from PHP when you var_dump() or call get_resource_type.

As with the Object version of this example, the class destructor is automatically called when the resource variable falls out of scope, meanwhile sweep() is invoked if the variable is still "live" at the end of a request. Since in this case we want the same behavior to occur, we simple chain one through the other, and on to their actual purpose; Closing the file.

isInvalid() exists for Resource types because it's very common to invoke functions like fclose() who's purpose is to make the resource no longer usable as its normal type. HHVM's mechanism for dealing with this is very different from PHP's, but the end result is the same. So long as your class has some way of knowing that it's "dead", HHVM can detect it, and report accordingly in var_dump() and other calls.

So now that we've defined a Resource type, let's make a function to create instances and do things with them:

As you can see, allocating a resource is literally as simple as newing up a C++ class with the newres<T>() template (or if you're using an older version of HHVM, the NEWOBJ() macro) and passing it as an argument to Resource's constructor.

For getting at that C++ instance, I've presented two different, equally valid methods. The first is certainly more concise, but you may not want to throw exceptions (you've probably eschewed making an Object for a reason, after all). On the other hand, while the second form allows you to handle your error cases more explicitly, this particularly common case forced us to change our return type from int to the far less precise mixed. Take this into account when designing your APIs. Many extensions follow the latter pattern, using a simple macro to avoid excessive copypasta from one function to the next. I've added an example of that to the repo.

What's with the nullOkay arg? TL;DR version: It's a bit of legacy logic and doesn't really have a place in modern HHVM extensions. You can usually set it to the same value as you pass for badTypeOkay. Note that both of these args are false by default.

Update: In the initial version of this post, I used NEWOBJ() exclusively to allocate a new resource instance, but as @maide pointed out in IRC, that's been removed from the newest versions of HHVM and replaced with newres<T>(). We use an #ifdef to figure out which version we're dealing with because the HHVM team is lousy at updating the API version constant. ;p

What's next...

We've covered all the userspace data types, declaration of functions, classes, and resources. Next up, in Part V, we'll back up for a few moments and look at the build system so that we can start linking in external libraries. If you're already familiar with CMake, then you can probably skip that chapter. Part VI is TBD at the moment, but I'll probably pick up a lot of the parts I glossed over in Parts I-IV, we'll see what comes out...

Jan 7, 2015

In Part I of this series, we looked at the basic building blocks of a simple HHVM extension skeleton. Part II continued with three of the five "smart" datatypes used throughout the API. But now we get to have some fun. It's time to start looking into declaring classes with methods and properties and constants (oh my!).

As you should expect by now, compiling your extension with this bit of code should mean that the Example1_Greeter class is now available to all requests and may be invoked like any other class definition. Let's apply what we already know from making native functions, and see how it works with methods...

While you're probably tempted to rush off and add an HHVM_FE() and HHVM_FUNCTION() implementation to the C++ file, you'd only be half right. These aren't functions, they're methods, and as such have a different set of macros.

Similarly, properties and constants may be declared directly on the Hack definition of the class in your systemlib file. I won't bother showing it here, since you all know how to write PHP code, but I'll put an example or two in the git repo.

What's marginally more interesting, and something you can probably guess at from the coverage in Part I, is that you can declare class constants from C++, meaning that they can take on values defined in external headers or computed values. Let's add one from moduleInit().

Binding internal data

What we have so far is all well and good for simple classes, but most extension classes will need to store some opaque pointer from an external library somewhere on the class that can be easily referenced later on. For that, things start to get a little bit more complicated. To help clarify what we're doing, I'm going to wipe the slate clean by moving example1 to its own subdirectory, and starting fresh with example2.

Yeah, kinda messy changing my whole directory structure around midway, but would you rather I ran `git push --force`? Yeah, I thought not.

The code skeleton we're starting out with is at commit: 20dc4ef824 in example2/.

To illustrate something slightly less contrived than earlier examples, I'll be wrapping the POSIX FILE* object in a simple PHP class. Let's start with something basic in our systemlib, containing just a constructor:

A new user attribute has appeared! <<__NativeData("Example2_File")>> tells the runtime that this is no ordinary object. This object should be over-allocated with enough to space to handle some internal C++ object, identified by the quoted name. In practice this is usually the name of the class it goes with, but it doesn't have to be. How does this hook up to internals? That comes next, within ext_example2.cpp by adding an #include "hphp/runtime/vm/native-data.h" and the following code:

We're only opening (and ultimately closing) our file at this point, but these are really important stages in an object's lifecycle, so this is worth going though slowly. When a PHP script calls $o = new Example2_File(__FILE__, "r");, the first thing the engine does is allocate space for the object. This is done by adding sizeof(ObjectData) (the standard, base object size) to the size given by any NativeDataInfo associated with it. We made that association by calling Native::registerNativeDataInfo(StringData* id);, where T is the C++ class type to allocate with the ObjectData, and id is the symbolic name we gave it in the syetemlib file using <<__NativeData("id")>>.

Next, the engine invokes the constructor, providing a pointer to the object via a hidden ObjectData* this_ property in the C++ method's signature. From here, we can get access to our private data structure by using the Native::data() accessor to jump to the correct offset from this_. At this point, we have access to a normal C++ object which just so happens to be bound to a PHP object, and we have constructor parameters as well!

From here, there are two reasons a PHP object might die. In the expected case, it runs out of references and is destructed during the course of request's runtime. In this case, our auxiliary object has its destructor called as well, so that external pointers can be cleaned up nicely. The other time a PHP object can die is when the request is shutting down. This is somewhat more exceptional since the memory manager is sweeping ALL request-local data, not necessarily in the most ideal order. It's up to your auxiliary class to deal with non-sweepable resources, but trust that the runtime will deal with resources which are. This is resolved by having a secondary psuedo-destructor called sweep();. For simple implementations like ours, we want the regular destructor and sweep to do the same time, since the only members of this C++ class are external pointers. If we have sweepable resources such as an HPHP::String however, we'd want to avoid the implicit member destruction which comes with calling ~Example2_File(). It's entirely possible that the internal state of that String is no longer valid because it was sweeped first. Hence the need for a separate sweep() function.

TL;DR? - Just have your destructor call sweep(), and deal with external pointers in sweep(). That's good 90% of the time.

You might also have noticed that I'm throwing an exception in the assignment operator. This is normally used for handling a clone, where you'd probably duplicate the FILE* handle, but I realized midway that POSIX file streams don't really have that notion, so I took the easy way out and threw a standard exception. In practice, the implementation would probably look something like:

Jan 6, 2015

In our last installment I walked through setting up a dev environment and creating a simple HHVM extension which exposed some constants and global scope functions. Today, we'll expand on that by delving deeper into three of the five "Smart" types: String, Array, and Variant. The other two "Smart" types will be covered in Parts III(Objects) and IV(Resources) since they require a bit more explaining.

The String class

HPHP::String resembles C++'s std::string class in many ways, but also builds in several assumptions about how PHP strings should behave, is able to be encapsulated in a Variant (mixed) object, and performs common string related tasks, such as numeric conversion.

This post is going to highlight the most common features of the String class, but you should look through the header file yourself for a more in-depth exploration.

The constructors, as you can see, are generally built around making new runtime string values from an existing string or numeric value, and are again straight-forward to use. String::FromCStr() is a somewhat special case in that it creates a StaticString, rather than a String. While a String is cleaned up at the end of the request it was created in, StaticStrings live forever, and can even be shared between multiple requests. Because overuse of StaticString could easily lead to memory bloat, they're typically only used for defining persistent features (such as constant names/values) as seen in Part I.

The most interesting part of this API is the ReserveStringMode and MutableSlice. Ordinarily, you shouldn't save the pointer you get from String::c_str() as it can potentially change between calls, and you generally shouldn't go modifying a String unless you know you own it anyway. If you do have need to modify a string, call bufferSlice() on it. The MutableSlice structure you get back will contain a pointer to a (relatively) stable block of memory which can be populated. Here's an example:

This contrived example allocates enough space for 10 single-digit numbers, and a comma and space following them. It uses snprintf() to fill that buffer up, then it truncates it as 28 characters, since the final ', ' wasn't actually necessary. You'll find this pattern in use anywhere an API expects you to provide it with a buffer for it to fill, such as in the intl extension where it calls into ICU.

Another approach to building up a string from parts would be to use the operator+ overload which allows you to simply concatenate Strings such as in the following:

There are costs and benefits to both versions. The former is more efficient as it only does one allocation, as opposed to the latter which does at least 11, and far less copying around. On the other hand, the second version is far more readable and far less error prone. For the contrived example, I'd call the second version "better", but there are certainly cases where the first version is superior.
The code so far is at commit: fa82b3cd70

The Array Class

Arrays are the do-all bucket of "stuff" of the PHP language. They can behave like vectors, maps, sets, or weird hybrid hodgepodge containers without rhyme or reason. You already know how to interact with them from userspace, so let's take a look at how to interact with them from C++. As with Strings, we're only going to go into the most common API calls here, check out the header for the full story.

As you can see, the Array APIs mirror PHP's userspace API very closely, down to the read API using square-bracket notation just like PHP code. Let's write a couple new methods dealing with arrays as arguments and return values.

The Variant Class

The last "smart" class doesn't represent a single PHP type, rather it represents all types in a sort of meta-container which knows what it's holding, and knows how to convert between the concrete types. Variant is useful when you need to accept and/or return multiple possible types. For a start, let's list out the core API. Remember that there are far more methods than I'll cover here, and you can find the reset in the header file.

These seemingly incompatible return types (const char* and bool) work because they are implicitly constructed into a Variant instance. Explicit types are generally preferred, because the IR can make better assumptions during optimization, but sometimes you just want your return values to be adaptable like that.

These APIs allow pulling a concrete data type out of a Variant so they can be operated on directly. Note that the to*() APIs will convert the type if necessary, even if the is*() call returned false, but that not all conversions make sense. Let's make a contrived example by implementing a simplistic var_dump():

I've written a number of blogposts and even one book over the years on writing extensions for PHP, but very little documentation is available for writing HHVM extensions. This is kinda sad since I built a good portion of the latter's API. Let's fix that, starting with this article.

Setting up a build environment

The first thing you need to do is get all the dependencies in place. I'm going to start from a clean install of Ubuntu 14.04 LTS and use the prebuilt HHVM binaries for Ubuntu. Other distros should work (with varying degrees of success), but sorting them all out is beyond the scope of this blog entry.

First, let's trust HHVM's package repo and pull in its package list. Then we can install the hhvm-dev package to pull in the binary along with all needed headers.

Creating an extension skeleton

The most basic, no-nothing extension imaginable requires two files. A C++ source file to declare itself, and a config.cmake file to describe what's being built. Let's start with the build file, which is a simple, single line:

HHVM_EXTENSION(example1 ext_example1.cpp)

This macro declares a new extension named "example1" with a single source file named "ext_example1.cpp". If we had multiple source files, we'd delimit them with a space (HHVM_EXTENSION(example1 ext_example1.cpp ex1lib.cpp utilex1.cpp etc.cpp))

The source file has a little more boilerplate, but fortunately it's also just a handful of lines:

All we're doing here is exposing a specialization of the "Extension" class which gives itself the name "example1". It doesn't do anything more than declare itself into the runtime environment. Those familiar with PHP extension development can think of this as the zend_module_entry struct, with all callbacks and the function table set to NULL.

Building an extension and testing it out

To build an extension, first run hphpize to generate a CMakeLists.txt file, then cmake . to generate a Makefile from that. Finally, issue make to actually build it. You should see output like the following:

Now we're ready to load it into our runtime. Start by creating a simple test file:

<?php
var_dump(extension_loaded('example1.php'));

Then fire up hhvm: hhvm -d extension_dir=. -d hhvm.extensions[]=example1.so tests/loaded.php and you should see bool(true).

Adding functionality

The simplest way to add functionality is to write some Hack code. You could write straight PHP code, but you'll see in a few moments why Hack is preferable for extension systemlibs. Let's introduce a new file: ext_example1.php and link it into our project:

Bridging the gap

If all you wanted to do was write PHP code implementations you could create a normal library for that. Extensions are for bridging PHP-script into native code, so let's do that. Make a new entry in your systemlib file using some hack specific syntax:

The <<__Native>> UserAttribute tells HHVM that this is the declaration for an internal function. The hack types tell the runtime what C++ type to pair them with, and the usual rules for default arguments apply.

To pair it with an internal implementation, we'll add the following to ext_example1.cpp:

And link it to the systemlib by adding HHVM_FE(example1_greet); to moduleInit().

As you can see, internal functions are declared with the HHVM_FUNCTION() macro where the first arg is the name of the function, as exposed to userspace, and the remaining map to the userspace functions argument signature. The argument types map according the following table:

Hack type

C++ type (argument)

C++ type (return type)

void

N/A

void

bool

bool

bool

int

int64_t

int64_t

float

double

double

string

const String&

String

array

const Array&

Array

resource

const Resource&

Resource

object

const Object&

Object

ClassName

const Object&

Object

mixed

const Variant&

Variant

mixed&

VRefParam

N/A

Since this is Hack syntax, you may declare the types as soft (with an @) or nullable (with a question mark), but since these types are not limited to a primitive, they need to be represented internally as the more generic const Variant& for arguments or Variant or return types (essentially, mixed).

Reference arguments use the VRefParam type noted above. An example of which can be seen below:

Constants

Constants, like any other bit of PHP, may be declared in the systemlib file, or if they depend on some native value (such as a define from an external library), they may be declare in moduleInit() using the Native::registerConstant() template as with the following:

The use of this function should be mostly obvious, in that it takes the name of a constant as a StringData* (which comes from a StaticString's .get() accessor), and a value appropriate to the constant's type. The type, in turn, is given as the function's template parameter and is one of the DataType enum values. The kinds correspond roughly to the basic PHP data types.

Jun 11, 2009

About a day and a half ago, Reddit user stderr posted a link to the PHP documentation showing that GOTO will be a part of the language as of version 5.3. Somewhere in the reddit comments for this post, a link was pasted to an entry on this blog where I announced the feature being added years earlier. This naturally brought out the usual flame wars that circle around something like GOTO and drove some new traffic to my blog. No big deal, my server is pretty low-traffic, it can handle a few extra hits.

Okay, more than a little incendiary, but I'll get to that in a moment...

The lesson

You would think, given that I'm the Architect for Yahoo WebSearch Front-end Engineering, that I would know something about configuring a server to not fall over under load. And in fact, I spend a good portion of my time on making sure that unexpected traffic spikes aren't enough to make a server get overloaded and trigger a chain-reaction of front-ends falling over. I really have no excuse for not preparing my server for what happened.

When I woke up Wed morning, I found that my server just didn't seem to respond to SSH or HTTP attempts. A reboot request didn't help, and the colo folks insisted that the server was simply running slow, but it was running. So I left my ssh connection attempt running and eventually I did get a login prompt. Several pained minutes later I managed to get an iptables rule in place to block off the flood of traffic (quicker than trying to stop the webserver). Suddenly, the CPU load was gone! Turns out I had my MaxClients setting much too high, and after a certain degree of concurrency, enough web-server children had spawned off to use up the available memory, which triggered disk swap, and made the CPU load 10x worse. Again, I know this effect exists, I really should have set up this server better.

A few tweaks to the config later and I got my server running smoothly. I also took the time to re-run my access log statistics. Of the aproximately 3 years of stats I've got, a full 2% of my hits were logged yesterday. Impressive reddit.... you win this round...

Prior to yesterday's traffic

Current stats

Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

01

02

03

04

05

06

07

08

09

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

My flickr views...

Clarifications

Despite the title of the reddit entry, I have nothing against Christians. Further, I count myself as one. So if you really want to take a piss at people who are calling Christians evil, don't aim at me. Thanks.

Moreover, I don't think those people were evil, or even horrible (though I did use that word in the heat of the moment). A loved one had just died, and everyone was in pain. Death SUCKS, and it was a bad situation no matter how you slice it. I don't hate those people for trying to erase my friend though, I pity them. I pity the fact that they turned down the chance to mourn the passing of their loved one by crying on another loved one's shoulder. I pity the fact that they didn't get nearly the closure that my friend did. I pity them for willingly becoming victims of their own grief.