Objective C Generated Code

This page describes exactly what Objective C code the protocol buffer compiler generates for any given protocol definition. Any differences between proto2 and proto3 generated code are highlighted. You should read the proto2 language guide and/or proto3 language guide before reading this document.

Compiler invocation

The protocol buffer compiler produces Objective C output when invoked with the --objc_out= command-line flag. The parameter to the --objc_out= option is the directory where you want the compiler to write your Objective C output. The compiler creates a header file and an implementation file for each .proto file input. The names of the output files are computed by taking the name of the .proto file and making the following changes:

The file name is determined by converting the .proto file base name to camel case. For example, foo_bar.proto will become FooBar.

The extension (.proto) is replaced with either pbobjc.h or pbobjc.m for the header or implementation file, respectively.

The proto path (specified with the --proto_path= or -I command-line flag) is replaced with the output path (specified with the --objc_out= flag).

The compiler will read the files src/foo.proto and src/bar/baz.proto and produce four output files: build/gen/Foo.pbobjc.h, build/gen/Foo.pbobjc.m, build/gen/bar/Baz.pbobjc.h, and build/gen/bar/Baz.pbobjc.m. The compiler will automatically create the directory build/gen/bar if necessary, but it will not create build or build/gen; they must already exist.

Packages

The Objective C code generated by the protocol buffer compiler is completely unaffected by the package name defined in the .proto file, as Objective C has no language-enforced namespacing. Instead, Objective C class names are distinguished using prefixes, which you can find out about in the next section.

Class prefix

The specified string - in this case, CGOOP - is prefixed in front of all Objective C classes generated for this .proto file. Please use prefixes that are 3 or more characters as recommended by Apple. Note that all 2 letter prefixes are reserved by Apple.

Camel case conversion

Idiomatic Objective C uses camel case for all identifiers.

Messages will not have their names converted because the standard for proto files is to name messages in camel case already. It is assumed that the user has bypassed the convention for good reason, and the implementation will conform with their intentions.

Methods generated from field names and oneofs, enum declarations, and extension accessors will have their names camel cased. In general to convert from a proto name to a camel cased Objective C name:

The first letter converted to uppercase (except for fields, which always start with a lowercase letter).

For each underscore in the name, the underscore is removed, and the following letter is capitalized.

So, for example, the field foo_bar_baz becomes fooBarBaz. The field FOO_bar becomes fooBar.

Messages

Given a simple message declaration:

message Foo {}

The protocol buffer compiler generates a class called Foo. If you specify an objc_class_prefix file option, the value of this option is prepended to the generated class name.

In the case of outer messages that have names matching any C/C++ or Objective C keywords:

message static {}

the generated interfaces are suffixed by _Class, as follows:

@interface static_Class {}

Note that as per the camel case conversion rules the name static is not converted. In the case of an inner message that has a camel cased name that is FieldNumber or OneOfCase, the generated interface will be the camel cased name suffixed by _Class to make sure that the generated names do not conflict with the FieldNumber enumerations or OneOfCase enumerations.

A message can also be declared inside another message.

message Foo {
message Bar {}
}

This generates:

@interface Foo_Bar : GPBMessage
@end

As you can see, the generated nested message name is the name of the generated containing message name (Foo) appended with underscore (_) and the nested message name (Bar).

While we have tried to ensure that conflicts are kept to a minimum, there are still potential cases where message names may conflict due to the conversion between underscores and camel case. As an example:

message foo_bar {}
message foo { message bar {} }

will both generate @interface foo_bar and will conflict. The most pragmatic solution may be to rename the conflicting messages.

GPBMessage interface

GPBMessage is the superclass of all generated message classes. It is required to support a superset of the following interface:

Unknown fields (proto2 only)

If a message created with an older version of your .proto definition is parsed with code generated from a newer version (or vice versa), the message may contain optional or repeated fields that the "new" code does not recognize. In proto2 generated code, these fields are not discarded and are stored in the message's unknownFields property.

You can use the GPBUnknownFieldSet interface to fetch these fields by number or loop over them as an array.

In proto3, unknown fields are simply discarded when a message is parsed.

Fields

The following sections describe the code generated by the protocol buffer compiler for message fields.

Singular fields (proto3)

For every singular field the compiler generates a property to store data and an integer constant containing the field number. Message type fields also get a has.. property that lets you check if the field is set in the encoded message. So, for example, given the following message:

Special naming cases

There are cases where the field name generation rules may result in name conflicts and names will need to be "uniqued". Such conflicts are resolved by appending _p to the end of the field (_p was selected because it's pretty unique, and stands for "property").

typedef GPB_ENUM(Foo_FieldNumber) {
// If a non-repeatable field name ends with "Array" it will be suffixed
// with "_p" to keep the name distinct from repeated types.
Foo_FieldNumber_FooArray_p = 1,
// If a field name ends with "OneOfCase" it will be suffixed with "_p" to
// keep the name distinct from OneOfCase properties.
Foo_FieldNumber_BarOneOfCase_p = 2,
// If a field name is a C/C++/ObjectiveC keyword it will be suffixed with
// "_p" to allow it to compile.
Foo_FieldNumber_Id_p = 3,
};
@interface Foo : GPBMessage
@property(nonatomic, readwrite) int32_t fooArray_p;
@property(nonatomic, readwrite) int32_t barOneOfCase_p;
@property(nonatomic, readwrite) int32_t id_p;
@end

Default values

The default value for strings is @"", and the default value for bytes is [NSData data].

Assigning nil to a string field will assert in debug, and set the field to @"" in release. Assigning nil to a bytes field will assert in debug and set the field to [NSData data] in release. To test whether a bytes or string field is set requires testing its length property and comparing it to 0.

The default "empty" value for a message is an instance of the default message. To clear a message value it should be set to nil. Accessing a cleared message will return an instance of the default message and the hasFoo method will return false.

The default message returned for a field is a local instance. The reason behind returning a default message instead of nil is that in the case of:

message Foo {
message Bar {
int32 b;
}
Bar a;
}

The implementation will support:

Foo *foo = [[Foo alloc] init];
foo.a.b = 2;

where a will be automatically created via the accessors if necessary. If foo.a returned nil, the foo.a.b setter pattern would not work.

Singular fields (proto2)

For every singular field the compiler generates a property to store data, an integer constant containing the field number, and a has.. property that lets you check if the field is set in the encoded message. So, for example, given the following message:

Special naming cases

There are cases where the field name generation rules may result in name conflicts and names will need to be "uniqued". Such conflicts are resolved by appending _p to the end of the field (_p was selected because it's pretty unique, and stands for "property").

typedef GPB_ENUM(Foo_FieldNumber) {
// If a non-repeatable field name ends with "Array" it will be suffixed
// with "_p" to keep the name distinct from repeated types.
Foo_FieldNumber_FooArray_p = 1,
// If a field name ends with "OneOfCase" it will be suffixed with "_p" to
// keep the name distinct from OneOfCase properties.
Foo_FieldNumber_BarOneOfCase_p = 2,
// If a field name is a C/C++/ObjectiveC keyword it will be suffixed with
// "_p" to allow it to compile.
Foo_FieldNumber_Id_p = 3,
};
@interface Foo : GPBMessage
@property(nonatomic, readwrite) int32_t fooArray_p;
@property(nonatomic, readwrite) int32_t barOneOfCase_p;
@property(nonatomic, readwrite) int32_t id_p;
@end

Default values (optional fields only)

The default value for numeric types, if no explicit default was specified by the user, is 0.

The default value for strings is @"", and the default value for bytes is [NSData data].

Assigning nil to a string field will assert in debug, and set the field to @"" in release. Assigning nil to a bytes field will assert in debug and set the field to [NSData data] in release. To test whether a bytes or string field is set requires testing its length property and comparing it to 0.

The default "empty" value for a message is an instance of the default message. To clear a message value it should be set to nil. Accessing a cleared message will return an instance of the default message and the hasFoo method will return false.

The default message returned for a field is a local instance. The reason behind returning a default message instead of nil is that in the case of:

message Foo {
message Bar {
int32 b;
}
Bar a;
}

The implementation will support:

Foo *foo = [[Foo alloc] init];
foo.a.b = 2;

where a will be automatically created via the accessors if necessary. If foo.a returned nil, the foo.a.b setter pattern would not work.

Repeated fields

Like singular fields, the protocol buffer compiler generates one data property for each repeated field. This data property is a GPB<VALUE>Array depending on the field type where <VALUE> can be one of UInt32, Int32, UInt64, Int64, Bool, Float, Double, or Enum. NSMutableArray will be used for string, bytes and message types. Field names for repeated types have Array appended to them. The reason for appending Array in the Objective C interface is to make the code more readable. Repeated fields in proto files tend to have singular names which do not read well in standard Objective C usage. Making the singular names plural would be more idiomatic Objective C, however pluralization rules are too complex to support in the compiler.

For string, bytes and message fields, elements of the array are NSString*, NSData* and pointers to subclasses of GPBMessage respectively.

Default values

The default value for a repeated field is to be empty. In Objective C generated code, this is an empty GPB<VALUE>Array. If you access an empty repeated field, you'll get back an empty array that you can update like any other repeated field array.

Cases where keys are strings and values are strings, bytes, or messages are handled by NSMutableDictionary.

Other cases are:

GBP<KEY><VALUE>Dictionary

where:

<KEY> is Uint32, Int32, UInt64, Int64, Bool or String.

<VALUE> is UInt32, Int32, UInt64, Int64, Bool, Float, Double, Enum, or Object. Object is used for values of type stringbytes or message to cut down on the number of classes and is in line with how Objective C works with NSMutableDictionary.

Default values

The default value for a map field is empty. In Objective C generated code, this is an empty GBP<KEY><VALUE>Dictionary. If you access an empty map field, you'll get back an empty dictionary that you can update like any other map field.

You can also use the provided <mapField$gt;_Count property to check if a particular map is empty:

if (myFoo.myMap_Count) {
// There is something in the map...
}

GBP<KEY><VALUE>Dictionary interface

The GBP<KEY><VALUE>Dictionary (apart from GBP<KEY>ObjectDictionary and GBP<KEY>EnumDictionary) interface is as follows:

Enumerations

Given an enum definition like:

enum Foo {
VALUE_A = 0;
VALUE_B = 1;
VALUE_C = 5;
}

the generated code will be:

// The generated enum value name will be the enumeration name followed by
// an underscore and then the enumerator name converted to camel case.
// GPB_ENUM is a macro defined in the Objective C Protocol Buffer headers
// that enforces all enum values to be int32 and aids in Swift Enumeration
// support.
typedef GPB_ENUM(Foo) {
Foo_GPBUnrecognizedEnumeratorValue = kGPBUnrecognizedEnumeratorValue, //proto3 only
Foo_ValueA = 0,
Foo_ValueB = 1;
Foo_ValueC = 5;
};
// Returns information about what values this enum type defines.
GPBEnumDescriptor *Foo_EnumDescriptor();

// GPBEnumDescriptor is defined in the runtime and contains information
// about the enum definition, such as the enum name, enum value and enum value
// validation function.
typedef GPBEnumDescriptor *(*GPBEnumDescriptorAccessorFunc)();

The enum descriptor accessor functions are C functions, as opposed to methods on the enumeration class, because they are rarely used by client software. This will cut down on the amount of Objective C runtime information generated, and potentially allow the linker to deadstrip them.

In the case of outer enums that have names matching any C/C++ or Objective C keywords, such as:

enum method {}

the generated interfaces are suffixed with _Enum, as follows:

// The generated enumeration name is the keyword suffixed by _Enum.
typedef GPB_ENUM(Method_Enum) {}

All enumeration fields have the ability to access the value as a typed enumerator (Foo_Bar in the example above), or, if using proto3, as a raw int32_t value (using the accessor functions in the example above). This is to support the case where the server returns values that the client may not recognize due to the client and server being compiled with different versions of the proto file.

Unrecognized enum values are treated differently depending on which protocol buffers version you are using. In proto3, kGPBUnrecognizedEnumeratorValue is returned for the typed enumerator value if the enumerator value in the parsed message data is not one that the code reading it was compiled to support. If the actual value is desired, use the raw value accessors to get the value as an int32_t. If you are using proto2, unrecognized enum values are treated as unknown fields.

kGPBUnrecognizedEnumeratorValue is defined as 0xFBADBEEF, and it will be an error if any enumerator in an enumeration has this value. Attempting to set any enumeration field to this value is a runtime error. Similarly, attempting to set any enumeration field to an enumerator not defined by its enumeration type using the typed accessors is a runtime error. In both error cases, debug builds will cause an assertion and release builds will log and set the field to its default value (0).

The raw value accessors are defined as C functions instead of as Objective C methods because they are not used in most cases. Declaring them as C functions cuts down on wasted Objective C runtime information and allows the linker to potentially dead strip them.

Well-known types (proto3 only)

If you use any of the message types provided with proto3, they will in general just use their proto definitions in generated Objective C code, though we supply some basic conversion methods in categories to make using them simpler. Note that we do not have special APIs for all well-known types yet, including Any (there is currently no helper method to convert an Any's message value into a message of the appropriate type).