> I am surprise that I just know we do not need to declare instance variables
> explicitly when we declare the property and use @synthesize.
>
> That is we do not need a instance variable to be declared that is
> corresponds to the property declaration.
>
>
>
> look at this blog:
> http://cocoawithlove.com/2010/03/dynamic-ivars-solving-fragile-base.html
>
>
>
> However, I just wonder, is it really true that there is no ANY different
> between explicitly declaring iVars and not delaring it?
>
> If so, is it a better approach that just declare the property and let
> @synthesize to generate the iVars itself.
>
>
>
> Any more explaination and clarify will be appreciated.

Only difference is that you have to explicitly declare the ivar to be compatible with the 32-bit runtime.

If you’re 64-bit only (or if you require Lion or better), there’s no real reason to explicitly declare the ivars these days.

Explicitly declaring your ivars makes them easier to see in the
debugger. Otherwise you have to call the property accessor from gdb's
command line to look at it.

I feel that it's helpful to declare them in any case, so I can
more-easily tell how much memory each of my objects is taking up by
looking at its list of ivars.

Not every property is backed by the ivar it acts as asscessors for.
It's perfectly legal to implement custom accessors that do other
things than set or get instance variables. They could instead run
some sort of calculation, or set some other combination of ivars.

The synthesized ivars are private to the class that declares them. So you can use them as an ivar in that class's implementation, but not in a subclass implementation. Usually this is not a problem, but if you already have code that passes the ivar's address in a function call, for example, that won't work with a property.

On 2011-11-13, at 2:16 AM, ico wrote:

> I am surprise that I just know we do not need to declare instance variables
> explicitly when we declare the property and use @synthesize.
>
> That is we do not need a instance variable to be declared that is
> corresponds to the property declaration.
>
>
>
> look at this blog:
> http://cocoawithlove.com/2010/03/dynamic-ivars-solving-fragile-base.html
>
>
>
> However, I just wonder, is it really true that there is no ANY different
> between explicitly declaring iVars and not delaring it?
>
> If so, is it a better approach that just declare the property and let
> @synthesize to generate the iVars itself.
>
>
>
> Any more explaination and clarify will be appreciated.
>
> --
> =========================> Life isn't about finding yourself.
> Life is about creating yourself.

For some reason, sometimes my synthesized ivars show up in the debugger as ivars, and sometimes they don't. Can anyone explain this? I can't remember whether this is with GDB or LLDB.

On 2011-11-13, at 5:24 AM, Don Quixote de la Mancha wrote:

> Explicitly declaring your ivars makes them easier to see in the
> debugger. Otherwise you have to call the property accessor from gdb's
> command line to look at it.
>
> I feel that it's helpful to declare them in any case, so I can
> more-easily tell how much memory each of my objects is taking up by
> looking at its list of ivars.
>
> Not every property is backed by the ivar it acts as asscessors for.
> It's perfectly legal to implement custom accessors that do other
> things than set or get instance variables. They could instead run
> some sort of calculation, or set some other combination of ivars.
>
> --
> Don Quixote de la Mancha
> Dulcinea Technologies Corporation
> Software of Elegance and Beauty
> http://www.dulcineatech.com
> <quixote...>

> On Nov 13, 2011, at 1:16 AM, ico wrote:
> >>
>> However, I just wonder, is it really true that there is no ANY different
>> between explicitly declaring iVars and not delaring it?
>> >
> If youre 64-bit only (or if you require Lion or better), theres no real reason to explicitly declare the ivars these days.

As others have pointed out, this is not true. There are practical differences between declaring and not declaring the ivar explicitly. I almost never declare the ivar explicitly, but once in a while I need it to show up in the debugger or to be available in a subclass, and then I must declare it explicitly. m.

>> If you’re 64-bit only (or if you require Lion or better), there’s no real reason to explicitly declare the ivars these days.>
> As others have pointed out, this is not true. There are practical differences between declaring and not declaring the ivar explicitly. I almost never declare the ivar explicitly, but once in a while I need it to show up in the debugger or to be available in a subclass, and then I must declare it explicitly. m.

TBH what I don't get is why this cannot be changed in LLVM instead -
then we would not have the 64-bit/10.7 restrictions.

Of course it would not change what's happening in the runtime but I
guess most of people only care what they need to type anyway.

>>> If you’re 64-bit only (or if you require Lion or better), there’s no real reason to explicitly declare the ivars these days.>>
>> As others have pointed out, this is not true. There are practical differences between declaring and not declaring the ivar explicitly. I almost never declare the ivar explicitly, but once in a while I need it to show up in the debugger or to be available in a subclass, and then I must declare it explicitly. m.>
> TBH what I don't get is why this cannot be changed in LLVM instead -
> then we would not have the 64-bit/10.7 restrictions.
>
> Of course it would not change what's happening in the runtime but I
> guess most of people only care what they need to type anyway.

In four words: Fragile Base Class Problem.

The problem is that a subclass (in 32 bit OS X) needs to know the size of the superclass so it know how to lay out its ivars. If there is no explicit ivars, there is no way for the compiler to know the size (since when it is compiling the subclass it doesn't see all the files that may potentially contain the parent's ivars).

Glenn Andreas <gandreas...>
The most merciful thing in the world ... is the inability of the human mind to correlate all its contents - HPL

> In four words: Fragile Base Class Problem.
>
> The problem is that a subclass (in 32 bit OS X) needs to know the size of the superclass so it know how to lay out its ivars. If there is no explicit ivars, there is no way for the compiler to know the size (since when it is compiling the subclass it doesn't see all the files that may potentially contain the parent's ivars).

Think of it like the compiler generates the ivars from the property
definitions. So the ivar would be indeed explicit ivars - just not
defined as such in the classic way.

>> In four words: Fragile Base Class Problem.
>>
>> The problem is that a subclass (in 32 bit OS X) needs to know the size of the superclass so it know how to lay out its ivars. If there is no explicit ivars, there is no way for the compiler to know the size (since when it is compiling the subclass it doesn't see all the files that may potentially contain the parent's ivars).>
> Think of it like the compiler generates the ivars from the property
> definitions. So the ivar would be indeed explicit ivars - just not
> defined as such in the classic way.

Doesn't matter. The subclass still needs to know the size of its superclass so that it generate the proper ivar offsets. If the ivar declarations are not visible to the compiler, it cannot know this information.

The modern runtime sidesteps this issue by storing the offset of each ivar in a global variable and indirecting all ivar access through that.

>> Think of it like the compiler generates the ivars from the property
>> definitions. So the ivar would be indeed explicit ivars - just not
>> defined as such in the classic way.>
> Doesn't matter. The subclass still needs to know the size of its superclass so that it generate the proper ivar offsets. If the ivar declarations are not visible to the compiler, it cannot know this information.

But there is the property declaration where it can derive that information from.

FWIW I check on the llvm IRC channel and the response was "I wouldn't
be surprised if there are annoying edge cases, but offhand I don't see
any reason it couldn't be done."

>>> Think of it like the compiler generates the ivars from the property
>>> definitions. So the ivar would be indeed explicit ivars - just not
>>> defined as such in the classic way.>>
>> Doesn't matter. The subclass still needs to know the size of its superclass so that it generate the proper ivar offsets. If the ivar declarations are not visible to the compiler, it cannot know this information.>
> But there is the property declaration where it can derive that information from.

No it can't. @property only says "I have methods named -foo and -setFoo:". It implies absolutely nothing about storage.

>
> FWIW I check on the llvm IRC channel and the response was "I wouldn't
> be surprised if there are annoying edge cases, but offhand I don't see
> any reason it couldn't be done."

If it could've been done, they would have done it. The fragile base class problem is a well-understood aspect of language design, and the compiler team is full of smart people.

> No it can't. @property only says "I have methods named -foo and -setFoo:". It implies absolutely nothing about storage.

How does

@property (nonatomic, assign) IBOutlet NSWindow *window;

not have the information that there would need to be an ivar

NSWindow *window;

on 32-bit?

>> FWIW I check on the llvm IRC channel and the response was "I wouldn't
>> be surprised if there are annoying edge cases, but offhand I don't see
>> any reason it couldn't be done.">
> If it could've been done, they would have done it. The fragile base class problem is a well-understood aspect of language design, and the compiler team is full of smart people.

FWIW the guy was from Apple. Quoting more "as far as I know the
decision to tie omitted ivars to non-fragile ivars was more one of
available engineering time and source compatibility than one of
technical possibility"

>> FWIW I check on the llvm IRC channel and the response was "I wouldn't
>> be surprised if there are annoying edge cases, but offhand I don't see
>> any reason it couldn't be done.">
> If it could've been done, they would have done it. The fragile base class problem is a well-understood aspect of language design, and the compiler team is full of smart people.

Yup. Rather, to be pedantic, it can be done, and they did do it, but at the cost of changes to object layout and the ABI. So to avoid breaking binary compatibility with all existing apps, they didn’t make these changes in the 32-bit x86 runtime, only newer runtimes like 64-bit and iOS.

(In a nutshell: if a class doesn’t know the size of its base class’s instance data at compile time, it can’t look up its own instance variables as constant offsets from ‘self’, so this changes the machine instructions for getting and setting instance variables.)

>> No it can't. @property only says "I have methods named -foo and -setFoo:". It implies absolutely nothing about storage.>
> How does
>
> @property (nonatomic, assign) IBOutlet NSWindow *window;
>
> not have the information that there would need to be an ivar
>
> NSWindow *window;
>
> on 32-bit?

Because you could implement this property by doing something like this:

> How does
>
> @property (nonatomic, assign) IBOutlet NSWindow *window;
>
> not have the information that there would need to be an ivar
>
> NSWindow *window;
>
> on 32-bit?

There’s no requirement that there be such an ivar, only a method named -window that returns an NSWindow*. The implementation of that method is arbitrary. For example it might just look like
- (NSWindow*) window { return [_parent window];}

Separating interface from implementation is a good thing and a keystone of OOP.

> There’s no requirement that there be such an ivar, only a method named
> -window that returns an NSWindow*. The implementation of that method is
> arbitrary. For example it might just look like
> - (NSWindow*) window { return [_parent window];}

But then again the compiler would know about these implementations.

> Separating interface from implementation is a good thing and a keystone of
> OOP.

No one is questioning that :)

Anyway, not sure this discussion is really useful for the list anymore.
Happy to discuss further off list if someone is interested.

>> There’s no requirement that there be such an ivar, only a method named
>> -window that returns an NSWindow*. The implementation of that method is
>> arbitrary. For example it might just look like
>> - (NSWindow*) window { return [_parent window];}>
> But then again the compiler would know about these implementations.

You have some magical compiler that can not only access superclasses'
method definitions when all it sees is a header file, but has also
solved the halting problem in a way that allows it to determine how
the method uses instance storage?

No, it wouldn’t. The compiler has no idea how NSDictionary or NSWindow are implemented. All it knows about them is what’s given in their header files. (Worse, even if it did grope into the framework’s binary code to decompile the implementation, that implementation is guaranteed to change in the next OS release, and that might include the size of the instance data.)

The fragile base class problem in a nutshell:

- I’m generating 32-bit Mac OS machine code to read instance variable “self->_foo” in a method of class Bar.
- I have an imaginary internal struct that defines the data layout of Bar. It looks like:
struct BarInstanceData {
Class isa;
BarParentInstanceData inheritedData;
int _foo;
};
- The compiler can now interpret “self->_foo” as a regular C struct access and emits an instruction that loads an int from a hardcoded offset from the register holding ‘self’. Let’s say the offset is 48.
- In the next release of the OS, one of the base classes of Bar has added some instance variables, adding 8 bytes to its instance size.
- This means that at runtime the true offset of self->_foo is now 48+8 = 56.
- Unfortunately the old offset 48 is baked into the machine code of the app/library containing class Bar.
- This means that the implementation of Bar will read and write the wrong locations. Kaboom.

Note that if the compiler can’t work out the instance size of all the base classes of Bar, it can’t work out the size of that BarParentInstanceData struct in step 2, meaning it can’t compile Bar.

Now just removing one of two getter calls by caching its result won't
have that much effect on binary size, but the other night I went
through my code to do the exhaustively. The size decrease was quite
significant.

Using properties when a simple iVar would do is not justified. One
wants to use properties only when the generated code significantly
reduces the amount of work one has to do as a coder, for example by
automagically taking care of retain counts.

Calling accessors is also quite slow compared to a direct iVar access,
because it has to go through Objective-C's message dispatch mechanism.

> Using properties significantly increased the size of my executable
> file. If I had something like this:
>
> float a = [self foo: self.scale];
> float b = [self bar: self.scale];
>
> I could cut down the size of my code quite a bit by caching the return
> value of self.scale:
>
> float theScale = self.scale;
> float a = [self foo: theScale];
> float b = [self bar: theScale];
>
> Now just removing one of two getter calls by caching its result won't
> have that much effect on binary size, but the other night I went
> through my code to do the exhaustively. The size decrease was quite
> significant.
>
> Using properties when a simple iVar would do is not justified. One
> wants to use properties only when the generated code significantly
> reduces the amount of work one has to do as a coder, for example by
> automagically taking care of retain counts.

Once again you're using a bogus argument. There's nothing wrong (in most cases) with using stack variables to "cache" property values or return values as you have demonstrated. However, these are not ivars.

Within a class implementation, there's often nothing wrong with reading ivars directly. There's also nothing wrong with writing ivars directly, except that there is generally memory management to consider (though, I guess, not when you're using ARC).

> Calling accessors is also quite slow compared to a direct iVar access,
> because it has to go through Objective-C's message dispatch mechanism.

If you're talking about access from outside the class (that is, from clients of the class), then the overwhelming consensus of developers who've being using Obj-C for a while is that properties are a huge win, because they provide useful encapsulation -- class implementation details, such as the backing store (aka ivar) for a public property, are hidden within the class's implementation.

If you're talking about access within the class implementation, then the best approach depends on the circumstances. In many cases, the self-discipline of encapsulating property values yields greater robustness for subclasses, code readability, etc. In many other cases, internal encapsulation is of no benefit and direct ivar access is perfectly fine.

I'll tell you, though, that (at least pre-ARC) the trend over the last several years has been strongly in favor of using the properties, for practical rather than theoretical reasons. Whether ARC might reverse this trend isn't obvious yet.

> Focussing on interface is no excuse for weighty, bloated code!

I'm not sure what exactly this has to do with "interface".

One mantra that gets repeated a lot on this list is 'No Premature Optimization'. If the "weighty, bloated" accessors produce no significant *measurable* effect on your app, there's no reason to avoid them. If there is a measurable effect, and if the measurable benefits of optimizing your code outweigh the development costs of doing so, then by all means the optimized route is the way to go.

One other advantage to using properties over direct ivar access internally is KVO compliance.

Usually I prefer to declare properties backed by ivars of a different name, then use getters/setters everywhere except inside initializers and dealloc. Frees me from having to worry about willChangeValueForKey:, etc.

I do this even if I currently don't observe a property in order to foster forward compatibility.

(Sent from my iPhone.)

--
Conrad Shultz

On Nov 16, 2011, at 2:19, Quincey Morris <quinceymorris...> wrote:

> On Nov 16, 2011, at 01:00 , Don Quixote de la Mancha wrote:
> >> Using properties significantly increased the size of my executable
>> file. If I had something like this:
>>
>> float a = [self foo: self.scale];
>> float b = [self bar: self.scale];
>>
>> I could cut down the size of my code quite a bit by caching the return
>> value of self.scale:
>>
>> float theScale = self.scale;
>> float a = [self foo: theScale];
>> float b = [self bar: theScale];
>>
>> Now just removing one of two getter calls by caching its result won't
>> have that much effect on binary size, but the other night I went
>> through my code to do the exhaustively. The size decrease was quite
>> significant.
>>
>> Using properties when a simple iVar would do is not justified. One
>> wants to use properties only when the generated code significantly
>> reduces the amount of work one has to do as a coder, for example by
>> automagically taking care of retain counts.>
> Once again you're using a bogus argument. There's nothing wrong (in most cases) with using stack variables to "cache" property values or return values as you have demonstrated. However, these are not ivars.
>
> Within a class implementation, there's often nothing wrong with reading ivars directly. There's also nothing wrong with writing ivars directly, except that there is generally memory management to consider (though, I guess, not when you're using ARC).
> >> Calling accessors is also quite slow compared to a direct iVar access,
>> because it has to go through Objective-C's message dispatch mechanism.>
> If you're talking about access from outside the class (that is, from clients of the class), then the overwhelming consensus of developers who've being using Obj-C for a while is that properties are a huge win, because they provide useful encapsulation -- class implementation details, such as the backing store (aka ivar) for a public property, are hidden within the class's implementation.
>
> If you're talking about access within the class implementation, then the best approach depends on the circumstances. In many cases, the self-discipline of encapsulating property values yields greater robustness for subclasses, code readability, etc. In many other cases, internal encapsulation is of no benefit and direct ivar access is perfectly fine.
>
> I'll tell you, though, that (at least pre-ARC) the trend over the last several years has been strongly in favor of using the properties, for practical rather than theoretical reasons. Whether ARC might reverse this trend isn't obvious yet.
> >> Focussing on interface is no excuse for weighty, bloated code!>
> I'm not sure what exactly this has to do with "interface".
>
> One mantra that gets repeated a lot on this list is 'No Premature Optimization'. If the "weighty, bloated" accessors produce no significant *measurable* effect on your app, there's no reason to avoid them. If there is a measurable effect, and if the measurable benefits of optimizing your code outweigh the development costs of doing so, then by all means the optimized route is the way to go.

> Using properties significantly increased the size of my executable
> file. If I had something like this:
>
> float a = [self foo: self.scale];
> float b = [self bar: self.scale];
>
> I could cut down the size of my code quite a bit by caching the return
> value of self.scale:
>
> float theScale = self.scale;
> float a = [self foo: theScale];
> float b = [self bar: theScale];
>
> Now just removing one of two getter calls by caching its result won't
> have that much effect on binary size, but the other night I went
> through my code to do the exhaustively. The size decrease was quite
> significant.

This isn't an argument against properties, it's an argument against redundant method calls. The same debate existed long before the @property keyword and dot syntax was introduced.

>
> Using properties when a simple iVar would do is not justified. One
> wants to use properties only when the generated code significantly
> reduces the amount of work one has to do as a coder, for example by
> automagically taking care of retain counts.

Or if the superclass anticipates a subclass overriding the getter. Designing for extensibility versus speed is a common tradeoff.

Some people always use the property outside of -init and -dealloc. (Brent Simmons recently tweeted about doing this, and there was a mixed set of replies.) I typically don't, unless I'm writing framework code that I expect other developers will subclass. But because it's all our own code, it's very easy to move to direct ivar access if we need more speed, or to calling accessors for correctness.

>
> Calling accessors is also quite slow compared to a direct iVar access,
> because it has to go through Objective-C's message dispatch mechanism.

objc_msgSend isn't very slow. What measurements have you done that indicate objc_msgSend is taking any appreciable amount of time?

>
> Focussing on interface is no excuse for weighty, bloated code!

I would argue that you have your priorities confused. Focusing on interface recognizes that programmer time is far more expensive than computer time.

> On Nov 16, 2011, at 1:00 AM, Don Quixote de la Mancha <quixote...> wrote:>> Calling accessors is also quite slow compared to a direct iVar access,
>> because it has to go through Objective-C's message dispatch mechanism.>
> objc_msgSend isn't very slow. What measurements have you done that indicate objc_msgSend is taking any appreciable amount of time?

objc_msgSend is slow as Alaskan Molasses compared to a simple C function call.

According to Instruments, my iOS App now spends about half its time in
a C (not Objective-C) void function that updates the state of a
Cellular Automaton grid, and the other half in a low-level iOS Core
Graphics routine that fills rectangles with constant colors.

Despite that I implemented my own most time-critical routine in C,
objc_msgSend takes up two percent of my run time. I expect it would
be a lot more if I didn't implement that grid update routine in C.

>> Focussing on interface is no excuse for weighty, bloated code!>
> I would argue that you have your priorities confused. Focusing on interface recognizes that programmer time is far more expensive than computer time.

End-user time is even more expensive than programmer time.

Large executables take up more disk or Flash storage space, so they
load slower and require more electrical power to load.

For desktop and server machines, large executables are less likely to
be completely resident in physical memory, so the user's hard drive
will spin down less often. For iOS and other mobile devices, your App
will be slower to launch then, after having been suspended, slower to
resume. Also large executables shorten battery life significantly.

It is worthwhile to reduce the size even of the portion of executables
that don't seem to impact run time. The reason is that code takes up
space in the CPU's code cache. That displaces your time-critical code
from the cache, so your app has to slow down by hitting main memory to
load the cache.

Calling any Objective-C method also loads data into the data cache, at
the very least to refer to self, as well as to search the selector
table.

That selector table is fscking HUGE! My iOS App isn't particularly
big, but Instruments tells me that just about the very first thing
that my App does is malloc() 300,000 CFStrings right at startup. I
expect each such CFString is either a selector, or one component of a
selector that takes multiple paramaters.

While the Mach-O executable format is designed to speed selector table
searching, that search reads stuff into both the code and data cache.
If the code and data are not already cached, then objc_msgSend will be
very, very slow compared to a C subroutines call.

I expect that all Objective-C methods are implemented just like C
subroutines that take "self" as a parameter that's not named
explicitly by the coder. So objc_msgSend blows away some cache lines
to determine which C implementation to call, as well as to check that
the object to which we are sending the message is not nil. It then
calls the C subroutine, only at that point becoming as efficient as C
code.

Not all Objective-C methods refer to self, not even implicitly. But
self is always passed by objc_msgSend, even if you don't need it.
While the compiler can optimize away the register self is stored in,
the runtime has no choice but to always pass self to methods. That
means method parameters take up space for self that's not always
necessary.

On ARM, the first four parameters are, more or less, passed in
registers. One of those will always be self, even if the called
method doesn't use it.

In C++, one can declare such functions as "static", so that "this" -
the C++ equivalient of self - is not passed. But while Objective-C
supports Class Methods, those whose signature starts with "+" instead
of "-", using a Class Method instead of an Instance Method doesn't
save you anything. Rather than self being passed, a pointer to the
class object will be passed.

I've been trying to beat these arguments into the heads of my fellow
coders since most of you lot were in diapers. Just about always the
responses are that "Premature Optimization is the Root of All Evil,"
as well as that programmer time is too valuable to optimize.

But at the same time, the vast majority of the software I use on a
daily basis is huge, bloated and slow. It has been my experience for
many years that it is simply not possible for me to purchase a
computer with a fast enough CPU, memory, network or disk drive, or
with enough memory or disk that the computer remains useful for any
length of time.

How much do you all know about how hardware engineers try to make our
software run faster?

Moore's "Law" claims that the number of transistors on any given type
of chip doubles every eighteen months. Most such chips also double in
their maximum throughput. To design each new generation of chip, as
well as to construct and equip the wafer fabs that make them is
collosally expensive. A low-end wafer fab that makes chips that
aren't particularly fancy costs at least a billion dollars. A fab for
any kind of really interesting chip like a high-end microprocessor or
really large and fast memory costs quite a lot more than that.

But the woeful industry practice of just assuming that memory, CPU
power, disk storage and network bandwidth are infinite more than
reverses the speed and capacity gains developed at collosal expense by
the hardware people.

You all speak as if you think I'm a clueless newbie, but I was a
"White Badge" Senior Engineer in Apple's Traditional OS Integration
team from 1995 through 1996. For most of that time I worked as a
"Debug Meister", in which I isolated and fixed the very-most serious
bugs and performance problems in the Classic Mac OS System 7.5.2 - the
System for the first PCI PowerPC macs - and 7.5.3.

One of my tasks was to find a way to speed up a new hardware product
that wasn't as fast as it needed to be to compete well against similar
products from other vendors. After tinkering around with a prototype
unit for a while, I rewrote an important code path in the Resource
Manager in such a way that it would use less of both the code and data
caches. That particular code path was quite commonly taken by every
Classic App, as well as the System software, so it improved the
performance of the entire system.

Even so, our product wasn't going to sell well unless most if not all
of the code paths in the entire Classic System software improved its
cache utilization, so I wrote and distributed a Word document that
pointed out that the code and data caches in our product's CPU were
very limited resources.

Rather than writing our software with the assumption that we had the
use of - at the time - dozens to hundreds of megabytes of memory,
which memory was very fast, this document asserted that one should
focus instead on cache utilization. I illustrated my point with a
rectangular array of bullet characters, one for each of the 32-byte
data or cache lines in the PowerPC chips of the day.

Let me give you such a diagram for the ARM Cortex A8 CPUs that are
used by the iPhone 4, first-generation iPad, and the third and - I
think - fourth generation iPod Touch.

The Cortex A8 has 64 bytes in each cache line, which you might think
is a good thing, but it might not be if your memory access patterns
aren't harmonious with the Cortex's cache design. Specifically, if
you read or write so much as one byte in a cache line without then
using the remaining 63 bytes somehow, you are wasting the user's time
and draining their battery in a way that you should not have to.

The ARM Holdings company doesn't manufacture chips itself, it just
designs them, then sells the designs to other companies who use the
"IP" or "cores" in the designs for what are usually more complex
chips. In the case of the first-gen iPad and iPhone 4, while the
design is based on the Cortex A8, the proper name for the chips is the
Apple A4. The A4 is different in some respects from other Cortex A8
implementations. From Wikipedia:

... the Apple A4 has a 32 kb L1 code cache and a 32 kb data cache. At
64 bytes per cache line, this gives us just 512 cache lines for each
of code and data.

Thus, rather than assuming that programmer time is too valuable to
take pride in your work, you should be assuming that your code and
data must make the very best use possible of just 512 available
64-byte cache lines for each of your code and data. Here is a
graphical diagram of how many cache lines are available in each cache:

I didn't have much of a clue about Objective-C, Cocoa or Cocoa Touch
when I first started writing Warp Life. I learned all about them as I
went along. Much of my work has focussed on making Warp Life run
faster, as well as to use less memory. But because I didn't really
know what I was doing when I started coding, my early implementations
were quite half-baked.

A few nights ago I decided to put quite a lot of time and effort into
refactoring the code so as to reduce my executable size. I know a lot
of algorithmic improvements that will speed it up dramatically, but
I'm putting those off until my refactoring is complete.

At the start of refactoring, Warp Life's executable was about 395 kb.
I've done most of the refactoring that I can, with the result being
that its executable is about 360 kb. That's about a nine percent
reduction in code size that resulted from twelve hours of work or so.
I assert that is a good use of my time.

One more thing to consider is that unless you use some kind of
profile-directed optimization, or you create a linker script manually,
the linker isn't that intelligent about laying out your code in your
executable file.

What that means is that uncommonly-used code will quite often be
placed in the same virtual memory page as commonly used code. Even if
that uncommon code is never called, so that it doesn't blow away your
code cache, your program will be using more physical memory than would
be the case if your executable were smaller.

The simple fix is to refactor your source so that as many of your
methods compile to less executable code. Not so easy but far more
effective is to create a linker script that will place all your
less-commonly used code together at the end of your executable file,
with more commonly-used code at the beginning.

More most of that code at the end of your file, it will have the
effect that large portions of your program are never paged in from
disk or Flash storage. Also, despite the fact that the iOS doesn't
have backing store, it DOES have virtual memory.

The Cortex A8 has has hardware memory management, so that the way
executable code is read into memory is by attempting to jump into
unmapped memory regions, which causes a page fault, saves all the
registers on the stack the A8 has quite a few registers - enters the
kernel, which eventually figures out that the faulted access really is
valid, so that a page table entry is allocated in the kernel's memory,
the Flash storage page that page table entry refers to is read into
physical memory, and the page fault exception returns, with all of
those registers being restored from the stack.

If your code is smaller, and you place less-frequently used code all
in one place, the colossal overhead that results from reading
executable code by generating a page fault won't happen so much, your
user's device will spend more of its time and battery power running
your App instead of executing kernel code, the battery charge will
last longer, and the kernel will allocate fewer memory page tables.

> I've been trying to beat these arguments into the heads of my fellow
> coders since most of you lot were in diapers. Just about always the
> responses are that "Premature Optimization is the Root of All Evil,"
> as well as that programmer time is too valuable to optimize.

I haven't really been following this, but there are definitely good insights in your response. To add to your comment, premature is in the timing of the developer, and it never hurts to be thinking along those lines.
--
Gary L. Wade (Sent from my iPhone)http://www.garywade.com/

> I've been trying to beat these arguments into the heads of my fellow
> coders since most of you lot were in diapers. Just about always the
> responses are that "Premature Optimization is the Root of All Evil,"
> as well as that programmer time is too valuable to optimize.
>
> But at the same time, the vast majority of the software I use on a
> daily basis is huge, bloated and slow. It has been my experience for
> many years that it is simply not possible for me to purchase a
> computer with a fast enough CPU, memory, network or disk drive, or
> with enough memory or disk that the computer remains useful for any
> length of time.

Can you confirm that the reason the particular software you use is huge, bloated, and slow is specifically because of objc_msgSend() and not shortcomings in the application's design or the addition of new, resource-intensive features to take advantage of the more powerful hardware? The "premature optimization" response is a colloquial expression of the philosophy that effort should be spent where performance metrics dictate it should.

> How much do you all know about how hardware engineers try to make our
> software run faster?

Probably more than you realize, going by your well-meaning lecturing, interesting and informative as it is. :) These are topics that have been mulled ad nauseam. The precarious balance between efficiency and productivity is as old as software development itself. The choice of Objective-C is an acceptable tradeoff of performance for productivity and flexibility.

If there is no direct evidence of user experience being negatively impacted, programmer productivity wins, because shipping software on time is as important a concern as anything else in commercial development. You're fretting over optimal memory access patterns for A8 cache lines and executable size when there is a vast spectrum of apps with varying resource requirements, for many of which such low-level information is ultimately meaningless. This goes back to the anti-premature optimization philosophy. You make specific determinations for each project and respond accordingly, because it's not going to matter if some fart app or note-taking app doesn't cache a property value. They may even use KVO extensively, bringing further overhead. If it's not sufficiently measurable in real-world usage of the app, it doesn't actually matter.

> Can you confirm that the reason the particular software you use is huge, bloated, and slow is specifically because of objc_msgSend() and not shortcomings in the application's design or the addition of new, resource-intensive features to take advantage of the more powerful hardware? The "premature optimization" response is a colloquial expression of the philosophy that effort should be spent where performance metrics dictate it should.

Did you ever use any of the early releases of Mac OS X? It was so
slow as to be largely unusable until 10.3 (Panther). Mac OS X had
been shipping to end-users for quite a long time before most end-users
gave up Classic completely. Professional audio in particular just
didn't work in OS X for quite a long time, so serious audio people had
to stick with Classic.

In particular, Xcode's text editor used to drive me absolutely
bananas. I was told that was because it used the Cocoa text widget.
By contrast, the CodeWarrior and TextWrangler text editors have always
been snappy and responsive. It is only recently that the
responsiveness of Xcode's text editor doesn't make me feel like I'm
pounding nails with my fists.

Having a harmoniously-designed architecture with easily understandable
class interfaces is nice and all, but it's no substitute for
performant code. There's more to that "Premature Optimization" quote,
let me Google it up for y'all:

"We should forget about small efficiencies, say about 97% of the time:
premature optimization is the root of all evil. Yet we should not pass
up our opportunities in that critical 3%. A good programmer will not
be lulled into complacency by such reasoning, he will be wise to look
carefully at the critical code; but only after that code has been
identified"[5] — Donald Knuth

So many quote Knuth's advice as the reason one should not optimize,
but his classic Art of Computer Programming texts from the 1960s
examine a whole bunch of algorithms in a lot more detail than do most
modern algorithms books.

It is my experience that very few of the coders I have ever met pay
any attention at all to hardware or system architecture when profiling
or optimizing their code.

For example, when Bjarne Stroustrup first designed the C++ language,
it was thought that easy-to-use inline functions would greatly speed
up any program. But they actually slow down your code when they
interfere with cache utilization. I have just as many gripes, actualy
more, with my colleagues' C+= than I do with the Objective-C that I've
seen. Overuse of inlines, as well as poor implementations of them is
a major gripe, but just one of many.

Many real-time systems do not have memory cache at all. While cached
memory makes it cheaper to make your program go faster if you access
it the right way, it also makes it a lot harder to predict how long
your tasks will take to complete. That's not cool if someone will get
killed if your device doesn't respond according to the designer's
specifications!

Embedded developers deal with that problem by not using cached memory,
preferring hand-rolled assembly code instead. The ARM Holdings
website has oodles of technical articles about how to do that, as do
Intel's and AMD's websites.
--
Don Quixote de la Mancha
Dulcinea Technologies Corporation
Software of Elegance and Beautyhttp://www.dulcineatech.com
<quixote...>

> Despite that I implemented my own most time-critical routine in C,
> objc_msgSend takes up two percent of my run time. I expect it would
> be a lot more if I didn't implement that grid update routine in C.

Definitely. Time-critical inner loops should be written to be as fast as possible, and avoiding message dispatch is a good idea there.

The tricky part is that, in a lot of apps, it’s not always clear what part of the code is going to be time-critical. The answer is obvious in a cellular automata simulator, but not in many other things. Thus Knuth’s dictum that “premature optimization is the root of all evil”. Typically you initially code without considering optimization (unless you’re really sure, as in your case), then profile and optimize based on that.

> End-user time is even more expensive than programmer time.

Yes, but that only comes into play once an end user installs your app. Programmer time is very significant _before_ that, if it either prevents the project from shipping at all, or delays shipping enough that you can’t keep up with competitors, or causes the resulting product to have insufficient testing. There are many, many examples of products that failed to get customers because of this.

(A classic example is WriteNow, an early word processor for the Mac. It was coded in assembly by a team who could and did count clock cycles in their heads, so it was insanely small and fast. Unfortunately they couldn’t keep up with other word processors (especially Word), back when they were still adding genuinely useful features like paragraph styles and image wrapping, so they lost out in the market after a few good years.

> That selector table is fscking HUGE! My iOS App isn't particularly
> big, but Instruments tells me that just about the very first thing
> that my App does is malloc() 300,000 CFStrings right at startup.

Selectors are not CFStrings. They’re uniqued C strings and they’re mapped directly from the read-only TEXT section of the executable.

> If the code and data are not already cached, then objc_msgSend will be
> very, very slow compared to a C subroutines call.

It’s slower than a C call or C++ dispatch, but it’s surprisingly fast. It has been obsessively optimized for a very long time by some very smart people. (Recent OS’s do in fact use a C++-like vtable dispatch for some common classes like NSString.)

I do agree that it’s worth avoiding message-send overhead in some common situations, and I’m not fond of the “use property accessors everywhere” approach. I’ll use the ivar directly if it’s a RHS access or if it’s a scalar value.

> I've been trying to beat these arguments into the heads of my fellow
> coders since most of you lot were in diapers.

Cards on the table: I’m 46 and have been coding since the days of the Altair and IMSAI. But I try to avoid the lengthy “back in my day we had to walk uphill both directions in the snow” rants, even if I’m entitled to them ;-) No one reads all the way to the end, anyway.

Cocoa uses Objective-C for I/O and user-interaction libraries, to respond to events that take place over periods of milliseconds.

For those purposes, I can't weep for calls that take hundreds of nanoseconds instead of tens. If you need C/C++ speeds, by all means use those languages and write your ivars directly.

In my experience, obsessing on objc_msgSend for performance optimization is a fool's errand. Filter out system-library samples (they get reattributed to your own code). That will tell you where to optimize _your_ code. If that means factoring message sends out of inner loops, at least you'll know.

> Did you ever use any of the early releases of Mac OS X? It was so
> slow as to be largely unusable until 10.3 (Panther).

I was working on OS X at that time. The main reasons for poor performance weren’t micro-level stuff like objc_msgsend. They were very large-scale issues that only became apparent after doing major profiling. A lot of this was because many Carbon calls that had been extremely cheap in the classic OS were now expensive (such as font stuff, because it had to make IPC calls to the font server) and the code that called them hadn’t yet been updated to take that into account. Lots of that optimization went on in Puma (10.1) which was dramatically faster than the unfortunately-named Cheetah (10.0).

It also took a while for RAM availability to catch up with what the OS was comfortable with; early versions of the OS really required at least twice as much RAM as it said on the box, or they’d spend a lot of time paging.

> Professional audio in particular just
> didn't work in OS X for quite a long time, so serious audio people had
> to stick with Classic.

That was more about bugs and the overhead of developers switching to the all-new CoreAudio API than performance. The CoreAudio folks had already been plenty obsessive about latency during OS X development, all the way down to the kernel level; I used to hang out with the tech lead who had all kinds of interesting stories about it.

I appreciate the arguments you’re making, but you’re claiming more authority than you seem to have, based on factual accuracy about OS X and Obj-C.

Clearly if you are writing an operating system or other
performance-critical software it's exceedingly important to optimize
things are far as possible. One reason that the iPhone and iPad have
fared so well against competition has to be the UI responsiveness, which
in no small part derives from the performance engineering the good folks
at Apple have put into the OS, Core Animation, etc.

One app I maintain has an interactive animated interface that involves
on-the-fly shadows, reflections, etc. You can bet that I spent a lot of
time addressing threading, graphics processing, etc., so that everything
responds smoothly even on an original iPhone. Yes, I rewrote large
portions in straight C.

But it's all about trade-offs and comparative advantage. I agree that
user time is more expensive than developer time. However, "user time"
is not always optimized by increasing an app's raw performance. Along
the lines of comments I made to April in a different thread, if the user
is better served by the developer devoting resources to issues such as
interface improvement, new features, smarter application logic, even
customer support, that is where resources should be allocated.

Or suppose you have an app backed by a subscription web service; time
spent *server-side* optimizing bandwidth and storage lowers hosting
cost, allowing a lower subscription fee, AND may also pay dividends to
the user in speed and battery life of the client device.

In other words, in a world of finite resources (a fact with which you
clearly agree), for a great majority of developers, dwelling on the
performance characteristics of objc_msgSend() is likely not the best way
to create the happiest users.

> According to Instruments, my iOS App now spends about half its time in
> a C (not Objective-C) void function that updates the state of a
> Cellular Automaton grid, and the other half in a low-level iOS Core
> Graphics routine that fills rectangles with constant colors.
>
> Despite that I implemented my own most time-critical routine in C,
> objc_msgSend takes up two percent of my run time.

Do you know how much of your run time is in the overhead of preparing and making C function calls?

> I expect it would be a lot more if I didn't implement that grid update routine in C.

Probably true. The tight integration with C and C++ code is one of Objective-C's great strengths over other high-level object-oriented languages.

> That selector table is fscking HUGE! My iOS App isn't particularly
> big, but Instruments tells me that just about the very first thing
> that my App does is malloc() 300,000 CFStrings right at startup. I
> expect each such CFString is either a selector, or one component of a
> selector that takes multiple paramaters.

The selector table has no CFStrings in it. The selector table is typically smaller than 300,000 entries. The selector table is unused in nearly all Objective-C method calls. You should use Instruments to find out why your app is allocating 300,000 CFStrings at launch.

> A few nights ago I decided to put quite a lot of time and effort into
> refactoring the code so as to reduce my executable size. I know a lot
> of algorithmic improvements that will speed it up dramatically, but
> I'm putting those off until my refactoring is complete.
>
> At the start of refactoring, Warp Life's executable was about 395 kb.
> I've done most of the refactoring that I can, with the result being
> that its executable is about 360 kb. That's about a nine percent
> reduction in code size that resulted from twelve hours of work or so.
> I assert that is a good use of my time.

Were you able to measure a performance improvement?

The dyld shared cache on my Mac is 300 MEGABYTES. And that's just the system's shared libraries for a single CPU architecture, never mind the kernel or drivers or applications. Is it bloated and inefficient? No, it's big because it does lots of stuff. Do you propose that we hand-tune the assembly code and memory cache behavior of all of it?

We do in fact perform that level of performance optimization on the most time-sensitive code. For example, objc_msgSend on x86_64 is carefully arranged to fit as well as possible into current CPUs' decode and execution units. Much of the rest of the system also gets performance attention, just not at that level of detail. But for almost everything, "make the computer do something that it already does but faster" needs to be balanced against "make the computer do something new".

> On Wed, Nov 16, 2011 at 10:29 AM, Preston Sumner
> <preston.sumner...> wrote:>> Can you confirm that the reason the particular software you use is huge, bloated, and slow is specifically because of objc_msgSend() and not shortcomings in the application's design or the addition of new, resource-intensive features to take advantage of the more powerful hardware? The "premature optimization" response is a colloquial expression of the philosophy that effort should be spent where performance metrics dictate it should.>
> Did you ever use any of the early releases of Mac OS X? It was so
> slow as to be largely unusable until 10.3 (Panther). Mac OS X had
> been shipping to end-users for quite a long time before most end-users
> gave up Classic completely. Professional audio in particular just
> didn't work in OS X for quite a long time, so serious audio people had
> to stick with Classic.

Can you confirm that those issues were due to objc_msgSend() and not the immaturity and newness of system frameworks and subsystems? Professional audio adoption wasn't an Objective-C issue; DAWs were Carbon apps anyway. Concern over message dispatch performance is premature unless there's a perceptible impact. If your code is time critical, you'll be avoiding high-level features regardless of language.

> You all speak as if you think I'm a clueless newbie, but I was a
> "White Badge" Senior Engineer in Apple's Traditional OS Integration
> team from 1995 through 1996. For most of that time I worked as a
> "Debug Meister", in which I isolated and fixed the very-most serious
> bugs and performance problems in the Classic Mac OS System 7.5.2 - the
> System for the first PCI PowerPC macs - and 7.5.3.

So which were you, Kon or Bal?

(Those of use with extremely long memories might get the reference ;-)