Is there a generally agreed upon definition for what a programming abstraction is, as used by programmers? [Note, programming abstraction is not to be confused with dictionary definitions for the word "abstraction."] Is there an unambiguous, or even mathematical definition? What are some clear examples of abstractions?

There are either too many possible answers, or good answers would be too long for this format. Please add details to narrow the answer set or to isolate an issue that can be answered in a few paragraphs.
If this question can be reworded to fit the rules in the help center, please edit the question.

8

What do you mean by "mathematically"? I wouldn't really think of abstraction as a mathematical concept.
–
FishtoasterNov 1 '10 at 17:23

2

@mlvljr: I'm sorry, I'm still not sure I follow. Abstraction is just the practice of providing a simpler way of dealing with something. I dont see how formal tools/methods have anything to do with it.
–
FishtoasterNov 1 '10 at 17:32

1

@mlvljr, Do you want a math example, or a math to cover all programming extractions. I don't think the latter exists.
–
C. RossNov 1 '10 at 17:33

25 Answers
25

The answer to "Can you define what a programming abstraction is more or less mathematically?" is "no." Abstraction is not a mathematical concept. It would be like asking someone to explain the color of a lemon mathematically.

If you want a good definition though: abstraction is the process of moving from a specific idea to a more general one. For example, take a look at your mouse. Is it wireless? What kind of sensor does it have? How many buttons? Is it ergonomic? How big is it? The answers to all of these questions can precisely describe your mouse, but regardless of what the answers are, it's still a mouse, because it's a pointing device with buttons. That's all it takes to be a mouse. "Silver Logitech MX518" is a concrete, specific item, and "mouse" is an abstraction of that. An important thing to think about is that there's no such concrete object as a "mouse", it's just an idea. The mouse on your desk is always something more specific - it's an Apple Magic Mouse or a Dell Optical Mouse or a Microsoft IntelliMouse - "mouse" is just an abstract concept.

Abstraction can be layered and as fine- or coarse-grained as you like (an MX518 is a mouse, which is a pointing object, which is a computer peripheral, which is an object powered by electricity), can go as far as you want, and in virtually any direction you want (my mouse has a wire, meaning I could categorize it as an objects with a wire. It's also flat on the bottom, so I could categorize it as a kind of objects that won't roll when placed upright on an inclined plane).

Object oriented programming is built on the concept of abstractions and families or groups of them. Good OOP means choosing good abstractions at the appropriate level of detail that make sense in the domain of your program and don't "leak". The former means that classifying a mouse as an object that won't roll on an inclined plane doesn't make sense for an application that inventories computer equipment, but it might make sense for a physics simulator. The latter means that you should try to avoid "boxing yourself in" to a hierarchy that doesn't make sense for some kind of objects. For example, in my hierarchy above, are we sure that all computer peripherals are powered by electricity? What about a stylus? If we want to group a stylus into the "peripheral" category, we'd have a problem, because it doesn't use electricity, and we defined computer peripherals as objects that use electricity. The circle-ellipse problem is the best known example of this conundrum.

I think I see; you are specifically talking about referring to a method or function in an abstract way, i.e. by its contract as opposed to its implementation. Your statements are correct, and this is a perfectly valid use of the term; a method call is an abstraction of some kind of concrete behavior.
–
nlawalkerNov 1 '10 at 22:35

3

@nlawalker: You're mixing up abstraction with generalization. They're not the same thing. What you're describing is the latter ('moving from a specific idea to a more general one'). Abstraction is moving from concrete things to abstract things, e.g. having 7 blue and 7 red marbles, and saying 'I have two sets with the same number of same-colored marbles': here I am moving from concrete things (marbles) to abstract things (classes and equivalent sets). BTW, a natural number n is the class of all equivalent sets of cardinality n, non-circular defined by one-to-one mappings between those sets.
–
pillmuncherNov 13 '10 at 0:52

19

The color of a lemon, mathematically, is light with a wavelength of approximately 570 nm.
–
ErikDec 14 '10 at 22:04

Given two sets G and H, a Galois connection (alpha, beta) can be defined between them, and one can be said to be a concretization of the other; reverse the connection, and one is an abstraction of the other. The functions are a concretization function and an abstraction function.

This is from the theory of abstract interpretation of computer programs, which typically is a static analysis approach to date.

Abstraction is more focus on What and less on How. Or you can say, know only the things you need to, and just trust the provider for all other services. It sometimes even hides the identity of the service provider.

For example, this site provides a system for asking questions and answering them. Almost everyone here knows what are the procedures for asking, answering, voting and other things of this site. But very few knows what are the underlying technologies. Like whether the site was developed with ASP.net mvc or Python, whether this runs on a Windows or a Linux server etc. Because that is none of our business. So, this site is keeping an abstraction layer over its underlying mechanism to us providing the service.

Some other examples:

A car hides all its mechanisms but provides a way to drive, refuel and maintain it to it's owner.

Any API hides all its implementation details providing the service to other programmers.

A class in OOP hides its private members and implementation of public members providing the service to call the public members.

While using an object of type of an Interface or an abstract class in Java or C++, the real implementation is hidden. And not just hidden, the implementations of the methods declared in the Interface are also likely to be different in various implemented/inherited classes. But as you are getting the same service, just don't bother How it is implemented and exactly Who/What is providing the service.

Identity Hiding: For the sentence "I know Sam can write computer programs." the abstraction can be- "Sam is a programmer. Programmers know how to write computer programs." In the second statement, the person is not important. But his ability to do programming is important.

You have a formal system, you propose a thought about that system. You do a proof of it, and if it works out, then you have a theorem. Knowing that your theorem holds, you can then use it in further proofs about the system. Primitives provided by the system (like if statements and int value types) would typically be seen as axioms, although that isn't strictly true since anything that isn't CPU instructions written in machine code is a kind of abstraction.

In functional programming, the idea of a program as a mathematical statement is very strong, and often the type system (in a strong, statically typed language like Haskell, F#, or OCAML) can be used for testing theoremhood through proofs.

For example: let us say that we have addition and equality checking as primitive operations, and integers and booleans as primitive data types. These are our axioms. So we can say that 1 + 3 == 2 + 2 is a theorem, then use the rules of addition and integers and equality to see if that's a true statement.

Now let us suppose we want multiplication, and our primitives (for brevity's sake) include a looping construct and a means to assign symbolic references. We could suggest that

ref x (*) y := loop y times {x +}) 0

I'm going to pretend I proved that, demonstrating that multiplication holds. Now I can use multiplication to do more stuff with my system (the programming language).

I can also check my type system. (*) has a type of int -> int -> int. It takes 2 ints and outputs an int. Addition has a type of int -> int -> int so the 0 + (rest) holds as long as (rest) results in an int. My loop could be doing all kinds of things, but I'm saying it outputs a chain of curried functions such that (x + (x + (x... + 0))) is the result. The form of that addition chain is just (int -> (int -> (int ... -> int))) so I know my final output will be an int. So my type system held up the results of my other proof!

Compound this sort of idea over many years, many programmers, and many lines of code, and you have modern programming languages: a hearty set of primitives and huge libraries of "proven" code abstractions.

For example, a TCP/IP connection is an abstraction over sending data. You merely include an ip address and a port number and send it off to the API. You aren't concerned with all the details of the wires, signals, message formats, and failures.

Well, mathematically, "integer" is an abstraction. And when you do formal proofs like that x+y = y+x for all integers, you're working with the abstraction "integer" rather than specific numbers like 3 or 4. That same thing happens in software development when you interact with the machine at a level above registers and memory locations. You can think more powerful thoughts at a more abstract level, in most cases.

You're getting good answers here. I would only caution - people think abstraction is somehow this wonderful thing that needs to be put on a pedestal, and that you can't get enough of. It is not. It is just common sense. It is just recognizing the similarities between things, so you can apply a problem solution to a range of problems.

Permit me a peeve...

High on my list of annoyances is when people speak of "layers of abstraction" as if that's a good thing. They make "wrappers" around classes or routines they don't like, and call them "more abstract", as if that will make them better. Remember the fable of the "Princess and the Pea"? The princess was so delicate that if there was a pea under her mattress she wouldn't be able to sleep, and adding more layers of mattresses would not help. The idea that adding more layers of "abstraction" will help is just like that - usually it doesn't. It just means that any change to the base entity has to be rippled through multiple layers of code.

If a change in one place makes you have to make multiple changes elsewhere, then your abstractions are bad. In essence, I won't say that I or anyone else has never erred on the side of too much abstraction, but there's a method to that madness. Good abstractions are the cornerstone of loosely coupled code. Given the right abstraction, changes become ridiculously simple. So yes, I do put abstractions on a pedestal and I spend an inordinate amount of time finding the right ones.
–
Jason BakerNov 2 '10 at 2:12

1

@Jason: "If a change in one place makes you have to make multiple changes elsewhere, then your abstractions are bad." I'm with you there. I seem to be surrounded by bad ones.
–
Mike DunlaveyNov 2 '10 at 11:50

1

It sounds like you've worked in a place where the devs had grand visions, and there wasn't a strong boss keeping the team focused. When I find myself in an environment like that I start looking for another job (project budgets always over, or small enough company => bankrupt). I saw a tweet recently: 'spaghetti code' vs. 'lasagna code', the latter is when there are too many layers.
–
yzorgDec 18 '10 at 6:59

I think you might find a blog post of mine on leaky abstractions useful. Here is the relevant background:

Abstraction is a mechanism to help take what is common among a set of related program fragments, remove their differences, and enable programmers to work directly with a construct representing that abstract concept. This new construct (virtually) always has parameterizations: a means to customize the use of the construct to fit your specific needs.

For example, a List class can abstract away the details of a linked-list implementation-- where instead of thinking in terms of manipulating next and previous pointers, you can think on the level of adding or removing values to a sequence. Abstraction is an essential tool for creating useful, rich, and sometimes complex features out of a much smaller set of more primitive concepts.

Abstraction is related to encapsulation and modularity, and these concepts are often misunderstood.

In the List example, encapsulation can be used to hide the implementation details of a linked-list; in an object-oriented language, for instance, you can make the next and previous pointers private, where only the List implementation is allowed access to these fields.

Encapsulation is not enough for abstraction, because it does not necessarily imply you have a new or different conception of the constructs. If all a List class did was give you 'getNext'/'setNext' style accessor methods, it would encapsulate from you from the implementation details (e.g., did you name the field 'prev' or 'previous'? what was its static type?), but it would have a very low degree of abstraction.

Modularity is concerned with information hiding: Stable properties are specified in an interface, and a module implements that interface, keeping all implementation details within the module. Modularity helps programmers cope with change, because other modules depend only on the stable interface.

Information hiding is aided by encapsulation (so that your code does not depend on unstable implementation details), but encapsulation is not necessary for modularity. For example, you can implement a List structure in C, exposing the 'next' and 'prev' pointers to the world, but also provide an interface, containing initList(), addToList(), and removeFromList() functions. Provided that the rules of the interface are followed, you can make guarantees that certain properties will always hold, such as ensuring the data-structure is always in a valid state. [Parnas's classic paper on modularity, for example, was written with an example in assembly. The interface is a contract and a form of communication about the design, it does not necessarily have to be mechanically checked, although that's what we rely on today.]

Although terms like abstract, modular, and encapsulated are used as positive design descriptions, it's important to realize that the presence of any of these qualities does not automatically give you good design:

If an n^3 algorithm is "nicely encapsulated" it will still perform worse than an improved n log n algorithm.

If an interface commits to a specific operating system, none of the benefits of a modular design will be realized when, say, a video game needs to be ported from Windows to the iPad.

If the abstraction created exposes too many inessential details, it will fail to create a new construct with its own operations: It will simply be another name for the same thing.

Ok, I think I figured out what you're asking: "What is a mathematically rigorous definition of 'An Abstraction.'"

If that's the case, I think you're out of luck- 'abstraction' is a software architecture/design term, and has no mathematical backing to it as far as I'm aware (maybe someone better versed in theoretical CS will correct me here), any more than "coupling" or "information hiding" have mathematical definitions.

That said, I don't think his "Abstractness" is the same as yours. His is more a measure of "lack of implementation on a class" meaning use of interfaces/abstract classes. Instability and Distance from Main Sequence probably play more into what you're looking for.

Abstraction is when you ignore details deemed irrelevant in favor of those deemed relevant.

Abstraction encompasses encapsulation, information hiding, and generalization. It does not encompass analogies, metaphors, or heuristics.

Any mathematical formalism for the concept of abstraction would itself be an abstraction, as it would necessarily require the underlying thing to be abstracted into a set of mathematical properties! The category-theory notion of a morphism is probably closest to what you're looking for.

Abstraction is not something you declare, it is something that you do.

As a way of explaining it to another person I would go the other way around, from the outcomes back:

Abstraction in computer programming is the act of generalizing something to the point that more than one similar thing can be generally treated as the same and handled the same.

If you want to expand on that you can add:

Sometimes this is done to achieve polymorphic behavior (interfaces and inheritance) to cut down on repetitive code up front, other times it is done so that the inner workings of something could be replaced at a future date with a similar solution without having to alter the code on the other side of the abstracted container or wrapper, hopefully cutting down on rework in the future.

Merriam-webster defines abstract as an adjective being: disassociated from any specific instance.

An abstraction is a model of some system. They often list a group of assumptions that have to be met for a real system to be able to be modeled by the abstraction, and they are often used to allow us to conceptualize increasingly complicated systems. Going from a real system to an abstraction does not have any formal mathematical method for doing so. That's up to the judgment of whoever is defining the abstraction, and what the purpose of the abstraction is.

Often times though, abstractions are defined in terms of mathematical constructs. That's probably because they are so often used in science and engineering.

An example is Newtonian mechanics. It assumes everything is infinitesimally small, and all energy is conserved. The interactions between objects are clearly defined by mathematical formulas. Now, as we know, the universe doesn't quite work that way, and in many situations the abstraction leaks through. But in a lot of situations, it works very well.

Another abstract model is typical linear circuit elements, resistors, capacitors, and inductors. Again the interactions are clearly defined by mathematical formulas. For low frequency circuits, or simple relay drivers, and other things, RLC analysis works well and provides very good results. But other situations, like microwave radio circuits, the elements are too big, and the interactions are finer, and the simple RLC abstractions don't hold up. What to do at that point is up to the judgment of the engineer. Some engineers have created another abstraction on top of the others, some replacing ideal op-amps with new mathematical formulas for how they work, others replace ideal op amps with simulated real op-amps, which in turn are simulated with a complex network of smaller ideal elements.

As others have said, it is a simplified model. It is a tool used to better understand complex systems.

You're not making any sense. Providing hints on designing a real system is not a purpose for an abstraction. If you don't know anything about what is being modeled, that's not a problem for an abstraction to solve.
–
whatsisnameNov 1 '10 at 22:11

An abstraction is representing something (e.g. a concept, a data structure, a function) in terms of something else. For example we use words to communicate. A word is an abstract entity that can be represented in terms of sounds (speech) or in terms of graphical symbols (writing). The key idea of an abstraction is that the entity in question is distinct from the underlying representation, just as a word is not the sounds that are used to utter it or the letters that are used to write it.

Thus, at least in theory, the underlying representation of an abstraction can be replaced by a different representation. In practice, however, the abstraction is rarely entirely distinct from the underlying representation, and sometimes the representation "leaks" through. For example, speech carries emotional undertones which are very difficult to convey in writing. Because of this an audio recording and a transcript of the same words may have very different effect on the audience. In other words the abstraction of words often leaks.

Abstractions typically come in layers. Words are abstractions that can be represented by letters, which in turn are themselves abstractions of sounds, which are in their turn abstractions of the pattern of motion of the particles of air that are created by one's vocal chords and that are detected by one's ear drums.

In computer science bits are typically the lowest level of representation. Bytes, memory locations, assembly instructions, and CPU registers are the next level of abstraction. Then we have primitive data types and instructions of a higher level language, that are implemented in terms of bytes, memory locations, and assembly instructions. Then functions and classes (assuming an OO language) that are implemented in terms of primitive data types and built in language instructions. Then more complex functions and classes are implemented in terms of the simpler ones. Some of these functions and classes implement data structures, such as lists, stacks, queues, etc. Those in turn are used to represent more specific entities such as a queue of processes, or a list of employees, or a hash table of book titles. In this scheme of things each level is an abstraction with respect to its predecessor.

If it's leaky, then it's obviously not an abstraction... enough, let's be honest.
–
mlvljrNov 1 '10 at 19:43

2

@mlvljr Computers aren't math problems, sadly, so you have to allow for some level of leaking. If nothing else, the fact that calculations are being performed on a physical device implies certain constraints against the scope of problems that can be modeled. Technically, the incompleteness theorems imply certain things can't be proven about mathematical systems internally, so even math has "leaky abstractions."
–
CodexArcanumNov 1 '10 at 19:59

2

You can always find a situation where an abstraction will leak. There is no such thing as a perfect abstraction. It is simply a matter of how much it leaks and whether you can live with it.
–
DimaNov 1 '10 at 19:59

I would argue that an abstraction is something that hides unnecessary details. One of the most basic units of abstraction is the procedure. For instance, I don't want to worry about how I'm saving data to the database when I'm reading that data in from a file. So I create a save_to_database function.

Abstractions can also be joined together to form bigger abstractions. For instance, functions may be put together in a class, classes can be put together to form a program, programs can be put together to form a distributed system, etc.

I always think of abstraction in programming as hiding details and providing a simplified interface. It is the main reason programmers can break down monumental tasks into manageable pieces. With abstraction, you can create the solution to a part of the problem, including all of the gritty details, and then provide a simple interface to use the solution. Then you can in effect "forget" about the details. This is important because there is no way a person can keep all of the details of a super complex system in their minds at once. This is not to say that the details underneath the abstraction will never have to be revisited, but for the time being, only the interface must be remembered.

In programming, this simplified interface can be anything from a variable(abstracts away a group of bits and provides a simpler mathematical interface) to a function(abstracts away any amount of processing into a single line call) to a class and beyond.

In the end, programmers' main job is to usually abstract away all the computational details and provide a simple interface like a GUI that someone who doesn't know one thing about how computers work can make use of.

Some of the advantages of abstraction are:

Allows a big problem to be broken up into manageable pieces. When adding a person's records to a database, you don't want to have to be messing with inserting and balancing index trees on the database. This work may have been done at some point, but now has been abstracted away and you no longer have to worry about it.

Allows multiple people to work well together on a project. I don't want to have to know all the ins-and-outs of my colleague's code. I just want to know how to use it, what it does, and how to fit it together with my work (the interface).

Allows people who don't have the required knowledge to perform a complex task to do so. My mom can update her facebook and people she knows all over the country can see it. Without the abstraction of an insanely complex system to a simple web interface, there is no way she would be able to begin to do something similar (nor would I for that matter).

Abstraction can, however, have the reverse effect of making things less manageable if it is overused. By breaking up a problem into too many small pieces, the number of interfaces you have to remember increases and it gets harder to understand what is really going on. Like most things, a balance must be found.

You don't want to care whether the object you're using is a Cat or a Dog, so you go through a virtual function table to find the right makeNoise() function.

I'm sure this could be applied to 'lower' and 'higher' levels as well - think of a compiler looking up the right instruction to use for a given processor or Haskell's Monads abstracting over computational effects by calling everything return and >>=.

This is something I actually wanted to blog about for a longer time, but I never got to it. Luckily, I am a rep zombie and there's even a bounty. My post turned out rather lengthy, but here's the essence:

Abstraction in programming is about
understanding the essence of an object
within a given context.

[...]

Abstraction is not only mistaken with
generalization, but also with
encapsulation, but these are the two
orthogonal parts of information
hiding: The service module decides
what it is willing to show and the
client module decides what it is
willing to see. Encapsulation is the
first part and abstraction the latter.
Only both together constitute full
information hiding.

Interesting question. I don't know of a single definition of abstraction that is considered authoritative when it comes to programming. Though other folks have provided links to some definitions from various branches of CS theory or math; I like to think of it in a similar way to "supervenience" see http://en.wikipedia.org/wiki/Supervenience

When we talk about abstraction in programming, we are essentially comparing two descriptions of a system. Your code is a description of a program. An abstraction of your code would also be a description of that program but at a "higher" level. Of course you could have an even higher level abstraction of your original abstraction (e.g. a description of the program in a high-level system architecture vs. the description from the program in its detailed design).

Now what makes one description "higher-level" than another. The key is "multiple realizability" -- your abstraction of the program could be realized in many ways in many languages. Now you might say that one could produce multiple designs for a single program as well -- two people could produce two different high-level designs that both accurately describe the program. The equivalence of the realizations makes the difference.

When comparing programs or designs you must do so in a way that allows you do identify the key properties of the description at that level. You can get into complicated ways of saying a design is equivalent to another, but the easiest way to think about it is this -- can a single binary program satisfy the constraints of both descriptions?

So what makes one level of description higher than the other? Let's say that we have one level of description A (e.g. design documents) and another level of description B (e.g. source code). A is higher level than B because if A1 and A2 are two non-equivalent descriptions at level A, then realizations of those descriptions, B1 and B2 must also be non-equivalent at level B. However, the reverse doesn't necessarily hold true.

So if I can't produce a single binary program that satisfies two distinct design documents (i.e. the constraints of those designs would contradict each other), then source code that implements those designs must be different. But on the other hand, if I take two sets of source code that couldn't possibly be compiled into the same binary program, it still could be the case that the binaries resulting from compiling those two sets of source code both satisfy the same design document. Thus the design document is an "abstraction" of the source code.

+1, But not always I think (think of truly unnecessary details :) ). The initial question is: "Is it about removing details forever, forgetting them temporarily or even utilizing them in some way not knowing that?"
–
mlvljrNov 6 '10 at 9:16

1

I do not care how many photons have hit my customers today.
–
flamingpenguinDec 20 '10 at 12:31

For me abstraction is something that doesn't exist "literally", it's something like an idea.
If you express it mathematically, it's not abstract anymore because mathematics are a language to express what happens in your brain so it can be understood by somebody else's brain, so you can't structure your ideas, because if you do, it's not an idea anymore: you would need to understand how a brain works to express an idea model.

Abstraction is something that allows you to interpret reality into something that can be independent from it. You can abstract a beach and a dream, but the beach exists but the dream doesn't. But you could tell both exist, but it's not true.

The hardest thing in abstraction is finding a way to express it so that other people can understand it so it can turn into reality. That's the toughest job, and it can't really be done alone: you have to invent a relative model that works on your ideas and that can be understood by somebody else.

To me abstraction in computer language should be name "mathematicating" the model, it's about reusing ideas that can be communicated, and it's an enormous constraint compared to what can be achieved abstractly.

To put it simply, atoms are next to each others, but they don't care.
A large set of molecules organised into a human being can understand that he is next to somebody, but it can't understand how, it's just how the atoms positionned themselves into some pattern.

One object that is ruled by a concept, in general, cannot "understand" itself. That's why we try to believe in god and why we have an hard time understanding our brain.

Programming abstractions are abstractions made by someone on a programatical element. Lets say you know how to build a Menu with it's items and stuff. Then someone saw that piece of code and thought, hey that could be useful in other kind of hireachy-like structures, and defined the Component Design Pattern with is an abstraction of the first piece of code.

Object Oriented Design Patterns are quite a good example of what is abstraction, and i dont mean the real implementation but the way we should approach a solution.

So, to sum up, programming abstraction is an approach that allows us understand a problem, it is the means to get something but it aint the real thing