What’s Eating OOP?
Repost from altdevblogaday. Also of note that this was my first blog post that I know of that was reposted on reddit/hackernews, and on reddit especially the comments were sort of brutal… oh, internets. Anyway, I’d suggest heading over the altdevblogaday to read the comments when you’re done with the article.
It has been commonplace over the past few years to bash Object Oriented Programming. Functional programming going mainstream. Data oriented design becoming commonplace for performance. The resurgence of dynamic languages. OO bastions going multi-paradigm. Why is everything going wrong for traditional OOP?
Because it took a while but we realized that canonical OOP sucks. Let’s look at .NET’s humble System.Diagnostics.Process class.
- The sin of statefulness. ProcessStartInfo, the mutable type that represents the filename, args, std io, and other state of the Process, has 20 mutable (get/set) properties. The Process type itself has over 50 properties (mostly immutable). The problem here is that the Process itself transitions between three states- not started, running, and finished- and only a subset of properties are valid at any given time. This whole situation is impossible to reason about- you either need to look at the extensive tests that would need to be written to test all the combinations of state, or you’d need to look at it under the debugger to know what’s going on.
- Inheritance. This situation is bad enough. But have you ever seen someone subclass Process? I have, a few times, and it makes things even more impossible to reason about. You presumably subclass it to ensure certain state is set up by default, such as Filename. What if someone mutates that default, though? You either allow it, which makes your class sort of pointless and breaks its invariant (Filename won’t change), or you don’t allow it by raising an Exception, or even worse, just silently returning, which would break the fundamental contract of your base class and the Liskov Substitution Principle (you are quite clearly changing the behavior if you are raising an exception or not fulfilling the contracts the base class makes). There’s no point to inherit stateful objects like this, but that is canonical OOP.
- Code reuse through inheritance/polymorphism. Obviously code reuse is a good thing. The problem is the way OOP encourages it, through polymorphism via inheritance. Process does not implement any interfaces. You could not pass Process to a method or class that, say, is responsible for managing IO and std streams in general, not just for Process. Actually, this isn’t a big problem- just either wrap the Process in something (don’t subclass it!), or pass in only the actual data/methods needed. The ease of getting around this quite clearly demonstrates that, if you were to take away inheritance, it really wouldn’t be such a big deal- would it?
- Messy contracts and abstractions. What are the contracts on Process? Good luck trying to figure them out by reading the documentation (which is extensive). I think everyone has put an asynchronous process into a deadlock, even when following MSFT’s directions. Understanding how to use Process still requires a pretty thorough understanding of the underlying system, and it ends up in a no-man’s land between simplicity and power. These messy (not just leaky) abstractions are the major problem when consuming other people’s code- I can’t count how many 3rd party modules I’ve seen crashes or problems in, if they have a reasonable enough API to figure out in the first place.
I’m aware I’m picking on Process here. It is a .NET 1.0 type, and the .NET framework (and programming in general) has matured immeasurably. I’m sure if the team were to do it again, they would do it quite differently. Process is a simple thing but obviously technically not easy- look at the dozens of ways Python had to launch a process, until subprocess.POpen simplified things into a wonderfully simple yet powerful way. But that’s another good point, isn’t it- even Microsoft, who are supposed to be the leaders in these things (they are the ones training people and publishing the guides), ‘get it wrong,’ if it’s even possible to get right (it isn’t). How is Sammy the Scripter supposed to learn these lessons easily? He won’t. It will take him years, and he’s not going to learn it from OOP, he’s going to learn it (like the C# team did) from other languages and concepts. But this whole time, we’re telling him these fallacies about the wonders of OOP, with inheritance, polymorphism, code reuse, abstraction, patterns, and every other buzzword.
So what are we gonna do? Well, the first thing is to throw out ideological purity when it comes to OOP. The language designers are way ahead of us. Dynamic languages like python and Ruby have long been multi-paradigm. C# has been making big strides in the area, with anonymous methods/lambdas in 3.5, and even adding dynamic typing support in 4.0. Java and even C++ are following suit. On the opposite end of the spectrum, people are also taking hints from Eiffel, the most thorough and pure OO language around, with things like .NET’s Code Contracts.
We’re still lagging behind with education (the education we give at work, not just universities). We need to expand our toolbox by looking at other languages and other concepts. We need to throw out much of the traditional OOP approach we’ve taken that hasn’t worked. (As a commenter pointed out- ideological purity is an aid for new people, but we too often label it as best practices.) But I also don’t want to throw the baby out with the bathwater and start declaring that OOP is dead, or all around inferior. The practical applications of OOP languages (and not necessarily their ideological underpinnings) make them natural for multi-paradigm implementations, and this is something I think it’d be hard to say of procedural, or even functional, languages.
I’d love to see us start to branch out in how we educate and teach to include these non-OO concepts, so we can better use the generally excellent OO languages available. Let’s take the lack of state from functional programming. That’s easy enough to do. Let’s take the modularity and specificity of data oriented design solutions. Not everything has to fit into some grande, reusable abstraction. Let’s be honest about the fact that most of our code does a particular thing and isn’t reused. Let’s take design by contract from Eiffel, and stress how important contracts are for a clear and well abstracted API. Let’s take duck typing from dynamic languages, so we don’t have to write a new interface to use our code somewhere (interfaces are great, except when you want some small overlap or subset of functionality- look at how even though .Add isn’t part of .NET’s IEnumerable, it gets special treatment by the compiler). On the other hand, let’s not forget that formal interfaces are important, and make sure we have those (like ABC’s in python).
We have most of these things already, because the language designers are really quite smart people and are way ahead of where the mainstream usage and understanding of these concepts are. We just need to start using and teaching them more intelligently. Maybe it is a PR thing? Stop calling our languages ‘object oriented’ and take the focus off of the ‘4 principles’, and start teaching people how to program effectively using a variety of paradigms.
Likewise, I’d like to see caution when talking about the style-a-la-mode, whether that’s AOP, DOD, FP, whatever, so we don’t start treating it as a golden hammer. As modern programmers, we live in a complex world, and it is our duty to continually educate ourselves and others using all the information we can find.
looking forward to your next post! it’s been a while :)
Thanks Grak. Work’s been crazy so I haven’t been able to keep up, but it should be slowing down soon and I can get back to blogging. It means a lot when people say they like my posts, so thanks!
https://bugs.php.net/bug.php?id=55617
In my mind, we phper can thoroughly throw away the concept of Object(class instance), we only need Array and Mode Class:
All arrays in initial mode support any array function as it’s method:
array_flip(this);
?>
Use “->mode()” to validate the minimal data set, and then switch mode class:
mode(‘class1’, $success);
?>
Any mode class has no “construct()” in it, but has “validate()” to validate the minimal data set.
The array in a mode still could use array function as its method, but after using any of them the array will be switched back into basic array mode, and we need to use “->mode(‘class1’, $success);” to switch mode back.
The radical thought is data-centric programming, we need seperate the data(array) and the activity(class method).
We could modify php engine, to get rid of parts of OO(object oriented), and support Mode Class, we could call it MyPHP.
For example: $array_man1 could be set into two modes:cls_normal_man and cls_crazy_man:
mode(‘cls_normal_man’)->normal_method1()->mode(‘cls_crazy_man’)->crazy_method1();
?>
Diyism, that’s, um, quite a strange idea. I’m not sure what it would buy you beyond a new paradigm people would be unfamiliar with as I’m not sure of how many practical benefits this would really bring. Isn’t this just the pendulum swinging far back and ‘throwing the baby out with the bathwater’? Just because you can figure out a way to get rid of OO doesn’t mean you should- it is quite useful when used correctly.