To library or not to library, that is the question

I wasn't really sure whether to put this in Coding Help or not, since I wasn't sure where it would otherwise go.

We have a dev that is a strong evangelist of Composer and Packagist. For those not familiar with the PHP ecosystem (thank fuck, you're all thinking), this is the principle package manager and and major repository respectively for third party libraries of PHP code.

Unlike NPM, packages might have dependencies but are generally sane about it - no left-pad here (mostly because we have that shit built in) - so we have actual packages with actual uses and actual sanity. For the most part.

Now my question comes back to this development. I'm generally fairly against bringing in swathes of third party code into an otherwise bespoke platform. I don't object so much to very specialised dependencies that anything I write is going to be a poor imitation of (e.g. PHPExcel), but I have objections about introducing random dependencies for handling tasks.

For example I discovered recently that a library was brought in to handle CSV files. Which hilariously broke on actual CSV files that had a backslash in the data because this library is just a huge wrapper around PHP's own built in CSV functions that assume conventions like backslash to escape things rather than actual CSV spec. (Having stupid defaults is not the problem here) So we have a fairly huge - I think 1500 lines - library that wraps around core functionality and provides nothing of use that I can see unless you wanted to abstract it away for mock purposes, like say unit testing your CSV code. But you don't need 1500 lines for that, you really don't.

And then we get to the fun that was Slim. For those who don't know it, it's a URL routing/front-side controller tool. The fact that you can easily end up with a stack trace 100 layers deep when anything actually occurs is just hilarious. The fact that in our development environments we've had to up the limit in Xdebug because it actually exceeds Xdebug's default stack limit is just... something else. But should we be bringing in such libraries when we could do it ourselves in a fraction of the code/complexity?

So in general... is bringing libraries in a good thing? Is bringing libraries whose relevant functionality can be replicated with core functions a good thing? Or for some jobs, is it worth bringing in libraries and frameworks when rolling your own (even securely) isn't either that time consuming or that complex?

At what point is bringing in third party libraries a good thing? Should it be your first port of call before trying to write it yourself (as some people here seem to think)?

In general, yes, but only if the libraries are good quality. I find it especially useful when dealing with things like database access, parsing complex file formats, or handling file transfers and other streams, as the libraries hide away a lot of the boilerplate and faffing about, so I can get on with focussing on the stuff I need.

If the library is bad however, you may as well just write the damn thing yourself

Is bringing libraries whose relevant functionality can be replicated with core functions a good thing?

Depends on the complexity. For example, that left-pad NPM package is utterly ridiculous, as any competent coder can write their on in just a half-dozen lines. However, an FTP package is worth using, as you can avoid writing hundreds of lines of boilerplate stream writing and reading, and connection management.

Or for some jobs, is it worth bringing in libraries and frameworks when rolling your own (even securely) isn't either that time consuming or that complex?

I'd work out if you even need a framework in the first place, then evaluate the options. Too many people immediately reach for the flavour of the week without working out if they actually need it or not, and they end up with bloated unmaintainable crap.

When it lets you do something complicated more easily and more correctly with less effort. For example, writing a CSV parser that is capable of handling all the annoying edge cases is quite hard, so people are usually recommended to use a library for that. People are definitely recommended to use a library instead of writing their own security code (because that's hilariously easy to fuck up in non-obvious ways). Obviously, the simpler the functionality, the less of a benefit there is in using a library to achieve it.

OTOH, if you've already solved the hard stuff then you're into the space of exchanging the potential to have someone else fix bugs so you don't have to, against the fact that switching to something external might actually introduce bugs in that you've already solved. No easy answer there.

In general, yes, but only if the libraries are good quality. I find it especially useful when dealing with things like database access, parsing complex file formats, or handling file transfers and other streams, as the libraries hide away a lot of the boilerplate and faffing about, so I can get on with focussing on the stuff I need.

How does one judge what is good without really using it in the first place? Especially given that the language has a number of these things built in?

The examples I'm thinking of that are questionable are libraries to handle CSV files, libraries to abstract the file system away (despite the fact that we don't have any use beyond the regular physical file system, we're not abstracting away to S3 for example), libraries to handle URL routing + middleware.

Especially when we're talking about things that the abstracted form brings more complexity than we use.

When it represents a significant and worthy reduction in time and effort required to implement the solution.

This is kind of the crux of my issue. The amount of time the developer in question is using to implement these libraries is actually about the time it would take to do it without the libraries - except we're now on the library treadmill and now need to think about updating the libraries or not.

As a fan of ASP.NET MVC, I'm tempted to say this is a good use case for libraries, but then, is ASP.NET MVC a library or a framework? I think some would favour calling it the latter. I guess really it depends on whether you favour being able to add new routes easily with minimal effort, or having a well-defined set of routes that's tightly managed.

If you do use third-party libraries, which I'm generally in favor of if the task is sufficiently complex, you have to consider the extra effort to keep a curated repository. You want reproducible packages, and few things are more annoying than randomly breaking shit because someone introduced a bug in something you depend on and what worked perfectly yesterday goes down in flames today just because you pulled the current package from a public mirror instead of a known-good one. That curation can be a significant amount of work, too.

Bringing in a third-party library, especially one that is open source* and might need frequent updates (think: something that interacts with Facebook's API, which changes frequently and usually without warning) should always be an absolute last resort.

It doesn't take long at all to get to the point where you've spend more time dicking around to get the library to work than it would have taken you to write the functionality you need (probably 10% of what the library does) in-house.

If you do end up using a third-party library, for God's sake, keep your own copy of it so you have reproducible builds. Do not have your build system pull the newest shiny version off NPM or whatever each time it builds the product.

* I should clarify here I mean "open source development methodology" not "the code is open source". C++'s STL was carefully designed for years by experts to be efficient and high-quality, the fact that the current implementations are open source is incidental. Ditto that with the various frameworks in .net. They could not contrast more with the "Facebook API Doodz" library on GitHub which spends like 80% of its time broken due to "release early, release often". We really need better language here.

How does one judge what is good without really using it in the first place?

Most people don't bother. They just hitch their horse to the new shiny thing and off they go without thinking 6 months down the road. There's zero planning, high-level design, or any kind of foresight in modern software development.

How does one judge what is good without really using it in the first place?

Most people don't bother. They just hitch their horse to the new shiny thing and off they go without thinking 6 months down the road. There's zero planning, high-level design, or any kind of foresight in modern software development.

And then we get to the fun that was Slim. For those who don't know it, it's a URL routing/front-side controller tool. The fact that you can easily end up with a stack trace 100 layers deep when anything actually occurs is just hilarious. The fact that in our development environments we've had to up the limit in Xdebug because it actually exceeds Xdebug's default stack limit is just... something else. But should we be bringing in such libraries when we could do it ourselves in a fraction of the code/complexity?

I did some work with Slim, and AFAIR, it was pretty bare bones little library. I wouldn't expect to see more than 4-5 function calls before it reaches your controller.

Are you sure you didn't end up with some endless redirect loop or something?

So in general... is bringing libraries in a good thing? Is bringing libraries whose relevant functionality can be replicated with core functions a good thing? Or for some jobs, is it worth bringing in libraries and frameworks when rolling your own (even securely) isn't either that time consuming or that complex?

At what point is bringing in third party libraries a good thing? Should it be your first port of call before trying to write it yourself (as some people here seem to think)?

As always, it depends. Some OSS libraries are great, they cover all sorts of edge cases you'd learn about the hard way and can generally make things a lot easier. Others are like your CSV library.

I also download libraries from npm, examine the code and usually get disappointed. I feel like everything is written by monkeys and I could do better myself. And sometimes I do just that.

But here are a few important points I try to keep in mind.

Unlike 3rd party libs, my code doesn't get improved or fixed on its own. A year down the line, that ugly lib has all sorts of new features and edge case fixes, while my once superior lib is now a pile of crap in comparison.

Coding paradigms change. My code meshes well with my other code, so it all calcifies into one hodge-podge framework that is hard to tear apart. OSS code is by design modular and interchangeable.

Will my colleagues be able to maintain my code without me? Will they understand my vision for the code's organization and general usage pattern? Or will the next guy just bolt on their crap on top of mine? A 3rd party lib will generally have well established patterns, so there will not be problems like that.

Finally, will other coders even want to work on my artisinally made in-house framework? I mean, even if my thing is superior, once the time comes to update their CV, I bet they'd rather have "2 years of express+mongo+slim+whatevers-hip" there, than "2 years of cartman's internal framework no one's ever heard of before".

Fight the instinct and let the modules in. Unless they absolutely suck and you have no other option.

OTOH, if you've already solved the hard stuff then you're into the space of exchanging the potential to have someone else fix bugs so you don't have to, against the fact that switching to something external might actually introduce bugs in that you've already solved. No easy answer there

Will my colleagues be able to maintain my code without me? Will they understand my vision for the code's organization and general usage pattern? Or will the next guy just bolt on their crap on top of mine?

The real shame is that some of the coolest parts of our tech stack aren't frameworks or libraries and couldn't really be made libraries without compromising what they were made for.

I guess I'm just pissy about people just adding more and more crap on top of the crap we already have especially when we have enough crap that we don't really need. I honestly don't feel that '3rd party library' should be the first suggestion to any problem that we encounter.

The real shame is that some of the coolest parts of our tech stack aren't frameworks or libraries and couldn't really be made libraries without compromising what they were made for.

I guess I'm just pissy about people just adding more and more crap on top of the crap we already have especially when we have enough crap that we don't really need. I honestly don't feel that '3rd party library' should be the first suggestion to any problem that we encounter.

But if you guys want to bring the hellish library community of npm to the toxic hellstew of PHP, feel free

@sloosecannon Packagist is vastly more sane than npm. And I'm not the one in our company who is "is there a library for that", because I could easily be the person writing a library rather than consuming. It's not like I can't write it myself in just about every case we currently use a third party dependency - the exception is PHPExcel, and I'd rather have my sanity first.

The fact we are now in a treadmill where almost every library on it has upgrades we currently cannot deploy because our platform does not yet work on PHP 7 and our libraries all now mandate PHP 7 is just a wonderful place.

@sloosecannon Packagist is vastly more sane than npm. And I'm not the one in our company who is "is there a library for that", because I could easily be the person writing a library rather than consuming. It's not like I can't write it myself in just about every case we currently use a third party dependency - the exception is PHPExcel, and I'd rather have my sanity first.

The fact we are now in a treadmill where almost every library on it has upgrades we currently cannot deploy because our platform does not yet work on PHP 7 and our libraries all now mandate PHP 7 is just a wonderful place.

Ow...

It's like you took a post from the  thread and compressed all the ow in it into one sentence...

@Magus You got me curious, so I'm digging into how they measure quality, and right off the bat, they admit it's a flawed premise:

Any objective measurements of quality are going to be flawed one way or another. package-quality only attempts to give some indications about quality, not be an absolute rating on which to bet your farm. If you don't agree with our ratings, please help us improve them!

They also have some odd measures:

More versions means higher quality.
I can imagine that being pulled apart quite spectacularly here.

More downloads means higher quality.
Same again. Just because a package is popular, doesn't automatically mean it's good. Still, it's a better metric than version count.

Repo quality.
This is a complex one. There's three measures:

Total Factor 1-1/total_number_of_issues

Open Factor 1.2-open_issues/total_number_of_issues ('healthy' is 20% or less of issues open)

Long Open Factor 1-long_open_issues/total_number_of_issues ('long' means 'over a year')

I'll be honest: I'm not exactly sure I understand the reasoning behind that last measure.

Then there's this:

New packages will always appear with low stars, until they get enough momentum. Also, packages that "just work" and get no issues will be underrated by our system.

Which just throws more doubt over their measurements.

I just PR'd adding shields for this to both SockBot and SockMafia. Now I'm wondering if we got a bit carried away.

Long Open Factor 1-long_open_issues/total_number_of_issues ('long' means 'over a year')

The “Not-Jeff Factor” should be the number of issues that have been opened without being closed as fixed in a way that the submitter finds acceptable, as that's one which is fairly hard to game. Except nobody collects the critical metrics in the first place, so tricks like deleting everything after 10 days that hasn't been fixed let you get a hugely high acceptability metric while not making anyone actually happy.

Coding paradigms change. My code meshes well with my other code, so it all calcifies into one hodge-podge framework that is hard to tear apart. OSS code is by design modular and interchangeable.

Will my colleagues be able to maintain my code without me? Will they understand my vision for the code's organization and general usage pattern? Or will the next guy just bolt on their crap on top of mine? A 3rd party lib will generally have well established patterns, so there will not be problems like that.

Finally, will other coders even want to work on my artisinally made in-house framework? I mean, even if my thing is superior, once the time comes to update their CV, I bet they'd rather have "2 years of express+mongo+slim+whatevers-hip" there, than "2 years of cartman's internal framework no one's ever heard of before".

Pretty much sums up my feelings on it.

If it is a decent open source library that is actively maintained. used, popular and has unit-tests then use it.

At the moment me and another dev are making a .NET web app. I need to do some Image processing, I could use System.Drawing and write a load of code for cropping, rotation and a few other things we have to do with incoming images.