do not fill this field

Programming with passion

My Worst Mistakes in Programming

I’m in the middle of refactoring a big infrastructure piece in our product PopScan. It’s very early code, rarely touched since its inception in 2004, so I’m dealing mainly with my sins of the past.

This time like no time before, I’m feeling the two biggest mistake I have ever made in designing a program, so I though I’d make this post here in order to help others not fall into the same trap.

Remember this: Once you are no longer alone working on your project, the code you have written sets an example. Mistakes you have made are copied - either verbatim or in spirit. The design you have chosen lives on in the code that others write (rightfully so - you should strive to keep code consistent).

This makes it even more important not to screw up.

Back in 2004 I have failed badly at two places.

  • I chose a completely wrong abstraction in class design, mixing two things that should be separate.

  • I chose - in a foolhearted whish to save on CPU time to create a ton of internal state instead of fetching the data when it’s needed (I could still cache then, but I missed that).

So here’s the story.

One is the architectural issue.

Let me tell you, dear reader, should you ever be in the position of having to do anything even remotely related to an ecommerce solution dealing with products and orders, so repeat with me:

Product lists are not the same thing as orders. Orders are not the same thing as baskets.

and even more importantly:

A product and a line item are two completely different things.

A line item describes how a specific product is placed in a list, so at best, a product is contained in a line item. A product doesn’t have a quantity. A product doesn’t have a total price.

A line item does.

And when we are at it: «quantity» is not a number. It is the entitiy that describes the amount of times the product is contained within the line item. As such a quantity usually consists of an amount and a unit. If you change the unit, you change the quantity. If you change the amount, you change the quantity.

Anyways - sitting down and thinking of the entities in the feature that you are implementing is an essential part of the work that you do. Even it it seems “kinda right” at the time, even if it works “right” for years - once you make a mistake at a bad place, you are stuck with it.

PopScan is about products and ordering them. Me missing the distinction between a product and a line item back in 2004 worked fine until now, but as this is a core component of PopScan, it has grown the most over the years, more and more intertwining product and line item functionality to the point of where it’s too late to fix this now or at least it would require countless hours of work.

Work that will have to be done sooner rather than later. Work that deeply affects a core component of the product. Work that will change the API greatly and as such can only be tested for correctness in integration tests. Unit tests become useless as the units that are being tested won’t exist any more in the future.

Painful work.

If only I had more time and experience those 8 years ago.

The other issue is about state

Let’s say you have a class FooBar with a property Foo that is exposed as part of the public API via a getFoo method.

That Foo relies of some external data - let’s call it foodata.

Now you have two options of dealing with that foodata:

  1. You could read foodata into an internal foo field at construction time. Then, whenever your getFoo() is called, you return the value you stored in foo.

  2. Or you could read nothing until getFoo() is called and then read foodata and return that (optionally caching it for the next call to getFoo())

Chosing the first design for most of the models back in 2004 was the second biggest coding mistake I have ever made in my life.

Aside of the fact that constructing one of these FooBar objects becomes more and more expensive the more stuff you preload (likely never to be used for the lifetime of the object), you have also contributed to a huge amount of internal state of the object.

The temptation to write a getBar() method that has a side effect of also altering the internal foo field is just too big. And now you end up with a getBar() that suddenly also depends on the internal state of foo which suddenly is disconnected from the initial foodata.

Worse, suddenly calling code will see different results depending on whether it calls getBar() before it’s calling getFoo(). Which will of course lead to code depending on that fact, so fixing it becomes very hard (but at least caught by unit tests).

Having the internal fields also leads to FooBar’s implementation preferring these fields over the public methods, which is totally fine, as long as FooBar stands alone.

But the moment there’s a FooFooBar which inherits from FooBar, you lose all the advantages of polymorphism. FooBar’s implementation will always only use its own private fields. It’s impossible for FooFooBar to affect FooBar’s implementation, causing the need to override many more methods than what would have been needed if FooBar used its own public API.

Conclusion

These two mistakes cost us hours and hours of working around our inability to do what we want. It cost us hours of debugging and it causes new features to come out much more clunky than they need to be.

I have done so many bad things in my professional life. A shutdown -h instead of -r on a remote server. A mem=512 boot parameter (yes. That number is/was interpreted as bytes. And yes. Linux needs more than 512 bytes of RAM to boot), an update without where clause - I’ve screwed up so badly in my life.

But all of this is nothing compared to these two mistakes.

These are not just inconveniencing myself. These are inconveniencing my coworkers and our customers (because we need more time to implement features).

Shutting down a server by accident means 30 minutes of downtime at worst (none since we heavily use VMWare). Screwing up a class design twice is the gift that keeps on giving.

I’m so sorry for you guys having to put up with OrderSet of doom.

Sorry guys.

Abusing LiveConnect for Fun and Profit

On december 20th I gave a talk at the JSZurich user group meeting in Zürich. The talk is about a decade old technology which can be abused to get full, unrestricted access to a client machine from JavaScript and HTML.

I was showing how you would script a Java Applet (which is completely hidden from the user) to do the dirty work for you while you are creating a very nice user interface using JavaScript and HTML.

The slides are available in PDF format too.

While it’s a very cool tech demo, it’s IMHO also a very bad security issue which browser vendors and Oracle need to have a look at. The user sees nothing but a dialog like this:

and once they click OK, they are completely owned.

Even worse, while this dialog is showing the case of a valid certificate, the dialog in case of an invalid (self-signed or expired) certificate isn’t much different, so users can easily tricked into clicking allow.

The source code of the demo application is on github and I’ve already written about this on this blog here, but back then I was mainly interested in getting it work.

By now though, I’m really concerned about putting an end to this, or at least increasing the hurdle the end-user has to jump through before this goes off - maybe force them to click a visible Applet. Or just remove the LiveConnect feature all together from browsers, thus forcing applets to be visible.

But aside of the security issues, I still think that this is a very interesting case of long forgotten technology. If you are interested, do have a look at the talk and travel back in time to when stuff like this was only half as scary as it is now.

Updated Sacy - Now With External Tools

I’ve just updated the sacy repository again and tagged a v0.3-beta1 release.

The main feature since yesterday is support for the official compilers and tools if you can provide them on the target machine.

The drawback is that these things come with hefty dependencies at times (I don’t think you’d find a shared hoster willing to install node.js or Ruby for you), but if you can provide the tools, you can get some really nice advantages over the PHP ports of the various compilers:

  • the PHP port of sass has an issue that prevents @import from working. sacy’s build script does patch that, but the way they were parsing the file names doesn’t inspire confidence in the library. You might get a more robust solution by using the official tool.

  • uglifier-js is a bit faster than JSMin, produces significantly smaller output and comes with a better license (JSMin isn’t strictly free software as it has this “do no evil” clause)

  • coffee script is under very heavy development, so I’d much rather use the upstream source than some experimental fun project. So far I haven’t seen issues with coffeescript-php, but then I haven’t been using it much yet.

Absent from the list you’ll find less and css minification:

  • the PHP native CSSMin is really good and there’s no single official external tool out that demonstrably better (maybe the YUI compressor, but I’m not going to support something that requires me to deal with Java)

  • lessphp is very lightweight and yet very full featured and very actively developed. It also has a nice advantage over the native solution in that the currently released native compiler does not support reading its input from STDIN, so if you want to use the official less, you have to go with the git HEAD.

Feel free to try this out (and/or send me a patch)!

Oh and by the way: If you want to use uglifier or the original coffee script and you need node but can’t install it, have a look at the static binary I created

Updated Sacy - Now With More Coffee

I’ve just updated the sacy repository to now also provide support for compiling Coffee Script.

{asset_compile}
<script type="text/coffeescript" src="/file1.coffee"></script>
<script type="text/javascript" src="/file2.js"></script>
{/asset_compile}

will now not compile file1.coffee into JS before creating and linking one big chunk of minified JavaScript.

<script type="text/javascript" src="/assetcache/file2-deadbeef1234.js"></script>

As always, the support is seamless - this is all you have to do.

Again, in order to keep deployment simple, I decided to go with a pure PHP solution (coffeescript-php).

I do see some advantages in the native solutions though (performance, better output), so I’m actively looking into a solution to detect the availability of native converters that I could shell out to without having to hit the file system on every request.

Also, when adding the coffee support, I noticed that the architecture of sacy isn’t perfect for doing this transformation stuff. Too much code had to be duplicated between CSS and JavaScript, so I will do a bit of refactoring there.

Once both the support for external tools and the refactoring of the transformation is completed, I’m going to release v0.3, but if you want/need coffee support right now, go ahead and clone the repository.

Node to Go

Having node.js around on your machine can be very useful - not just if you are building your new fun project, but also for quite real world applications.

For me it was coffee script.

After reading some incredibly beautiful coffee code by @brainlock (work related, so I can’t link the code), I decided that I wanted to use coffee in PopScan and as such I need coffee support in sacy which handles asset compilation for us.

This means that I need node.js on the server (sacy is allowing us a very cool checkout-and-forget deployment without any build-scripts, so I’d like to keep this going on).

On servers we manage, this isn’t an issue, but some customers insist on hosting PopScan within their DMZ and provide a pre-configured Linux machine running OS versions that weren’t quite current a decade ago.

Have fun compiling node.js for these: There are so many dependencies to meet (a recent python for example) to build it - if you even manage to get it to compile on these ancient C compilers available for these ancient systems.

But I really wanted coffee.

So here you go: Here’s a statically linked (this required a bit of trickery) binary of node.js v0.4.7 compiled for 32bit Linux. This runs even on an ancient RedHat Enterprise 3 installation, so I’m quite confident that it runs everywhere running at least Linux 2.2:

node-x86-v0.4.7.bz2 (SHA256: 142085682187a57f312d095499e7d8b2b7677815c783b3a6751a846f102ac7b9)

pilif@miscweb ~ % file node-x86-v0.4.7 
node-x86: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.5, statically linked, for GNU/Linux 2.2.5, not stripped

The binary can be placed wherever you want and executed from there - node doesn’t require any external files (which is very cool).

I’ll update the file from time to time and provide an updated post. 0.4.7 is good enough to run coffee script though.