foobarcat

Thursday, 16 February 2017

On Golang and Maintainability

I have talked a bit before, mainly in this post, about how Golang as a language tends to expose complexity and excludes some features that while useful can serve to hide complexity. In this post I'm going to explore this topic in more depth and explain why I think this contributes to Golang being a language better suited to writing maintainable code than Python.

Any sufficiently advanced technology is indistinguishable from magic - Arthur C Clarke

Where Python favours the implicit, Golang favours the explicit. And, where Python hides complexity in 'magic' language features, Golang forces you to go the long way round. Some language features in Python that I consider suitably magic are: decorators, properties and list comprehensions. Decorators and properties are mechanisms of indirection, and all these listed features provide handy shortcuts for developers. List comprehensions themselves are fine but nesting or using them for their side effects can quickly result in difficult to read code.

Short cuts make long delays - Frodo Baggins

The interactive capabilities of the python interpreter can encourage a user to build up multiple lines of Python code into a single complex expression. Case in point, nested list comprehensions, these are usually the result of the condensation of a couple of loops into a one line wonder. And, programmers tend to love one line wonders, they exude elegance, and removing all those lines makes you feel warm and fuzzy inside, because readability and conciseness are easily confused.

Given the fact that it took some thought and tinkering to determine how to compress some readable for loops into such a concise representation, it is likely that the next person to come along, in the absence of the context of the expression's formation, will struggle to decode the compressed representation. In fact they may even try and rewrite it long-form in order to unravel its secrets. List comprehensions that are used for their side effects are full of even more implicit nastiness.

Maintainability comprises a number of factors but a key one is the ability of another programmer (or even you!), to come along and understand the intention of your program. Readability is not inversley proportional to LoC (number of lines of code), mistakenly in this belief programmers can be inclined to do things in complex rather than intelligble ways. The problem is that it can be difficult to distinguish the two. Perhaps a misunderstanding of the code is a indicator of a flaw of the reader or perhaps it is because a simpler representation would suffice. In the former case the writer could be forced to writing a lowest common denominator. In the latter case it pays to consider a language feature's potential cost as well as its benefits.

Language features are like power tools, we come up with excuses just to use them

Golang forgoes many shortcut features resulting in more explicit and maintainable code. I have found that whilst no means necessary, static typing also helps manage complexity and thus improve maintainability in a large application. And optimising for maintenance can be a good idea as this is often where we spend most of our time as developers.

Monday, 6 February 2017

Improvements in go 1.8

This post represents notes collected on the new go release and from the state of go talk of Feb 2017, on changes in go 1.8.

Video of the talk can be found here.
Slides of the talk can be found here.

General Improvements

ignore struct tags in type conversions (easier type conversions)
32-bit mips support
osx 10.8+ supported
go 1.8 is last version to support ARMv5E and ARMv6 processors
go 1.9 will require ARMv6K
go vet (sort of compiler warnings) now detects closing http.Response.Body before checking error
default gopath $HOME/go on unix
go bug command opens a bug on github.com/golang/go with version/machine information
Compiler backend improvements (SSA) sees cpu usage reductions of 20-30% on arm and upto 10% on x86 (SSA was already part-implemented on x86).

Performance Improvements

build times faster than go 1.7 but slower than go 1.4
improved -race detection
mutex contention profiling `go test bench=. -mutexprofile=mutex.out`, can provide data on whether you should lock in a less or more granular manner, sequential could even be faster.
sub-millisecond (~100 microsecond) GC pause times, costing an extra 1/2% cpu.
defer is a 1/10th to a 1/3rd faster, but still not that fast, for example...
cgo is 50% faster, mostly due to removing high frequency defer calls

Additions to the Standard Library

sort.Slice() introduced, provides easier slice sorting
plugins introduced (linux only linux atm), load shared libraries at runtime, enables hot code swapping
added Shutdown method to http.Server, was previously very hard to stop previously, personally I had to resort to https://github.com/hydrogen18/stoppableListener
HTTP/2 support introduced

Full go 1.8 release notes are here.

go 1.8 is set to be released on February 16th 2017.

Golang UK conference is on August 16th to 18th 2017.

Friday, 3 February 2017

Thoughts on Two Years in Golang

In my last post, I talked/ ranted a little bit about not being swept up in new trends or languages without proper analysis of their pros/cons and suitability for use in certain scenarios. Hence after having learnt Golang from scratch two years ago and having been programming in it day in day out its about time that I collected my thoughts on it.

Now a lot can be said about the cost of learning a new language, that time spent learning the basics, making the right of passage mistakes and getting up to speed with the tooling. However, I think that Golang recognises these costs and does what it can to mitigate these for a new developer, not to say that there isn't still a cost. But, I know that for many companies, mine included, the ease at which a Golang programmer can be converted is a signifcant consideration in the choice of the language.

C and Python had a love child and they called it Golang

Strict, opinionated and boring

I think of Golang as a strict, opinionated and boring language. Now, I know that the word 'boring' has many negative connotations. But when I invoke it here I mean that it lacks many of the features that tittilate academics and occupy the minds of advanced programmers. I discussed the exclusion of exceptions in a previous article. Other non-existent features include some I miss: Generics, operator overloading, primitive sets, assertions. And some I don't: nested functions, inheritance.

I have often heard people say Golang ignores the last X years of language development. Of course there are some useful features missing but in order to keep the language small and simple you have to be strict, and evaluate the costs and benefits of adding a new feature. Terseness can be considered as a feature in and of itself. In other languages the plethora of features can be bewildering and take an age to master, with the extra folds hiding more pitfalls and stumbling blocks.

Inheritance is a big ticket item but I have found that Interface gets you most of the benefits of duck typing without dragging in the massive amount of complexity and metadata fiddling inheritance brings.

Golang has some really nice features. Goroutines are great, these are lightweight concurrency primitives, basically multiplexing upon threads. There are also channels for communicating between goroutines. It is really great that Go can do concurrency so well out of the box and I find it much more clear than Python's generators.

Importantly Golang is very quick to compile and run, out-performing C Python easily and many other Python implementations. This is an oft cited reason for switching from Python to Golang. Most of my work with Golang has been on embedded devices and this was the reason Python was never in the running. There were concerns about its GC (Garbage Collection) latency but great work has been done to bring this to sub-millisecond levels in go 1.8

It has nice concise syntax, something akin to a cross between Python and C, which is nice as I am fond of Python syntax, Java syntax makes me queasy.

Ecosystem

Probably the best feature is the tooling available and the strength of the ecosystem in general, it is fairly comprehensive and has a strong standard library which is something I really miss in Python. It tries hard to get things right the first time and mostly succeeds.

govet and golint are great static analysis tools and gofmt and goimports can format your source code on save in compliance with the style guide, saving time and bikeshedding. Golang really benefits from the strictness here, introduced at such an early stage that everyone is forced to get on board. I am so used to auto code formatting that I also set up auto pep8 formatting in Python and didn't look back.

The source tree layout and the build process are also standardised and there are great tools for running, building, testing and generating coverage stats in a standardised way with very little effort. You get deployable static binaries with little hassle which I always found a struggle with Python. This layout and process is strictly dictated which I know will rub some people the wrong way but in my opinion it saves a lot of turmoil for a little sacrifice in freedom.

It is very easy to pull dependencies `go get github.com/username/foo`, and you're there. However the lack of versioning and no way of telling how popular a library is are problematic. There are some third party solutions to the former problem, personally I use godep and there was some attempt to fix versioning with vendoring, but I don't feel as this is a complete solution and poses its own questions. However I am always a bit horrified by the multitude of tools when I have to pull dependencies in Python {pip, easyinstall, setuptools}, I don't think go does too bad in comparison.

Gripes

Now for some gripes.

Non-pointer receiver methods, this is often a common pitfall for new go programmers. In using a method with a non-pointer receiver, the receiver itself is copied by value meaning that changes to that receiver after the function call are not persisted. See this code example.

Lack of a generic max function, this is quite embarrassing for the language as it is something that newcomers will run into fairly early. Due to the lack of generics there is no max function for all numeric types and seemingly as a result of this no max function for any numeric type, err, yea, I know.

Sensible slicing syntax, now the syntax we have is quite nice for some use cases and is appreciated but I still have to resort to slice tricks.

Being strict and opinionated has downsides, on some issues the exclusion of certain features and lack of support for certain use cases makes it seem as though some problems are being wilfully ignored, namely, generics and dependency versioning.

Summary

I find Golang a great place on the ladder of abstraction, garbage collected and static typed. I can develop faster in Python but I am more confident of my Golang code's correctness as Python hides complexity, tries to be smart and lacks the safety of the compiler. However Golang does lack some of the libaries and stacks for widespread adoption on the server though this is improving everyday. And its memory requirements may be too demanding for some extremely resource constrained embedded environments, however it has performed admirably for our embedded use case thus far. After two years I like Golang as a language, there's much much more that I have to say about it. But it suffices to say that its a language that I am now very comfortable with and productive in and I feel more confident writing maintainable and efficient code in than Python.

Saturday, 21 January 2017

On Programming and Pragmatism

You know when someone wants to invoke feelings of humility and humbleness they show you that graph. You know the one, it shows that Dinosaurs lived for ages in comparison to us and how we are merely an insignificant blip on our planet's mammoth (geddit!) timeline. Well we can see the software industry in a similar position to man in this example, being about sixty years old and fledgling in comparison to traditional engineering. Take the Institute of Civil Engineers in the UK, two centuries old, with established practices and a commitment to professional review, conduct, and a collective commitment to studying and analysing past works. Morality is a seperate topic, but just imagine if we as a community of engineers had reached the maturity whereby we saw each failure as a learning opportunity and seriously analysed case studies.

I have always found that there is comfort in tradition, I think that this partly explains a few bizzare ongoing phenomena and anachronisms such as constitutional monarchy. There is comfort in tracing an unbroken line back, and knowing that your ancestors encountered similar difficulties yet persevered. However this is a comfort that the software industry is visibly bereft of. Perhaps this goes some way to explaining our identity crisises, the continual rocking of the boat every few years when 'THE NEXT BIG THING'TM comes along and all those goddamn wood-working craftsman metaphors that everyone is so fond of. I think that it is a sign of industrial immaturity that a dogmatic view that the next big thing will solve all our problems is so alive and well. New technologies have pros and cons and are designed for certain use cases over others, we should be able to evaluate their merits level headedly.

There is that constant desire to seek that silver bullet, OOP, functional programming, test-driven development, agile methodologies, they all promise to cure all ills yet come with their own set of potential abuses and weaknesses. I read a Steve Yegge post where he compared a programmer's progression to that of a child. At first the bewildering exploration of the early years, then the overconfidence of adolesence, followed by the humility of adulthood, admitting that complexity and flaws exists and always will. I see the software industry as in those heady teenage years, still chasing absolute truths.

'I know that I know nothing' - Socrates

I think that some of the best programmers are the ones who realise their limitations and check overconfidence. They program defensively, realise the human brain will never be up to the task of perfectly modelling and building these complex systems, these castles in the sky, and don't try and solve that problem by weaving more layers of abstraction, UML and object hierarchies. They behave conservatively and understand the importance of testing and don't overreach.

Have you ever found some code and thought, this is crap, who wrote this? ... git blame, oh, me? This is evidence that we are constantly improving and as we do we realise that our formerselves were misguided in some way, this is an endless path, we do not one day become enlightened and get bestowed a halo and aura by Richard Stallman. It stands to reason that there are always flaws in our understanding, this realisation is one of the humbling and empowering truths of programmer adulthood. If we had limitless understanding tests would be redundant and refactoring rare.

As developers we like to imagine ourselves as omniscient and infallable and don't like putting our mistakes on show, we lean on git rebase. This fallacy is propagated by many solutions presented in blog posts or code samples that exclude the context of their genesis and teetering development. For code review, fine, but in general there is no point in fixing up your version control history so it looks like you are some zen programming god. Improvements come in increments, everything won't be solved in the 'BIG REWRITE'TM. I'm not sure if its a cultural thing, but there is this Japanese concept in japan of 'kaizen', continuous, iterative improvement, I think this a healthier philosophy, than I am going to fix everything in one highway to the danger zone themed montage.

We have to be pragmatic lest we become lost in the complexity of our work, software is hard and stable optimal solutions take time, good engineering and clean coding can help but we have to be careful not to overreach or become swept up in heady currents of new trends

Sunday, 8 January 2017

On Golang and Exceptions

I have been programming professionally in Golang for a couple of years now and I have to say that I really like the language. My first experience of Golang was a bit of a drop in the deep end, coming into a new job where I would be using Golang as my main language with no real experience. Yet, despite this, it did not take very long before I became productive. I believe that this is partly due to the simplicity of golang and its density/ economy of language features.

Golang was designed to be a small, strict and opinionated language. The small size reduces the required learning time, strictness ensures users do not form harmful habits such as ignoring warnings or leaving unused variables lying around and its opionatedness puts an end to bikeshedding about things like brace placement. This is in contrast to a language such as C++, massive and sprawling and certainly intimidating to a newcomer. The size and complexity of C++ provides many places for the concealment of pitfalls. And an effective understanding of the quirks and gotchas of the lanugage is deservedly highly valued in the corporate world. The problem with giving you this much rope is that it is long enough to hang yourself many times over. Sure, it is powerful but it is shows little respect for your sanity if you are not well directed in your work. Golang also tries to avoid introducing magic where possible. By magic I mean, a feature that hides a sufficient amount of complexity so as to appear 'magic' to the uninformed.

'Any sufficiently advanced technology is indistinguishable from magic' - Arthur C Clarke

One of the magical language features that got the chop in Golang is exceptions. Recently, when doing some work in Python I noticed that I didn't really miss exceptions, they complicated the control flow a lot and caused me much fear and consternation. This is because exceptions are magic, they can cause unexpected jumps in your code based on non-local conditions and inject complexity. They are another thing that you constantly have to think about when writing code. I find that multiple return values, available in both Python and Golang, is a much more intuitive and useful feature that largely subverts the need for exceptions.

'But it doesn't even have exceptions' - reaction of an old workmate when I told him I was now working in Golang.

I see how exceptions can be useful in standardising error reporting, which is great. We've all had to deal with a function with obscure error reporting, that say returns an int value, and we end up asking, does 0 denote an error, what do negative values mean? etc. However Golang also standardises this by providing the error type and interface providing a standard with room for extensibility.

I understand that not allowing exceptions complicates the success case code as often 'if err != nil {...)' is liberally applied. However one really needs to consider if these minor gripes are worth adding extra complexity to the language and burdening the programmer with as an extra concern.

Friday, 16 December 2016

Rebase as an Integration Strategy for Feature Branches

There are generally two reasons for using git rebase, 1) To tidy up/ rearrange commits that aren't publically in use 2) As a strategy for integrating branches. This post discusses the second use case. Rebase gets a lot of bad press, I think that this is parlty due to misunderstanding, it's like a dog that people kick, it bites someone, then it gets put down, so lets try and understand it and then play fetch with it or something instead.

So most people are familiar with merging as an integration strategy, the problem with merging is that it creates a non-linear history polluted with merge commits. The murky history manifests itself as ambiguity in tools like git log and increased difficulty in using git bisect, it generally makes archeological spelunking and working with history harder. Merges do have some benefits however, they naturally work well with pull requests and also preserve branch history (that may or may not be valuable to you). For a full discussion of pros and cons check out this excellent article.

Merge and Rebase Workflows

Starting steps, create a feature branch off the tip of up-to-date/ upstream master

 $ git checkout master
 $ git pull
 $ git checkout -b feature-branch

State of Play
      C             feature-branch
      /
A---B---D        master

A normal merge workflow

... do some work, stage it
 $ git commit -m "C"
 $ git checkout master
 $ git merge feature-branch
 $ git push origin master

Post-Merge
     C        feature-branch
    /   \
A---B---D        master

A normal rebase workflow

... do some work, stage it
 $ git commit -m "C"
 $ git rebase master
 $ git push origin feature-branch:master

Post Rebase
C'    feature-branch
       /
A---B---D        master

The Rebase Workflow Analysed

The rebase command above takes master and forward ports C on top and sets this to be feature-branch, this can be considered as 'rebasing' "feature commit" on the updated master. Through this process C is rewritten, with a different SHA, it is now a different commit, C', this can be a sticking point in understanding, but an important feature of a commit is that it is immutable. Rebases rewrite commits whereas merges don't.

The git push means, push the ref before the colon to the ref after the colon on the remote origin. If your git push is being rejected as a non-fast forward, you are doing something wrong or someone has pushed in the time since you pulled, the blighters, repull master and rebase onto it again.

Consider Dry Run

If you are worried about what you are commiting, note that you can always see what you are doing before you push it using the --dry-run argument to git push, which stops short of sending the actual update, you can then run git log on the SHA range it outputs.

$ git push origin feature-branch:master --dry-run  
 To git@github.com:aultimus/example-project  
   d4f3294..6c53234 feature-branch -> master  
 $ git log -U3 d4f3294..6c53234  
 ...

On a personal note, I prefer generally prefer rebase over merge for integrating feature branches, I am a bit keen on a nice clean history, the importance of well-kept history is a topic for a future post.

Sunday, 11 December 2016

Coupling Under Christmas Trees

So I have been reading a couple of programming books recently. Reading reviews of these books I encountered a common complaint, that they 'state the obvious' or 'say things one already knows'.

Some Amazon Reviews

'If you have 7+ years of java development dont buy it... Most of what he has said took me some time to work out for myself' - Clean Code

'A lot of common sense and stuff a seasoned programmer probably already know' - Clean Code

'What little it does say has been said several times before' - Pragmatic Programmer

I have also found this to be the case, however I do not necessarily consider that this detracts from the work. Having been programming for several years you tend to pickup common patterns, smells and designs, they tend to 'fall out' of the code and into use over time. Implicitly forming in your monkey brain, however there is a risk that they may become malformed in their formation in isolation from wider discussion. There is a danger of developing bad habits and you know what they say about old dogs, never mind old programmers.

'The limits of my language means the limits of my world' - Wittgenstein

Covering these topics can serve to cement your understanding and formalise definitions of concepts or patterns of which perhaps you have already have grasped by a thread, but have not fully unwound in a deeper analysis. Indeed, formalisation is an important process, having a common language and nomenclature is an important prerequisite for effective discussion and deeper analysis. Also many of the most important truths that merit discussion are self-evident ones, like tests are good and duplication is bad.

The Bit In Which I Try and Justify The Slightly Erotic Title

Cohesion and coupling are two such concepts, I have found that a good understanding of these topics, alongside their conscious application can help you to write more extensible and maintainable software. Since we spend the vast majority of our time maintaining rather than creating, optimising for maintenance can save you a lot of hassle down the line.

Coupling

In the Pragmatic Programmer, a helicopter's controls is used as an example of a tightly coupled system, indeed helicopters are considered considerably more difficult to fly than aeroplanes as constant action is required to keep them in the air, left to their own devices, they will, quickly plummet, sound like any software systems you know? Every control creates a side effect that requires the application of another control in order to resolve and so using the second control requires more corrections and so on, ad infinitum.

'I never liked riding in helicopters because there's a fair probability that the bottom part will get going around as fast as the top part' - Lt. Col. John Wittenborn, USAFR

Coupling is frequently expressed in terms of orthogonality, a term derived from mathematics, but essentially a grown up synonym for de-coupled, you too can use this term to befuddle new developers.

Working with a non-orthogonal (tightly coupled) system, any change can lead to a host of side effects, reducing confidence in your ability to make changes, this can result in inertia and paralysis, as you do not fully comprehend the full effects of potential changes. This is exacerbated by the fact that such a system is likely not well tested. This is because the coupling is tight, and prohibitively so, rendering isolation of components for unit testing difficult. The resultant dirth in test coverage further contributes to a deeper, more pungent rot. Maintaining such a system is akin to some some sort of feverish festive nightmare in which you are trapped in an eternal game of jenga, but you are alarmed to discover that your hands have turned into claws.

'Talk to friends not strangers' - Clean Coder

Orthogonality is often also expressed in reference to the 'Law of Demeter', which outlines design guidelines useful for designing a non-orthogonal system:
- Each unit should have only limited knowledge about other units: only units "closely" related to the current unit.
- Each unit should only talk to its friends; don't talk to strangers.
- Only talk to your immediate friends.

Something most developers have at least some understanding of are train wrecks, in addition to being generally bad practice, the presence of train wrecks, are frequently a sign of a breach of the law of demeter. Train wrecks are long chained calls, symptomatic of overly friendly objects, that have no respect for each other and are privy to each others dirty implementation details.

To borrow an example from Clean Coder:

final String outputDir = ctxt.getOptions().getScratchDir().getAbsolutePath()

Again, experienced developers will have certainly encountered this 'code smell', and will likely have realised that it is a rotten one. Now, this example may benefit from being broken into several lines, but our code here should certainly not have intimate knowledge of the implementation of scratchDir. Suppose this code is run on a test system that does not have access to the production file structure, our code is tightly coupled to the implementation of scratchDir and any change to the interface of scratchDir will necessitate changes here and to the multitude of other distantly related callers.

The Model, View, Controller (MVC) design pattern is a well known method for increasing maintainability via decoupling and the breaking out of responsibilites.

Christmas Trees

Imagine if you never threw away a christmas decoration, if instead you hung them all on your christmas tree, you featured on one of those Hoarders TV shows on Channel 5, it's the christmas special. So you kept all the crappy decorations made out of tissue paper that you made in primary school and the top of your tree shared angelic congestion issues the envy of heaven. Under the layer upon layer of decaying tinsel, the poor old tree strains under the additional weight of your magpie like compulsion.

You may have had the displeasure of encountering an object or class that bears a resemblance to such a tree, one where every bauble of information is pinned onto this object, until it begins to creak and strain under the complexity. Every time you receive a request to add extra functionality in this area, you groan, hold your breath, do it, wince, stack the technical debt higher, defer the refactor and try really hard to forget.

The Single Responsibility Principle

It is likely that such an overburdened class has many responsibilities and is in violation of SRP (Single Responsibility Principle), this states that a class should only have one responsibility and thus generally only one reason to change, this rule helps us keep class size reasonable. So, our class should have one over-arching high level responsibility but complex logic should be embodied in sub classes composed upon or utilised in conjunction to our class. The SRP helps us reason in terms of the language agnostic concept of responsibilities in contrast to line count. This deserves its own post.

Low Cohesion and its Effects

The presence of tight coupling and low cohesion are usually a good indicator of the benefits of a refactor and represent technical debt, but the subject of refactoring also deserves its own post! For now let's try and define cohesion. Cohesion is a measure of how well an object hangs together as a logical whole. In our christmas tree example we note that the parts bare little relevance to each other, colours and styles clashing garishly. Technically, cohesion is defined as being inversely proportional to the number of instance variables and methods a class has. A class having methods that only utilise a small subset of its variables is said to have low cohesion, a class where each variable is used by every method is said to be maximally cohesive. Note that maximum cohesion is rarely a goal, but low cohesion undesirable.

Similarly to a tightly coupled class, in an uncohesive class it is often difficult to make changes with confidence, fully aware of the knock on effects and to have comprehensive tests. The difficulty in testing is apparent if you following a thought exercise.

Take an uncohesive class, replace a class method with a function call in which each class variable is an argument and the return value is a composite value of all class attributes and the method return value. Now attempt to write exhaustive unit tests for this function over the range of possible inputs and outputs, hard, eh? We can see that the potential for side effects of such methods are large and testing is a combinatoric disaster. In addition, it is likely impossible to keep a full understanding of such a class in your head, thus we can see how low cohesion is detrimental to effective maintenance and extensibility.

Bit of a rambling one, but food for thought, links into some other topics I'd like to discuss. Anwyay, I hope that this can help you to write more maintainable and extensible code, or at the least provoke thought and discussion that does so.

Until next time