Showing posts with label nodejs. Show all posts
Showing posts with label nodejs. Show all posts

Thursday, 30 May 2019

Practising Clean Code in Node.JS

Having recently read Robert C. Martin’s book - “Clean Code: A Handbook of Agile Software Craftsmanship”. I began to look again at some of the code I have written before, specifically in Football Simulation Engine (FSE).

The book explores what makes code ‘clean’ and details why code writers should and would want to follow some core principles that makes reading, understanding and working with their code as simple as possible.

With a view of applying key principles learnt in my own code, this post will look at the thought process as we go through the code until it feels or ‘smells’ clean.

9 Key Clean Code Principles

From my own reading I took key principles of writing and reviewing code;

  • Readability is key
    • Code should fit on the screen
    • Small functions improve readability
    • Code should read like a newspaper; top to bottom, left to right
    • Remove ambiguity
    • Stick to personal or team convention
    • Question: Can you read your code without pausing?
  • Leave code cleaner than you found it
    • Remove unused variables
    • Remove duplication
    • Add test cases where needed
    • Build ‘factories’ / ‘common’ functions
    • Check readability
  • Single Purpose Functions
    • 0<>150 characters width, 0<>20 lines long
    • Single Responsibility Principle
    • Arguments: 0 Best, 1 Great, 2 OK, 3 Bad, 3< STOP!
    • No Flag Arguments
    • No Output arguments
    • Functions should be close together for readability
    • Write third party interface as a function
    • Don’t pass or return NULL
  • Name things for their purpose
    • Be consistent
    • Agree Conventions as a team
    • Names should make the purpose obvious
    • Reduce Ambiguity - ‘addedNumbers’ could be ‘velocityPlusDistance’
    • Should include side effects
    • You can refactor another person’s naming
    • No Encoding of names
  • Refine, refactor and be comfortable with review
    • Code reviews are normal and non-personal
    • Be comfortable receiving and giving code reviews
    • Read the code. Any pauses?
    • Read the code. Any ambiguity?
    • Check for Duplication
    • Complete when tests 100% pass
  • Test are as important as your code
    • Follow Test Driven Development
    • Should be testable with ‘One Step’
    • One concept per test
    • Don’t skip trivial tests
    • Test extra where bugs are found
    • Big tests will never get run
    • ‘Learning tests’ on third party code are free tests
    • Tests should have readability at heart
  • Comments are a necessary evil
    • Delete inappropriate, redundant, poorly written, obsolete or old comments
    • “To Do” comments are OK if they add value and are time boxed
    • If writing a comment, ask why?
    • Never comment out code
  • Reduce/Remove duplication
    • Build common libraries
    • Look for patterns in code
    • Not all duplication can be removed, that’s OK
  • Code should ‘smell’ clean
    • If you have to reread a line of code
    • Structure comes before convention
    • Are there enough tests
    • Planned return to complete work - why not do it now?

Applying Clean Code to an existing Project

For this blog we will be looking at the FSE v2.2.0 - setPositions.js file. To do this we first must ask ourselves some key questions:
  1. Are my changes isolated and can they be rolled back?
  2. Do I have enough tests?
  3. What is the test coverage?

Are my changes Isolated? Yes. I have created a new git branch and can roll back the changes if needed.

Do I have enough tests? Applying Clean Code is going to mean a lot of refactoring, changing naming conventions, reducing functions, separating out concerns etc. Tests are key. I currently have limited tests for this code thus I will have to write some.

What is the Test Coverage - To understand how much of the code I am running with existing tests I have installed a module called nyc which runs mocha tests in Node.JS to output into text or html the number of lines, branches, functions and statements being covered.

Writing Tests:

With the initial code coverage at just 8% - because the functions in this file are called by functions higher in the code - I begin to start writing some tests. The plan is to take each function, review its worth, check existing naming conventions and see if the function complies with the Single Purpose Function principle. There will then be tests written in expectation of what the code should do rather than its current implementation.

To write the tests I will be following Test Driven Development (TDD) as recommended in the book by following my previous post method for implementing TDD tests into an existing Node.JS project.
If we look at the function - setCornerPositions we can see straight away that it does not comply with a Single Purpose Function. In context, this function sets up players in a football/soccer match at either of the four corners of the rectangular shaped pitch.

Original Set Corner Functions
This code is performing 4 key functions; putting player in specific places based on the corner locations; ‘top left’ corner, ‘top right’ corner, ‘bottom left’ corner, ‘bottom right’ corner. The code should be split up accordingly.

When you first look at the original code again for the first time in a while you get the urge to throw everything away and start again. But there’s a lot of thought gone into existing code, even if it not as clean as you like, and that core thought process is what we are retaining by cleaning instead of dumping code.

Instead we write a single test and make small changes.


First this test fails, because the function doesn’t exist yet. I create a simple function called setTopLeftCornerPositions, input matchDetails - which are always needed because this denotes state in the simulator - and output matchDetails - which represents the change in state by the function.

Initial Test Scenarios
This test passed and successfully tests the location of the attacking goalkeeper in their own half. However, this obviously doesn’t cover very much functionality, it doesn’t show where the other 21 players are indeed, doesn’t check the ball information like position, team with the ball etc. A more comprehensive test might look like something below.

Comprehensive test cases

When 1 function becomes 4:

Because we are splitting setCornerPositions into 4 distinct functions there is a need to test that the correct function is being called in the correct scenarios. To do this we write some tests that will call each of the four corners in the existing corner calling location. This function keepInBoundaries needs four new tests to that send it to the four functions being created.

The new tests pass in out of boundary positions and then check the opposition goalkeeper is in the correct position and that a log has been inserted dictating the correct set piece (corner) and that is has been assigned to the correct player.

Boundary tests
Initially only one of the four FAIL. That’s because I have only created the single function so far and that function passes back what it receives. The other three functions are created, and the existing code is copied into the function.

What can we clean:

Cleaned Corner code
All four aim to remove the ball from the existing players which it completes using two function calls; remove ball from team 1 and remove ball from team 2. This adds an extra line of code for an extra function that does the same thing. Removal ball from 1 team’s players is as single purpose as removing the ball from all players.

After positions are set for individual players there is a piece of duplication in setting ball specific details like assigning it to an attacking player, removing ball movement and ball ownership to a team or player. This is seen across all four corner functions and so can become a function to be reused. This reduces duplication.

Improved Naming:

There are additional benefits when we split the code into the four functions. The first is that we have the ability to reduce the number of parameters we pass into the function since it performs one task and one task only.
function setCornerPositions(team, opposition, side, matchDetails) {
If we know there is a corner in the top left position, we know that the team whose base position is that side of the pitch must have kicked the ball out of the pitch. We also know which will be attacking and should have possession of the ball and which team will be defending. This can all be derived from the single ‘matchDetails’ which gives a greater understanding inside the function itself as to where the data has come from without having to locate the function call elsewhere in the code.

In the old code there is a repeat reference to ‘team’ and ‘opposition’ which implies that team is the team that will have possession and opposition will be in defence. However, this is ambiguous since both teams will be the ‘team’ from their own perspective and the other the opposition. ‘Attack’ and ‘Defence’ much better which has possession of the ball.

Add Tests:

With these new tests we can see an increase in code coverage in terms of statements and lines and branches which is satisfying to see.

STATEMENTS BRANCHES FUNCTIONS LINES
setPositions.js 18.35% 129/703 11.55% 29/251 48.72% 19/39 17.26% 117/678
setPositions.js 29.16% 212/727 18.01% 47/261 42.86% 18/42 29.2% 205/702

Repeat the new method:

From here the same method could be applied to setGoalKicks which when the ball was passed over the same positions as for corners but by the team on the same side. This functionality also performed two distinct functions i.e. set a ‘top’ side goal and a ‘bottom’ side goal with a requirement for the team and opposition to be supplied even though this was implicit.

Set goal kick before
Set goal kick after cleaning
The function was split into two, the function was made smaller using best practices of the current node version (v10.10.0) and repeated code was pulled into its own function (setBallSpecificGoalKick). Whilst the first set of changes took considerable time, the second set of changes were quick to replicate with tests written before any change made.

Test writing process:

  1. Write tests that will adequately cover the functions being changed
  2. Evaluate the written tests complete the requirements successfully i.e. all players for a goal kick should be in front of the goal keeper taking the kick
  3. Make tests as explicit as possible to reduce different code outcome
  4. Make naming as clear as possible

Name of tests should be readable at a glance; from experience this is because it is hard to understand where failures are happening in the code if they are called ‘test1’…’testN’. Another benefit of clear test case name is that straight away you know what test is failing and with enough detail you can make it clear what the test state is as well.

This is also true for test variables. Just like in our normal code this should not allow for ambiguity and should be clear from a glance what it is trying to do. i.e. the test ‘describe’ name, function name, and variable names - ‘defencePlayer’, for example, means the current defending player in the loop.

Well named test example

‘Smelling’ the code:

Having successfully replicated the process for setTopGoalKick and setBottomGoalKick there is still a lot of code to cover and so the process of cleaning the code can be applied to the next fuction which sets setpieces. i.e. penalties and freekicks.

After writing tests for the expected behaviour of the -to be split - function, it is important to get a smell of the code. Which leads to a number of observations;
  1. The existing code does not calculate the setpieces correctly and it is likely that specific setpieces are given incorrectly. Ouch. We have a bug!
  2. The new tests must reflect intention and requirements over validation of the existing code.
  3. The code itself ‘smells’ like it could be cleaner
  4. By rewriting the setPiece to be split per team it is easier to understand the position of the ball, who has possession and the direction they are travelling. Splitting the functions removes ambiguity and will reduce possibility of bugs
  5. Any changes now will affect the calling of setPieces by other functions, so tests need to be written for those too
  6. The data set used to generate the scenarios of the tests take the most time to write. They need to be precise or the tests are worthless.
I will skip the exact steps for changes as they replicate directly what I have done in the other functions. The process is repeated for throwIns, penalties and minor functions.

So far, so clean:

Up to this point all the functions except one have been ‘cleaned’ following clean code principles. Test cases have increased from around 20 tests to 96 passing tests with the code coverage as follows;

Before relativePositions cleaned - a particularly important, bulky function with high significance on user experience

STATEMENTS
BRANCHES
FUNCTIONS
LINES
setPositions.js
44.98%
336/747
36.26%
99/273
72.22%
39/54
44.04%
314/713

After relativePositions cleaned:
STATEMENTS
BRANCHES
FUNCTIONS
LINES
setPositions.js
69.11%
566/819
63.61%
250/393
97.06%
66/68
66.26%
493/744

Adding Comments:

So far there have been no comments added to the code because the naming allows the flow to be read much easier than before and in the most part the functions themselves are being kept within the 20-line limit, but exceptions have been made where necessary. The rules are not strict but are common sense and best efforts to be completed. Additionally, by using a linter(link) the code can have a standard applied which keeps line less than a prescribed length.

Separating Concerns:

The final part of the cleaning of the code looked at the structure of the file system. It is immediately obvious from the table above that the remaining function is far too large to sit intertwined with all the other functions. It takes up 30% of the code base and makes the file hard to navigate and hard to read.

On this basis setFreekicks could be isolated into its own file before the same process was applied. The following table shows how code coverage improved when the file was split.

STATEMENTS
BRANCHES
FUNCTIONS
LINES
setFreekicks.js
13.95%
41/294
6.11%
8/131
60%
3/5
13.75%
40/291
setPositions.js
100%
532/532
92.37%
242/262
100%
64/64
100%
460/460
Tests - 96 passing (121ms)

STATEMENTS
BRANCHES
FUNCTIONS
LINES
setFreekicks.js
100%
383/383
95.1%
194/204
100%
39/39
100%
315/315
setPositions.js
98.89%
534/540
91.35%
243/266
100%
64/64
98.72%
462/468
Tests - 120 passing (179ms)


It can be seen in the tables above that SetPositions code coverage has been reduced as a result of the changes made to setFreekick. However, the parts of the code not covered are apparent edge cases which can be ignored for now because of time limitations. Whilst not ideal to leave code untested a progression from 13% coverage of single file now sees 100% and 99% of the same level of functionality.

Next Steps:

  • Find someone to review the code. Their ability to read the code will mark the success of the activity
  • Embed the tests into code deployment automation
  • Embed the linting of the file into code deployment automation
  • Recognise there will still be improvements to make
  • Work to achieve full test coverage of all the files

Thank you for reading through this - well done if you got all the way to the end, it was a long read!

Clean code is an important part of being able to share, reuse and understand code. Hopefully this practical application of clean code practices has shined some light on how the process might look.

Wednesday, 13 March 2019

Enforcing Test Driven Development into an Existing Project


In an earlier post I wrote about my observations of a book I have recently read called Test Driven Development: By Examples by Kent Beck. The book talks about Test-Driven Development (TDD) and how to apply it practically. From this I wanted to get going with the concept as quickly as possible but without having to make an entirely new project. 

I started to think about how I could apply the practices taught into an existing node module I have written - and blog about too often - hosted on npm

In the chapter “How do you switch to TDD midstream” Kent describes how you might transition into using TDD and the issues that might rise from doing so. I have taken the liberty of summarising from my own perspective and will discuss how I implemented it into my latest football simulation engine code (blog)

After reading TDD and considering its application, there were 4 main factors (tips) for enforcing TDD in existing projects:
·     Test the requirements, not the code
·     Start with new features/bugs
·     Refactor code with testing in mind
·     Ignore working code

As I opened the code and assessed my kingdom the extent of the problem became apparent. There were no tests, but the project was thousands of lines of code, had 3 main exported functions and around 50 functions throughout with no test data or examples to be used by the end user.

Kent says in his book ‘The biggest problem is that code that isn't written with tests in mind typically isn't very testable’. Which happened to be my first thought as I dug into the large functions I had made. Where do I start? I began to write a list of what of the different parts and split them into a simple table with the headers Test, Ignore, Refactor - keeping in mind “What you don't do is go write tests for the whole thing and refactor the whole thing”. I wanted to decide what could be tested, what should be tested and what shouldn’t be tested. 

Test: A new test should be written to test the functionality I expected from the requirements.
Ignore: Large bulky functions which called other functions and needed a lot of intense work to make it test ready.
Refactor: Code that could be refactored to make it testable.

Test
Ignore
Refactor
Input Validation
Move Ball
Set Freekick
Closest Player to Ball
Find Possible Actions
Set Corner
Closest Player to Position
Resolve Ball Movement
Set Penalty
Check Offside
Decide Movement
Formation Check
Split Number into N

Shot Made

The first and easiest on the list was Input Validation, so I began to write the first test for when a games initiation input JSON was validated. The first test was simply that we got an object back. Success! This led to a quick imitation for the other two main functions of the project; start second half and play iteration.

 I then wrote some negative test cases by purposely inserting input data that shouldn’t be valid. Importantly, the question I asked myself was, what tests cases were valid based on the requirements and not the code that had already been written. This caught a number of validations I was not already catching in the validation functions.

An example was when I removed the ball from the play iteration function input and the test failed - by not failing. How can a football simulation work without the ball? This is why Tip 1 is: Test the requirements, not the code

mocha.it('playIteration no ball with team',async() =>{
      letprovidedItJson'./test/input/badInput/noBallWithTeam.json'
      try{
        letoutputIterationawaitvalidation.playIteration(providedItJson)
        expect(outputIteration).to.be.an('Error')
      } catch(err) {
        expect(err).to.be.an('Error')
        letexpectedOutput'Provide: position,withPlayer,Player,withTeam,direction,ballOverIterations'
        expect(err.toString()).to.have.string(expectedOutput)
      }
    })

Having fleshed out all the required checks on the input validation for the three main functions (using mainly negatives) I felt assured that the code would be able to tell the user when their input was wrong, and more specifically - why it was wrong. 

At the same time, I was doing some house cleaning of the code and some general refactoring. Additionally, a few bugs were raised in the GitHub repository by users of the module. 

Some related to queries on parameter names, others were refactoring suggestions, but some referenced specific issues being seen in the code. Players were getting stuck, the closest player to a position was incorrectly assigned and the ball was moving too quickly. Investigations also caused me to raise my own issues which I could also test. 

Bugs and new features for existing code mean that the requirements are set out before any changes or new code is written. The code should behave in a certain way and currently doesn’t. From this you can essentially work on a blank canvas in terms of writing tests (using TDD). Tip 2: Start with new features/bugs

For the bug of not calculating closestPlayerToBall, there was no way to validate the issue because there were no tests. I started to write a test case by placing players on the ‘pitch’ where I knew which player was closest to a position or the ball. The test then is simple - does the function, when given the input data, correctly identify which player is closest to the ball? 

Once we run the first test - and potentially get a FAIL, we can start TDD from the beginning. What is the simplest thing we can do to the existing code to make it PASS? By then following the TDD method the code was altered until it passed the conditions successfully. At which point we had a passing test for the closest player using TDD as the method. 

Writing tests for the whole project in this way and then ensuring they PASS would conform the project to TDD and I believe would reduce the number of small bugs that end up being raised in the Github repository. However, it is a long process and would take up a lot of productive time, when there are other big fixes and new features to address.

Another major issue was the way that shots are made. A player makes a shot, a bunch of logic plays out and then either the goalkeeper has the ball or it’s a goal kick. All without so much as a little ball or player movement. This leads to the next step in the introduction of TDD - Tip 3 - Refactor code with testing in mind.

The shot made function had been several hundred lines long and included a variety of methods within. This doesn’t test easily. As code was edited for v2.1.2 I started to make changes to make it as simple as possible, abstracting out functionality into smaller functions and thus making each component easier to test using TDD. 

It’s important to remember that this isn’t the time for TDD yet. TDD is the process and steps for creating code that passes the test. Following the rule of only writing tests for new features or bugs this extraction of logic into smaller forms sets us up for TDD without investing unproductive time into its making.

In the list I created at the start there appears to be no real difference between some of the functions in the ignore and the refactor slots. There are a number of reasons to ignore specific functions. For example;

-      The function is very large and/or complex
-      The function cannot easily be split into smaller functions
-      The tests would take a long time to write
-      The function isn’t broken

For example, in FootballSimulationEngine - Decide Movement cannot easily be split into smaller functions and is very large, Find Possible Actions would both take a long time to write tests for (because of its complicated logic) but also the function simply isn’t broken. Resolve Ball Movement requires a lot of random number generation and mathematics which makes it difficult to test and Move Ball isn’t broken either. 

This analysis focuses heavily on the code I have written and not the requirement which goes against tip 1, but this is mainly because of Tip 4 - Ignore working code. If it isn’t broken, don’t fix it.

Put in another way, it has taken two years (more or less) to get the code to where it is today, with time etched away from weekends and evenings. To use TDD in the whole project would mean to essentially start from scratch. In his book, Kent states “Spending money without making it is generally speaking not a sustainable process.” And as a subscriber to the statement ‘time is money’ it seems more responsible and a better use of time to slowly introduce TDD as bugs appear and are fixed, or new features are included. 

By v2.1.2 the list looked a big better. In future editions I can see some of the ‘Ignore’ items moving into the ‘refactor’ bracket, the ‘refactor’ tasks moving into the ‘test’ section and removing the ‘test’ tasks as tests are completed for them. See below for the table as of the commit of v2.1.2.

Test
Ignore
Refactor
Input Validation
Move Ball
Set Freekick
Closest Player to Ball
Find Possible Actions
Set Corner
Closest Player to Position
Resolve Ball Movement
Set Penalty
Check Offside
Decide Movement
Formation Check
Split Number into N

Shot Made
Shot Made



 If you want to see a wider use of TDD in the FootballSimulationEngine code I’d love to see more enhancements raised in the Github repo. Thanks for reading.