Friday, September 21, 2012

Introduction to Unit Testing Using Python Unittest

Unit Testing is an extremely powerful tool.  It directly helps to ensure the 3 aspects of good software: Verifiability, Maintainability, and Extensibility.  Unit testing is as much a process of software design as it is a tool.  There are many great tutorials on HOW to use python unittest, and of course python documentation is an excellent resource.  I am going to focus on WHY to test your software.  Below are a couple short unit testing examples using python's built in unittest package.


Verfiability

How does one verify that their program works?  With unit testing we can create specific functions or groups of functions to target our code.  For example:

def sum_numbers(num_one, num_two):
   """return an integer, the sum of two numbers"""
   return num_one + num_two

We can easily create a test for this using python's built in unittest module.  Tests are created by subclassing unittest.TestClass

from mymodule.functions import sum_numbers

class TestFunctions(unittest.TestClass): 

    def test_add_numbers_success(self):
        self.assertEqual(sum_numbers(2, 2), 4)


On python 2.7+ running python -m unittest discover in our package will automatically search for all test.py files and run the TestClasses.  There are a couple of things to note in the above example.  All test classes must subclass unittest.TestClass.  The focal point of any test is its assertions.  The assertions are what dictate whether a test passes or fails.  Although this is a contrived example it displays how easy it is to isolate our methods and control exactly what inputs they receive!  If we were to create one test method (or more) for every method in our project we would quickly grow a test suite.  When making ANY changes to our code it becomes trivial to run through every single function in our project and verify that nothing has broken!  Imagine if we had a web app and every time we added a feature we would have to run through EVERY possible page/action!? It could take a long time doing it manually.


Maintainability

Bugs are a part of software development.  It is extremely important to minimize bugs but when they do happen it is important to create fixes very quickly.  Because bugs will occur in code it is important to have a process set up that helps to isolate the bug so that it is easy to reproduce, easy to correct and easy to verify the bug has been fixed.  Unittesting helps to do all of these.  Suppose the sum_numbers function is at the heart of a website.  It gets all sorts of user input data, and occasionally some faulty data slips through. If a string is passed as one of the parameters it will result in a TypeError!!  It is trivial to isolate and reproduce this bug.  I guess we need a little thought about what should be returned if there is an invalid input.  For this example lets return None.  We then can create another test method like:


class TestFunctions(unittest.TestClass): 

    def test_add_numbers_success(self):
        self.assertEqual(sum_numbers(2, 2), 4)

    def test_add_numbers_string_bug(self):
        self.assertEqual(sum_numbers('a', 2), None)


Running the above code reproduces the string error.  Since we haven't fixed our code yet this test will fail.  We can then change our sum_numbers method to handle a type error:


def sum_numbers(num_one, num_two):
   """return an integer, the sum of two numbers, can't trust user input"""
   try:
     return int(num_one) + int(num_two)
   except ValueError:
     return None


Running our test again will result in two passing tests.  We successfully isolated the bug, reproduced the bug and verified the bug has been fixed!! We now also have a test trail assuring us the big has been addressed.  Pretty cool.


Extensibility

With tests it becomes very easy to help an app grow.  A test suite provides a safety net for an application.  We can programatically run through every function of an app in a short amount of time.  This could take hours to do manually.   Some test suites can take hours to run, it wouldn't even be feasible to manually test a large codebase!!   As long as we keep designing our apps in a modular unit based way we can easily add functions and tests for those individual functions.  Another aspect of unittesting is how easy it is to refactor code.  Suppose we had thought it was a good idea at the time to write our original function like:


def sum_numbers(num_one, num_two):
   """return an integer, the sum of two numbers"""
   return sum([num_one, num_two])


Assuming we had the same test method as before:


def test_add_numbers_success(self):
        self.assertEqual(sum_numbers(2, 2), 4)



This test is focused on the output of our function.  It is assuring us the output is as expected.   This allows us to easily change what is happening in our function and still have the test acting as a safety net.  We can rewrite (refactor) the internals of our methods and guarentee they are still functioning in the way we originally tested them!  Our method would pass the test because it is performing the action that we want it to.  We could change the method to remove the list and sum function


def sum_numbers(num_one, num_two):
   """return an integer, the sum of two numbers"""
   return num_one + num_two


We cleaned up our function and ensures that it functions in the way we designed it to!!


Testing is a powerful tool that should be very heavily considered.  It helps verify our functions are working the way we intended them, help us easily maintain our applications and help us extend our applications.  Correct aplication design will help us isolate our problems.  Testing can take a significant amount of time, but the benefits it offers far outweigh any downsides.

Sunday, September 9, 2012

Why Unit Testing is important

One of the most controversial topics in programming is Unit Testing, or testing in generally.  There are a number of strong arguments on both sides of the issue.

For the past 10 months I have been freelancing.  During this time I have been exposed to a wide variety of code created by many individuals of vastly varying skill levels.  All of these projects have been php websites and webapps.  Of course this had led to tons and tons of different architectures.  All of these projects have had one troubling thing in common: The projects were created with absolutely no thought about maintainability.  This is, in part, because of the industry.  Boutiques and contractors are not maintaining the app.  The goal is to ship a working product in as little time as possible.  Unit testing code doesn't play into this because the time investment involved.  A significant time investment is required on determining testing strategy and writing the actual tests.  I have had many instances where writing tests takes AS LONG AS writing my actual functions!!!  Having to budget in up to 50% more time to write tests is undesirable for every party involved.

This test-less, unmaintainalbe strategy actually works pretty well (it is an industry standard) as long as the sites never need features added.   During a 3 month contract with a php boutique we had a number of recurring contracts.  This involved maintaining web sites which were created 5-10 years ago.  Many of these sites were from a different era, relying on registered_globals, and completely prone to sql injection.  So the solution is simple right? Fix the security holes, add new features, and deploy!?  No.  Many of the sites did not have any sort of structure to the programs.  Each file generally had 1 or no functions and hundreds of lines of code.  Fixing bugs meant wading through lots of code that was more or less unrelated to the problems.  Why not fix this code soup?  The issues are there are set deadlines.  People don't see refactoring the whole site into something that can be maintained as a good usage of time.

Writing code as if it were going to be unit tested will resolve many issues.  Unit testing is important because it helps us think in terms of maintaining code.  The most important aspects of unit testing are usually overlooked:

Thinking about program structure.  This NEEDS to be thought about.  Even for small websites.  Sitting down and writing whatever comes to ones head is a sure way to reduce the quality of code and to remain a mediocre programmer.


Designing units, what are logical sections

I think a simple way to do this is to go through long code and comment what each section does.  For example

// log in a user
// check user permissions
// get friends of user
// etc

When doing this it becomes very very apparent what should comprise a "section".  This would makes sense to have a login_user function.  Or a check_permission function.   Even if they are only used one time, it still makes sense to create functions for these.  This helps with the next point.

Thinking in terms of maintaining code.
    - how willl this code be added to?

Actually thinking about program design will make maintaining and extending your code so much easier.  Say that in addition to the facebook login that version 1 of the site uses a client wants to add twitter auth too.  With a login_user function this is pretty easy.  All login code can be located in this function.
 

Learning to create software as a series or related units takes practice, but it pays off in the long run as code is easier to work on and easier to extend.

Sunday, August 12, 2012

VIM productivity

Vim Pedal

A couple weeks ago my pinky fingers became so strained it was painful.  This was due to the location of the control key on my keyboard (http://www.kinesis-ergo.com/freestyle.htm).  When typing normally ie. using webrowser, word processor, email etc.  I hadn't noticed.  Control is a modifier key in VIM and used for many commands, requiring me to use it constantly.

Over the last couple weeks a couple of sites had an article about VIM clutch pedals https://github.com/alevchuk/vim-clutch.  I loved the idea but was very reluctant to have to construct it myself, so I searched for prefabbed/complete pedals.  There were a number of programmable pedals including some very nice ones by kinesis.  The issue was the programmable software is for windows.  Once again I did not want this to become a hobby but just something I could buy and start using.

After some searching I discovered Delcom Products Inc. who makes complete programmable clutch pedals.  "Programming" one involves moving pins around to create keyboard key codes, which in turn are just interpreted as keyboard input! Perfect no hassle and no custom software.  I promptly ordered one.  When it arrive it was trivial to open it up and change the key codes to control (included is a huge sheet of key codes and corresponding pin combinations).  The build quiality is very good, the pedal feels very sturdy.  Now I no longer have to stretch my pinkys to reach the control key!



Vim monitor


I am a large proponent of a single monitor setup, instead of relying on hardware for screen space I enjoy relying on software!.  This includes Virtual desktops and keycombinations to place windows on certain parts of the screen.  The main strengths of this are:

  1. No bezel! Multiple monitor setups are always split across a bezel! A huge piece of material interrupting the display
  2. We can't focus on two monitors at the same time.  If I could make my left eye look at a left monitor and my right eye look at a right monitor then a dual monitor setup would be perfect.  The problem is I can't, there will always be one monitor that cannot be actively used.
  3. There is no comfortable multiple monitor setup.  With 2 monitors they can be setup so that there is a "main" monitor directly in front and then there is a monitor off to the side which requires craning the neck.  Or the monitors are perfectly symetrical placed with the bezels touching directly in front of ones eyes.  These means the neck needs to be turned to view either monitor. Three monitor setups "fix" this because one monitor can be placed in front with 1 on either sides.
I have been using a 1920x1200 resolution monitor which has been pretty good.  When I have webiste requirements I split the screen in 2 with the requirements on one side and a browser on the other.  Vim is in a different desktop.  The amount of time it takes to switch from developing in VIM to the browser to view changes (cntr+alt+left) is less time then it would take to move my head to view another monitor.

Just today I tried programming on a 2560x1440 resolution monitor!
My mind was blown.  Normally I can have 4 windows open in VIM and view them perfect and open up to 6 if necessary.  With the higher resolution monitor 6 windows are nothing! Pretty amazing.

Wednesday, May 9, 2012

Lenovo wins for Linux

Since I switched over to linux as my main OS 2.5 years ago I've had a couple of major hardware conflicts that have kept me from running on certain devices or hardware components.  Linux is maturing greatly.  A wide, ever increasing, range of components are supported.

Luck played a large role in my switching over to linux.  I had an asus g60vx gaming laptop at the time.  Installing ubuntu 10.04 was absolutely hassle free and all hardware was supported by default.  In addition there existed propriety drivers for my nVidia graphics card.  The first install is absolutely crucial in retaining or switching a user to linux (something the community would absolutely benefit from).  Switching is becoming less of an ordeal.  Manufacturers like Lenovo and Dell make it very simple to switch over with full hardware support.

Both Lenovo and Dell have 'Ubuntu Certified' machines.  Any speedbump to installing and running a system smoothly can ruin someones chances of converting.  Individuals do not want to deal with technical issues, especially on their recreation machines.  If Ubuntu didn't offer a list of approved hardware and laptops, I personally would have just gone with OSX and virtualized linux.  Not because I like mac (quite the opposite) but because it is POSIX and it just works with the hardware.  Having to debug, troubleshoot, fiddle with hardware/software all day at work turns me off to doing it at home.  I no longer like spending hours of my small amount of free time debugging linux hardware issues.

There are a couple of reasons I became a Lenovo loyalist:

1.  They offer ubuntu certified machines with compatible hardware.

     The hardware components they choose to use have open source drivers.  When I install fedora or Ubuntu things just work.  Unfortunately there are still issues on messages boards of some certified machines performing poorly.  I've lucked out and have not hand an issue with the T and E series.

2.  Hardware is completely business focused.

    Lenovo hardware is absolutely no frills.  This can be seen in their designs that have barely changed sine the first thinkpads.  Lenovos aren't pretty.  They are functional, reliable, have an extremely high build quality and are military spec.  Plastic is used all around.  But the 'cockpit' is the most comfortable laptop I have ever used.  The keyboard is by far the nicest and most comfortable offered on any portable device.

3. Finally, being conscious of Linux users goes a long way to earn repeat business.

   Most of the large companies offer some sort of linux compatible machines.  Dell has some laptops and HP has many business desktops.  Lenovo goes that extra step and has most of its business/professional laptops certified.


Companies like Lenovo and Dell are helping to lower the barrier to entry for new Linux users.  When linux is not a hassle it increases a user's chance of regularly using linux.

Sunday, February 19, 2012

Installing PHP PDO extensions for Drupal 7 on ubuntu 10.04

I ran into some confusion installing php pdo extensions for php5.  Apparently installing has changed significantly with PHP versions over the last couple years.  The catch is PDO extension is installed by default with php5.  To enable it (ubuntu) add the extensions to your php.ini file.

extension=pdo.so

mysql_pdo now comes bundled with php5-mysql package in ubuntu.  So if you are installing on php5 please ignore all the many posts online that suggest installing it through pear, or pecl, if you try these ways you will be very dissapointed.

Once php5-mysql is installed all that is required is enabling it in your php.ini

extension=pdo_mysql.so

restart apache and you should be good to go!!

Wednesday, February 15, 2012

Django Class Based Views Tutorial -- why I like them!

With the release of Django 1.3 developers can use class based views in addition to the function based views.  Django includes some generic prebuilt views that help to solve some basic problems.  Although they are a great starting point at some point you will probably find yourself having to write a custom class based view.  I hope to clearly explain how that is done.


What are they?

Most of the other functionality in django is already class based: models, forms, middleware.  For this reason views have always kind of stuck out.  They are function based.  urls.py maps to one function.  That function takes a request as a parameter and must return a response object.  Classed based views applies the same oo concepts that are available in other parts of the framework to views.


Why use them?

A common scenario is having a page with a form.  A GET request renders the form and a POST request processes the form.  Although these two processes often share the same url they are very different.  I never have liked cramming these into one function.  

def form(request):
  if request.method == 'GET':
    return render_to_response('the_form_template.html')
  elif request.method == 'POST':
    # do some processing for the form
    # return redirect if successful


Often times I have found myself with some fairly complicated things taking place inside either one of these conditionals.  Code can grow difficult to read, with lots of nested conditionals.  Readability is just a preference of mine and might not bother everyone.   What bothers me the most is these are separate processes that just so happen to share a url.  That does not mean they should share a function though.  Class Based Views solves this problem.  They recognize that these are 2 sides of the same coin but can present them in a more readable way while keeping CRUD operations straight forward.  Class Based Views will automatically call the method associated with the request. Now using new Class Based Views:

from django.views.generic.base import View, TemplateResponseMixin
class TestView(View, TemplateResponseMixin):
  def get(self, request):
    self.template_name = 'the_form_template.html'
    # or you can instantiate your form and render with
    # a template as normal, then there would be no need for
    # the TemplateResponseMixin


  def post(self, request):
    # process the form and redirect.


Additionally, in urls.py there is a small change, instead of putting in the location of the function in patterns such as (r'^yourregex/$', 'project.app.views.function') you import your class


from myproject.myapp.views import TestView
(r'^yourregext/$', TestView.as_view())


If someone makes a request that is not supported django returns a 405 error.


What to use them for?


Whenever I have a url that does different things for different request methods I like to use ClassBased Views. I also use them when I have common tasks to do each request.  For a while I was finding that I had to write a lot of csv download functions, all of them would go something like: 1) validate input, 2)get data, 3) render data as csv.
I found that this was cleanly implemented by creating a generic CSVView.  Every time I had a different download to do I subclassed it and implemented the get_data method appropriately.

For reference django has some great documentation on their Generic Views,  but is lacking in documentation on Class Based Views.  The best source of information on them right now is the source code, easily accessible on github, https://github.com/django/django/blob/master/django/views/generic/base.py


Friday, February 10, 2012

Simple Steps to Problem Solving

For the last couple months, every time I find myself with some downtime I try and answer questions on stackoverflow.com.  I see many questions regarding error logs, and built in functions.  Most of these can be solved by developing a problem solving approach.  There are a couple aspects to this approach.

1)  Read the documentation
99% of the time the best source of information about a library, built in extension, code, etc is the documentation that comes with the code.  I have only very very rarely run into code whose documentation was very poor.  In my experiences I usually skim over documentation, thinking that I can just look at the code samples.  I end up being wrong most of the time.  I can't even remember how many times, I've been stumped and the answers to my questions are in the basic documentation.  Reading every word of the relevant documentation is never a waste of time, after all it's the information the developers believe is necessary to user their code.
 
2) Error logs are not lying
When errors happen they are logged somewhere.  The tricky part then becomes finding out where the logs are.  The other day I was setting up amazon ec2 instance.  I forgot that I had whitelisted the users that were allowed to connect through ssh.   I kept getting a publickey ssh error when trying to connect from my remote machine.  The trick to solving this was to find out where the error log was located ( on the actual ec2 instance) and once I found that out, it clearly indicated that my account was not allowed to connect.  There are so many questions on so involving 500 errors, or php errors.  Knowing where the error logs are and how to read them are an indispensable part of problem solving.
Exceptions are not lying either
In addition to the recommendation above. Errors do not lie.  When the interpreter finds an error on line x.  There is usually an error on line x.  Copying the error and pasting it into google will usually quickly yield a result.

3) Learn to google
Effective google queries is a skill. I wish I had some advice for how to do it.  I think the problem is, the internet has such an abundance of information even a poor googler will eventually find what they are looking for, giving them negative reinforcement.  Learning to effectively google for an answer is a crucial skill.


4) Learn to rely on yourself
Developing good trouble shooting and problem solving skills takes practice.  At first it will be easier to ask other people whenever a roadblock is run into.  But breaking away from this habit will be best in the long run.