Thursday, 25 July 2013

A Place for the Tests

In a previous article, I created an empty module for some utilities. Now is the time to populate it...and to develop tests for the subroutines.

A Place for the Tests

First, I need to create a place for the tests. By convention, a t/ subdirectory is used:

  $ mkdir ~/perl5/lib/t

And I'll create a subdirectory for each module:

  $ mkdir ~/perl5/lib/t/MyUtils
  $ cd ~/perl5/lib/t/MyUtils

The trim() Function

One of the function missing from Perl is the trim function. This gets rid of excess white space. But to write one with a universal interface takes some thought.

The three functions it should perform:

  • Remove any leading white space.
  • Remove any trailing white space.
  • Replace sequences of white space with a single space.

The code is straightforward:

  # --------------------------------------
  #       Name: trim
  #      Usage: $text | @text = trim( @text );
  #    Purpose: Remove excess white space.
  # Parameters: @text -- A list of text to modify
  #    Returns: $text -- A line of text to be returned in scalar context
  #             @text -- A list of text to be returned in list context
  #
  sub trim {
      my @text = @_;

      for my $text ( @text ){
          $text =~ s{ \A \s+ }{}msx;
          $text =~ s{ \s+ \z }{}msx;
          $text =~ s{ \s+ }{ }gmsx;
      }

      return wantarray ? @text : $text[0];
  }

Creating the Tests

Start by opening the test file with my favourite editor, ViM:

  $ cd ~/perl5/lib/t/MyUtils
  $ gvim 00-trim.t

The tests will been run in the order of their sorted ASCII name, so the 00 is at the front of the name will mean it is the first test to run. The .t extension is used to indicate it is an executable test script.

Adding some code to 00-trim.t:

  #!/usr/bin/env perl

  use strict;
  use warnings;

Save this and make it executable:

  $ chmod a+x 00-trim.t

Now to start building the tests. Perl comes with a number of modules for testing. You can read about them in perldoc perlmodlib. I decided to use Test::more.

Below is the complete script. The tests are placed in an array of hashes, @tests. This makes it easier to add more tests. If the expected is a scalar, trim() is called in scalar context. If it is an array reference, trim() is called in list context.

  #!/usr/bin/env perl

  use strict;
  use warnings;

  my @tests = (
    {
      arguments => [ 'The quick brown fox jumped over the lazy dogs.', ],
      expected  => 'The quick brown fox jumped over the lazy dogs.',
      test_name => 'no change',
    },
    {
      arguments => [ '               The quick brown fox jumped over the lazy dogs.', ],
      expected  => 'The quick brown fox jumped over the lazy dogs.',
      test_name => 'remove leading spaces',
    },
    {
      arguments => [ 'The quick brown fox jumped over the lazy dogs.               ', ],
      expected  => 'The quick brown fox jumped over the lazy dogs.',
      test_name => 'remove trailing spaces',
    },
    {
      arguments => [ '               The quick brown fox jumped over the lazy dogs.               ', ],
      expected  => 'The quick brown fox jumped over the lazy dogs.',
      test_name => 'remove leading & trailing spaces',
    },
    {
      arguments => [ 'The            quick            brown            fox            jumped            over            the            lazy            dogs.', ],
      expected  => 'The quick brown fox jumped over the lazy dogs.',
      test_name => 'many internal spaces',
    },
    {
      arguments => [ '               The            quick            brown            fox            jumped            over            the            lazy            dogs.               ', ],
      expected  => 'The quick brown fox jumped over the lazy dogs.',
      test_name => 'spaces all over the place',
    },
    {
      arguments => [
                     'The quick brown fox jumped over the lazy dogs.',
                     '               The quick brown fox jumped over the lazy dogs.',
                     'The quick brown fox jumped over the lazy dogs.               ',
                     '               The quick brown fox jumped over the lazy dogs.               ',
                     'The            quick            brown            fox            jumped            over            the            lazy            dogs.',
                     '               The            quick            brown            fox            jumped            over            the            lazy            dogs.               ',
                   ],
      expected  => [
                      'The quick brown fox jumped over the lazy dogs.',
                      'The quick brown fox jumped over the lazy dogs.',
                      'The quick brown fox jumped over the lazy dogs.',
                      'The quick brown fox jumped over the lazy dogs.',
                      'The quick brown fox jumped over the lazy dogs.',
                      'The quick brown fox jumped over the lazy dogs.',
                   ],
      test_name => 'array inferface.',
    },
  );

  use Test::More;
  BEGIN{ use_ok( 'MyUtils' ); } # test #1: check to see if module can be compiled

  use MyUtils qw( trim ); # import the trim() function

  # do each test
  for my $test ( @tests ){

    # if expected is not a scalar, then test in list context
    if( my $ref =  ref( $test->{expected} )){

      # tested function returns an array
      if( $ref eq 'ARRAY' ){
        my @actual = trim( @{ $test->{arguments} } );
        is_deeply( \@actual, $test->{expected}, $test->{test_name} );

      # only arrays can be tested (so far)
      }else{
        die "cannot handle $ref references\n";

      } # end if ref eq 'ARRAY'

    # test in scalar context
    }else{
      my $actual = trim( @{ $test->{arguments} } );
      is_deeply( \$actual, \$test->{expected}, $test->{test_name} );

    } # end if ref()
  }

  # add 1 for use_ok() in BEGIN
  done_testing( 1 + scalar( @tests ) );

4 comments:

  1. Great article. It is probably worth mentioning the perl5i module, which provides many of the "missing" features of Perl:

    https://metacpan.org/module/perl5i#trim

    With perl5i, you could simply say:

    $trimmed = " quick brown fox "->trim;

    ReplyDelete
  2. String::Util also provide a trim() function. And you might want to load it with Util::Any.

    I would advice against context-sensitive functions (who will return either a scalar or an array). See why in this talk: http://aaroncrane.co.uk/talks/calamitous_context
    Users who want to process a list can use map():

    my @results = map { trim } @arguments;

    And it makes you tests much simpler, thanks to the separation of concerns: map() is responsible for applying the operation to all elements of a list, you don't have this responsibility.


    In terms of style, if you keep the trim() function as it is, for each test i would refactor it slightly differently, to make it (IMHO) more readable and maintainable. I hope you will welcome my participation in the big experiment of doing things in different ways, tell me what you think :-) :

    # Process all tests
    for my $test ( @tests ){

    my ($expected, $arguments, $test_name) = $test->{qw( expected arguments test_name )};

    # Only handle arrays and scalars
    my $type = ref $expected;
    die "cannot handle $type references\n" if $type ne 'SCALAR' && $type ne 'ARRAY';

    if($type eq 'SCALAR') {
    # Scalar
    my $result = trim( @{$arguments} );
    is_deeply( \$result, \$expected, $test_name );
    }
    else {
    # Array
    my @results = trim( @{$arguments} );
    is_deeply( \@results, $expected, $test_name );
    }

    } # end for(@tests)

    ReplyDelete
    Replies
    1. Woops, my indentation disappeared.
      In fact, i would delete the #Scalar comment: it's really obvious with the line above (^c^)

      Delete
  3. First off, thanks for writing a tutorial encouraging people to write tests. We need more of these. :-)

    A question though: Why use /ms on your regexen? It is just because you should "always" use those? I think, from a tutorial viewpoint, it might be better to explain what these mean rather than encourage people to use them without understanding them. In this case, /m isn't doing you any good since you're not using ^ or $ (in fact, you've avoided the issue neatly by using \A and \z). And the /s isn't doing you any good since you're not using . in the regex.

    That was the main thing that jumped out at me when I read it.

    ReplyDelete