Sunday, 7 July 2013

A Look At My Utility Library

Although there are many useful utility subroutines in Perl's standard modules, there are some that are missing. I'm going to spend the next few posts describing a few.

Where to Put Things

Since I want to use these subroutines in my scripts, I want to place the module in a place where Perl can easily access them. I have perlbrew in a directory called ~/perl5/perlbrew, so I decided to create a library directory under ~/perl5:

    cd ~/perl5
    mkdir lib
    cd lib

Adding Conetnt

And I shall call my utility module MyUtils.pm. So, I open it with my favourite editor, which automatically creates the file, and add the following line:

    #!/
    #
    #   Title   : MyUtils
    #   Purpose : Some useful subroutines
    #

    # --------------------------------------
    # Package
    package MyUtils;

Pragmatics

Since I'm using Perl v5.18.0, I want UTF-8 enabled.

    # --------------------------------------
    # Pragmatics

    use v5.18.0;

    use strict;
    use warnings;

UTF-8 for Everything

Add some pragmatics for UTF-8.

    # UTF-8 for everything
    use utf8;
    use warnings   qw( FATAL utf8 );
    use open       qw( :encoding(UTF-8) :std );
    use charnames  qw( :full :short );
    binmode( DATA, qw( :encoding(UTF-8) ));
use utf8;

This allows for UTF-8 character to be used in the module.

use warnings qw( FATAL utf8 );

This changes warning about UTF-8 errors into exceptions.

use open qw( :encoding(UTF-8) :std );

This defaults all open calls to a :encoding(utf8) layer. It also adds to STDIN, STDOUT, and STDERR a :encoding(utf8) layer.

use charnames qw( :full :short );

Enables the use of naming characters in strings and regular expressions. Example: my $smiley = "\N{WHITE SMILING FACE}";

binmode( DATA, qw( :encoding(UTF-8) ));

Adds the :encoding(UTF-8) layer to the <DATA> file handle.

Version Number

And every piece of Perl code should have its own version number:

    # --------------------------------------
    # Version
    our $VERSION = v1.0.0;

Exports

Now for what it exports. The standard modules, Exporter is used for this. I decided not to export anything by default so there won't be a name collision with anything in my scripts. But I want to be able to export some of them on request. That means I will be placing the names of the subs in the @EXPORT_OK list. (But it shall be empty for now.)

    # --------------------------------------
    # Exports
    use base qw( Exporter );
    our @EXPORT = qw( );
    our @EXPORT_OK = qw(
    );
    our %EXPORT_TAGS = (
        all  => [ @EXPORT, @EXPORT_OK ],
    );

Modules

Now to add some modules:

    # --------------------------------------
    # Modules

    # Standard modules
    use Carp;
    use English qw( -no_match_vars ) ;  # Avoids regex performance penalty
    use File::Glob ':bsd_glob';
    use List::Util;
    use POSIX;
    use Scalar::Util;
    use Storable qw( dclone );

    # CPAN modules
    use List::MoreUtils;
    use Const::Fast;
    use Regexp::Common;

Modules Used

Looking at each module individually:

use Carp;

Damian Conway recommends in Perl Best Practices that carp and croak should be used instead of warn and die, respectively, since they will report the errors from the calling sub point of view.

use English qw( -no_match_vars ) ; # Avoids regex performance penalty

This allow me use use English names for Perl's special variables (see perlvar). The parameter, -no_match_var, tells it not to load the names for the match variables because including them would slow down every regular expression.

use File::Glob ':bsd_glob';

This replaces the default glob function. The default function treats spaces as argument separators, which means it won't work with files or directories that have spaces in their names. The BSD function allows spaces in names.

Example:

    # use default glob function
    my @files = glob( '*.c *.cpp' );

vs:

    # use BSD glob function
    use File::Glob qw( :bsd_glob );
    my @files = ( glob( '*.c' ), glob( '*.cpp' ));

The following modules are included because I sometimes use some subs from them.

use List::Util;
use POSIX;
use Scalar::Util;
use Storable qw( dclone );

Modules from CPAN

This modules from CPAN I find useful.

use Const::Fast;

Sets a flag in the variables so they can't be changed. And it set the flag in any nested arrays or hashes.

use List::MoreUtils;

More utilities for lists.

use Regexp::Common;

A set of regular expressions for many things.

Finishing the Module

And some final sections in the module. I'll be adding to these later articles. And, of course, the final command to return a non-false value to indicate the module was loaded correctly.

    # --------------------------------------
    # Configuration Parameters

    # --------------------------------------
    # Subroutines

    1;

Complete Module

The complete module looks like this:

    #!/
    #
    #   Title   : MyUtils
    #   Purpose : Some useful subroutines
    #

    # --------------------------------------
    # Package
    package MyUtils;

    # --------------------------------------
    # Pragmatics

    use v5.18.0;

    use strict;
    use warnings;

    # UTF-8 for everything
    use utf8;
    use warnings   qw( FATAL utf8 );
    use open       qw( :encoding(UTF-8) :std );
    use charnames  qw( :full :short );
    binmode( DATA, qw( :encoding(UTF-8) ));

    # --------------------------------------
    # Version
    our $VERSION = v1.0.0;

    # --------------------------------------
    # Exports
    use base qw( Exporter );
    our @EXPORT = qw( );
    our @EXPORT_OK = qw(
    );
    our %EXPORT_TAGS = (
        all  => [ @EXPORT, @EXPORT_OK ],
    );

    # --------------------------------------
    # Modules

    # Standard modules
    use Carp;
    use English qw( -no_match_vars ) ;  # Avoids regex performance penalty
    use File::Glob ':bsd_glob';
    use List::Util;
    use POSIX;
    use Scalar::Util;
    use Storable qw( dclone );

    # CPAN modules
    use List::MoreUtils;
    use Const::Fast;
    use Regexp::Common;

    # --------------------------------------
    # Configuration Parameters

    # --------------------------------------
    # Subroutines

    1;

1 comment: