星期一 四月 17, 2006

Don't bite by "\n" in perl scripting

Below perl code tends to create a hash with key/value pair from diamond operator, and then query the existance of entry which is read from ANOTHER_FILE in the hash table.

Will below perl code work as expected?

    while ( my $next_call = <> )
    {
            @key_value = split /\s+/, $next_call;
            $value = pop(@key_value);
            $key = pop(@key_value);
            $drv_func_call{$key} = $value;
    }

    while ( my $next_line = <ANOTHER_FILE> )
    {
            if ( ! $drv_func_call{$next_line} )
            {
                    printf "NOT CALLED: %s\n", $next_line;
            }
    }

Unfortunately, it will not. The reason is the $next_line reading from ANOTHER_FILE has "\n" at the end of string. That is the searching key is: $next_line."\n". While the $key of hash is retrived by "split" without newline deliminator attached, $drv_func_call{$next_line} will return no value as a result.

Perl uses chomp operator to remove "\n" after a string, which will fix the problem.

Below are quoted from: http://perldoc.perl.org/functions/chomp.html
chomp VARIABLE

chomp INPUT_RECORD_SEPARATOR $/ newline eol



  • chomp( LIST )

  • chomp

    This safer version of chop removes any trailing string
    that corresponds to the current value of $/
    (also known as
    $INPUT_RECORD_SEPARATOR in the English module). It returns the total
    number of characters removed from all its arguments. It's often used to
    remove the newline from the end of an input record when you're worried
    that the final record may be missing its newline. When in paragraph
    mode ($/ = ""
    ), it removes all trailing newlines from the string.
    When in slurp mode ($/ = undef
    ) or fixed-length record mode ($/
    is
    a reference to an integer or the like, see perlvar) chomp() won't
    remove anything.
    If VARIABLE is omitted, it chomps $_
    . Example:

        while (<>) {
    chomp; # avoid \n on last field
    @array = split(/:/);
    # ...
    }

    If VARIABLE is a hash, it chomps the hash's values, but not its keys.


    You can actually chomp anything that's an lvalue, including an assignment:

        chomp($cwd = `pwd`);
    chomp($answer = );

    If you chomp a list, each element is chomped, and the total number of
    characters removed is returned.


    If the encoding pragma is in scope then the lengths returned are
    calculated from the length of $/
    in Unicode characters, which is not
    always the same as the length of $/
    in the native encoding.


    Note that parentheses are necessary when you're chomping anything
    that is not a simple variable. This is because chomp $cwd = `pwd`;
    is interpreted as (chomp $cwd) = `pwd`;
    , rather than as
    chomp( $cwd = `pwd` )
    which you might expect. Similarly,
    chomp $a, $b
    is interpreted as chomp($a), $b
    rather than
    as chomp($a, $b)
    .

So fixing above code like below:
    while ( my $next_call = <> )
    {
            @key_value = split /\s+/, $next_call;
            $value = pop(@key_value);
            $key = pop(@key_value);
            $drv_func_call{$key} = $value;
    }

    while ( my $next_line = <ANOTHER_FILE> )
    {
            chomp($next_line);
            if ( ! $drv_func_call{$next_line} )
            {
                    printf "NOT CALLED: %s\n", $next_line;
            }
    }


评论:

发表一条评论:
  • HTML语法: 禁用