Sort::Naturally(3pm) | User Contributed Perl Documentation | Sort::Naturally(3pm) |
Sort::Naturally -- sort lexically, but sort numeral parts numerically
@them = nsort(qw( foo12a foo12z foo13a foo 14 9x foo12 fooa foolio Foolio Foo12a )); print join(' ', @them), "\n";
Prints:
9x 14 foo fooa foolio Foolio foo12 foo12a Foo12a foo12z foo13a
(Or "foo12a" + "Foo12a" and "foolio" + "Foolio" and might be switched, depending on your locale.)
This module exports two functions, "nsort" and "ncmp"; they are used in implementing my idea of a "natural sorting" algorithm. Under natural sorting, numeric substrings are compared numerically, and other word-characters are compared lexically.
This is the way I define natural sorting:
foo => "foo", -1 foobar => "foo", -1, "bar" foo13 => "foo", 13, foo13xyz => "foo", 13, "xyz"
That's so that "foo" will come before "foo13", which will come before "foobar".
This function takes a list of strings, and returns a copy of the list, sorted.
This is what most people will want to use:
@stuff = nsort(...list...);
When nsort needs to compare non-numeric substrings, it uses Perl's "lc" function in scope of a <use locale>. And when nsort needs to lowercase things, it uses Perl's "lc" function in scope of a <use locale>. If you want nsort to use other functions instead, you can specify them in an arrayref as the first argument to nsort:
@stuff = nsort( [ \&string_comparator, # optional \&lowercaser_function # optional ], ...list... );
If you want to specify a string comparator but no lowercaser, then the options list is "[\&comparator, '']" or "[\&comparator]". If you want to specify no string comparator but a lowercaser, then the options list is "['', \&lowercaser]".
Any comparator you specify is called as "$comparator->($left, $right)", and, like a normal Perl "cmp" replacement, must return -1, 0, or 1 depending on whether the left argument is stringwise less than, equal to, or greater than the right argument.
Any lowercaser function you specify is called as "$lowercased = $lowercaser->($original)". The routine must not modify its $_[0].
Often, when sorting non-string values like this:
@objects_sorted = sort { $a->tag cmp $b->tag } @objects;
...or even in a Schwartzian transform, like this:
@strings = map $_->[0] sort { $a->[1] cmp $b->[1] } map { [$_, make_a_sort_key_from($_) ] @_ ;
...you wight want something that replaces not "sort", but "cmp". That's what Sort::Naturally's "ncmp" function is for. Call it with the syntax "ncmp($left,$right)" instead of "$left cmp $right", but otherwise it's a fine replacement:
@objects_sorted = sort { ncmp($a->tag,$b->tag) } @objects; @strings = map $_->[0] sort { ncmp($a->[1], $b->[1]) } map { [$_, make_a_sort_key_from($_) ] @_ ;
Just as with "nsort" can take different a string-comparator and/or lowercaser, you can do the same with "ncmp", by passing an arrayref as the first argument:
ncmp( [ \&string_comparator, # optional \&lowercaser_function # optional ], $left, $right )
You might get string comparators from Sort::ArbBiLex.
if(@set >= SOME_VERY_BIG_NUMBER) { no locale; # vroom vroom @sorted = sort(@set); # feh, good enough } elsif(@set >= SOME_BIG_NUMBER) { use locale; @sorted = sort(@set); # feh, good enough } else { # but keep it pretty for normal cases @sorted = nsort(@set); }
Copyright 2001, Sean M. Burke "sburke@cpan.org", all rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.
Sean M. Burke "sburke@cpan.org"
2022-11-19 | perl v5.36.0 |