Hacker News

Significant features introduced for recent versions of Perl

2024-02-0210:3313194sheet.shiar.nl

The most significant features introduced for recent versions of the Perl scripting language. Core security support is provided for 3 years, so typical users should run at least 5.34. Stable…

Show article

The most significant features introduced for recent versions of the Perl scripting language.

Core security support is provided for 3 years, so typical users should run at least 5.34. Stable distributions such as Debian 10 maintain 5.28+. Enterprise platforms retain versions up to 5.10.

use feature "module_true": default in use 5.37 and up, also no feature "bareword_filehandles"
sub ($var ||= default): assign values when false (or undefined on //=) instead of omitted
/(*{ … })/: optimistic eval: (?{ … }) with regex optimisations enabled
class: define object classes: packages with field variables and method subroutines (feature, experimental)
${^LAST_SUCCESSFUL_PATTERN}: explicit variable to access the previous match as in s//…/
%{^HOOK}: perform tasks require__before and require__after when calling require
PERL_RAND_SEED: environment variable to set srand random number seed
is_tainted: builtin function to check variable tainting like Scalar::Util::tainted
export_lexically: builtin function to export named functions into the calling scope
Unicode: v15.0

use v5.36: use warnings; use feature qw'signatures isa'; no feature qw'indirect multidimensional switch'
use builtin: namespace for interpreter functions, such as weaken and blessed from Scalar::Util, ceil/floor from POSIX, and trim like String::Util (experimental)
is_bool(!0): distinguish scalar variable types (by builtin functions) for data interoperability
for my ($k, $v) (%hash): iterate over multiple values at a time (including builtin::indexed for arrays) (feature, experimental)
defer {}: queue code to be executed when going out of scope (feature, experimental)
try {} finally {}: run code at the end of a try construct regardless of failure (feature, experimental)

try {} catch: exception handling similar to eval blocks (feature, experimental)
/{,n}/: empty lower bound quantifier is accepted as shorthand for 0
\x{ … }: insignificant space within curly braces, also for \b{}, \g{}, \k{}, \N{}, \o{} as well as /{m,n}/ quantifiers
0o0: octal prefix 0o alternative to 0… and oct
re::optimization(qr//): debug regular expression optimization information discovered at compile time
no feature …: disable discouraged practices of bareword_filehandles and multidimensional array emulation

isa: infix operator to check class instance (feature, experimental until 5.36)
$min < $_ <= $max: chained comparison repeats inner part as $min < $_ and $_ <= $max
/\p{Name=$var}/: match Unicode Name property like \N{} but with interpolation and subpatterns
open F, '+>>', undef: respect append mode on temporary files with mixed access
no feature 'indirect': disable indirect object notation such as new Class instead of Class->new
streamzip: program distributed with core IO::Compress::Base to compress stdin into a zip container
Unicode: v13.0

/(?<=var+)<>: variable length lookbehind assertions (experimental until 5.36)
m(\p{nv=/.*/}): match unicode properties by regular expressions (experimental)
my $state if 0: workaround for state (deprecated since v5.10!) is now prohibited
qr'\N': Delimiters must be graphemes; unescaped { illegal; \N in single quotes
Unicode: v12.1

delete %hash{…}: hash slices can be deleted with key+value pairs
/(*…)/: alphabetic synonyms for assertions, e.g. (*atomic:…) for (?>…) and (*nlb:…) for (?<!…) (experimental until 5.31.6)
/(*script_run:)/: enforces all characters to be from the same script (experimental until 5.31.6)
state @a: persistent lexical array or hash variables (in addition to scalars)
perl -i -pe die: safe in-place editing: files are replaced only after successful completion
${^SAFE_LOCALES}: locales are thread-safe on supported systems, indicated by this variable
Unicode: v10.0

<<~EOT: indented here-docs, strips same whitespace before delimiter in each line
@{^CAPTURE}: array of last match's captures, so ${^CAPTURE}[0] is $1
//xx: extended modifier to also ignore whitespace in bracketed character classes
use Test2::V0: generic testing framework to replace Test::* and TAP::*
Unicode: v9.0

printf '%.*2$x': reordered precision arguments
/\b{lb}/: line break boundary type (position suitable for hyphenation)
/faster/: various significant speedups, notably matching fixed substrings, /i on caseless languages, 64-bit arithmetic, scope overhead
Unicode: v8.0

\$alias =: aliasing via reference (scoped as of v5.25.3) (experimental)
<<>>: safe readline ignoring open flags in arguments
/()/n: flag to disable numbered capturing, turning () into (?:)
/\b{}/: boundary types: gcb (grapheme cluster), sb (sentence), wb (word)
&.: & | ^ ~ consistently numeric, dotted operators for strings (feature, experimental until 5.28)
use re 'strict': apply stricter syntax rules to regular expression patterns (experimental)
0x.beep+0: hexadecimal floating point notation with binary power; printf '%a' to display
??: single match shorthand (deprecated since v5.14) requires the operator m?PATTERN?
Unicode: v7.0

sub ($var): subroutine signatures (feature, experimental until 5.36)
%hash{…}: hash slices return key+value pairs
[]->@*: postfix dereferencing (also e.g. $scalar->$* for $$scalar) (feature, experimental until 5.23.1)
use warnings 'once'; $a: variables $a and $b are exempt from used once warnings
Unicode: v6.3

${^LAST_FH}: last read filehandle (used by $.)
/(?[ a + b ])/: regex set operations (character subtraction -, union +, intersection &, xor ^) (experimental until 5.36)
my sub: lexical subroutines (also state, our); buggy before v5.22 (experimental until 5.26)
next $expression: loop controls allow runtime expressions
no warnings 'experimental::…': mechanism for experimental features, as of now required for smartmatch
Unicode: v6.2

__SUB__: current subroutine reference (feature)
fc, "\F": unicode foldcase to compare case-insensitively (feature)
"\N{}": automatic use charnames qw( :full :short )
Unicode: v6.1

s///r: non-destructive substitution
/(?{ m() })/: regular expressions can be nested in /(?{})/ and /(??{})/ (experimental until 5.20)
/dalu: regexp modifiers to restrict character classes: either default, ascii, locale, or unicode semantics.
use re '/flags': customise default modifiers
/(?^)/: construct to reset to default modifiers
FH->method: filehandle method calls load IO::File on demand (eg. STDOUT->flush)
\o{}: escape sequence for octal values beyond \777
use JSON: interface with data in JavaScript Object Notation {decode_json <>}
use HTTP::Tiny: minimal HTTP/1.1 client without LWP::UserAgent overhead
Unicode: v6.0+#8

package version: package NAME VERSION shorthand for our $VERSION
...: yada-yada operator: code placeholder
use 5.012: implicit strict if use VERSION >= v5.12
… when: when is now allowed to be used as a statement modifier
use overload 'qr': customisable conversion to regular expressions
/\N/: inverse \n to match any character except newline regardless of /s (experimental until 5.18)
each $ref e.a.: array and hash container functions accept references
Unicode: v5.2

//: defined-or operator
~~: smart-match operator to compare different data types (updated in v5.10.1) (experimental)
say: print with newline, equivalent to print @_, "\n" (feature)
given: switch statement to smart-match with when/default (feature, experimental)
/(?<name>)/: named capture buffers into %+
/(?1)/: recursive regular expression patterns
/(?|)/: resets capture numbering for each contained branch
/.++/: possessive quantifiers ?+, *+, ++ to match greedily
s/keep\K//: floating positive lookbehind, efficient alternative for s/(keep)/$1/
/p: optionally preserve ${^MATCH} variables (avoiding $& penalty until COW in v5.20)
/\v/, /\h/: vertical and horizontal whitespace escapes (\V \H to invert); also /\R/ for newlines
my $_: lexically scoped version of the default variable
state: persistent my variables (scalars only until 5.28) (feature)
use autodie: replace builtin functions to throw exceptions instead of returning failure {eval {open ...} or $@->matches("open") || die}
use IO::Compress::Zip: various file compression standards {zip IO::Uncompress::Gunzip->new("test.gz") => "recompressed.zip"}
use Time::Piece: timestamps as objects {localtime->year > 1900}
use File::Fetch: generic data retrieval/download {File::Fetch->new(uri => "http://localhost/")->fetch(to => \$slurp)}
Unicode: v5.0.0

no utf8: full unicode support, utf8 pragma only for script encoding
use open: file handle behaviour altered by PerlIO layers {binmode $fh, ":bytes"}
open $fh, '-|', @cmd: open list to fork a command without spawning a shell
open $fh, '>', \$var: perl scalars as virtual files
printf '%1$s', @args: syntax to use parameters out of order
1_2_3 == 123: underscores between digits allowed in numeric constants
use bignum: transparent big number support {length 1e100 == 101}
use if: conditional module inclusion {no if $] >= 5.022, "warnings", "redundant"}
use sort: override sort() algorithm
use Digest: calculate various message digests (data hashes) {$hash = sha256_hex($data)}
use Encode: character set conversion {encode("utf8", decode("iso-8859-1", $octets))}
use File::Temp: create a temporary file or directory safely {$fh = tempfile();}
use List::Util: general-utility list subroutines {@cards = shuffle 0..51}
use Locale::Maketext: various localization and internationalization in Locale::* and L18N::*
use Memoize: remember function results, trading space for time {memoize "stat"}
use MIME::Base64: base64 encoded strings as in email attachments
use Test::More: modern framework for unit testing {is $got, $expected}
use Time::HiRes: high resolution timers {$μs = [gettimeofday]; sleep .1; $elapsed = tv_interval $μs}
Unicode: v3.2.0

use warnings: pragma to enable warnings in lexical scope
use utf8: experimental unicode semantics (completed in v5.8) (experimental until 5.8)
use charnames: string escape \N{} to insert named character
our: declare global variables
v1.2.3: represent strings as vector of ordinals, useful in version numbers (printf '%vd' to display)
0b0: binary numbers in literals, printf '%b', and oct
sub :lvalue: subroutine attribute to return a modifiable value (experimental until 5.20)
open my $fh, $mode, $expr: file handles in scoped scalars, third argument for unambiguous file name
pack 'q': 64-bit integer support (also large files >2GiB) (experimental until 5.8.1)
sort $coderef (): comparison function can be a subroutine reference; prototype ($$) to pass elements as normal @_
CHECK {}: special block called at end of compilation
/[[:…:]]/: POSIX character class syntax such as /[[:alpha:]]/
Unicode: v3.0.1

Read the original article

Comments

By tasty_freeze 2024-02-0316:051 reply

Over the past 10-12 years, Perl has been getting some significant speed boosts, on the order of 30% overall. It is of course still a very slow language when compared to compiled languages, but I'll take it.

https://blogs.perl.org/users/dimitrios_kechagias/2022/11/per...

I still code in Perl a fair bit for work. One great improvement I have taken advantage of is subroutine arguments, rather than unpacking @_ at the head of each subroutine. It is much easier to tell what the sub expects, and if the caller passes the wrong number of args it is a compile time error.

The main draw for me is that regex is built into the syntax of the language, and not a library. If the task requires a lot of pattern matching and string manipulation and speed isn't that critical, I reach for Perl.

By temporallobe 2024-02-0318:15

Perl aficionado here - I love Perl for its regex as well. Whenever I need to do serious regex work (log parsing, for example), I pull out Perl. Many of the engineers I’ve worked with also don’t realize that it makes an extremely good shell interaction tool (much more powerful and intuitive than Bash scripting, for example) and I have developed many a TUI with it. I eventually learned Python as well, but always saw Perl as superior in many ways, especially when raw performance isn’t a consideration.

By librasteve 2024-02-0313:516 reply

perl has the distinction of being designed by a linguist and, for many coders (like me), it has a very smooth and easy feeling …

… others dislike the idea of $ sigils as “noun” markers and @ sigils to denote plurals but imo these features trick your brain into engaging these natural language concepts.

In contrast, I feel that Java is “heavy”, Python is “sciency” and so on.

The other distinctive aspect of perl is that it does not seek to constrain and block the coder. Strongly typed and opinionated languages (like Rust and Haskell) can be frustrating since they force you to code their way. That’s fine - it’s in the contract. perl is for when you just want a bag of tools to get the job done without becoming a wrestling match.

This flexibility also means that non standard code is easier to write in perl. The tool gets a bad reputation at the hands of poor coding practices. A higher level of trust and self-discipline is needed.

perl6 - now renamed to www.raku.org, continues the spirit with a cleaned up syntax and a lot more features in the core language

By bloopernova 2024-02-0316:47

I really like Raku, but haven't yet tried to integrate it into anything I do. I was looking at fizzbuzz implementations on https://web.archive.org/web/20240116132452/https://rosettaco... (as you do), and liked the simplicity of a Raku solution:

  for 1 .. 100 {
    when $_ %% (3 & 5) { say 'FizzBuzz'; }
    when $_ %% 3       { say 'Fizz'; }
    when $_ %% 5       { say 'Buzz'; }
    default            { .say; }
  }

By ReleaseCandidat 2024-02-0315:221 reply

> perl has the distinction of being designed by a linguist

... who won the IOCC 2 times! https://www.ioccc.org/winners.html#W

By librasteve 2024-02-0322:17

in 1986 and 1987 ... yikes

https://www.linkedin.com/pulse/18th-december-day-perl-progra...

Obfuscation coding contests, Perl golf tournaments to create the shortest feasible code for a specified function, and Perl-language poetry collections have all grown up as a result of Perl's versatility.

Your takeaway may well be that perl is not the kind of thing that your boss would like. My takeaway is that perl (and raku) is versatile and engenders a sense of -Ofun.

By tetha 2024-02-0315:191 reply

> This flexibility also means that non standard code is easier to write in perl. The tool gets a bad reputation at the hands of poor coding practices. A higher level of trust and self-discipline is needed.

And this becomes a problem in teams.

I do somewhat miss perl, because perl and some libraries did some things better than later languages. You could express some ideas very concisely. Or, croak and carp only reporting functions crossing module or library boundaries in a stack trace (instead of some java teams complaining that the log aggregation refuses to collect 50kb+ sized stacktraces, not even exaggerating there).

Alternatives like python force you into a more maintainable style. Like, don't get me wrong, you can write unmaintainable code in any language, and python would be my language of choice to make a horrible non-understandable mess with the extensive meta programming capabilities.

But with perl, you kinda have to put in effort to stay on the simple path. If you leave that part, it quickly becomes very dark. And I know some of my co-workers.

By Enk1du 2024-02-0315:491 reply

For me, it's code review that makes me write in a maintainable style. It's the team that's responsible for a code base and objections are raised whenever anyone thinks that they'll struggle to understand this or that line. We also enforce consistency. Yes, you _can_ choose any character for quoting lists, but for the love of Pete, you _will_ follow how it's done in every other case (in that file, at least).

I'll miss you qw//, but I was out-voted. :(

By sonofhans 2024-02-0316:561 reply

qw// was always my favorite, too — distinctive, and quick to type.

By librasteve 2024-02-0322:19

  qw/one two three/;

in raku is

  <one two three>

By dontupvoteme 2024-02-0314:491 reply

Indeed, Linguistics is a sincerely underappreciated field.

It's telling that perl is/was absolutely functional at it's core -- a pidgin of bash and <proper programming>, if one might will.

By librasteve 2024-02-0322:23

yeah, raku takes the perl functional heritage and adds a bit

  my $logger = -> $m { say $m; $m };
  my $add-five = -> $x { $x + 5 };
  my $add-five-and-log = $add-five o $logger;

  say $add-five-and-log(25); # Prints 25, then prints 30

https://dev.to/rawleyfowler/functional-programming-with-raku...

By washadjeffmad 2024-02-0314:13

Surprisingly (but not really), I found myself in a position to harness my late-90s perl for cgi experience for the sciences after bioinformatics began to take off.

Python makes quite a few things easier if only because of how "teachable" it is between colleagues, but the really useful tools that bear more in common with the human relationship with data as text-at-scale were part and parcel to the POSIX-likes. The biggest challenges I faced were getting IT to let me use Linux at work and convincing peers to try it.

By archargelod 2024-02-0314:202 reply

> perl6 - now renamed to www.raku.org, continues the spirit ...

With perl7 still being worked on, is it really a continuation or they're just branching off? I'm still not sure what's their relation to each other now. Why would one choose to use perl over raku or vice-versa?

By 7thaccount 2024-02-0314:372 reply

My understanding is that there is Perl 5 (which is Perl) and a completely separate language known as Raku that was originally supposed to be Perl 6. This happened as Perl 6 took so long (similar to Duke Nukem Forever) that large portions of the Perl community decided it wasn't a realistic upgrade path and the languages had a hard fork.

Perl7 seems to be Perl5 with some changes to the defaults. They can't upgrade to Perl6 as that would be needlessly confusing. If foresight were perfect, Perl6 would have always been a different language name so Perl proper would continue without confusion. The problem was that they had no idea how hard it would be to make Perl6 and thus thought it would be ready in a few years (my understanding) and everyone would upgrade. Raku/Perl6 is SERIOUSLY COOL. I think it just needs a larger community and that's a chicken/egg problem.

By chromatic 2024-02-0322:131 reply

Your understanding is mostly accurate.

Perl 6 was intended from the start to be the next major release of Perl (at various times, a replacement for a 5.10 or 5.12 or 5.14), and it was intended to have a backwards compatibility mode to run 5.8 (or 5.10 or 5.12) code in the same process, with full interoperability.

As time went by, that plan became less and less likely. Some people came up with the idea that Perl and P6 were "sister languages", both to have new major releases. I think this happened sometime around 2009 or so, maybe as early as 2007.

Also by 2011 or so, the P6 developers effectively scuttled the backwards compatibility plan and code written to that point, but I've argued that their plan to replace Parrot was a mistake enough here and elsewhere already. (Sometimes I wonder, now that MoarVM is older than Parrot was when Parrot was declared unsuitable, if they've achieved their promised speed and compatibility goals.)

By 7thaccount 2024-02-042:11

Thanks Chromatic!

By librasteve 2024-02-0323:201 reply

Well the raku community may be small (1300 ish on reddit) but it is very welcoming and there is lots to help out with (RakuAST anyone?). There are 2200 or so modules on www.raku.land (plus Inline::Perl5, Inline::Python, CFFI and so on) - I learn something every day by rubbing shoulders with some amazing experts!

By 7thaccount 2024-02-0422:00

Yep. I hope I didn't comes across as negative. It's a good community from what I've seen.

By librasteve 2024-02-0323:11

(conceptually) it's a fork - long overdue really since the whole stack has diverged

By csdvrx 2024-02-0314:131 reply

I like perl. Before, I had written a toy web server in perl (it was fun and I learned a lot), then I ported it to cosmopolitan to have it run everywhere!

This year I've written a fdisk replacement in perl, to make hybrid MBR+GPT for bootable media.

First, I wanted to check how they were made, but then I decided I wanted to programmatically write hybrid MBR in a way that's easier than gdisk and that offers more control that xorriso.

It's not complete yet, but the partition reading feature was already very helpful to understand the ins and out of the mfg59 layout that's so popular for optical media, and the final gpt tweaks should only take a few more days.

perl allowed me to write it very quickly and to make sure it will work reliably for the years to come.

By kstrauser 2024-02-0315:111 reply

Bummer you’d been downvoted for that. It sounds like quite the ambitious project that you’ve nearly got working. Cool!

But I’m also saying this while holding a sharp stick and slowly backing away. That is not the sort of thing I’d expect someone to write in Perl, and I’m experiencing an odd mix of “that’s amazing!” and “what on earth were you thinking, my friend?!”

By csdvrx 2024-02-0316:211 reply

> Bummer you’d been downvoted for that.

Some people have an instinctual dislike of things they've been told it's fashionable to hate. I resent that, because all programming languages are interesting in their own ways.

> It sounds like quite the ambitious project that you’ve nearly got working. Cool!

Oh it's already working, it just needs more polish :)

Check https://github.com/csdvrx/hdisk if you're interested

Actually, I'll try to submit it!

> That is not the sort of thing I’d expect someone to write in Perl, and I’m experiencing an odd mix of “that’s amazing!” and “what on earth were you thinking, my friend?!”

I wanted to do it quickly :)

For decoding weird formats that mix little and big ending, I think perl unpack/unpack is the fastest way.

Also, for computing crc32, I didn't have to bother much :)

By kstrauser 2024-02-0316:501 reply

I dig it, or at least the fact that you stuck it out. Well done. :)

Huh, wonder if people who aren’t familiar with the subject saw “GPT” and thought you were going off into AI gibberish land?

By csdvrx 2024-02-0318:30

> Huh, wonder if people who aren’t familiar with the subject saw “GPT” and thought you were going off into AI gibberish land?

If they don't know GPT is a type of partition table, I think their stereotypical dislike of perl is the least of their problems!