A few weeks ago I needed to do some line based manipulation that kinda went
further of what you can easyly do with awk. My old-SysAdmin brain kicked in and
the first though was, if you're going to use awk and sed, you might as well
use perl. Thing is, I really can't remember when was the last time I wrote
even a oneliner in perl, maybe 2011, in my last SysAdmin-like position.
Since then I've been using python for almost anything, so why not? Well, the
python interpreter does not have an equivalent of perl's -n switch; and
while we're at it, -a, -F, -p are also interesting for this.
So I wrote a little program for that. Based on those switch names, I called it
pefan. As python does not have perl's special variables, and in particuar,
$_ and @_, the wrapper sets the line variable for each line of the input,
and if you use the -a or -F switches, the variable data with the list
that's the result of splitting the line.
Meanwhile, while reading the perlrun manpage to write this post, I found out
that -i and even -s sound useful, so I'll be adding support for those in the
future. I'm also thinking of adding support for curly-brace-based block
definitions, to make oneliners easier to write. Yes, it's a travesty, but it's
all in line with my push to make python more SysAdmin friendly.
In the meantime, I added a couple of switches I find useful too. See the whole usage:
usage: pefan.py [-h] [-a] -e SCRIPT [-F SPLIT_CHAR] [-i] [-M MODULE_SPEC]
[-m MODULE_SPEC] [-N] [-n] [-p] [--no-print] [-r RANDOM]
[-s SETUP] [-t [FORMAT]] ...
Tries to emulate Perl's (Yikes!) -peFan switches.
positional arguments:
FILE Files to process. If ommited or file name is '-',
stdin is used. Notice you can use '-' at any point in
the list; f.i. "foo bar - baz".
optional arguments:
-h, --help show this help message and exit
-a, --split Turns on autosplit, so the line is split in elements.
The list of e lements go in the 'data' variable.
-e SCRIPT, --script SCRIPT
The script to run inside the loop.
-F SPLIT_CHAR, --split-char SPLIT_CHAR
The field delimiter. This implies [-a|--split].
-i, --ignore-empty Do not print empty lines.
-M MODULE_SPEC, --import MODULE_SPEC
Import modules before runing any code. MODULE_SPEC can
be MODULE or MODULE,NAME,... The latter uses the 'from
MODULE import NAME, ...' variant. MODULE or NAMEs can
have a :AS_NAME suffix.
-m MODULE_SPEC Same as [-M|--import]
-N, --enumerate-lines
Prepend each line with its line number, like less -N
does.
-n, --iterate Iterate over all the lines of inputs. Each line is
assigned in the 'line' variable. This is the default.
-p, --print Print the resulting line. This is the default.
--no-print Don't automatically print the resulting line, the
script knows what to do with it
-r RANDOM, --random RANDOM
Print only a fraction of the output lines.
-s SETUP, --setup SETUP
Code to be run as setup. Run only once after importing
modules and before iterating over input.
-t [FORMAT], --timestamp [FORMAT]
Prepend a timestamp using FORMAT. By default prints it
in ISO-8601.
FORMAT can use Python's strftime()'s codes (see
https://docs.python.org/3/library/datetime.html#strftime-and-strptime-
behavior).
Go get it here.